Launching the Engineering Blog

We recently re-launched Zalando's Engineering Blog. Learn how we have set up a blog with a Lighthouse score of 100.

photo of Henning Jacobs
Henning Jacobs

Senior Principal Engineer

Posted on Jul 01, 2020

Our Engineering Blog was launched in June 2020 after a long break of the previous tech blog. This post describes the technical setup behind engineering.zalando.com.

You will learn:

  • Which static site generator we selected and why.
  • What customizations we applied to design the blog and the publishing process.
  • How we serve static HTML using Skipper and S3.

Static Site Generator

Our previous tech blog used a CMS which only a limited number of people had access to. The CMS system also lacked a workflow to propose and review drafts. As authors of the Engineering Blog will (mostly) be software engineers, we decided to switch to a git-based workflow and a static site generator.

StaticGen provides a nice overview of many different static site generators. Nearly all of them provide the necessary features to generate a static HTML site from blog posts written in Markdown. So which static site generator to choose?

With the need to customize the blog engine, e.g. with custom templates and features like author titles, the main criteria for the static site generator is to use a familiar programming language for templating and for plugins. The static site generator should generate plain HTML and not contain unnecessary features we won't use. The winner was Pelican:

StaticGen: Pelican stats

Customization

We implemented the blog's design with plain HTML/CSS. The CSS is generated via PostCSS and Tailwind CSS. Customizing Pelican's Jinja templates was straightforward.

Other customizations we did:

Additionally to the above, we want to make sure that automatic linting is in place for blog posts:

  • Required meta keys must be present, e.g. title, summary, and author names.
  • The blog post Markdown file must be in the right year/month folder.
  • Article tags should be curated via an explicit allowlist. We want to avoid introducing many unnecessary tags and different tags for the same concept, e.g. "Postgres" vs. "PostgreSQL".

Linting is done via pre-commit which calls a custom Python script to validate blog post Markdown files. The .pre-commit-config.yaml looks something like this:

minimum_pre_commit_version: 1.21.0
repos:
  - repo: meta
    hooks:
      - id: check-hooks-apply
      - id: check-useless-excludes

  - repo: local
    hooks:
      - id: validate-content
        name: Validate blog content
        language: system
        # run with poetry to get dependencies (Pelican)
        entry: poetry run ./validate-content.py
        types: [markdown]
        exclude: ^content/pages/.*.md$

  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v3.1.0
    hooks:
      - id: check-added-large-files
      - id: end-of-file-fixer
      - id: trailing-whitespace
      - id: mixed-line-ending

Zalando's CI/CD system automatically lints all files by executing make lint.

Writing a blog post

Anybody in Zalando can pitch a blog post idea by creating an issue in the git repo:

Blog post pitch: new issue

Bootstrapping a new blog post looks like this:

hjacobs@ZALANDO-123:~/workspace/engineering-blog$ make new
poetry run ./scripts/new-post.py
This will create a new blog post, please answer a few questions..
Title of blog post: Launching the Engineering Blog
Slug [launching-the-engineering-blog]:
Date (estimated) of publishing [2020-07-04]:
Author names (separate with semicolon) [Henning Jacobs]:
Author titles (separate with semicolon) [Senior Principal Engineer]:
========================================
Title:         Launching the Engineering Blog
Slug:          launching-the-engineering-blog
Authors:       Henning Jacobs
Author Titles: Senior Principal Engineer
Date:          2020-07-04
URL:           /posts/2020/07/launching-the-engineering-blog.html
========================================
Does this look correct? Answer 'y' or 'n': y
Creating content/2020/07/launching-the-engineering-blog/2020-07-04-launching-the-engineering-blog.md ..

Useful commands:
- make devserver    Start local webserver, find your draft on http://localhost:8000/drafts/
- make lint         Validate content and formatting.

Please edit your article in content/2020/07/launching-the-engineering-blog/2020-07-04-launching-the-engineering-blog.md
and don't forget to open a PR :-)

Opening a PR to the Engineering Blog repository will trigger a build (make html) on our Zalando Continuous Delivery Platform. The PR build will publish a preview of the blog under a private (authenticated) URL.

After merging the blog post PR, it will automatically be published on the live site engineering.zalando.com.

Serving static HTML

Zalando's Continuous Delivery Platform has a built-in feature to upload files to a given S3 bucket. This feature is used to upload all files from the output directory (generated by Pelican) to the blog's S3 bucket. The S3 bucket is created via CloudFormation which also configures the S3 website:

AWSTemplateFormatVersion: 2010-09-09
Metadata:
  StackName: "engineering-blog"
  Tags:
    application: "engineering-blog"
Resources:
  S3Bucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketName: "<BUCKET-NAME>"
      AccessControl: PublicRead
      WebsiteConfiguration:
        IndexDocument: index.html
        ErrorDocument: error.html
    DeletionPolicy: Retain
  BucketPolicy:
    Type: AWS::S3::BucketPolicy
    Properties:
      PolicyDocument:
        # ...

The WebsiteConfiguration property will make the bucket contents available on http://<BUCKET-NAME>.s3-website.<REGION>.amazonaws.com. The S3 website only provides an HTTP endpoint (no SSL) and not a domain we would want to use publicly.

One way to serve the contents with a custom domain and SSL is to create a CloudFront web distribution. I decided to not use CloudFront as all the required infrastructure for domain+SSL is already in place.

We have Skipper as the Kubernetes Ingress proxy running for all our 140+ Kubernetes clusters. External DNS automatically configures the DNS name and the Kubernetes Ingress Controller for AWS configures the AWS ALB with the right ACM SSL certificate. So let's reuse this infrastructure and let Skipper proxy all requests to the S3 website bucket endpoint. This can be achieved by adding a default Skipper route as Ingress annotation:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: "engineering-blog"
  labels:
    application: "engineering-blog"
  annotations:
    zalando.org/skipper-routes: |
      redirect_app_default: * -> compress() -> setDynamicBackendUrl("http://<BUCKET-NAME>.s3-website.<REGION>.amazonaws.com") -> <dynamic>;
spec:
  rules:
  - host: "engineering.zalando.com"
    http:
      paths:
      - backend:
          serviceName: "engineering-blog"
          servicePort: 80

That Skipper's compress() filter enables gzip compression as the S3 endpoint does not provide response compression out-of-the-box. The ACM certificate, HTTP/2 support, the S3 website response, and the enabled compression are visible when doing a curl request (output shortened):

$ curl -v --compressed https://engineering.zalando.com -o /dev/null
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* Server certificate:
*  subject: CN=engineering.zalando.com
*  subjectAltName: host "engineering.zalando.com" matched cert's "engineering.zalando.com"
*  issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
*  SSL certificate verify ok.
> GET / HTTP/2
> Host: engineering.zalando.com
> user-agent: curl/7.68.0
> accept: */*
> accept-encoding: deflate, gzip, br
< HTTP/2 200
< content-type: text/html
< content-encoding: deflate
< etag: "304fcc9c31aac19255bf1d84669059df"
< last-modified: Sat, 27 Jun 2020 07:23:19 GMT
< server: AmazonS3
< vary: Accept-Encoding

Performance

The static website should be fast. So let's test. We can use Vegeta for some basic HTTP load testing. 60ms as p99 latency looks good:

$ echo "GET https://engineering.zalando.com/" | vegeta attack -duration=60s | vegeta report
Requests      [total, rate, throughput]         3000, 50.02, 50.00
Duration      [total, attack, wait]             59.995s, 59.98s, 15.246ms
Latencies     [min, mean, 50, 90, 95, 99, max]  12.418ms, 19.751ms, 17.049ms, 25.05ms, 38.382ms, 59.958ms, 244.094ms
Bytes In      [total, mean]                     51441000, 17147.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:3000
Error Set:

The user experience with a real browser is much more interesting. Chrome Lighthouse can be used to assess the page performance. Google's PageSpeed Insights uses Lighthouse for its score calculation. Running PageSpeed Insights for the blog reports a nice score of 100 out of 100 (desktop):

PageSpeed Insights for https://engineering.zalando.com/

Thanks go out to our Employer Branding colleagues who created the design and implemented the responsive HTML/CSS layout!

Summary

I hope this blog post gives you some inspiration for setting up your own blog with Pelican or some other static site generator. After re-launching our Engineering Blog, our main focus will be providing regular and high quality content. We still have to figure out the best way to source, review, and schedule blog posts.

Follow ZalandoTech on Twitter and subscribe to the Atom/RSS feed to get the latest articles.



Related posts