Une version Française de ce post est disponible.

Moby Dock, the Docker whale mascot

I strive to self-host my services in a secured manner. As such, I wanted to serve my website content secured with TLS.

As I don't want to pay for an expensive certificate for my domain, I also was inclined to use Let's Encrypt free Certificate Authority.

Getting a Let's Encrypt certificate

To get a Let's Encrypt certificate, the Let's Encrypt Certificate Authority needs us to prove that we own the domain for which we needs a certificate.

There are different ways to prove to the CA that I control lenain.info. I chose to provision HTTP resources under a well-known URI on this domain.

To do so, I made my DNS point to my Public IP and forwarded HTTP and HTTPS traffic to my home server.

Next parts of this blog post mentions example.com rather than lenain.info.

The ACME protocol

The ACME protocol works with a Let's encrypt agent, the Let's encrypt CA, and numerous steps to ensure that we control the DNS domain :

  • The agent generates a cryptographic key pair.
  • It then asks the CA to use the HTTP resource provisioning check scheme.
  • The CA provides challenges to the agent to publish as HTTP resources. at a well-known URI served with HTTP on the requested domain.
  • The CA also provides a nonce that the agent must sign to prove that it controls the key pair.
  • The agent publish challenges and the signed nonce, and then asks the CA to verify them.
  • Once the CA downloads the challenges and verify the nonce, the agent public key is authorized to do certificate management.

Serving the challenges

As we will use the EFF certbot agent, let's create a directory to hold all of our Let's Encrypt data on the server.

mkdir -p /home/lenain/letsencrypt/data/certbot/{conf,www}

Then we use the following nginx configuration to serve the yet to be created Let's Encrypt challenges by certbot :

server {
  listen 80;
  server_name example.com;

  location /.well-known/acme-challenge/ {
    root /var/www/certbot;
  }

  location / {
    root   /usr/share/nginx/html;
    index  index.html index.htm;
  }
}

Next step is to run nginx using Docker to serve the content of the directory using this configuration file.

docker run -v "$(pwd)"/nginx.conf:/etc/nginx/conf.d/default.conf:ro \
  -v "$(pwd)"/data/certbot/www:/var/www/certbot:ro \
  -p 80:80 \
  nginx

Now that content of the directory is served, let's start the ACME process.

Generating the certificate

We now launch certbot also with Docker on the server getting HTTP traffic :

docker run -v "$(pwd)"/data/certbot/conf:/etc/letsencrypt \
  -v "$(pwd)"/data/certbot/www:/var/www/certbot \
  --entrypoint "certbot" \
  certbot/certbot \
    certonly \
      --webroot /var/www/certbot \
      --non-interactive \
      --staging \
      --email 'email@example.com' \
      --no-eff-email \
      --domains example.com --domains blog.example.com \
      --rsa-key-size 4096 \
      --agree-tos \
      --force-renewal

We mount as docker volumes the certbot conf and www directory in a writable manner to retrieve the generated certificates and the challenges received by certbot.

We specify:

  • --webroot: The web root directory were challenges and nonces will be stored by certbot
  • --noninteractive: That we don't want to interact with certbot
  • --staging: That we want to use the Let's Encrypt staging server (Once we are ready to generate our final certificate, we would remove this flag)
  • --email: E-mail used for registration and recovery contact.
  • --no-eff-email: Do not share the e-mail address with EFF.
  • --domains: Domain names to apply, with the first that will receive the Common Name of the certificate.
  • --rsa-key-size: Use a 4096-bit RSA public key.
  • --agree-tos: Agree to the ACME Server's Subscriber Agreement
  • --force-renewal: If a certificate already exists for the requested domain, renew it now.

Now, in the certbot/conf/live/example.com/ directory we have the private key file privkey.pem for the certificate and the full chain file fullchain.pem containing all certificates including the server certificate.

We can trash the nginx configuration and Docker containers as we don't need them anymore.

Serving the website using Docker Swarm

Architecture

To serve the website, we will use Docker Swarm services.

We will have 2 services :

  • An nginx service, that will continue to serve the certbot www directory
  • Another nginx service, this one will serve our website and communicate with the first one

Creating the Docker Swarm overlay network

To let the two nginx services communicate, we create a Docker Swarm overlay network :

docker network create -d overlay --attachable onsen-naitwaurk

We let the network to be attachable if we need to run containers that would communicate with others containers running on other Docker daemons.

The well-known service

As we want to automate Let's Encrypt certificate renewal, we need a way to serve the Let's Encrypt challenges and nonces as we did when we first generated our certificate.

So we create a Docker Swarm service that will serve the well-known endpoint.

First let's create a directory to hold this service configuration :

mkdir -p /srv/docker/onsen-naitwaurk/letsencrypt-well-known

We then create an nginx.conf configuration file in it:

server {
  listen 80;
  server_name well-known;

  location /.well-known/acme-challenge/ {
    root /var/www/certbot;
  }
}

And a docker-compose.yml Docker Compose file :

version: "3.7"
services:
  well-known:
    image: "nginx:alpine"
    volumes:
      - type: "bind"
        source: /srv/docker/onsen-naitwaurk/letsencrypt-well-known/nginx.conf
        target: /etc/nginx/conf.d/default.conf
        read_only: true
      - type: "bind"
        source: /home/lenain/letsencrypt/data/certbot/www
        target: /var/www/certbot
        read_only: true
    networks:
      - onsen-naitwaurk
    deploy:
      placement:
        constraints:
          - node.labels.letsencrypt == true
networks:
  onsen-naitwaurk:
    external: true
  • The service will be called "well-known".
  • It will use the Alpine Linux latest nginx docker image.
  • We bind the newly created nginx configuration file to the default path in the container as readonly.
  • We also bind as readonly the certbot www directory to serve it.
  • We let the service use the onsen-naitwork Docker Swarm network externally defined network.
  • Finally, we set a placement constraint on the service so that it runs only on the node having the Let's Encrypt data.
  • We do not expose any external ports on this service. This will be the role of the website service.

On the node having the Let's Encrypt data we then set the constraint label:

$ docker node update --label-add letsencrypt=true kawaii

We can now start the service with Docker Compose:

$ docker stack deploy --compose-file docker-compose.yml letsencrypt-well-known

And check that the service is up and running:

$ docker service ps letsencrypt-well-known_well-known
ID                  NAME                                  IMAGE               NODE                DESIRED STATE       CURRENT STATE        ERROR               PORTS
uyk2nue3nsim        letsencrypt-well-known_well-known.1   nginx:alpine        kawaii              Running             Running 8 days ago

The website service

As we want this service to be distributed across the Docker Swarm, we need to incorporate both an nginx configuration and Let's Encrypt certificates in a Docker image that we would publish to our private registry.

Let's create directories to hold this service configuration:

mkdir -p /srv/docker/onsen-naitwaurk/example.com/{tls,public}

Dockerfile

Then we create a Dockerfile:

FROM nginx:alpine
COPY nginx.conf /etc/nginx/conf.d/default.conf
COPY tls/ /tls/
COPY public/ /usr/share/nginx/html/

RUN ln -sf /dev/stdout /var/log/nginx/access.log && ln -sf /dev/stderr /var/log/nginx/error.log
CMD ["nginx", "-g", "daemon off;"]

This Dockerfile will:

  • use the alpine nginx variant as the well-known service previously described
  • copy our nginx configuration as the default one for the container
  • copy our TLS related data in the /tls directory
  • copy our public content in the default nginx directory
  • create symbolic links to let nginx log on stdout and stderr
  • run the nginx server

Nginx configuration

We then create our nginx.conf configuration file:

server {
  listen 80;
  server_name example.com;

  location / {
    return 301 https://$host$request_uri;
  }

  location /.well-known/acme-challenge/ {
     resolver 127.0.0.11 valid=10s;
     set $endpoint well-known;
     proxy_pass     http://$endpoint;
     proxy_redirect off;
     proxy_set_header Host $host;
     proxy_set_header X-Real-IP $remote_addr;
     proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
     proxy_set_header X-Forwarded-Host $server_name;
  }
}

server {
  listen 443 ssl;
  server_name example.com;

  ssl_certificate         /tls/fullchain.pem;
  ssl_trusted_certificate /tls/fullchain.pem;
  ssl_certificate_key     /tls/privkey.pem;

  location / {
    root   /usr/share/nginx/html;
    index  index.html index.htm;
  }
}

Devil is in the details. Here we have 2 servers defined:

HTTP server

This server listen on HTTP port for the example.com domain.

It serves the /.well-known/acme-challenge/ requests by proxifying them to the well-known service previously defined in the same Docker Swarm network.

We let nginx check every 10 seconds that the well-known endpoint didn't change in the Docker Swarm network through the Docker embedded DNS server to find it again.

Finally, it redirect all other requests to the HTTPS version of the service.

HTTPS server

This server listen on HTTPS port for the example.com domain.

It uses the /tls directory to get the private key and certificate chain.

Docker Compose

Here is the docker-compose.yml configuration file:

version: '3.7'
services:
  example-com:
    image: "registry.onsen.lan:5000/example_com"
    build: .
    ports:
      - "80:80"
      - "443:443"
    networks:
      - onsen-naitwaurk
    deploy:
      mode: replicated
      replicas: 2
networks:
  onsen-naitwaurk:
    external: true
  • The service will be called "example-com".
  • It will push and use the image that we'll build with Docker Compose from our private registry.
  • It will expose both HTTP and HTTPS port on any node running the service.
  • We let the service use the onsen-naitwork Docker Swarm externally defined network.
  • Finally, we let it be deployed in replicated mode with at least 2 replicas in the swarm.

Now we can build and push our container to the private registry with Docker Compose. But first, we copy the needed TLS files from the certbot directory.

$ cp /home/lenain/letsencrypt/data/certbot/conf/live/example.com/fullchain.pem tls/fullchain.pem
$ cp /home/lenain/letsencrypt/data/certbot/conf/live/example.com/privkey.pem tls/privkey.pem
$ docker-compose build
$ docker-compose push

We can now start the service, forwarding the registry authentication to Docker Swarm agents:

$ docker stack deploy --compose-file docker-compose.yml example-com --with-registry-auth

And check that the service is up and running:

$ docker service ps example-com_example-com
ID                  NAME                            IMAGE                                               NODE                DESIRED STATE       CURRENT STATE         ERROR               PORTS
732k5bd2a140        example-com_example-com.1       registry.onsen.lan:5000/example_com:latest   kawaii              Running             Running 8 days ago
r2tqzevlfq93        example-com_example-com.2       registry.onsen.lan:5000/example_com:latest   kissu               Running             Running 8 days ago

Renewing automatically the certificate

Lets create a certbot service in our Docker Swarm, that will try to renew our Let's Encrypt certificate once per day to avoid certificate expiration.

First let's create a directory to hold this service configuration :

mkdir -p /srv/docker/onsen-naitwaurk/letsencrypt-renew

We then create a docker-compose.yml Docker Compose file :

version: "3.7"
services:
  renew:
    image: "certbot/certbot"
    volumes:
      - type: "bind"
        source: /root/letsencrypt/data/certbot/conf
        target: /etc/letsencrypt
        read_only: false
      - type: "bind"
        source: /root/letsencrypt/data/certbot/www
        target: /var/www/certbot
        read_only: false
    entrypoint:
      - /bin/sh
      - -c
      - 'trap exit TERM; while true; do certbot renew; sleep 1d & wait $${!}; done;'
    deploy:
      placement:
        constraints:
          - node.labels.letsencrypt == true
  • The service will be called "renew".
  • It will use the latest certbot docker image.
  • We bind as read/write the certbot conf directory to let certbot write its new certificates and key pairs in it.
  • We also bind as read/write the www directory to let certbot write challenges and nonces in it.
  • We set a placement constraint on the service so that it runs only on the node having the Let's Encrypt data.
  • We change the entrypoint to run certbot renew once a day.

Let's start the service:

docker stack deploy --compose-file docker-compose.yml letsencrypt-renew

And check that it is started:

$ docker service ps letsencrypt-renew_renew
ID                  NAME                        IMAGE                    NODE                DESIRED STATE       CURRENT STATE        ERROR               PORTS
ilq8u259hmwp        letsencrypt-renew_renew.1   certbot/certbot:latest   kawaii              Running             Running 8 days ago

We can also check the logs of renewal:

$ docker service logs letsencrypt-renew_renew
[...]
letsencrypt-renew_renew.1.ilq8u259hmwp@kawaii    | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
letsencrypt-renew_renew.1.ilq8u259hmwp@kawaii    |
letsencrypt-renew_renew.1.ilq8u259hmwp@kawaii    | The following certs are not due for renewal yet:
letsencrypt-renew_renew.1.ilq8u259hmwp@kawaii    |   /etc/letsencrypt/live/example.com/fullchain.pem expires on 2020-02-14 (skipped)
letsencrypt-renew_renew.1.ilq8u259hmwp@kawaii    | No renewals were attempted.
letsencrypt-renew_renew.1.ilq8u259hmwp@kawaii    | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Some final thoughts

Now my website is distributed across my Docker Swarm and protected through HTTPS.

However some things are left to do:

  • If my Let's Encrypt node is unavailable I cannot renew my certificate anymore (as the certbot and well-known service would be down (and, worse: my private registry also !)
  • Even if the certificate is automatically renewed, I still need to rebuild my website service Docker image to include the new certificate and update the Docker Swarm service.
  • I still need to copy manually certificates, keys and chains from the Let's Encrypt data directory to the website docker tls directory before rebuilding the image. So this build process must be executed on the node having the Let's Encrypt data.