Skip to Content

Docker Production Environment - Enterprise Edition and security features

What we will get with paid Docker? Is Docker EE easy to use?

Share on:
docker enterprise

In production environment we should care about security, also we want easy of usage.
So, it is good to know docker security options and Docker Enterprise Edition.


This guide assume that we have got Docker Enterprise Edition installed on three node cluster. Most of features described here are paid ones, but we can get a trial of Docker Enterprise Edition for free - 30 day period.

How to do it?
Look at official docs:

Docker EE Installation
Docker EE Trial


Security Overview

Security components in docker:

  • Secrets - encrypted store for passwords and other sensitive data that can be publish to container
  • Docker Content Trust - singing images protect from man-in-the-middle attacks
  • Security Scanning - images in DTR can be scanned for security vulnerabilities
  • Swarm Mode - most of swarm components are encrypted by default

Security components in Linux which used by docker:

  • seccomp
  • Mandatory Acess Control(MAC)
  • Capabilities
  • Control Groups(cgroups)
  • Kernel namespaces

Namespaces

Linux technology that gives us ability to slice OS into couple of logical OS’es.
Docker container has got couple of own namespaces which are isolated from others and from docker host itself.

What namespaces are available:

  • Process ID(pid) - isolation of process tree - every container hasw got own processes
  • Network(net - network stack isolation - interfaces, IP’s, ports etc.
  • Mount(mnt) - root(/) filesystem isolation
  • Inter-prcess Communication(ipc) - shared memory isolation
  • User(usr)(not enabled by default) - mapping non-root user on container to root user on docker host
  • UTS(uts) - isolating of hostnames and uname calls

Control Groups

Control Group will be used to set limits for containers resource compsumption.

Capabilities

Linux capabilities gives us possibility to restrict certain calls to kernel from root user. Docker containers are running from root user which may lead to some dangerous situations.
We can remove some capabilities from root user after running our containers to make docker host more secure.
What can be capability?

  • CAP_CHOWN - posibillity to do chown
  • CAP_NET_BIND_SERVICE - binding low ports
  • CAP_SYS_BOOT - rebooting host

Mandatory Acess Control

Well known Linux MAC is SELinux which we love to setenforce 0 :) We can start containers with SELinux policy applied - by default we get one which in most cases should be fine.

seccomp

Profiles wchich can filter dangerous kernel calls from containers.

Swarm security

Swarm security includes:

  • Crypto node ID
  • TLS authentication for nodes - certificate added to node after joining cluster
  • Secure join tokens - one for managers and one for workers
  • Encrypted DB with metadata
  • Encrypted management networking
  • CA and certificate rotation
Check TLS node certificate

Subject:

  • O - Swarm ID
  • OU - Swarm node role
  • CN - Node ID
[root@docker-host1 ~]## openssl x509 -in /var/lib/docker/swarm/certificates/swarm-node.crt  -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            15:70:f3:02:11:f6:41:41:e3:10:4c:84:74:f8:3b:a7:b2:37:27:c5
        Signature Algorithm: ecdsa-with-SHA256
        Issuer: CN = swarm-ca
        Validity
            Not Before: May 12 16:09:00 2020 GMT
            Not After : Aug 10 17:09:00 2020 GMT
        Subject: O = rmozotgo6u1elkg2r0b1r1fzh, OU = swarm-manager, CN = q6oweiw4l87n292hn4cyekc7x
        Subject Public Key Info:
            Public Key Algorithm: id-ecPublicKey
                Public-Key: (256 bit)
                pub:
                 <>
                ASN1 OID: prime256v1
                NIST CURVE: P-256
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Key Identifier:
                C9:5D:5B:42:2E:EA:7C:79:9A:69:B6:2D:B2:57:91:B9:72:76:C2:22
            X509v3 Authority Key Identifier:
                keyid:27:BE:CB:1C:87:0D:ED:35:7D:EB:16:4C:1E:80:60:3E:C0:DE:EB:70

            X509v3 Subject Alternative Name:
                DNS:swarm-manager, DNS:q6oweiw4l87n292hn4cyekc7x, DNS:swarm-ca
    Signature Algorithm: ecdsa-with-SHA256
         <>
Change TLS certificate rotation period
[root@docker-host1 ~]## docker swarm update --cert-expiry 720h
Swarm updated.

Secrets

Secrets allows us to store in encrypted cluster store in Swarm some important data like passwords. This data can be decrypted end mounted in containers in /run/secrets as flat text file.

Create secret

From Swarm manager node:

[root@docker-host1 ~]## echo "SuperSecretPassword" | docker secret create ImportantPassword -
lu126735e618m51qp6u8jc5fl
List secrets
[root@docker-host1 ~]## docker secret ls
ID                          NAME                DRIVER              CREATED             UPDATED
lu126735e618m51qp6u8jc5fl   ImportantPassword                       6 seconds ago       6 seconds ago
ei5zfeqp6f2aolcygkiqmcwrb   ucp-auth-key                            2 days ago          2 days ago
Create service with secret mounted
[root@docker-host1 ~]## docker service create --name web_server --replicas 2 --secret ImportantPassword httpd
xvkr9xbmjayy99l1u9tcrruxv
overall progress: 2 out of 2 tasks
1/2: running   [==================================================>]
2/2: running   [==================================================>]
verify: Service converged
[root@docker-host1 ~]## docker exec -it web_server.1.kooxu73k13sthobea9f6p8542 bash

root@0b93e926dc4a:/usr/local/apache2## ls -lah /run/secrets/
total 4.0K
drwxr-xr-x. 2 root root 31 May 14 20:20 .
drwxr-xr-x. 1 root root 21 May 14 20:20 ..
-r--r--r--. 1 root root 20 May 14 20:20 ImportantPassword

root@0b93e926dc4a:/usr/local/apache2## cat /run/secrets/ImportantPassword
SuperSecretPassword

Docker Enterprise

Docker EE includes:

  • Universal Control Plane(UCP)
  • Docker EE engine
  • Docker Trusted Registry(DTR)

UCP

UCP as mentioned earlier is web gui for docker. It is used to manage docker engine and some entreprise features - it is shipped as container for easy deployment.
If we have Docker EE binaries installed we can deploy UCP with single docker run command. Before doing so, we have to be dure that:

  • all nodes will have set up NTP for time synchro
  • all nodes have got static IP address
  • all nodes have got resolvable DNS name
  • odd number of managers for HA(3 or 5)(5 is best option)
  • you have some VIP’s and LB - to round robin calls to UCP/DTR GUI
  • managers spread in multiple DC availability zones
  • all nodes meet requirements:
    • minimum:
      • UCP Manager with DTR: 8GB RAM, 3GB disk space
      • UCP Worker: 4GB RAM, 3GB disk space
    • recommended:
      • UCP Manager with DTR: 8GB RAM, 4vCPU, 100GB disk space
      • UCP Worker: 4GB RAM, 25-100GB disk space

For setup I advise to follow official docs, after following it we get swarm cluster created which is managed by UCP and DTR deployed on it.

Docker UCP Setup Docs
Docker Trusted Registry Setup Docs

After installation we should se on manager node:

[root@docker-host1 ~]## docker service ls
ID                  NAME                     MODE                REPLICAS            IMAGE                        PORTS
begjzklu24xp        ucp-auth-api             global              1/1                 docker/ucp-auth:3.2.6
re8y924yflm6        ucp-auth-worker          global              1/1                 docker/ucp-auth:3.2.6
4fouy44hjo1b        ucp-cluster-agent        replicated          1/1                 docker/ucp-agent:3.2.6
dmm1i3un93x7        ucp-manager-agent        global              1/1                 docker/ucp-agent:3.2.6
az4tgr8y1fmy        ucp-worker-agent-win-x   global              0/0                 docker/ucp-agent-win:3.2.6
g4i7oduf9zgv        ucp-worker-agent-x       global              2/2                 docker/ucp-agent:3.2.6

If you not familiar with moving around Docker Swarm Cluster and services look at:

Docker Swarm Complete Guide


What does this services do?

  • ucp-agent - main UCP agent that is capable of scheduling containers on nodes
  • ucp-ectd - cluster persistent database
  • ecp-auth - authentication service which gives us possibility to enable SSO
  • ucp-proxy - control access to local docker socket
  • ucp-swarm - communication with swarm cluster on which UCP is deployed

By default UCP creates some CA for internal and external connection.
For production we can use our - ca.pem, cert.pem, key.pem - we can add certs when starting UCP with flag --external-ca.
This flag will take volume name in which we have our certs. Volume should be named: ucp-controller-server-certs.

Accesing UCP dashboard - adding new node


If we log in for the first time we shuold first upload liecnse file: docker_subscription.lic.

It can be download from Docker Hub from place where we have info about our subscription.


We should use: https://<docker_host_with_ucp>:443 to access UCP.
After log in we get dashboard:

Docker EE Dashboard

In Shared Resources -> Nodes we can click Add Node:

Docker EE Dashboard

UCP will gives us command to add manager or worker to swarm:

Docker EE Dashboard

After using command UCP will deploy nessesery containers into new node:

[root@docker-host3 ~]## docker container ls
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

[root@docker-host3 ~]## docker swarm join --token SWMTKN-1-0f2m4q5xac2s7ba304t61wi100h6oiisqoiqh8ui1ji31vytcz-d68hhgvwjrkidprzz2282dapc 10.10.10.20:2377
This node joined a swarm as a worker.

[root@docker-host3 ~]## docker container ls
CONTAINER ID        IMAGE                        COMMAND                  CREATED              STATUS                        PORTS                                                                       NAMES
511c2be45d61        dd89cabc02dd                 "/install-cni.sh"        54 seconds ago       Up 53 seconds                                                                                             k8s_install-cni_calico-node-2kf4p_kube-system_6f7ae121-9696-11ea-bb79-0242ac110009_0
ee56dd0e1507        40091fdbb1b4                 "/bin/entrypoint.sh"     55 seconds ago       Up 54 seconds                                                                                             k8s_calico-node_calico-node-2kf4p_kube-system_6f7ae121-9696-11ea-bb79-0242ac110009_0
275e1706ccf5        docker/ucp-pause:3.2.6       "/pause"                 56 seconds ago       Up 55 seconds                                                                                             k8s_POD_calico-node-2kf4p_kube-system_6f7ae121-9696-11ea-bb79-0242ac110009_0
6b960e1b63f9        docker/ucp-hyperkube:3.2.6   "/bin/kubeproxy_entr…"   About a minute ago   Up About a minute                                                                                         ucp-kube-proxy
09018980245c        docker/ucp-hyperkube:3.2.6   "/bin/kubelet_entryp…"   About a minute ago   Up About a minute                                                                                         ucp-kubelet
3e39596860d7        docker/ucp-agent:3.2.6       "/bin/ucp-agent prox…"   About a minute ago   Up About a minute (healthy)   0.0.0.0:6444->6444/tcp, 0.0.0.0:12378->12378/tcp, 0.0.0.0:12376->2376/tcp   ucp-proxy
b30f7f904b63        docker/ucp-agent:3.2.6       "/bin/ucp-agent node…"   About a minute ago   Up About a minute             2376/tcp                                                                    ucp-worker-agent-x.5u5lxvnfvv98bhnvpgm1h77gr.qsbes3hnkqvpenfy6x6v2ovyn
[root@docker-host3 ~]##

Client bundles - management commands protection

UCP block usage of docker cli with ucp-proxy that secures socket.
You can do any actions only from UCP after sucessful login or from docker cli but only if you have valid certificate.

Certificate we can get from UCP it is part of UCP Client Bundle.
Go to <your_user> -> My Profile

Docker EE Dashboard

Click on New Client Bundle -> Generate Client Bundle:

Docker EE Dashboard

We will get downloaded Client Bundle zip which we can send to our host with docker cli:

Docker EE Dashboard

We will unzip bundle on our account on OS:

[lukas@docker-host2 ~]$ unzip ucp-bundle-lukas.zip
Archive:  ucp-bundle-lukas.zip
 extracting: ca.pem
 extracting: cert.pem
 extracting: key.pem
 extracting: cert.pub
 extracting: env.sh
 extracting: env.ps1
 extracting: env.cmd
 extracting: kube.yml
 extracting: meta.json
 extracting: tls/kubernetes/ca.pem
 extracting: tls/kubernetes/cert.pem
 extracting: tls/kubernetes/key.pem
 extracting: tls/docker/ca.pem
 extracting: tls/docker/cert.pem
 extracting: tls/docker/key.pem

We can use env.sh script to set variables:

[lukas@docker-host2 ~]$ . env.sh

[lukas@docker-host2 ~]$ env | grep DOCKER
DOCKER_CERT_PATH=/home/lukas
DOCKER_TLS_VERIFY=1
DOCKER_HOST=tcp://10.10.10.20:443

After doing this we have fully configured client connection. From one of workers we connect to our manager with certificates(secure connection).
We have got set variables:

  • DOCKER_HOST
  • DOCKER_TLS_VERIFY
  • DOCKER_CERT_PATH

We can test some command:

[lukas@docker-host2 ~]$ docker system info
Client:
 Debug Mode: false
 Plugins:
  cluster: Manage Docker clusters (Docker Inc., v1.2.0)

Server:
 Containers: 85
  Running: 59
  Paused: 0
  Stopped: 26
 Images: 49
 Server Version: ucp/3.2.6
 Role: primary
 Strategy: spread
 Filters: health, port, containerslots, dependency, affinity, constraint, whitelist
 Nodes: 3
  docker-host1.lukas.int: 10.10.10.20:12376
   └ ID: B7LV:MYVW:NSZK:3FLG:SSDE:R5ZZ:HZWD:6OKH:MQZ7:CMBT:XHHC:BTJG|10.10.10.20:12376
   └ Status: Healthy
   └ Containers: 59 (37 Running, 0 Paused, 22 Stopped)
   └ Reserved CPUs: 0 / 2
   └ Reserved Memory: 680 MiB / 8.011 GiB
   └ Labels: com.docker.security.seccomp=enabled, kernelversion=4.18.0-147.8.1.el8_1.x86_64, operatingsystem=CentOS Linux 8 (Core), ostype=linux, storagedriver=overlay2
   └ UpdatedAt: 2020-05-15T12:05:42Z
   └ ServerVersion: 19.03.5
  docker-host2.lukas.int: 10.10.10.21:12376
   └ ID: TB3A:VG63:QQ5G:HHEY:X3CR:7INY:DRHJ:5SEV:Z4LA:HBVD:AJ7O:TBK4|10.10.10.21:12376
   └ Status: Healthy
   └ Containers: 19 (15 Running, 0 Paused, 4 Stopped)
   └ Reserved CPUs: 0 / 2
   └ Reserved Memory: 0 B / 3.877 GiB
   └ Labels: com.docker.security.seccomp=enabled, kernelversion=4.18.0-147.8.1.el8_1.x86_64, operatingsystem=CentOS Linux 8 (Core), ostype=linux, storagedriver=overlay2
   └ UpdatedAt: 2020-05-15T12:05:40Z
   └ ServerVersion: 19.03.5
  docker-host3.lukas.int: 10.10.10.22:12376
   └ ID: 53HL:P433:NFCW:H2W2:VYO2:FBCK:OPCE:BO7Z:VJUL:O6MJ:PD3J:QEKD|10.10.10.22:12376
   └ Status: Healthy
   └ Containers: 7 (7 Running, 0 Paused, 0 Stopped)
   └ Reserved CPUs: 0 / 2
   └ Reserved Memory: 0 B / 3.877 GiB
   └ Labels: com.docker.security.seccomp=enabled, kernelversion=4.18.0-147.8.1.el8_1.x86_64, operatingsystem=CentOS Linux 8 (Core), ostype=linux, storagedriver=overlay2
   └ UpdatedAt: 2020-05-15T12:05:40Z
   └ ServerVersion: 19.03.5
 Cluster Managers: 1
  docker-host1.lukas.int: Healthy
   └ Orca Controller: https://10.10.10.20:443
   └ Classic Swarm Manager: tcp://10.10.10.20:2376
   └ Engine Swarm Manager: tcp://10.10.10.20:12376
   └ KV: etcd://10.10.10.20:12379
 Plugins:
  Volume:
  Network:
  Log:
 Swarm: active
  NodeID: q6oweiw4l87n292hn4cyekc7x
  Is Manager: true
  ClusterID: rmozotgo6u1elkg2r0b1r1fzh
  Managers: 1
  Nodes: 3
  Default Address Pool: 10.0.0.0/8
  SubnetSize: 24
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 10
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 4 weeks
   Force Rotate: 0
   External CAs:
     cfssl: https://10.10.10.20:12381/api/v1/cfssl/sign
  Autolock Managers: false
  Root Rotation In Progress: false
  Node Address: 10.10.10.20
  Manager Addresses:
   10.10.10.20:2377
 Kernel Version: 4.18.0-147.8.1.el8_1.x86_64
 Operating System: linux
 Architecture: amd64
 CPUs: 6
 Total Memory: 15.76GiB
 Name: ucp-controller-10.10.10.20
 ID: rmozotgo6u1elkg2r0b1r1fzh
 Docker Root Dir:
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
  com.docker.ucp.license_key=Dq3bSUuQfXCDg037VlZKDK22rQtWFxuGrhee8mx6Fq6E
  com.docker.ucp.license_max_engines=10
  com.docker.ucp.license_expires=2020-06-13 16:39:03 +0000 UTC
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Quantity: 10 Nodes	Expiration date: 2020-06-13	License is currently active

Enterprise Features

Role-based access control(RBAC)

Objects types:

  • subject - one or more user or team
  • role - set of perrmisions
  • collection - set of resources

Grant in Docker is: Subject + Role + Collection

Creating grant

Log in as UCP Admin and first add some user: Docker EE Dashboard

Docker EE Dashboard

Go to Access Control -> Orgs & Teams

Docker EE Dashboard

Click Create and add new organisation, after that add new team in it by clicking on organisation and plus button:

Docker EE Dashboard

Docker EE Dashboard

Enter created group and add user by clicking plus on top right corner:

Docker EE Dashboard

Go to roles and create custom role(select Swarm Tab):

Docker EE Dashboard

Give new role name and select priviliges:

Docker EE Dashboard

Docker EE Dashboard

Create collection:

Docker EE Dashboard

Docker EE Dashboard

Create secret and add it to collection:

Docker EE Dashboard

Docker EE Dashboard

Docker EE Dashboard

Create grant:

Docker EE Dashboard

Docker EE Dashboard

LDAP Integration

There is possiblity to configure LDAP for Docker EE:

Docker EE Dashboard

Docker EE Dashboard

Docker Content Trust(DCT)

With DCT we san sign images before pushing to repo. If someone want to use our image he can check signature to be sure that image come from us, not from some cracker.

Login to Docker Hub or to your repo:

[root@docker-host3 ~]## docker login
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one.
Username: <secret_login>
Password:
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/##credentials-store

Login Succeeded

Enable image signing:

[root@docker-host3 ~]## export DOCKER_CONTENT_TRUST=1

Push image:

[root@docker-host3 ~]## docker push lbartnicki92/ubuntu:latest
The push refers to repository [docker.io/lbartnicki92/ubuntu]
8891751e0a17: Layer already exists
2a19bd70fcd4: Layer already exists
9e53fd489559: Layer already exists
7789f1a3d4e9: Layer already exists
latest: digest: sha256:5747316366b8cc9e3021cd7286f42b2d6d81e3d743e2ab571f55bcd5df788cc8 size: 1152
Signing and pushing trust metadata
You are about to create a new root signing key passphrase. This passphrase
will be used to protect the most sensitive key in your signing system. Please
choose a long, complex passphrase and be careful to keep the password and the
key file itself secure and backed up. It is highly recommended that you use a
password manager to generate the passphrase and keep it safe. There will be no
way to recover this key. You can find the key in your config directory.
Enter passphrase for new root key with ID d7db32d:
Repeat passphrase for new root key with ID d7db32d:
Enter passphrase for new repository key with ID e3b86b2:
Repeat passphrase for new repository key with ID e3b86b2:
Finished initializing "docker.io/lbartnicki92/ubuntu"
Successfully signed docker.io/lbartnicki92/ubuntu:latest

After this operation we have created rot key and repository key located in:

[root@docker-host3 ~]## ls -lah .docker/trust/
total 0
drwx------. 4 root root  32 May 15 17:39 .
drwxr-xr-x. 4 root root  66 May 15 17:39 ..
drwx------. 2 root root 158 May 15 17:39 private
drwx------. 3 root root  23 May 15 17:39 tuf

Root key is needed to create new repositories, where repository key is used to sign images.

Let’s try pull some unsigned image with DOCKER_CONTENT_TRUST set:

[root@docker-host3 ~]## docker pull lbartnicki92/web:latest
Error: remote trust data does not exist for docker.io/lbartnicki92/web: notary.docker.io does not have trust data for docker.io/lbartnicki92/web

If you want one time to push unsigned image you can use --disable-content-trust flag.
Even if you pull unsigned image you won’t be able to run container from it if you have DOCKER_CONTENT_TRUST=1 in your OS session.

Additionally you can from UCP Admin Page enforce using only signed images on entire Docker Enterprise Swarm Cluster.

Docker EE Dashboard

DTR

From UCP Admin Settings page we can get command to install DTR.
For producion grade deployment we should have three replicas of DTR running on Swarm workers that don’t have any other containers except DTR ones.
Also it is good to configure remote(AWS, Microsoft Azure) cloud store for our image blobs - doing so gives us possibility to spawn some replicas for HA DTR.

Docker EE Dashboard

[root@docker-host3 ~]## docker run -it --rm docker/dtr install \
>   --ucp-node docker-host3.lukas.int \
>   --ucp-username lukas \
>   --ucp-url https://10.10.10.20 \
>   --ucp-insecure-tls
INFO[0000] Beginning Docker Trusted Registry installation
ucp-password:
INFO[0003] Validating UCP cert
INFO[0003] Connecting to UCP
INFO[0003] health checking ucp
INFO[0003] The UCP cluster contains the following nodes without port conflicts: docker-host3.lukas.int, docker-host2.lukas.int
INFO[0004] Searching containers in UCP for DTR replicas
INFO[0004] Searching containers in UCP for DTR replicas
INFO[0004] verifying [80 443] ports on docker-host3.lukas.int
INFO[0012] Waiting for running dtr-phase2 container to finish
[..]


For more detailed Docker Trusted Registry installation guide follow:

Docker Trusted Registry Setup Docs


Backup Docker Enterprise Environment

We have to backup:

  • Swarm Cluster
  • UCP
  • DTR

Backup Swarm

What we have to backup? /var/lib/docker/swarm directory.
This catalog is replicated to every manager so we can do backup from one of it. In my lab I have got three machines. For purpose of guide I promoted all of them to managers because of fact that we have to stop docker engine on node from which backup will be made. On previous exercises I had one manager, we can’t stop manager to backup if we have only one - so it looks now like this:

[root@docker-host1 ~]## docker node ls
ID                            HOSTNAME                 STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
q6oweiw4l87n292hn4cyekc7x *   docker-host1.lukas.int   Ready               Active              Leader              19.03.5
ociiukjiw31cbrjgxqx0amjl5     docker-host2.lukas.int   Ready               Active              Reachable           19.03.5
5u5lxvnfvv98bhnvpgm1h77gr     docker-host3.lukas.int   Ready               Active              Reachable           19.03.5

Let’s make backup from docker-host3.
Shutdown docker engine:

[root@docker-host3 ~]## systemctl stop docker

Make backup:

[root@docker-host3 ~]## tar -cvzf swarm.tar /var/lib/docker/swarm/
tar: Removing leading `/' from member names
/var/lib/docker/swarm/
/var/lib/docker/swarm/state.json
/var/lib/docker/swarm/certificates/
/var/lib/docker/swarm/certificates/swarm-root-ca.crt
/var/lib/docker/swarm/certificates/swarm-node.key
/var/lib/docker/swarm/certificates/swarm-node.crt
/var/lib/docker/swarm/docker-state.json
/var/lib/docker/swarm/worker/
/var/lib/docker/swarm/worker/tasks.db
/var/lib/docker/swarm/raft/
/var/lib/docker/swarm/raft/snap-v3-encrypted/
/var/lib/docker/swarm/raft/snap-v3-encrypted/0000000000000002-000000000000000e.snap
/var/lib/docker/swarm/raft/wal-v3-encrypted/
/var/lib/docker/swarm/raft/wal-v3-encrypted/0000000000000000-0000000000000000.wal

Start docker engine:

[root@docker-host3 ~]## systemctl start docker

Backup UCP

To backup UCP we have to choose node node as in previous example. But this time we will do it with docker engine turned on. Still making backup of UCP will stop all UCP containers on that node so - we still have to had multiple managers for this.

Make backup of UCP - password protected:

[root@docker-host3 ~]## docker container run \
    --rm \
    --log-driver none \
    --name ucp \
    --volume /var/run/docker.sock:/var/run/docker.sock \
    --volume /tmp:/backup \
    docker/ucp:3.2.6 backup \
    --file mybackup.tar \
    --passphrase "secret12chars" \
    --include-logs=false
Unable to find image 'docker/ucp:3.2.6' locally
3.2.6: Pulling from docker/ucp
4167d3e14976: Already exists
af325902296c: Already exists
1c053272dca9: Pull complete
Digest: sha256:8ef41e8fa4b40ede84fc8633a28f61ed0bae009e293aad9f1a9fb042fcab8688
Status: Downloaded newer image for docker/ucp:3.2.6
time="2020-05-15T12:35:57Z" level=info msg="Your engine version 19.03.5, build 2ee0c57608 (4.18.0-147.8.1.el8_1.x86_64) is compatible with UCP 3.2.6 (04ac981)"
time="2020-05-15T12:35:58Z" level=info msg="Loading UCP configuration"
INFO[0015] Beginning backup
INFO[0015] Backup completed successfully
[root@docker-host3 ~]## ls -lah /tmp
total 3.5M
drwxrwxrwt.  7 root root  142 May 15 14:38 .
dr-xr-xr-x. 17 root root  224 Nov 22 11:24 ..
drwxrwxrwt.  2 root root    6 May 15 11:24 .ICE-unix
drwxrwxrwt.  2 root root    6 May 15 11:24 .Test-unix
drwxrwxrwt.  2 root root    6 May 15 11:24 .X11-unix
drwxrwxrwt.  2 root root    6 May 15 11:24 .XIM-unix
drwxrwxrwt.  2 root root    6 May 15 11:24 .font-unix
-rw-r--r--.  1 root root 3.5M May 15 14:36 mybackup.tar
-rw-------.  1 root root  718 May 15 14:38 runc-process896943808

Backup DTR

This will backup only metadata - images blobs aren’t backuped up!

docker container run \
  --rm \
  --interactive \
  --log-driver none \
  --env UCP_PASSWORD=<UCP_Password> \
  docker/dtr:2.7.6 backup \
  --ucp-url <ucp-url> \
  --ucp-insecure-tls \
  --ucp-username <ucp-username> \
  --existing-replica-id <replica-id> > dtr-backup-v2.7.6.tar.gz

After that you have to mannually backup image blobs.

Restore Docker Enterprise Enviroment

When to do it?

  • If we lost one manager - just repair it and redd to cluster
  • If we lost all managers - restore from backup

How to do it? For Swarm Cluster manager restore we have to:

  • stop docker engine on restore node
  • rm -rf /var/lib/docker/swarm
  • Untar our swarm.tar into /var/lib/docker/swarm
  • start docker engine
  • add rest of managers

For UCP restore: Remove all previous installation of UCP on node:

[root@docker-host3 ~]## docker container run --rm -it \
  -v /var/run/docker.sock:/var/run/docker.sock \
  --name ucp \
  docker/ucp:3.2.6 uninstall-ucp --interactive
INFO[0001] Detected UCP instance rmozotgo6u1elkg2r0b1r1fzh
INFO[0001] We're about to uninstall UCP from this Swarm cluster
Do you want to proceed with the uninstall? (y/n): y
INFO[0004] Removing UCP Services
INFO[0058] UCP has been removed from this cluster successfully.

Restore UCP - output could be very long so I cut out it here:

[root@docker-host3 ~]## docker container run \
  --rm \
  --interactive \
  --name ucp \
  --volume /var/run/docker.sock:/var/run/docker.sock  \
  docker/ucp:3.2.6 restore --passphrase "secret12chars"   --force-minimums < /tmp/mybackup.tar

time="2020-05-15T12:51:35Z" level=info msg="Your Docker daemon version 19.03.5, build 2ee0c57608 (4.18.0-147.8.1.el8_1.x86_64) is compatible with UCP 3.2.6 (04ac981)"
time="2020-05-15T12:51:35Z" level=warning msg="Restore is running in non-interactive mode. Proceeding to read backup file from stdin"
time="2020-05-15T12:51:35Z" level=warning msg="Your system does not have enough memory. UCP suggests a minimum of 4.00 GB, but you only have 3.87 GB. You may have unexpected errors."
time="2020-05-15T12:51:37Z" level=info msg="Checking required container images"
[..]
time="2020-05-15T12:55:38Z" level=info msg="Completed pulling required images"
time="2020-05-15T12:55:38Z" level=info msg="Running restore agent container ..."
time="2020-05-15T12:55:39Z" level=info msg="running restore steps"
time="2020-05-15T12:55:39Z" level=info msg="Step 1 of 31: [Purge volumes]"
time="2020-05-15T12:55:39Z" level=info msg="Step 2 of 31: [Decrypt backup from tar]"
time="2020-05-15T12:55:39Z" level=info msg="Step 3 of 31: [Restore volumes from tar]"
time="2020-05-15T12:55:40Z" level=info msg="Step 4 of 31: [Setup Internal Cluster CA]"
time="2020-05-15T12:55:42Z" level=info msg="Step 5 of 31: [Compute CA Digests]"
time="2020-05-15T12:55:42Z" level=info msg="Step 6 of 31: [Restore etcd Cluster]"
time="2020-05-15T12:55:44Z" level=info msg="Step 7 of 31: [deploy etcd server from restore]"
[..]
time="2020-05-15T12:57:22Z" level=info msg="Step 26 of 31: [Wait for Healthy UCP Controller And Kubernetes API]"
time="2020-05-15T12:57:23Z" level=info msg="Step 27 of 31: [Deploy Manager Node Agent Service]"
time="2020-05-15T12:57:23Z" level=info msg="Step 28 of 31: [Deploy Worker Node Agent Service]"
time="2020-05-15T12:57:24Z" level=info msg="Step 29 of 31: [Deploy Windows Worker Node Agent Service]"
time="2020-05-15T12:57:24Z" level=info msg="Step 30 of 31: [Deploy Cluster Agent Service]"
time="2020-05-15T12:57:25Z" level=info msg="Step 31 of 31: [Wait for All Nodes to be Ready]"

Restore DTR

docker container run \
  --rm \
  --tty \
  --interactive \
  docker/dtr:2.7.6 destroy \
  --ucp-insecure-tls

Restore images blobs.

Restore DTR metadata

read -sp 'ucp password: ' UCP_PASSWORD;

docker container run \
  --rm \
  --interactive \
  --env UCP_PASSWORD=$UCP_PASSWORD \
  docker/dtr:2.7.6 restore \
  --ucp-url <ucp-url> \
  --ucp-insecure-tls \
  --ucp-username <ucp-username> \
  --ucp-node <hostname> \
  --replica-id <replica-id> \
  --dtr-use-default-storage \
  --dtr-external-url <dtr-external-url> < dtr-metadata-backup.tar