Skip to Content

Docker build from dockerfile - compendium

How to build image? What we can include in Dockerfile?

Share on:
docker dockerfile

All docker containers are started from an image.

What’s an Docker image?

Image is a collection of read only layers with everything we need to run application. Commonly in image we have base layout of operating system we had choose and some necessary binaries to start our application.

For exaple, for PostgreSQL container image could contain CentOS base filesystem template, PostgreSQL binaries and some configuration files.

How we build Docker image?

We build image by issuing command docker build, but before we use it we need to create a Docker file.

Simplest-minimal command to build image:

docker build -t postgresql-server:11.6 .

We just use -t for setting image name and tag.
Dot at the end means that for build context we choose current dir - we could specify here full path to build context directory, which should provide Dockerfile and other files that we want to copy into our image at building process.

Another important docker build parameters are:

  • --pull - Always pull a newer version of the image from which we build
  • --no-cache - disable layer build cache
  • --build-arg - send to docker daemon values for parameters from Dockerfile
  • --file - choose custom Dockerfile location(out of build context directory)

For complete list check docker build -h

Before we use below command first we have to prepair Dockerfile.

What is a Dockerfile?

Dockerfile is text file with all commands needed to create environment for our application. We can easly define what files we want to include in our image, what binaries will run in container or what ports we want to expose from it. Using Dockerfile gives us possibility to fast create multiple enviroments just by modifying file and reissuing docker build.

Assume that we have our java application in 0.1 version currently running - we get from dev team 0.2. Only thing we have to do is to change in Dockerfile path to jar file(or if we put app version in variable use parameter --build-arg - later about it in sample) and issue docker build - of course we should provide proper name for new image which will reflect app version. After image build we can start container with new application.

Sample Dockerfile

FROM centos:7

LABEL maintainer="lukas@mail.com"

ARG PG_MAJOR
ARG PG_MINOR

ENV PG_MAJOR=${PG_MAJOR:-11}
ENV PG_MINOR=${PG_MINOR:-5}

ENV PG_PORT 5432
ENV PG_USER_ID 2201
ENV PG_USER_GID 2201

ENV PG_ENCODING UTF8
ENV PG_LOCALE en_US.UTF8
ENV PG_AUTH md5
ENV PG_AUTH_HOST md5
ENV PG_AUTH_LOCAL md5

RUN groupadd -g ${PG_USER_GID} postgres; useradd -g ${PG_USER_GID} -u ${PG_USER_ID} postgres

COPY my-yum-repo.repo /etc/yum.repos.d/

RUN yum update -y; \
    yum -y install \
    postgresql11-server-${PG_MAJOR}.${PG_MINOR} \
    postgresql11-${PG_MAJOR}.${PG_MINOR} \
    postgresql11-contrib-${PG_MAJOR}.${PG_MINOR} \
    postgresql11-libs-${PG_MAJOR}.${PG_MINOR}; \
    yum clean all;

ENV PATH $PATH:/usr/pgsql-${PG_MAJOR}/bin
ENV PG_DIR /postgresql
ENV PG_DATA ${PG_DIR}/data/pg${PG_MAJOR}
ENV WAL_DATA ${PG_DIR}/wal

RUN set -ex; \
    mkdir -p ${PG_DIR}; \
    mkdir -p ${PG_DATA}; \
    mkdir -p ${WAL_DATA}; \
    chown -R postgres:postgres ${PG_DIR}; \
    chmod 750 -R ${PG_DIR};

VOLUME ${PG_DIR}
VOLUME ${WAL_DATA}

COPY docker-entrypoint.sh /usr/local/bin

RUN chmod u+x /usr/local/bin/docker-entrypoint.sh

ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]

EXPOSE 5432

What commands in Dockerfile we have?

FROM

FROM [--platform=<platform>] <image>[:<tag>] [AS <name>]

FROM centos7

FROM tells Docker Engine which image it has to use as base image for current build.

[--platform=<platform>] flag can be used to specify the platform of the image when we are building multi-platform image. For example, linux/amd64. By default image is build with platform on which build is triggered.

[AS <name>] clause is used for creating multi stage builds. I will cover this topic in another post.

In example we use CentOS operating system, we didn’t specified any tag because we want newest CentOS image for version 7.

Every command under FROM directive will be run on base image. It’s important to notice that we don’t have to use as base images OS image. We can use any correctly build image, and build on top of it.

It is possible to parametrize image version specified in FROM clause with ARG and --build-arg.

ARG

ARG <name>[=<default value>]

ARG PG_MAJOR
ARG PG_MINOR

ARG clause define build time variable.
We can pass parameters for it with --build-arg for docker build also we can specify default value for it in case of omitting in --build-arg by doing like ARG PG_MINOR=11.

This variables are available only at build time!

In our example we user ARG for PostgreSQL version. By doing so, we can build from one Dockerfile multiple images with different PostgreSQL engines. We can use ARG before FROM for parametrizing version of base image.

ENV

ENV <key> <value>

ENV PG_PORT 5432

ENV clause define runtime variable.
In form without = sign - all characters after name of variable are treat as one string including whitespaces. The ENV variables will persist when a container is run. We can change them using docker run --env <key>=<value> on next container start.

ARG/ENV trick

ARG PG_MAJOR
ARG PG_MINOR

ENV PG_MAJOR=${PG_MAJOR:-11}
ENV PG_MINOR=${PG_MINOR:-5}

Why we do something like this?

As mentioned, ENV - runtime variable, ARG build variable which can’t be accessed at container runtime.
In our example, we want to modify version of PostgreSQL which we will use at build time, but what if after build we need this variable values for some operations which can happen in some scripts in our container?

To persist this variables and make them available in running container we make ARG variables without default. Later we use them to set ENV's. Also we move default values set to ENV(shell syntax for setting default value).

Now we can manipulate values with docker build --build-arg, but we also have them available after build at running container.

LABEL

LABEL <key>=<value> <key>=<value> <key>=<value>

LABEL maintainer="lukas@mail.com"

LABEL is used for adding some metadata to image which we will be visible in docker inspect output.
In example we add info about image building person.
We can specify multiple labels - in multiple or one line.

RUN

RUN <command>
RUN ["exec", "param1", "param2"]

RUN chmod u+x /usr/local/bin/docker-entrypoint.sh

RUN execute command on operating system of container.
If you just give command after RUN it will be execute in shell - for Linux it will be probably bash. If you use [](JSON form) it will execute command in raw mode - without shell characteristic behaviours. You should remember to use "” around each part of command.

Each RUN command creates new layer in image. Lesser layers, the better. Always try to merge multiple commands making some logical unit of work into one RUN statement.
For example, appending to file, moving it and changing permissions.
Use for it shell syntax like RUN <command>; <command2>; <command3> in Linux.

COPY & ADD

ADD [--chown=<user>:<group>] <src>... <dest>
COPY [--chown=<user>:<group>] <src>... <dest>

COPY my-yum-repo.repo /etc/yum.repos.d/

ADD and COPY command will move files or directories from src to dest. In most cases COPY is command you need. ADD will be used only when you want to copy from URL - that is main difference between them, COPY don’t work over network - just local files.

In both cases build context is your “root” path from which you can use relative paths like somefolder/somefolder2/somefile

In both cases you can use wildcards like * or ?.

In both cases you can use [--chown=<user>:<group>] parameter to set owner and group for things that you place into image.

VOLUME

VOLUME ["<path>"]

VOLUME ${PG_DIR}

VOLUME creates mount point in filesystem of container for external volumes.

ENTRYPOINT & COMMAND

ENTRYPOINT ["executable", "param1", "param2"]

ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]

ENTRYPOINT defines binary or shell script which will start immediately after starting container.
In our example we have docker-entrypoint.sh script which can contain logic like: if $PG_DATA empty initialize new postgresql cluster with initdb, if not empty - start postgresql with /usr/bin/postgres -D $PG_DATA.
We can pass arguments to ENTRYPOINT by adding it to docker run after image name.

COMMAND clause will be support command for ENTRYPOINT - it will hold default parameters for binary started in ENTRYPOINT in case when they are must exists and we do not pass anything to docker run. In our example there isn’t need for using COMMAND.

It is possible to override ENTRYPOINT when starting container with docker run --entrypoint parameter.

EXPOSE

EXPOSE <port> [<port>/<protocol>]

EXPOSE 5432

EXPOSE is documentation command that shows at which ports services in container will listen.

USER

USER <user>[:<group>]

USER instruction will set user from which all RUN, CMD and ENTRYPOINT commands will run.

WORKDIR

WORKDIR <path>

WORKDIR will set current working directory for all RUN, CMD, ENTRYPOINT, COPY and ADD commands.

ONBUILD

ONBUILD <dockerfile_command>

ONBUILD will trigger only if current image will be used as base image to build a new one. After ONBUILD we can specify any of the instructions explained below.

Example Build process for our Dockerfile

With file described below and other needed file in build context directory:

[node1] (local) root@192.168.0.18 ~/postgres-build-dir
$ ls -lah
total 12K    
drwxr-xr-x    2 root     root          76 Feb 28 16:14 .
drwx------    1 root     root          48 Feb 28 16:14 ..
-rw-r--r--    1 root     root        1.3K Feb 28 16:14 Dockerfile
-rw-r--r--    1 root     root          99 Feb 28 16:15 docker-entrypoint.sh
-rw-r--r--    1 root     root          40 Feb 28 16:15 my-yum-repo.repo

Always have in context directory only necessary files for build process!
All this files will be transferred to docker daemon and could make your new image huge!

We can build postgresql image - notice that we are using --build-arg to overwrite default 11.5 version specified in dockerfile - let’s say that we need 11.6:

[node1] (local) root@192.168.0.18 ~/postgres-build-dir
$ docker build -t postgresql-server:11.6 --build-arg PG_MINOR=6 .
Sending build context to Docker daemon   5.12kB
Step 1/28 : FROM centos:7
7: Pulling from library/centos
ab5ef0e58194: Extracting  75.78MB/75.78MB
[..]