At work, we use docker for local development and production. When I first started in this job (3+ years ago), I had barely heard about Docker. I have grown to love/hate Docker, although the hate part is not so much about how Docker solves problems but more about how we are using Docker. The love part is because I do find containers interesting, however I also think containers can get over-used and processes get unnecessarily over-complicated. In this Tale we’ll visit some struggles I (and other team members) had when first getting to know Docker.
First, some context. Here at the Texas Advanced Computing Center (TACC) we build Science Gateways, amongst other amazing things. Science Gateways can get very complicated, for the sake of simplicity let’s think about Science Gateways as a distributed web application. We use Mysql or Postgres for database, Django for the back-end, AngularJS (currently upgrading to something else, haven’t fully decided what) for the front-end, and we run all of this in containers. There are other services we use but this gives enough information to get the picture of the dev environment we’ll be talking about.
How Is Docker Running In Your Computer?
The projects we work here at TACC are, what we call, highly distributed.
We use the term highly distributed because these projects use multiple
services build by different teams. Some of these services are exposed
via REST APIs and some others are HPC resources. For instance, TACC’s main
storage resource is called Corral3. For specific type of data
we have to do a Network File System (NFS) mount to have direct
access from the host where we have Python/Django applications running.
NFS mounts are common to use specially when you want a storage system that’s robust.
In production this is quite simple to set up using fstab and in our
development environment (using docker) it’s easy to simulate since the host
is the one who mounts the NFS and then docker sees the mount as a regular
folder. The development cycle runs smoothly until some serious debugging is
necessary. I had to do an NFS mount directly in my laptop to figure out
this one bug. I’m currently using a Mac Book Pro and had to use FUSE
and do an SSHFS mount. The permissions on the mount were correct.
I then fired up
docker-compose and waited
for everything to run, then I saw this error:
ERROR: for <service name> Cannot start service <service name>: b'Mounts denied: The path /path/to/mount is not shared from OS X and is not known to Docker. You can configure shared paths from Docker -> Preferences... -> File Sharing. See https://docs.docker.com/docker-for-mac/osxfs/#namespaces for more info.'
One good thing is that the error is very explicit and it tells you right there where to get more information to fix it. It is still odd to get this error since I’m running docker for mac and it feels native. Here’s the catch: it feels and sounds native, but it is not native. Docker describes how Docker for Mac differs from Docker Toolbox. They do describe it as a native application but only because it uses Hyperkit, although this means there is still a VM between OSX and Docker. It is simpler to use, but we have to keep this structure in mind. After looking into how Docker for Mac runs it is easier to understand why we cannot mount folders into our Docker containers a la willy-nilly. The fix is to let the Docker VM know about the folder(s) we want to mount. We configure the paths the Docker VM can see in the Docker settings. This is called File Sharing.
Another thing that threw me off was when we started using
elasticsearch. Elasticsearch maps indices into memory using a
specific type of file system. This memory mapping allows elasticsearch to be
really fast. The trick here is that Linux will
restrict memory mapping to only a portion of the available virtual memory
address space. The restriction is necessary mainly to protect other processes
to not hog all the virtual memory address space. On Linux operating systems
this is easily fixed with one command
sysctl -w vm.max_map_count=262144.
There is more information on elasticsearch’s site.
Once again, we were able to set up the correct
max_map_count in our prod
environment without any hassles. An interesting thing happened when I was
first trying out elasticsearch in development. After setting up the
elasticsearch container and starting up the project locally we saw this error:
ERROR: bootstrap checks failed max virtual memory areas vm.max_map_count  likely too low, increase to at least 
The previous error is expected if the virtual memory is too low and the
elasticsearch docs are handy when solving this. The
sysctl command to
max_map_count needs to be run on the host environment.
Naturally I tried running the command on OSX only to get this error in return:
sysctl: unknown oid 'vm.max_map_count'
“How curious”, I thought. I then remembered that Docker for Mac does not actually run on OSX instead it runs on a VM. This means the memory Elasticsearch is using is actually the memory the Docker VM has access to. Usually incrementing the memory through Docker’s settings is enough:
If that is not enough, there is a way to actually drop into a
tty on the
Docker’s VM (which is the actual host of your containers). You can run this
If you’re using Docker for Windows and running the regular Linux daemon you can also drop into a shell within the tiny Linux VM using nsenter.
docker-compose is defined as: “a tool for
defining and running multi-container Docker applications.”.
can be used in different ways other than running multiple containers. I’ve seen
projects where it’s used as a way to run management commands. It is common to
docker-compose.yml file on a project which defines the entire
environment of said project.
Soon after starting to use
docker-compose in a
project I realized I had to update the file in different ways to test things
locally. Usually these are only tests and I don’t necessarily want to track
this changes with
git. Luckily I realized
docker-compose can merge multiple
files. By default
docker-compose will look for two files:
docker-compose.override.yml. I tend to add the
.override file to
.gitignore that way I can override different values locally
without worrying messing up everybody else’s setup. This allows you to have a
docker-compose.yml file such as:
django: image: project/portal:local links: - potgresql:postgresql - memcached:memcached volumes: - ./../src:/srv/www/project - /srv/www/project/static:/srv/www/project/static - /srv/www/project/media:/srv/www/project/media ports: - 8000:8000 command: newrelic-admin run-program /usr/local/bin/uwsgi --ini /srv/www/project/wsgi.py
Which can then be overridden with a
docker-compose.override.yml file such as:
django: volumes: - ./../src:/srv/www/project - static:/srv/www/project/static - media:/srv/www/project/media ports: - 8001:8000 command: /srv/www/project/bin/run-dev.sh volumes: static: media:
This would result in a
docker-compose config file such as:
django: image: project/portal:local links: - potgresql:postgresql - memcached:memcached volumes: - ./../src:/srv/www/project - static:/srv/www/project/static - media:/srv/www/project/media ports: - 8000:8000 - 8001:8001 command: /srv/www/project/bin/run-dev.sh volumes: static: media:
As we can see not every value gets simply overridden after merging.
docker-compose treats fields with the next policy:
- Single-value: (e.g.
mem_limit) The value from the
.overridefile will be used.
- Multi-value: These fields are treated as arrays or maps and depending on the type
docker-composewill act differently:
tmpfsare concatenated. Meaning we can never completly override the value set on the first file, we can only add more values to the array or map. I recommend being conservative with the value used on the first file.
labelsare merged together. Meaning, we can add more values to the array or map and we can override previously set values. The trick here is that the values used in this array or map are in the form of
docker-composewill use the left side of the string as uniqueness and will override values based on that.
environmentvalue can be an array or a map.
devicesare also merged together. Meaning, we can add more values to the array and we can override previously set values. The difference here is that
docker-composewill use the right side of the string as uniqueness.
docker-composewill see every value as
VALUE:UNIQUE_STRING. This makes sense because the right side of the value is unique for the container we are creating.
We can also use other file names and as many as we want taking the previous
rules into consideration. You can read more about overriding configuration in
The way to do this is to concatenate them using
-f flag such as the next example.
$ docker-compose -f docker-compose.local-dev.yml -f docker-compose.local-dev.override.yml -f docker-compose.local-dev.shenanigans.yml
I’ve also realized this tends to get confusing and I often loose track of what
I’m overriding and what not. The simple way to check how
end up being configured is to use:
$ docker-compose config
The previous command will print the entire configuration used.
down the house
When we first started using Docker and
docker-compose in our projects I found
running and stopping an entire project was a bit confusing. Let me clarify this:
it wasn’t confusing because I wasn’t sure how to check if everything was running
or not, it was confusing because it wasn’t clear to me what was the correct way
to bring up or down all the containers in a project.
First, Let’s talk about projects. A
file describes a set of services, and each of these services corresponds to a
docker-compose uses the concept of projects to somewhat group
together the containers described in the configuration. In reality a
project uses a project name to manage different container resources.
docker-composeproject name can be defined by the configuration’s root folder, by using the
-pargument or the
A project name is prepended to any object’s name defined in the configuration:
- Prepended to every service name. If one of the service’s name is
django, the container created will be named
container_nameis defined in the service definition.
- When defining
volumes in reality the volume’s name is prepended with
PROJECT_NAME. We can verify this by using
docker volume ls.
PROJECT_NAMEis also prepended to any network defined. We can also verify this by using
docker network ls.
Now that we have clarified
docker-compose.yml file and
projects to group together service, volumes and networks let’s talk about
how do we initialize and destroy these objects.
The easiest way to start up everything within a project is to use
$ docker-compose up.
This will download or build any images which need to be downloaded or build, create
every container, network and volume defined in the configuration file, and run
all the services defined.
docker-compose up is pretty handy and after creating and
running everything it will continue printing the output from every container to
stdout. If you want to run everything in the background you can use the
flag to run it as a daemon and keep it in the background.
up I realized there’s also
create. This is
a bit confusing since those verbs could potentially be synonyms. I later realized
create is deprecated and should not be used.
run is used for when you want
to execute a one-off command inside a container. The container will be created,
the command will be executed inside the container and the container will then be
stopped. I recommend to use
--rm to make sure the container is deleted after
the command is run and exits. You can read more about
run in the
start is also useful, it will only
run containers for the services defined in the configuration. I recommend
start only if you need to use
stop to temporarily stop services
docker-compose startdoes not create containers, it will only start containers for services that are already created.
By now we can safely say most of the time we’ll be using
to start our services since it’s the most compact way to create and run
everything necessary. Sometimes we encounter errors when bringing everything
up, when this happens I find it helpful to go step by step:
- Pull necessary images:
$ docker-compose pull
- Build necessary images:
$ docker-compose build
- Create necessary networks and volumes:
$ docker-compose up --no-build --no-start
- Run services:
$ docker-compose start
Note: The previous list is not the actual steps taken by docker, but it is a simple way to simulate what happens.
docker-compose buildwill only work correctly if the build configuration is defined in the
Most of the time errors appear when building images, but I find it
helpful to know everything that happens when running
Let’s take a look how can we bring everything down. I quickly realized
$ docker-compose up I was able to hit Ctrl+C and compose will
automatically stop every service. Sometimes this doesn’t work correctly,
I believe it has to do with how
docker processes handle signals. When this happens it is
necessary to use
$ docker-compose down, which is pretty simple and compact.
After a while, one is (almost) bound to work with multiple projects. Every
once in a while I encounter an error or a bug that might be related to
docker and not necessarily to the code I’m working on. When something like
this happens I usually first delete every container created for a project.
$ docker-compose rm -f
Something similar can be done using
up in this form:
$ docker-compose up --force-recreate
Which recreates every container for every service in the configuration.
Sometimes I have to delete everything from a project to make sure every service is correctly configured and initialized. This is done by removing every container with the command shown above and then:
$ docker-compose down --rmi=all -v --remove-orphans
The previous command will remove all images for a project (
every volume created (
-v) and every orphan container left (
Next time I use
docker-compose up everything for that project will be created again.
We can use
docker-compose stop or
docker-compose kill to stop
the services. The difference between
stop is that
down will delete
any containers and networks created by default, if needed you can specify to delete
volumes as stated above.
stop will only stop the containers, you can re-start the services
docker-compose start, this is better suited for a temporary stop on services.
If everything else fails you can force the services to be stopped by using
I strongly recommend to only use
kill in your local development because some containers
can end up in a false state when using
kill. If this happens and you have to run
production I recommend to use
docker-compose rm -f to remove all containers and then
bring everything up. This has been really helpful to me a lot of times.
These are some things I learned when I first started using docker. As with any technology, docker can be a breeze to use until it is not. I hope this will be useful to other developers. The Tales from the Keyboard series describes the things me (and other team-members) learned when using different technologies.