Docker Gotchas I’ve Encountered¶
Time to share the snags/gotchas I’ve run into developing and deploying with Docker containers.
On my recent projects, I seem to have become the Docker wrangler on the team. Over that time, I’ve hit a number of snags or gotchas and thought, “Maybe I should write that spot down?”. It’s as good a time as any now that I’ve reduced the impedance of my blogging platform.
Disk Space¶
TL;DR: CAUTION!
$ docker system prune -a --volumes
It’s up to you to clean up unused images, containers, volumes, networks, etc.. This one will likely seem obvious to anyone who understands now how Docker works, but it certainly bit me a few times when I was climbing the learning curve.
I’m not a big fan of this fact, but it’s important to understand that the $ docker
...
CLI is built only to perform the underlying operations, the building blocks,
required for containerized applications. The important thing to pay attention to there
is what it is not. It is not a system for managing images, containers, volumes,
etc., at least not across deployments, over time, or between different applications. It
is not a declarative system for describing what images should be used to run which
containers connected to which volumes and networks on a given host. Even $
docker-compose ...
is really only a different syntax to write a related collection
of $ docker ...
CLI commands in a way that is much more readable, maintainable, and
feels declarative.
Stopping a container does not remove it. Removing a container does not remove the
image, volumes or networks. Removing the image may not remove it’s base image or
Layers. And so on. To use other terminology, neither the $ docker ...
nor the $
docker-compose ...
CLIs are orchestration systems. As such, a developer must clean
up the inevitable large quantities of Docker cruft themselves. If, heavens forfend,
you’re deploying containers without an orchestration system, then you’ll also have to
do the same there. A recent project was my first AWS ECS exposure, so maybe it was
being used wrong, but I was surprised to learn that this cruft also accrued for that
usage of ECS.
The easiest way to take care of this is to use the docker system prune command (or
the prune
sub-command under the other $ docker ...
commands) but it’s important
to understand what it does lest you destroy data or even code in a certain sense. When
Docker says “prune” they mean “relative to all currently running containers”. Say
you have a debug container hanging around into which you’ve installed a rich set of
OS/dist packages, configured things just the way you like, and even written a few
utility or introspection scripts you’ve come to rely on. This debug container is
stopped most of the time and only run when you need it. Firstly, never do this!
Always treat containers as ephemeral and able to be destroyed at any time, but lets
continue with this example. If this debug container isn’t running when you run $
docker system prune
, then the container, its filesystem, its image, everything will be
destroyed irrevocably.
As such, only run $ docker system prune
when you’re sure every container is
currently running whose, filesystem, image, volumes, etc. you need to preserve. See
also the options/flags to that command for more thorough cleanup. IMO, if it’s not
always safe to run $ docker system prune
then you’re doing something wrong, and that
stands for both local development and deployments.
Publicly Exposed Ports¶
TL;DR: Always specify the host IP for port mappings, e.g.
127.0.0.1:80:80
.
Unlike SSH port forwarding, which defaults to only binding to the localhost
IP,
127.0.0.1
, the docker run -p … option binds all IPs by default. This is also
true for $ docker-compose ...
since it’s really just an alternate syntax for the
$ docker ...
CLI. So to prevent exposing the top secret application you’re
developing on your laptop to everyone at the cafe who can copy and paste a $ nmap
...
command, just default to always specifying 127.0.0.1
as the bind address.
It’s a cheap way to force yourself to think about it and to make the intention of all
port mappings clear to anyone else who reads them while simultaneously making an audit
of intentionally exposed ports as simple as $ git grep -r
'0\.0\.0\.0:[0-9]+:[0-9]+'
. As for deployments, I’d personally much rather have a
release break a deployment than have a release expose a service to the internet
unintentionally and silently.
YAML Needs Some Zen¶
TL:DR; Quote every YAML value except numbers, booleans, and
null
.
$ python -c 'import this'
The Zen of Python, by Tim Peters
...
Explicit is better than implicit.
...
I love YAML, so a quick rant. Personally, I find the tendency among developers to find a flaw in a technology and then develop a long lasting hatred because it cost them some time once. Our job is to solve difficult technical problems. Our job is also to collectively build fruitful ecosystems of tools, frameworks, and other technology. I personally think that embracing what’s good about a technology is more productive than dismissing a whole technology because of a set of flaws that are a subset of its features. Using something and complaining is much more fruitful than complaining and not using. Is what you’re saying productive criticism or just complaining? I know the programmer’s virtues are delightfully subversive, but I don’t think whining is among them.
In fact, I’d love to see the YAML parser libraries add options, if not defaults in new major versions, to disable the following problematic type conversions. That could put pressure on the standard to move such surprising behavior to an explicit opt-in whereby it could still be powerful for those that need it but no longer surprising for the rest of us. Or maybe just the standard first, I don’t really know how these things work. ;-) Enough ranting.
I lied, next rant but opposite tone. YAML does include some fairly distressing magical behavior. Try to digest this:
>>> yaml.safe_load('port: 22:22')
{'port': 1342}
I lost more time than I care to admit trying to figure out why a
./docker-compose.yml
SFTP port mapping wasn’t working when everything started up
with no error. I eventually did discover that port 1342:1342
was being mapped but
it certainly didn’t help me understand. My other port mappings were working without
any surprises:
>>> yaml.safe_load('port: 80:80')
{'port': '80:80'}
What the actual fork! This is the very definition of surprising behavior. The root cause here is that YAML has support for several obscure value types, and one of them is base 60 integers. I would really love to hear someone make the case that the use cases for sexagesimal are important and powerful enough for such a majority of developer users that is justifies this surprise for the rest of us. Who knew the Babylonians are such a significant constituency!?
There are, however, other such value types that can result in similar unaccountably
surprising behavior. Just avoid these issues, always quote every YAML value except
numbers, booleans, and null
unless and until you need any of that black magic:
>>> yaml.safe_load('port: "22:22"')
{'port': '22:22'}
Build Context and Image Stowaways¶
TL;DR:
$ ln -sv "./.gitignore" "./.dockerignore"
Docker uses what it calls the build context, the directory containing the
./Dockerfile
by default, to determine what is available to COPY
into an image
while building. That makes sense to me. What doesn’t make sense to me is that it
copies the whole build context somewhere before building an image. If past is
prologue I’d end up agreeing if I understood the full reasons for that, but until then I
remain dubious. Moving on. Regardless of how the build context is handled or managed,
the notion of the build context does sensibly define what will make it into the image.
When the build context includes unnecessary files and directories, at best it just slows down your image builds and maybe also increases your built image size to no benefit. At worst, it leaks content that wasn’t intended into a deployed image, or worse a publicly released image. I’ll be honest, though, I mostly get concerned about the former, unnecessarily long times build hurt a lot when it’s in the inner loop of developing an image. The method Docker provides to control which files under the build context directory should not be included in the build context is the ./.dockerignore file.
So then just remember to add paths to ./.dockerignore
every time you do something
that might add something to the build context that you wouldn’t want included in an
image. Simple, right? Don’t we all know that developers are really good at keeping a
long list of rote considerations in mind? Well I’m not. I also have to say I’m not the
only one, given how many other developers on teams I’ve worked on have made changes that
should be accompanied by additions to ./.dockerignore
.
You know what draws my attention to new files in my build context, and very reliably?
Well a new file represents a change and in such cases the new file is a consequence of
some other change in the source code. We use tools to manage changes that don’t depend
on developers remembering such considerations, don’t we? Alright, enough patronizing
sarcasm. VCS, good ol’ $ git status
, is how I notice when some change I made
resulted in new files in the build context.
This might be controversial, but in spite of the many posts I’ve found while
searching which say they should be managed separately, I’ve been strongly advocating for
making ./.dockerignore
a symlink that points to ./.gitignore
for ~2 years now.
I haven’t yet regretted it once. In that same time, I’ve also worked on projects where
./.dockerignore
is managed separately and inappropriate files leaking into the build
context has been a recurring issue on all of them, and not just from my commits.
There are frequent cases where a build artifact should not be committed to VCS but
should be included in built images. Here we can exploit one of the differences
between ./.gitignore and ./.dockerignore, namely that $ git ...
processes
./.gitignore
files in each descendant directory under the checkout but $ docker
build ...
does not. So make sure your build artifacts go somewhere in sub-directories
of the checkout (a good idea in any case) and place your ./.gitignore
rules for
those artifacts at the appropriate level down within that sub-directory.
Also note the other differences between ./.gitignore and ./.dockerignore and keep
your rules to those that are interpreted the same by both $ git ...
and $ docker
build ...
. In particular, replace your ./.gitignore
rules meant to apply to files
that match in any sub-directory, e.g. foo.txt
, with more explicit rules that work in
./.dockerignore
as well, e.g.:
/foo.txt
**/foo.txt
While more verbose, I prefer this anyways as it’s more explicit and communicates to me what’s intended instantly and without ambiguity.
TTFN¶
Well those are the big gotchas I’ve encountered. I’m sure I won’t encounter any more. I’m sure there won’t be any forthcoming “Docker Gotcha: …” posts. ;-)
Comments
comments powered by Disqus