What Challenges to Avoid When Migrating to Docker
At Logz.io, we have built an elastic, highly-available, and secure micro-service architecture on top of the ELK Stack that is designed to ingest petabytes of data every day. Pushing for production is always time-consuming and risky.
Therefore, we decided to use Docker to avoid instance-configuration changes, upgrades, and corruption. To help others, I’d like to share some of the pitfalls we have encountered and how to avoid them.
1. Boot2Docker
To put it frankly, the first issue with Docker is that Boot2Docker is such a great tool that you forget that you’re not running on your own machine. When you run containers on a remote machine instead of your own, you are unable to communicate locally with an exposed port or with a mapped volume to a container. This is because the port and volume are not located on your localhost but rather inside of Boot2Docker’s VM. As a result, if you’ve mapped volumes, they aren’t mapped to your machine.
Tip:
There are two ways to work around mapped volumes with a Mac OS X. The first way involves mapping volumes to “/Users”, which Boot2Docker maps from your host to the VM where Docker is running. The second way is by using the boot2docker ssh command to SSH into the VM and access the file system along with any volumes that you have mapped from within. Obviously, if you mapped volumes to be shared between containers, you’re in the clear.
If you want other hosts to access an exposed port, you need to use SSH port forwarding. It’s also possible to communicate from your Mac with a VM’s IP. Two containers running on the same host can use these port mappings to communicate without a problem.
In the example below, there is a MySQL container running with port 3306 exposed inside Boot2Docker to allow other hosts to connect to the container:
boot2docker ssh -L 0.0.0.0:3306:127.0.0.1:3306
Explanation:
- “boot2docker ssh” – initiates the SSH connection from your Mac to the Boot2Docker VM
- “-L” – forwards a local port to the other side of the SSH connection
- “0.0.0.0:3306” – binds locally to all available IP addresses on port 3306
- “127.0.0.1:3306” – forwards all traffic to the VM’s localhost to port 3306
Once this is set up, remote hosts can connect to your Mac IP on port 3306, and the traffic will be terminated inside the VM on port 3306.
To connect from your Mac directly to the exposed port inside boot2docker, boot2docker IP shows you the Boot2Docker VM’s IP address, which will listen on any ports that are exposed by containers running on it.
2. Docker Image Tagging
When you first start moving a release to production or staging servers, you’ll probably want your hosts to pull Docker images that are either tagged “latest” or from your master branch. If you choose to tag images by build number, version, revision, or a combination of the three, deployment will be a bit trickier because you’ll have to know exactly which revision to pull.
As shown below, the “latest” tag will only attach to the last image ID that you pulled, and the rest of the images in the host’s history will lose their tags and have meaningless characters instead.
Tip:
Images should be assigned both “version” and “latest” tags in order to easily roll back to a specific version of that image.
If you want to learn more about Docker tagging and naming, see their documentation or leave a comment below.
3. Leftovers
After your container has been running and you find yourself ready to pull the next version and stop the previous one, you can run into the problem that disk space on the previous container remains allocated. The reason: After images were run and a container was exited or stopped, the container ID might still need to be referenced to create images. Additionally, development machines always end up running lots of containers, images, and intermediary tags, and in turn, their file systems become congested with ghost containers and images.
Tip:
If you mapped your volumes from stopped containers, you need to remember that a container always has other items that take up space. To prevent your hosts from running out of space, you need to use “–rm”.
To clear out ghost containers, you need to use:
docker ps -a | grep ‘weeks ago’ | awk ‘{print $1}’ | xargs docker rm
- “docker ps -a” – to display all current and previous container info
- “grep ‘weeks ago’” – to only filter containers that were run two or more weeks ago
- “awk ‘{print $1}’” – to get the container ID column alone (the first column)
- “xargs docker rm” – to run docker rm on each of the previously generated container IDs
To clear out ghost images, you can use:
docker images -a | grep “weeks ago” | awk ‘{print $3}’ | xargs docker rmi
- “docker images -a” – to display both tagged and untagged images
- “grep ‘weeks ago’” – to only filter images that are two weeks old or older
- “awk ‘{print $3}’” – to get the image ID column alone (the third column)
- “xargs docker rmi” – to run docker rmi (remove images) on each of the previously generated image IDs.
There are additional challenges when using Docker Compose (once called “Fig”) to spin up a whole platform locally for development purposes, which I will explore in a future post.