Integration woes with docker containers and NixOS

This article explains how we use Hocker with NixOS at Awake Security, a topic I deferred in the blogpost introducing the Hocker project: Hocker, I can’t believe it’s not docker!

At a glance, you can use the hocker suite of utilities to:

Why did we need this?

  1. The stock docker tools in nixpkgs do not support retrieval of docker images from V2 registries
  2. Some deployment scenarios may not permit an internet connection preventing docker pull from working properly; therefore we need to include docker images in the system’s closure
  3. We wanted efficient storage and deployment of docker images by storing the layers in the Nix store (deduplicated!) instead of complete images (no deduplication!)

Few alternatives exist to docker for pulling containers from the docker registry; because of this vacuum we chose to write Hocker. We strove to write a generally useful tool that anyone could use outside of the Nix and NixOS community but we needed to integrate Hocker with Nix and NixOS because Awake Security deploys its product using both Nix and NixOS, so we wrote this integration in Nix and named it fetchDocker.

Two primary goals motivated the Hocker and Hocker+Nix integration project:

  • we wanted to produce a generally useful and easily-composed suite of utilities
  • we wanted to maintain the registry’s decomposition of docker images by layer, translating that decomposition (the registry’s image manifest) into something Nix understands

The second goal didn’t come along until later in the Hocker project’s history because we tried to do the naive (and faster) thing first: the integration we wrote fetched complete docker images into the Nix store as docker load-able tarballs. This method worked fine in isolated developer testing for small, quick to download docker images that also didn’t share any layers.

We realized after using this naive integration in production that it degraded the developer experience with docker images of a couple hundred megabytes to a couple gigabytes in size. Additionally, because the hash of the complete docker image tarball wasn’t known beforehand, the integration tooling needed to prefetch the complete docker image tarball in order to calculate a sha256 hash so that Nix could treat docker image’s derivation deterministically.

The developer experience degraded because prefetching the docker images required too much time and too much space to generate the Nix code integrating new docker image changes into the production NixOS system. A second-order effect of the network bandwidth and storage requirements and resulting slow-running integration tooling was that developers, understandably, wanted to offload this work to a CI oracle. This complicated the CI infrastructure and further disempowered developers by introducing a single point of failure (the oracle) and asynchronous failures, all before any build work even began!

We think generating Nix code to integrate a docker image into a NixOS system should feel reasonably quick and light and it should not require a CI oracle to run. We needed to make it better. This motivated us to improve the efficiency of the Hocker and NixOS integration.

To improve the poor developer workflow we began by storing the layers of every image along with a script to assemble a composite image from those constituent layers. Storing docker images this way improved disk-space usage and network-usage because the Nix store deduplicated any shared layers.

The integration tooling retrieves the inventory of layers for a docker image from the docker registry manifest file and this file keys the layers by their content-addressable sha256 hash. Layers keyed by their content-addressable hash helps improve the developer workflow in a second, significant way: we could remove the expensive prefetch step. The prefetch step was only necessary to produce one piece of information to construct a deterministic (or “fixed-output”) Nix derivation, the sha256 hash of the layer but because we can reuse the sha256 provided by the docker manifest we don’t need to redownload and recompute the hash of the contents ourselves.

We transformed our docker image NixOS integration tooling from an expensive one into a quick, efficient one by decomposing docker images into their constituent layers, deduplicating the shared layers, and reusing the known content-addressable hashes of the layers to generate deterministic Nix derivations of those layers to avoid expensive prefetching.

Conclusion

Hocker makes declaratively fetching and running docker containers in a NixOS system easy. Hocker, the Docker Registry, and NixOS work in concert to efficiently generate granular deterministic Nix derivations that fetch and deduplicate the binary artifacts of a docker image into the Nix store, enabling a NixOS system to run a docker container without ever using docker pull at run-time.

You can find hocker on GitHub and Hackage.

Notes

Thanks to Gabriel Gonzalez (@GabrielG439) for reading drafts and providing feedback.