TL;DR: balena is a new container engine based on the Moby Project, with an emphasis on embedded and IoT use cases, and fully compatible with Docker containers. Compared to the standard Docker container engine, it supports container deltas for 10-70x more efficient bandwidth usage, has 3.5x smaller binaries, uses RAM and storage more conservatively, and focuses on atomicity and durability of container pulling. You can download it here.
Since 2013, when we first ported Docker to ARMv6 and the Raspberry Pi, the resin.io team has been working in and around the Docker codebase. We have ported it to many more architectures, built an OS intended to assist Docker operation on embedded devices, and released our popular Docker base image series for embedded use cases.
Having seen IoT devices used in production for tens of millions of hours, we’ve become intimately acquainted with the unique needs of the embedded world. Until recently, we addressed these by either making small modifications to Docker itself or building larger components outside of it, but that approach had reached its limits. Meanwhile, as the Docker binaries have grown in functionality, they have also grown in size, eating away at our limited space budget on IoT devices.
With the announcement of the Moby Project, we were given the opportunity to remedy the situation in a way that fully supports the Docker community. So we built a container engine based on Moby technology from Docker, shares the Docker components that are needed for our use case, and is augmented with the specific features that we’ve built out over time.
Today, we’re announcing balena, a container engine based on Moby technology from Docker, informed by all the knowledge we’ve accumulated from working with device fleet owners in production scenarios, ranging from turbines to drones and everything in between.
- Adds the ability to perform true container deltas, 10-70 times more bandwidth efficient than the standard layer-based Docker pull.
- Adds the ability to perform atomic and durable image pulling, meaning you won’t end up with a corrupted container if power is lost in the midst of an update. This is not something one plans for in a datacenter, but is a daily occurrence for devices in the physical world.
- Is conservative about how much it writes to the filesystem, performing on-the-fly extraction of pulled layers and avoiding writing the compressed layer to disk.
- Prevents page cache thrashing during image pull in order to work well in low-memory situations.
- Supports build-time volumes, enabling more flexible container building workflows. This is also a feature needed for the resin.io builder.
Finally, balena is a single binary that is 3.5x smaller than the Docker CE binaries. We have released balena binaries for all the architectures that resinOS works on, including armv5, armv6, armv7, aarch64, i386, and x86_64.
To achieve its reduced size, balena drops several features of Docker that are not needed in the embedded scenarios we’re focusing on. As such, balena does not include Docker Swarm, support for plugins, cloud logging drivers, overlay networking drivers, and stores that are not backed by boltdb, such as etcd, consul, zookeeper, etc. Balena is only available for Linux, not for Mac or Windows.
As a result of these changes, balena performs incredibly well in the embedded scenarios we care about. Let’s get to a technical comparison:
Just how much more bandwidth-efficient are true container deltas than standard layer pulls? Let's have a look:
|Pull test||Normal size||Delta size||Improvement|
|Adding a package to the list of apt-get install||78.9MB||1.1MB||71.6x|
|Switching Ghost version from 1.12.0 to 1.12.1||368.1MB||21.8MB||16.9x|
|Changing the base from node:6-slim to node:4-slim||426.1MB||56.6MB||7.5x|
You can run these comparisons yourself using the benchmark here.
And how much smaller are balena binaries than standard Docker?
|Architecture||Docker binaries||Balena binary||Improvement|
And finally, how page cache friendly is an image pull using balena?
When a container engine downloads an image it needs to iterate over every file of every layer and write it to disk. This is a lot of data, most of which will never be read by the application.
Docker optimizes for pulling speed and uses as much page cache as possible during layer extraction. This can cause useful application pages to get evicted and impact IO performance. Balena takes a more conservative approach. Every file is synced to disk, and then balena informs the kernel that it can throw away its pages, so page cache usage is minimal. The downside of this approach is that pull is slower, but we believe the tradeoff makes sense for IoT use cases. We have some ideas for getting pull speed back up to par that we'll be exploring in future releases.
Unless you depend on one of the features in Docker that balena omits, using balena should be a drop-in replacement, with the different tradeoffs described above.
balena run, and so forth all work exactly like you’d expect them to. Balena also builds Dockerfiles, pulls containers from docker.io, and in general shares the core architecture of any Moby assembly, including Docker itself. In fact, you can install balena alongside Docker and run a comparison on your own system!
To help align expectations, balena will be versioned with the equivalent version of Docker it’s closest to. We will also add a revision marker at the end of the version string to indicate incremental updates within the same release cycle. This will enable us to release fast updates, which we expect will be needed in the early days, as we respond to community feedback. So the first version of balena will be 17.06-rev1 and we expect to continue versioning along these lines.
We’re incredibly excited to have reached this part of our journey working with containers in embedded/IoT use cases, and grateful to Docker for enabling us to do this much more cleanly than the alternative. The Docker team has gone above and beyond to enable projects such as balena with the release of the Moby Project, and that deserves special praise.
You will find all the balena releases at balena.io and you can see the source code in github.
As far as other resin.io projects are concerned, future releases of resinOS will use balena as the default container engine. This also means that resin.io users will seamlessly migrate to this new engine as they move their fleets to newer resinOS versions. We don’t expect anything to look or feel different in any way, other than seeing across the board improvement and new features being enabled.
We look forward to hearing from all of you on what you think about balena and where it should go next. We'll also be at DockerCon EU (booth S8) if you'd like to stop by and say hello!