Container use has exploded far beyond its initial niche of developers building cloud-native applications and microservices. Containers are now promoted as cloud-neutral platforms to run legacy enterprise applications and as facilitators of workload migrations between data centers and various cloud services.
While containers can be used for large, monolithic applications, that doesn't mean developers should ignore efficiency and blindly convert a legacy application and its bloated system environment into a single container image. Indeed, designers of containerized applications or packages should keep the timeless KISS principle top of mind.
In an era of 48-core servers with half a terabyte of memory and where anyone can launch similarly-sized Elastic Compute Cloud instances within minutes, it might seem like container size limits should be the last thing on the mind of DevOps staffers. It's critical, however, not to forget a key reason for using containers in the first place: the ability to rapidly deploy, scale and migrate workloads on container clusters. And, when it comes to deployment performance, size makes a difference.
General principles of container design dictate sizing considerations
While containerization is new enough that design standards continue to evolve, Red Hat has developed a comprehensive set of principles. Many of the directives in Red Hat's principles whitepaper deal with how to operate in the unique environment of a Kubernetes-managed container cluster in which containers can't rely on a consistent hardware environment from one invocation to the next; instead, they need to be able to be stopped, moved and restarted in short order.
Other Red Hat principles relevant to container scope and size include the following:
- Single-concern principle. This derives from a single responsibility guideline for object-oriented software. It promotes reuse and replaceability. At the same time, it improves source code change management and debugging. The overarching rule is that each container should have one -- and only one -- reason to change something that's impossible to abide by when multiple functions with various dependencies are united into a single container. As Red Hat puts it, "Every container should address a single concern and do it well." When developers adhere to the single-concern principle, containers will be smaller, easier to manage, faster to deploy and less complicated to update.
- Self-containment principle. This arises from a fundamental property of containers. It requires that a container bundle all the custom code, application libraries and language runtimes it needs, and that it rely on only the system kernel. Superficially, the self-containment principle seems to encourage larger containers by adhering to the single-concern principle. KISS developers can resist building bloated containers when they eliminate application and language libraries that aren't required. Holding containers to a single concern and bundling the bare minimum of code and libraries results in a smaller, more efficient container image. Small, focused containers mean decomposing all but the most trivial applications into multiple independent containers. For example, web server, database and user interface applications would no longer need to be deployed together. With a composite of smaller images, the orchestration system can be more flexible in how it schedules and scales those components in need of more resources.
- Runtime confinement principle. This states that containers must characterize and declare their system requirements in three dimensions: CPU usage, system memory and persistent storage. Such information is critical to enabling container orchestration platforms to schedule, place, move and automatically scale containers on clusters. Declaring its requirements amounts to a contract commitment by the container developer and infrastructure operator stating that the application will not exceed the specified resource requirements. The runtime confinement principle is relevant to size since the smaller the container, both physically and in CPU or memory use, the easier it is for an orchestration to manage and efficiently deploy in a heterogeneous container cluster.
Indeed, the first item in Docker's documentation of best practices describes how to keep containers small. Specifically, Docker suggests that developers do the following:
- Start with a lean base image tailored to your application.
- Use multistage builds for languages that require runtime environments, such as Java. Eliminate the libraries and dependencies used during the build from the final, deployed container image, and include only the artifacts and the environment needed to run them.
- Exploit layering to create base images to be shared among multiple images with lots of common code.
In general, developers should include only the bare minimum required files in container images and be sure to clean up any temporary files or build artifacts before packaging images.
Container properties promote lean sizing
Whether it is an iPhone app or a container image, smaller packages will load faster than large ones. Container implementations, however, mitigate the overhead of packaging large, monolithic applications with multiple dependencies by adopting an image standard from the Open Container Initiative (OCI) and a file system. These enable you to layer multiple dependent components into a bundle.
Most enterprise container implementations and cloud services support the image specification developed by the OCI. By that definition, a composite format has images composed of multiple layers of compressed archive files linked with a JSON format manifest file. The format promotes several of the guiding values of OCI and container use more broadly, namely, module independence, composability, portability and minimalism.
As a Red Hat developer blog on image sizing concludes, having "a basic understanding of Docker's layered file system can make a big difference in the size of your image." Large images become an impediment when having to deploy thousands of containers across a cluster. Also, the Red Hat recommendations reiterate the point that large, composite containers are hard to update and patch, and they can lead to bugs and security vulnerabilities.
In sum, a smaller container image is easier to deploy, debug, update and manage. It is also more secure. Further, smaller images make it easier for cluster management systems to place and scale containerized applications.