Docker Image Bloat — 1.2GB Java Image Killed Friday Deploy
A 1.2GB Java image caused 8-minute CI builds and pull timeouts on EKS.
20+ years shipping production infrastructure and CI/CD at scale. Lessons pulled from things that broke in production.
- Docker images are built as layers; each instruction adds a new read-only layer
- Layer caching speeds up rebuilds only if earlier layers haven't changed
- Multi-stage builds separate build-time deps from runtime, cutting image size by up to 90%
- Slim base images (Alpine, distroless) reduce attack surface and pull time
- Biggest mistake: installing build tools in the final image — they're never needed at runtime
- Another overlooked win: use a
.dockerignorefile to exclude local caches and secrets from the build context - Use
docker image history --no-truncto see the size and command of each layer - Track compressed size, not uncompressed — pull time depends on the former
- Set a size budget per service and enforce it in CI to prevent silent bloat creep
Imagine you're packing a suitcase for a weekend trip. A beginner throws in every piece of clothing they own — just in case. An experienced traveller packs only what they'll actually wear. Docker image optimisation is exactly that: ruthlessly removing everything your app doesn't need at runtime so the 'suitcase' is as light as possible. A 2 GB image and a 50 MB image can run identical apps — the difference is just whether you packed wisely.
Every second your CI pipeline spends pushing a bloated Docker image to a registry is a second your deployment is blocked. At scale — dozens of services, hundreds of deploys per day — a 1 GB image versus a 100 MB image isn't a minor aesthetic preference, it's the difference between a 30-second deploy and a 5-minute one. It compounds across your entire fleet, inflates egress costs on AWS/GCP/Azure, and widens your attack surface because every unused package is a potential CVE waiting to be exploited.
The root cause is almost always the same: Dockerfiles written like shell scripts — one giant RUN block, a fat base image chosen for convenience, build tools left behind after compilation, secrets accidentally baked into layers. Docker's union filesystem means every layer is permanent history; you can't 'delete' a file from a previous layer by removing it in a later one — the bytes are still there, just hidden. And that's the dirty secret of Docker bloat: the bytes you think you deleted are still there, costing you on every pull.
By the end of this article you'll be able to diagnose a bloated image using real tooling, rewrite Dockerfiles using multi-stage builds and layer-cache discipline, choose the right minimal base image for your workload, and avoid the production gotchas that catch even experienced engineers off guard. We'll go deep on the internals — because understanding why Docker layers work the way they do is what separates a developer who memorises tricks from one who can solve novel problems. Understanding layers isn't trivia — it's the difference between cutting 80% of image size and cutting 0%.
Here's the hard truth: most teams don't realise how much bloat costs them until their registry bill hits four figures. One team we consulted had a 2.3GB image for a simple Go webserver. After applying the techniques in this article, they dropped it to 12MB. That's not optimisation — that's elimination.
What Is Optimising Docker Images?
At its heart, image optimisation is about understanding Docker's union file system and using it deliberately. Each Dockerfile instruction adds a new layer. The image's total compressed size isn't just the sum of final files — it includes everything that was written in earlier layers, even if later instructions delete or overwrite them. That's the trap most beginner Dockerfiles fall into: they install compilers, download Maven, compile the app, remove the compiler — but the compiler's bytes still live in an older layer, never to be recovered.
The real measure isn't the size you see when you run docker images — that's the uncompressed size. Pull time is based on compressed size, and registry storage costs are typically based on compressed size as well. So optimisation targets both. A 2GB uncompressed image might compress to 700MB, still far too large for a microservice that does nothing but serve HTTP.
Optimisation isn't a one-time thing. It's a discipline: layer ordering, multi-stage builds, base image selection, and cache management. Get it right and your deploys are faster, your registry bill drops, and your attack surface shrinks. Get it wrong and you're paying for every unnecessary megabyte, every day.
One more thing: compressed vs uncompressed matters. That 2GB image may compress to 700MB on push, but when pulled over a 100Mbps link, that's still 56 seconds of network time. Every megabyte has a cost — even if it's not obvious from docker images.
Here's a nuance most guides miss: layer deduplication across images. If you have ten microservices all built on debian:stable-slim, each pulls the same base layer once on a node. But if each uses a different apt-get install in the first RUN layer, those layers aren't shared. That's why keeping common dependencies in a shared base image saves both build time and node storage.
dive to inspect each layer's filesystem.docker manifest inspect or regctl.dive before pushing to a registry.dive and apply multi-stage builds.Understanding Docker Layers and the Union Filesystem
Docker images are built from a series of read-only layers. Each instruction in a Dockerfile (FROM, RUN, COPY) creates a new layer. The union filesystem overlay2 stacks these layers and presents them as a single filesystem. This is why deleting a file in a later layer doesn't reduce image size — the file still exists in an underlying layer.
Understanding this mechanism is the key to writing efficient Dockerfiles. Every layer is cached and reused as long as the instruction text and its context (e.g., the files being copied) haven't changed. But misplaced order of instructions can invalidate the entire cache.
The point: place instructions that change infrequently (like installing packages) early, and instructions that change with every code change (like COPY . /app) as late as possible.
But there's a deeper trap: RUN rm -rf /var/cache/apt in a separate instruction doesn't remove those files from the previous layer. The files are still there in the layer stack, just hidden. That's why you must combine apt-get install and apt-get clean in the same RUN instruction using shell operators. Every byte you clean inside the same layer is actually gone. Every byte you clean in a later layer is still costing you.
Here's a real-world number: a single RUN with apt-get install -y build-essential && apt-get clean saves about 30MB compared to splitting it into two RUN commands. That 30MB per layer adds up fast when you have 5-10 layers.
Want a mental model? Each layer is like a delta snapshot in Git. If you add a file in commit A and delete it in commit B, the blob still exists in the object store. Docker is the same — docker history shows every layer's bytes.
- Each Dockerfile instruction adds a new read-only layer on top of the previous ones.
- If you install a package in one layer and remove it in the next, the package still exists in the lower layer — the image stays large.
- Use multi-stage builds to copy only the final artifact into a fresh, clean layer stack.
- Combine cleanup commands into the same RUN instruction to avoid wasting bytes.
--cache-from to reuse layers from previous builds — it's a game changer for CI pipelines.--squash is experimental and can cause cache misses.docker build --squash sparingly — it breaks caching.docker build --squash (experimental) or multi-stage to copy only the final files into a fresh base.Multi-Stage Builds: The Right Way to Compile and Package
The single most effective technique to reduce image size is multi-stage builds. Instead of using one Dockerfile that compiles your application and then runs it — leaving all build tools, source code, and intermediate artifacts in the final image — you split the process into two or more stages.
Stage 1 (build stage): Use a full SDK base image, install all build dependencies, compile your application. Stage 2 (runtime stage): Use a minimal base image (e.g., distroless, Alpine slim, or JRE-slim) that contains only the runtime necessary to execute your compiled artifact. Then copy only the compiled output (e.g., JAR, binary) from the build stage.
The syntax uses FROM ... AS alias, and COPY --from=alias to grab files from an earlier stage.
This pattern eliminates build-time dependencies, reduces image size dramatically, and also improves security because the final image contains only what's needed at runtime.
One pattern that catches people out: copying the entire /build directory instead of just the artifact. If your static files are in /build/static but you also have node_modules in /build, they'll all come along. Be precise with your COPY paths. For Go apps, copy only the single binary. For Java, copy only the *.jar. For Python, you might need to copy the entire site-packages, but you can control it with a virtualenv.
A common gotcha: the COPY --from stage still adds a layer. Combine multiple COPY --from calls? Not possible — each COPY adds a layer, but you can't merge them. Accept the overhead — it's still far smaller than including the build stage.
Real example: a Java team at a fintech used multi-stage and dropped their image from 1.8GB to 145MB. The build stage contained Maven, JDK, and all source; the runtime stage had only the JRE and the fat JAR. Their deploy time dropped from 8 minutes to 45 seconds.
--link with COPY (BuildKit), copied files create new layers that don't share the base layer — defeating layer deduplication.COPY --from=builder /build/target/*.jar to exclude everything else.COPY --from=builder /app/target/*.jar to be precise.Choosing the Right Base Image: Alpine, Slim, Distroless
The base image you choose sets the lower bound for your final image size and directly influences your attack surface. The trade-off is between size, package availability, and compatibility.
- Full images (e.g.
- ubuntu:latest, node:latest) are huge (600MB+). They contain a full OS with utilities, compilers, and often unnecessary libraries. Avoid them.
- Slim variants (e.g.
- node:18-slim, openjdk:11-jre-slim) strip out documentation, locales, and package manager caches. Typically 50-80% smaller than full images. Good for most apps that require standard glibc.
- Alpine (e.g.
- node:18-alpine) is based on musl libc. Very small (~5MB base) but can cause compatibility issues with binaries compiled against glibc. Works well for Go, Rust, and interpreted languages.
- Distroless (e.g., gcr.io/distroless/java17-debian11) contains only the runtime and the application — no shell, no package manager, no utilities. Minimal attack surface. Best for security-sensitive production workloads.
A real-world nuance: Distroless images from Google are based on Debian and use glibc, so they avoid the Alpine compatibility trap. But they lack a shell, so you can't exec into them for debugging. You'll need to set up sidecar debug containers or use kubectl debug with ephemeral containers. Many teams start with slim Debian and only go distroless after their security audit demands it.
Performance impact: pulling a full ubuntu:latest over 100Mbps takes ~6 seconds; pulling alpine:3.18 takes ~0.5 seconds. That 5.5 seconds per pull across 50 nodes is an extra 275 seconds of pod startup time — per deploy.
:latest can cause builds to break when the maintainer updates the image.ubuntu:22.04 to ubuntu:22.04-slim can reduce image size by 30-40% without any code changes.glibc and pick Alpine, you'll face runtime errors. Test thoroughly before switching base images in production.FROM node:18-alpine works fine until you add a native npm module that needs glibc — then it silently fails.ldd on your binary inside the container to check what it really requires before choosing the base.-slim Debian.-slim). Avoid Alpine unless you add libc6-compat and test.Distroless vs Alpine Comparison Table
When you've narrowed your base image choices to Alpine and distroless, the decision often comes down to a trade-off between size, compatibility, and debuggability. The table below breaks down the key differences across the dimensions that matter in production.
| Dimension | Alpine (musl) | Distroless (glibc) |
|---|---|---|
| Base size | ~5 MB | ~20-50 MB (depends on language) |
| Compatibility | May break with dynamically linked glibc binaries; use libc6-compat | Full glibc compatibility — works with almost everything |
| Shell access | Yes (ash) | No — no shell, no package manager |
| Package manager | apk | None |
| Debugging | Can exec in with /bin/sh | Must use kubectl debug or debug sidecars |
| CVE density | Low base, but adding packages increases CVEs | Very low — only runtime libs |
| Build speed | Fast (small downloads) | Moderate (larger base, but stable) |
| Best for | Static binaries (Go, Rust), scripting without native deps | Java, Python with native extensions, security-audited workloads |
The most common mistake teams make is assuming Alpine is always the right choice because it's smallest. In reality, the 5 MB savings over distroless is negligible when your final image is already 100+ MB. The glibc compatibility of distroless often saves more time in debugging than the size saves in pull time.
Consider this rule of thumb: if your application or any of its dependencies use apt, yum, or precompiled binaries that expect glibc, choose distroless. If you control the entire dependency chain and can compile statically, Alpine is a valid option. But even then, distroless offers better security with no shell.
A common compromise: use a slim glibc-based image during development for ease of debugging, and switch to distroless in production builds to minimise attack surface.
no such file errors when a binary expects glibc.Layer Cache Optimisation for CI/CD
In a CI pipeline, image rebuilds happen multiple times a day. A well-structured Dockerfile can reuse cached layers from previous builds, cutting build time from minutes to seconds.
The rule: order instructions by frequency of change. Start with system packages (almost never change), then language dependencies (change when you update dependencies), then application code (changes every commit).
Also, use .dockerignore to avoid sending unnecessary files (like .git, node_modules, target) to the Docker daemon — they invalidate the COPY layer.
BuildKit (enabled by default in recent Docker versions) offers additional cache optimization: --cache-from to use remote caches, --mount=type=cache for persistent package caches across builds.
But there's a hidden cost: cache invalidation can be unpredictable. If your CI runner doesn't reuse the Docker cache between builds (e.g., ephemeral runners), you lose all the benefit of layer ordering. In that case, lean on BuildKit's --cache-from pointing to the previous build in your registry. That's the pattern that reduces 5-minute builds to 30 seconds.
Another trick: for monorepos with multiple Dockerfiles, share a common base layer by building a base image containing all system deps, then use FROM base in each service Dockerfile. This saves both build time and registry storage.
One thing that trips up teams: cache mounts (--mount=type=cache) persist on the host. If you're running on ephemeral CI runners like GitHub Actions hosted, they don't persist between runs. You need either a persistent cache volume or --cache-from with a registry.
Pro tip: use --cache-to and --cache-from together to push cache to a registry and pull it on the next build. This works even across different runners.
--cache-from to pull a previous build's cache from a registry. This speeds up remote builds dramatically..dockerignore — one missing line there can invalidate the COPY layer on every build, killing your cache strategy.--cache-from for remote CI runners — it's the key to consistent cache reuse.FROM shared-base:latest — that one layer is cached across all services, saving both build time and disk space on CI..dockerignore.--mount=type=cache for your package manager (npm, pip, apt). It persists downloads across builds without bloating the image..dockerignore Best Practices and Template
The .dockerignore file is the simplest, most overlooked tool for keeping images lean. It tells Docker which files and directories to exclude from the build context — preventing secrets, local caches, and unnecessary files from being sent to the Docker daemon. Without a .dockerignore, every COPY . instruction will include your entire project directory, including .git (sometimes hundreds of MB), node_modules, target, .venv, and IDE configuration files.
.dockerignore should at minimum exclude.git/— avoids leaking repository historynode_modules/,vendor/,__pycache__/— local dependency directoriestarget/,build/,dist/(unless you copy them explicitly) — build artifacts.env,*.pem,credentials.json— secrets and credentials.gitignore,.dockerignore,Dockerfile*— build context files not needed in the image*.md,LICENSE— documentation filestest/,tests/,spec/— test code that's not needed at runtime- CI/CD config files (
.github/,.gitlab-ci.yml,Jenkinsfile)
But be careful: .dockerignore patterns are relative to the build context. Trailing slashes matter: /node_modules/ ignores only at root, while node_modules/ ignores anywhere. Use **/node_modules to catch nested directories if needed.
One common mistake: excluding target/ in a Java project but then realizing the JAR is built into target/. The solution: use fine-grained COPY instructions rather than broad COPY . .. Better: copy only the specific output folder in a multi-stage build.
Here is a production-ready template that works for most language ecosystems:
COPY . . to get compiled artifacts into the image, make sure your .dockerignore doesn't exclude those files. The pattern target would block target/my-app.jar — use a more specific ignore or switch to multi-stage builds..dockerignore is the #1 cause of accidental secret leaking in Docker images. I've seen production .env files baked into layers because the developer forgot to add it.target/ but then has COPY . . in the Dockerfile will get an empty build — the JAR is missing.COPY . . in a production Dockerfile — always copy specific folders..dockerignore only affects the build context sent to the daemon, not the files inside the image. It's a pre-filter, not a layer cleaner.docker build --no-cache . to force a fresh context send and verify your ignores..dockerignore file and review it as part of the code review process. It's as important as the Dockerfile itself..dockerignore in every project to prevent secrets and unnecessary files from inflating the build context and image layers..dockerignore excluding .git and node_modules prevents hundreds of MB from entering the build context..git and IDE files. It's a safety net.Production Monitoring: Tracking Image Size Over Time
Image size tends to creep up over time as developers add new dependencies, install debugging tools for troubleshooting, or forget to clean up temporary files. Monitoring image size as a CI metric helps catch bloat before it reaches production.
You can integrate tools like docker scout or dive into your CI pipeline to fail builds if image size exceeds a threshold. Also, use docker image history to track the impact of each Dockerfile change.
Another approach: maintain a Dockerfile.sizelimit or use external tools like Regctl to query registry manifests and track image size across tags.
One team we worked with added a simple CI step that compares new image size against the previous tag and fails if it increased by more than 5%. That single check caught three regressions in the first month, each caused by a developer adding a debugging library they forgot to remove.
Pro tip: store size metrics in a time-series database (e.g., InfluxDB) and graph them on a dashboard. A weekly trend that shows +2% every week means you'll hit your budget in 25 weeks — but you'll only notice when the deploy fails.
I've also seen teams use GitHub Actions to post a comment on every PR comparing the new image size to the base branch. That transparency alone stops bloat — nobody wants to see 'Image increased by 45%' on their PR.
regctl image digest and regctl image manifest to pull image sizes from the registry without pulling the whole image. Perfect for CI checks.docker scout to also track the number of CVEs per image — it correlates with size.dive and docker scout for deep inspection.Security Implications of Bloated Images
Every unnecessary package in a Docker image is a potential entry point for attackers. A fat base image like ubuntu:latest includes thousands of binaries, many of which have known CVEs. Even if your app doesn't use them, they're still in the container and exploitable if an attacker gains access.
Distroless images eliminate this surface entirely — no shell, no package manager, no utilities. But they also make debugging harder (you can't exec into the container). A compromised distroless image is harder to exploit because the attacker lacks basic tools like curl, wget, or bash.
Another angle: multi-stage builds reduce the attack surface by leaving build tools (e.g., compilers, debuggers) in the builder stage. The final image only contains what's needed to run the app.
Consider this real example: a team had a Node.js image with curl, wget, vim, and netcat installed. An attacker who got a shell via a vulnerable Express route had immediate internet access and lateral movement tools. Switching to distroless for the final image removed all those utilities — the attacker's shell, even if they got one, would have no curl, no wget, no shell history. It's a massive reduction in blast radius.
CVE density: a typical ubuntu:22.04 image has ~200 CVEs at baseline. Switch to distroless and that drops to ~5. Which would you rather deploy to production?
But here's the trade-off: distroless images can't run apt-get update to patch CVEs. You have to rebuild the image with a new base. That's fine for CI but means you can't hotfix a running container. Plan your patch cycle accordingly.
docker scout cves <image> on every image you push to production. If your base image has 200+ CVEs, switch to a slim or distroless variant immediately.docker scout cves as a mandatory CI step to catch vulnerable base images early.Advanced Layer Caching with BuildKit
Docker's BuildKit (enabled by default since Docker 23.0) offers several cache optimization features that go beyond simple layer ordering.
--cache-from: Pull a previous build's cache from a registry. Essential for remote CI runners where local cache doesn't persist.--mount=type=cache: Persist package manager caches (like apt, npm, pip) across builds without including them in the final image. Reduces network downloads dramatically.COPY --link: Copies files without creating a new layer that depends on the previous layer. Improves cache sharing between build stages.- Cache mounts: Mount a scratch directory for build artifacts like
.m2for Maven ornode_modulesfor npm, which are kept across builds but not in the final image.
Using these features requires minimal Dockerfile changes but yields significant speedups in CI.
But there's a subtlety with cache mounts: they persist across builds on the same host. If you're using ephemeral CI runners (like GitHub Actions hosted runners), cache mounts give no benefit because the runner's filesystem is fresh each time. In that scenario, only --cache-from with a remote registry works. Know your CI environment before investing in one pattern over the other.
A real-world benchmark: a Node.js app with npm ci took 45 seconds per build without cache mounts. Adding --mount=type=cache,target=/root/.npm dropped that to 8 seconds on the second build — a 5.6x improvement. On a hundred builds per day, that's 62 minutes saved.
Pro tip: use --cache-to and --cache-from together to push cache to a registry and pull it on the next build. This works even across different runners.
# syntax=docker/dockerfile:1.4 at the top of your Dockerfile to enable the latest BuildKit features. Without it, --mount and --link may not work.docker builder prune --filter type=exec.cachemount.--cache-from is most effective when you tag intermediate images; otherwise, the cache is lost after the registry push.--mount=type=cache carefully — it can cross-contaminate packages across projects.--cache-to and --cache-from together for a full cache life cycle. Push cache to a registry on successful builds, pull it on the next run.--cache-from with a remote registry for persistent caching in CI pipelines.COPY --link to improve layer cache sharing between stages.--mount=type=cache for your package manager. It will drastically reduce download times on subsequent builds.--cache-from with a registry instead.Docker Image Linting (Hadolint)
Just as you lint your source code, you should lint your Dockerfiles. Hadolint is a battle-tested Dockerfile linter that checks for common mistakes, security issues, and inefficiencies that lead to bloated images. It can be integrated into your CI pipeline to catch problems before they become production incidents.
Hadolint parses the Dockerfile using a Docker parser and applies a set of rules. Each rule is prefixed with a code like DL3000 to DL4006. Some of the most useful rules for image optimisation include:
DL3008: Pin versions inapt-get install. This prevents surprise upgrades that double image size.DL3009: Delete apt-get files after install. Enforces the&& rm -rf /var/lib/apt/lists/*pattern.DL3018: Pin versions inapk addfor Alpine.DL3020: UseCOPYinstead ofADDfor copying files.ADDhas extra features that can unintentionally add bloat.DL3025: Use arguments JSON form for CMD and ENTRYPOINT to avoid shell overhead.DL3033: Specify version withpip install(Python).DL3042: Avoid cache directories withnpm cache clean --force.DL3044: Do not upgrade packages alone (RUN apt-get upgrade) — often pulls in unnecessary bloat.
Running Hadolint locally is trivial, but the real value is in CI. You can fail the build if any errors are found, or only warnings depending on your strictness. Here's how to integrate with GitHub Actions:
failure-threshold: info and fix the most impactful rules first (DL3008, DL3009, DL3020). Tighten over time..hadolint.yaml config file to focus on rules that matter most for your stack.Image Size Governance: Setting Budgets and Automating Checks
Image size is a performance and cost metric that deserves the same attention as latency or error rates. Without a budget, bloat creeps in silently. The fix: set per-service size limits and enforce them in CI.
Start by establishing a baseline: measure the current compressed size of every image using docker scout quickview or docker images --format. Then set a reduction target (e.g., 20% smaller) as the initial budget. Store budgets in a YAML file committed to your repository.
In CI, use a script that builds the image, extracts its compressed size, compares it against the budget, and fails if exceeded. Integrate with regctl to query historical sizes from the registry without pulling the entire image.
For advanced governance, enforce that any PR that increases image size by more than 10% requires a review. Use Docker Scout policies to automatically scan for excessive CVEs or size regressions.
One team we know saved $3000/month in ECR costs just by adding a size gate. The gate caught two instances where a developer accidentally added a 200MB debug image into their base. Without the gate, that cost would have run indefinitely.
Another approach: use docker manifest inspect in a cron job to check sizes of images in the registry and alert if any exceed the budget. That catches bloat that slipped through CI (e.g., if someone pushed manually).
regctl for fast registry queries without pulling images.Practical Refactoring Workflow: From Bloated to Lean Dockerfile
Let's walk through a real-world refactoring. Start with a typical bloated Dockerfile:
`` FROM ubuntu:latest RUN apt-get update RUN apt-get install -y curl wget vim git build-essential RUN apt-get install -y python3 python3-pip RUN curl -sSL https://sh.rustup.rs -o rustup.sh RUN sh rustup.sh -y RUN git clone https://github.com/someapp WORKDIR /someapp RUN cargo build --release RUN cp target/release/someapp /usr/local/bin/ RUN rm -rf /someapp ~/.cargo CMD ["someapp"] ``
Problems: No multi-stage, fat base, unnecessary packages (vim, git), cleanup in separate RUN layers, no .dockerignore, using :latest.
Refactored:
Stage 1 (builder): FROM rust:1.65-slim AS builder — install only necessary build deps, compile. Stage 2 (runtime): FROM ubuntu:22.04-slim — only runtime libs if needed, otherwise use FROM scratch and copy static binary. Add .dockerignore: .git, target, *.md Pin base image digests Combine apt-get install and clean in one RUN.
After refactor: 1.2GB → 12MB for a Go/Rust static binary, or ~50MB if using glibc runtime. That's a 95-99% reduction.
Make this process a standard checklist in your team's Dockerfile review template. After a few months, it becomes second nature.
I've seen teams automate this refactoring with a script that runs dive, identifies top layers by size, and suggests a multi-stage pattern. You don't need to do it manually every time — build a tool once.
dive and docker scout in CI — suggest minimal base images automatically.Minimum Layers: The Silently Bloated Build
Every RUN, COPY, ADD in your Dockerfile is a layer. Layers aren't cheap cache bonuses - they're permanent overhead in your final image. The misconception is that dividing install steps into pretty isolated RUN commands is clean code. It's not. It's bloat.
Consolidate. RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/* — one layer, not three. Use && chains, not line breaks that look neat in a diff but double your image size. Every orphaned /var/cache/apt/archives from a split RUN is dead weight your production host will pull across the wire.
Why this matters: layer count inflates push/pull time. A 10-layer image vs 30-layer image for the same executable — the latter pulls slower because the registry and Docker daemon negotiate more layers. There's no free lunch. The union filesystem doesn't compress them away.
--no-install-recommends inherits across layers. It doesn't. Each RUN starts fresh. Add it to every install command.Docker Build Arguments: Your Secret Weapon for Conditional Bloat
You're not shipping one image for all environments, right? Right? Three identical Node.js images with dev tools, test certs, and debug endpoints baked into production. That's not 'agile.' That's a security incident waiting to be exploited.
ARG and --build-arg let you conditionally omit entire dependency trees during build time. No multi-stage gymnastics needed. Define ARG NODE_ENV=production in your Dockerfile. Use COPY and RUN only when NODE_ENV != production. Example: only install dev dependencies when building the test image.
Why this kills two birds: smaller prod image and tighter blast radius. If someone dumps the container, they don't get Jest, Mocha, or your package-lock.json pointing at internal npm registries. That's one less lateral movement path they can take after the initial breach.
.env files with ARGs, not ENVs. ARGs are build-time only. ENVs persist into the runtime layer. Production containers leak DB_PASSWORD if you use ENV. Use ARG + build-time substitution instead.Update Base Images: The Silent Drift That Doubles Your Image
You locked your base image to alpine:3.18.0 last year. Cute. That image is now storing 12 months of stale packages, security patches you never pulled, and lazy maintainer cruft. Meanwhile, the latest alpine:3.20 stripped an entire compiler toolchain from its default layer. Your image is 40MB heavier than necessary.
docker pull alpine:3.18.0 vs docker pull alpine:3.20 — same OS, 60MB vs 45MB. That 15MB is the difference between a base image that was built with developer convenience (GCC, make, perl) vs one built for containers: minimal.
Don't pin to ancient minors. Use docker build --pull to force fresh base layers. Automate a weekly job that rebuilds your images and checks size drift. If your base image maintainers drop bloat in a patch, you inherit it silently. The only defense is active, scheduled rebuilds with alerting on size deltas.
Your security team will thank you. So will your SRE who has to pull 200 replicas during a failover.
docker build --pull will fail. Always have a fallback: pin to at least a major version like alpine:3 not alpine:3.18.0.--pull on every build, weekly rebuilds, and alert on size deltas. Your images get lighter over time, not heavier.Introduction: Why Image Size Matters Beyond Storage Costs
Docker images are the atomic unit of deployment in modern cloud‑native stacks, yet most teams accidentally ship images that are 5×–10× larger than needed. Bloated images don’t just waste bandwidth—they increase cold‑start latency, expand the attack surface by including unnecessary packages, and silently degrade CI/CD pipeline throughput. Optimising Docker images is therefore not a cosmetic exercise but a core DevOps discipline that directly impacts deployment speed, security posture, and infrastructure cost. This guide focuses on the highest‑leverage technique—multi‑stage builds—which alone can shrink an image from gigabytes to a hundred megabytes. After mastering multi‑stage builds, you’ll see how every other optimisation (layers, ignore files, linting) reinforces the same principle: ship only what the runtime needs.
1. Adopt Multi‑Stage Builds
Multi‑stage builds use multiple FROM statements in a single Dockerfile to separate build dependencies from runtime artifacts. The build stage can include compilers, headers, and package managers (gcc, npm, pip) that are discarded in the final stage—only the compiled binary or production dependencies are copied over. For example, a Node.js app with devDependencies of 300 MB shrinks to under 100 MB by running npm ci --production in the final stage. Similarly, Go binaries built with CGO_ENABLED=0 produce a statically linked binary that runs on a scratch base (0 MB base image). The pattern: one stage for “build everything”, one stage for “ship the result”. Set --target if you need debugging intermediate stages locally. Always pin base image digests in each stage to avoid subtle version mismatches.
How a 1.2GB Java Image Took Down Our Friday Deploy
openjdk:11-jdk (400MB) instead of a slim JRE. No multi-stage build was used.maven:3.8.4-openjdk-11-slim to compile, stage 2 used openjdk:11-jre-slim and copied only the JAR. Added '--link' to COPY to reduce layer count. Used .dockerignore to exclude local .m2. Final image size: 118MB.- Use multi-stage builds for any compiled language — separate build tools from runtime.
- Choose the smallest base image that provides the runtime your app needs (JRE, not JDK).
- Clean up package manager caches inside the same RUN layer (e.g.,
apt-get clean). - Profile every image with
diveordocker scoutbefore pushing to a registry. - Pin base image digests, not just tags — a tag change can silently double your image size.
dive <image> to see per-layer size. Look for layers adding big files like /usr/share/doc, /var/cache/apt, or entire language SDKs.:latest). Pin exact base image digests in your Dockerfile.gcr.io/distroless/java17-debian11 or add apk add libc6-compat for Alpine.docker scout cves <image>. Replace fat base with distroless or a hardened slim image. Remove unnecessary packages, especially curl, wget, and vim.file <binary> inside the container to check. Rebuild with the correct base image for your target platform.docker pull <image>@sha256:... to lock the exact base.docker image prune and docker builder prune to clean up dangling layers. In CI, use docker build --prune to avoid left over layers from previous builds.dive <image>docker image history --no-trunc <image>docker build --no-cache to discard stale layers.Common mistakes to avoid
4 patternsUsing depends_on without a healthcheck
Installing build tools in the final stage when using single-stage Dockerfile
Using :latest tag for base images
Not combining RUN apt-get install and apt-get clean in the same layer
Interview Questions on This Topic
Explain how Docker layers work and why deleting a file in a later layer doesn't reduce image size.
Frequently Asked Questions
20+ years shipping production infrastructure and CI/CD at scale. Lessons pulled from things that broke in production.
That's Docker. Mark it forged?
21 min read · try the examples if you haven't