Multi-Stage Docker Builds — Secrets Survive rm
Even after 'rm', secrets persist in Docker layers—token was recovered by inspecting history.
- Core mechanism: Each
FROMstarts a new stage. Only files explicitly copied viaCOPY --from=end up in the final image. Everything else — compilers, build caches, source code — is discarded. - Size impact: Typical reduction from 1.2 GB (single-stage) to 80-150 MB (multi-stage). The build tools never ship.
- Layer caching: BuildKit caches each stage independently. Changing application code does not invalidate the dependency-install stage.
- BuildKit parallelism: Stages with no dependency between them execute in parallel, cutting CI time.
- Security: Build secrets (API keys, tokens) used in early stages never appear in the final image layers.
- Biggest mistake: Forgetting that only explicitly COPYed artifacts survive. If you build a binary in stage 1 but forget to COPY it in stage 2, the final image has nothing to run.
Imagine you're baking a cake. You need mixing bowls, electric beaters, and measuring cups to make it — but when you serve the cake to guests, you don't put all that equipment on the plate with it. Multi-stage Docker builds work the same way: one stage is your messy kitchen where all the building happens, and the final stage is just the clean, finished cake. Your users get the cake. The mixing bowls stay in the kitchen — and never ship to production.
Every second your Docker image takes to pull across a network is a second your deployment is stalled. In Kubernetes environments rolling out hundreds of pods under load, or CI pipelines building dozens of images a day, bloated images are a reliability and cost problem. A Node.js app shipping with its full devDependencies, TypeScript compiler, and build toolchain alongside the production binary is a 1.2 GB image waiting to become a 3 AM outage.
Traditional single-stage Dockerfiles are all-or-nothing. You install the compiler, build the binary, copy the source — and all of it ends up baked into the final layer. Docker does not have a native concept of 'clean up after yourself' within a single build context, because every RUN instruction adds a new immutable layer. Removing files in a later layer does not reclaim space — it just hides them.
Multi-stage builds solve this by introducing multiple isolated build contexts inside one Dockerfile. Each FROM starts a fresh stage with its own filesystem. Only artifacts you explicitly copy forward survive into the final image. The build tools, intermediate object files, and source code are discarded with the build stage. This is the single most impactful Dockerfile optimization for production images.
How Multi-Stage Builds Work at the Layer Level
A Dockerfile with multiple FROM instructions creates multiple isolated stages. Each stage starts with a fresh filesystem initialised from its base image. Stages are identified by their index (0, 1, 2...) or by an alias assigned with AS.
The critical insight: only files you explicitly COPY --from=<stage> are transferred between stages. Everything else — compilers, build caches, intermediate object files, source code — exists only in the build stage's filesystem and is discarded when the build completes. The final image contains only the last stage's filesystem plus any files copied into it.
Docker's layer system means each RUN, COPY, and ADD instruction creates an immutable layer. In a single-stage build, RUN rm file creates a NEW layer that hides the file — the original layer with the file still exists in the image. Multi-stage builds avoid this entirely by never including the file in the final stage's layers in the first place.
- Each FROM creates a new isolated filesystem
- Data moves between stages ONLY via COPY --from
- Build stage is destroyed after build — its layers never ship
- Final image = last stage filesystem only
- Secrets in early stages cannot leak into final stage unless explicitly copied
COPY . .) before installing dependencies. When any source file changes, Docker invalidates the COPY layer and every subsequent layer — including the dependency installation layer. Effect: Every code change triggers a full npm install or go mod download, adding 30-120 seconds to build time depending on dependency count. In CI, this multiplies across dozens of daily builds. Action: Copy dependency manifests (package.json, go.mod, pom.xml) BEFORE copying source code. Install dependencies in a layer that only invalidates when the manifest changes. This single reordering typically cuts rebuild time by 60-80%.FROM scratch (~0 MB) or FROM alpine (~7 MB) with ca-certificates for HTTPS.FROM node:20-slim (~200 MB). Alpine may fail on native modules that require glibc.FROM eclipse-temurin:21-jre-alpine (~100 MB). JRE only, not JDK — the compiler is not needed at runtime.FROM python:3.12-slim (~150 MB). Alpine's musl libc fails on C extensions.FROM scratch with a statically linked binary. No shell, no attacks — but no debug tools either.Stage Targeting with --target: Production vs Build vs Test
Docker allows you to stop the build at any named stage using the --target flag. This is invaluable for development workflows, CI, and debugging. Without --target, Docker builds all stages up to the last FROM. With --target, you can request only the stages you need.
- Development: Build only up to the
depsorbuilderstage, which includes all dev dependencies and tooling. Developers get a container with live-reload tools (e.g., nodemon, reflex) without waiting for the full production image to build. - Testing: Build a
teststage that runs unit tests, linters, and security scans. CI can pull this stage, run tests, and discard it without ever building the final production image. - Production: Build the final runtime stage (
runtimeorproduction) that contains only what ships to production.
- Use
AS <name>on everyFROMyou may want to target. - The last
FROMthat isn't used as a base for other stages is the default target. - Build command:
docker build --target production -t myapp:latest .
- --target includes the named stage and all its upstream dependencies
- Stages not in the dependency chain are skipped completely
- Use --target in CI to build test image without waiting for production build
- Great for parallel CI: test stage and production stage can be built independently
docker build --target test to build only the test stage. Run tests in that image, then docker build --target production only on successful test passes. This cuts CI pipeline time by 30-50% for typical workflows.docker build --target <stage> gives you surgical control over which stages execute. Use it to create efficient CI pipelines that separate test from production builds, and to give developers fast feedback images. Name every stage you might want to target.--target development or --target dev. This stage has full toolchain and dev dependencies.--target test. Build only dependencies + test execution stage. Faster feedback.--target production (or the last stage, which is the default). Minimal, secure, no build tools.--target builder or the stage that fails. Quickly inspect state with docker run -it <stage-image> /bin/sh.BuildKit, Parallel Stages, and Cache Mounts
BuildKit is Docker's modern build engine, enabled by setting DOCKER_BUILDKIT=1 or using docker buildx. It brings three critical capabilities to multi-stage builds:
- Parallel stage execution: Stages that do not depend on each other run concurrently. If your Dockerfile has a test stage and a build stage that both depend on the dependency-install stage, BuildKit runs test and build in parallel after dependencies are installed.
- Cache mounts:
--mount=type=cachepersists a directory across builds without invalidating the layer. This is transformative for package managers — mount the npm/pip/go cache directory so dependency downloads are cached across builds even when the layer would otherwise be invalidated. - Secret mounts:
--mount=type=secretprovides a temporary file during RUN execution that is never stored in any layer. This is the correct way to use API keys, tokens, and credentials during builds.
- Layer cache: if inputs unchanged, skip layer entirely (0s)
- Cache mount: if layer re-runs, reuse downloaded packages (fast re-download)
- Use both together for maximum build speed
- Cache mounts require BuildKit — they do not work with the legacy builder
--mount=type=cache for package manager caches. Combine with proper layer ordering (dependency manifest before source code). This typically cuts CI build time from 3-5 minutes to 30-60 seconds for incremental changes.--mount=type=secret. Available as a file during RUN, never stored in any layer.--mount=type=cache targeting the package manager cache (e.g., /root/.npm, /root/.cache/pip).COPY --from=intermediate in the final stage.Production Patterns: Go, Node.js, and Java
Each language ecosystem has specific multi-stage build patterns that address its unique characteristics. The patterns below are battle-tested in production and handle the most common failure modes.
- Copy dependency manifest (pom.xml, go.mod, package.json) FIRST
- Install/download dependencies in a separate layer
- Copy source code AFTER dependencies are installed
- Build/compile in the final layer
- Changing source code only invalidates the build layer, not the dependency layer
FROM scratch (0 MB). Binary runs directly. COPY ca-certificates from alpine for HTTPS.FROM node:20-alpine (~120 MB). Fastest pull, smallest footprint.FROM node:20-slim (~200 MB). Alpine's musl libc breaks some native modules.FROM eclipse-temurin:21-jre-alpine (~120 MB). JRE only, not JDK.FROM python:3.12-alpine (~50 MB). Minimal footprint.FROM python:3.12-slim (~150 MB). Alpine requires recompiling C extensions.Image Size Reduction Comparison Table
One of the strongest arguments for multi-stage builds is the dramatic reduction in image size. Below is a comparison of typical image sizes for common language stacks using single-stage vs. optimal multi-stage builds. These numbers are based on production images (source code + dependencies + runtime) and assume best practices like using slim/alpine bases and stripping debug symbols.
The pattern is consistent: build tools and compilers account for 70-90% of image bulk. Multi-stage builds eliminate them from the final image, leaving only the runtime and the compiled artifacts.
- Go and Rust can use FROM scratch – often < 20 MB
- Node.js minimum ~120 MB (node:20-alpine), but includes Node runtime
- Java JRE alpine ~120 MB, but JDK adds ~700 MB
- Python slim ~150 MB, but C extensions add 200-400 MB
Multi-Stage Build Example: Go (Minimal Pattern)
This section provides a focused, minimal multi-stage Dockerfile for a Go application. It demonstrates the canonical pattern: build stage containing the compiler and source, and a runtime stage with only the binary. The result is a 97% reduction in image size.
Key steps: 1. Use golang:alpine as the build stage – includes Go compiler and standard library, but is smaller than full debian-based images. 2. Copy go.mod/go.sum first – layer caching ensures go mod download runs only when dependencies change. 3. Build with CGO_ENABLED=0 – produces a statically linked binary that does not depend on glibc, allowing FROM scratch as runtime. 4. Use -ldflags='-s -w' – strips debug symbol tables, reducing binary size by ~30%. 5. Runtime base: scratch – zero bytes, no shell, no package manager. Maximum security and minimal size.
- CGO_ENABLED=0 produces a fully static binary
- scratch has no shell, no utilities – secure but limited
- Add ca-certificates from builder for HTTPS calls
- For debugging, use ephemeral debug containers
file server (should say 'statically linked').FROM scratch. Smallest and most secure. Downside: no shell for debugging.FROM alpine:3.19 (5-7 MB) and install only what you need. Avoid debian-based images unless necessary.FROM gcr.io/distroless/base. Includes glibc, tzdata, but no shell or package manager. Good balance.Security: Secret Management and Image Scanning
Multi-stage builds are a security primitive, not just a size optimisation. The build stage isolation means secrets used during compilation never appear in the final image — IF you use the correct mechanisms. The wrong mechanism (ENV, ARG, or inline RUN) leaks secrets into layers permanently.
Three rules for secret management in Docker builds: 1. Never use ENV or ARG for secrets — they persist in image metadata. 2. Never hardcode secrets in RUN commands — they persist in layer history. 3. Always use BuildKit secret mounts — --mount=type=secret provides temporary file access without layer persistence.
Beyond secrets, the final image should be scanned for vulnerabilities. Even a minimal alpine base image may contain packages with known CVEs. Trivy, Grype, and Snyk Container can scan images in CI and block deployment if critical vulnerabilities are found.
- ARG values appear in docker inspect metadata
- ENV values appear in docker inspect and are available to all subsequent layers
- RUN echo secret > file stores the secret in that layer permanently
- --mount=type=secret is the ONLY mechanism that does not persist the secret
docker inspect or docker history on the image. In shared registries, this means every developer and every CI system with pull access can extract production secrets. Action: Use BuildKit secret mounts exclusively. Add Trivy or Grype scanning to CI to detect accidentally embedded secrets. Run docker history <image> as a post-build verification step.--mount=type=secret,id=token to mount the secret as a file during RUN. Never stored in any layer.docker run -e VAR=value or Kubernetes env vars — not in the Dockerfile.docker history <image> then docker inspect <image>. Search for leaked tokens. Add Trivy to CI.Secrets Leaked in Docker Image Layers Exposed API Keys to Public Registry
RUN rm /root/.npmrc after npm install would remove the token from the image. They did not understand that Docker layers are immutable — the rm creates a new layer that hides the file, but the original layer containing the token is still present in the image history./root/.npmrc in one RUN layer, npm install ran in the next layer, and RUN rm /root/.npmrc ran in a third layer. The rm command only marked the file as deleted in the new layer — the token was still recoverable from the earlier layer by inspecting docker history or extracting layers manually. The image was pushed to a public ECR registry without secret scanning.--mount=type=secret to mount the npm token as a temporary file that never persists in any layer.
2. Added DOCKER_BUILDKIT=1 to CI to enable BuildKit secret mounts.
3. Added Trivy and Hadolint scanning to CI pipeline — both detect secrets in image layers.
4. Rotated the compromised npm token immediately.
5. Made the registry private and added IP-based access controls.- Docker layers are immutable.
RUN rmdoes not remove data — it creates a new layer that hides the file. The original data is still in the image. - Never embed secrets in RUN commands. Use BuildKit secret mounts (
--mount=type=secret) or multi-stage builds where the secret stage is discarded. - Always run image security scanning (Trivy, Grype, Snyk Container) in CI before pushing to any registry.
- Public registries are hostile environments. Assume every layer will be inspected by an attacker.
COPY --from=builder path matches the actual build output location. Common mistake: building with WORKDIR /app in the build stage but copying from /build/output in the final stage. Check docker run --rm -it <build-stage-image> ls /app to see what the build stage actually produced.FROM node:20 as the runtime base pulls in the full Node.js SDK (~900 MB). Switch to FROM node:20-slim (~200 MB) or FROM node:20-alpine (~120 MB). Also verify that devDependencies are not installed in the final stage — use npm ci --omit=dev.DOCKER_BUILDKIT=1 or add "features": {"buildkit": true} to /etc/docker/daemon.json. Also check if layer caching is being invalidated by copying files too early in the Dockerfile.docker history <image> output.RUN --mount=type=secret,id=npm_token npm install and pass the secret at build time with --secret id=npm_token,src=.npmrc. Verify with docker history that the secret does not appear.Key takeaways
Interview Questions on This Topic
Frequently Asked Questions
That's Docker. Mark it forged?
5 min read · try the examples if you haven't