Multi-Stage Docker Builds: Smaller Images, Faster Deploys, Zero Bloat
- Multi-stage builds separate build tools from runtime artifacts. Only explicitly COPYed files survive. The final image contains only the last stage.
- BuildKit is not optional — cache mounts, secret mounts, and parallel stage execution all require it. Enable it in every CI pipeline.
- Dependency manifest before source code. Always. This single reordering cuts rebuild time by 60-80%.
- Core mechanism: Each
FROMstarts a new stage. Only files explicitly copied viaCOPY --from=end up in the final image. Everything else — compilers, build caches, source code — is discarded. - Size impact: Typical reduction from 1.2 GB (single-stage) to 80-150 MB (multi-stage). The build tools never ship.
- Layer caching: BuildKit caches each stage independently. Changing application code does not invalidate the dependency-install stage.
- BuildKit parallelism: Stages with no dependency between them execute in parallel, cutting CI time.
- Security: Build secrets (API keys, tokens) used in early stages never appear in the final image layers.
- Biggest mistake: Forgetting that only explicitly COPYed artifacts survive. If you build a binary in stage 1 but forget to COPY it in stage 2, the final image has nothing to run.
Final image is unexpectedly large (>500 MB for a compiled binary).
docker history <image> --no-trunc # inspect layer sizesdocker run --rm <image> du -sh /* # find large directoriesBinary or app files missing in final image.
docker run --rm <build-stage> ls -la /app/dist # check build outputdocker inspect <final-image> # check Entrypoint and CmdBuild cache invalidated on every run.
docker build --no-cache -t test . # compare with cached build timedocker build --progress=plain -t test . 2>&1 | grep CACHED # see which layers hit cacheSecrets visible in docker history.
docker history <image> | grep -i token # check for leaked secretsdocker run --rm <image> cat /root/.npmrc # check if secret file existsMulti-stage build slower than expected — stages not running in parallel.
docker buildx version # check if buildx is availableDOCKER_BUILDKIT=1 docker build --progress=plain . # check for parallel stage executionProduction Incident
RUN rm /root/.npmrc after npm install would remove the token from the image. They did not understand that Docker layers are immutable — the rm creates a new layer that hides the file, but the original layer containing the token is still present in the image history./root/.npmrc in one RUN layer, npm install ran in the next layer, and RUN rm /root/.npmrc ran in a third layer. The rm command only marked the file as deleted in the new layer — the token was still recoverable from the earlier layer by inspecting docker history or extracting layers manually. The image was pushed to a public ECR registry without secret scanning.--mount=type=secret to mount the npm token as a temporary file that never persists in any layer.
2. Added DOCKER_BUILDKIT=1 to CI to enable BuildKit secret mounts.
3. Added Trivy and Hadolint scanning to CI pipeline — both detect secrets in image layers.
4. Rotated the compromised npm token immediately.
5. Made the registry private and added IP-based access controls.RUN rm does not remove data — it creates a new layer that hides the file. The original data is still in the image.Never embed secrets in RUN commands. Use BuildKit secret mounts (--mount=type=secret) or multi-stage builds where the secret stage is discarded.Always run image security scanning (Trivy, Grype, Snyk Container) in CI before pushing to any registry.Public registries are hostile environments. Assume every layer will be inspected by an attacker.Production Debug GuideWhen the final image is missing files, too large, or builds are slower than expected.
COPY --from=builder path matches the actual build output location. Common mistake: building with WORKDIR /app in the build stage but copying from /build/output in the final stage. Check docker run --rm -it <build-stage-image> ls /app to see what the build stage actually produced.FROM node:20 as the runtime base pulls in the full Node.js SDK (~900 MB). Switch to FROM node:20-slim (~200 MB) or FROM node:20-alpine (~120 MB). Also verify that devDependencies are not installed in the final stage — use npm ci --omit=dev.DOCKER_BUILDKIT=1 or add "features": {"buildkit": true} to /etc/docker/daemon.json. Also check if layer caching is being invalidated by copying files too early in the Dockerfile.docker history <image> output.→The secret was used in a RUN command without BuildKit secret mounts. Convert to RUN --mount=type=secret,id=npm_token npm install and pass the secret at build time with --secret id=npm_token,src=.npmrc. Verify with docker history that the secret does not appear.Every second your Docker image takes to pull across a network is a second your deployment is stalled. In Kubernetes environments rolling out hundreds of pods under load, or CI pipelines building dozens of images a day, bloated images are a reliability and cost problem. A Node.js app shipping with its full devDependencies, TypeScript compiler, and build toolchain alongside the production binary is a 1.2 GB image waiting to become a 3 AM outage.
Traditional single-stage Dockerfiles are all-or-nothing. You install the compiler, build the binary, copy the source — and all of it ends up baked into the final layer. Docker does not have a native concept of 'clean up after yourself' within a single build context, because every RUN instruction adds a new immutable layer. Removing files in a later layer does not reclaim space — it just hides them.
Multi-stage builds solve this by introducing multiple isolated build contexts inside one Dockerfile. Each FROM starts a fresh stage with its own filesystem. Only artifacts you explicitly copy forward survive into the final image. The build tools, intermediate object files, and source code are discarded with the build stage. This is the single most impactful Dockerfile optimization for production images.
How Multi-Stage Builds Work at the Layer Level
A Dockerfile with multiple FROM instructions creates multiple isolated stages. Each stage starts with a fresh filesystem initialised from its base image. Stages are identified by their index (0, 1, 2...) or by an alias assigned with AS.
The critical insight: only files you explicitly COPY --from=<stage> are transferred between stages. Everything else — compilers, build caches, intermediate object files, source code — exists only in the build stage's filesystem and is discarded when the build completes. The final image contains only the last stage's filesystem plus any files copied into it.
Docker's layer system means each RUN, COPY, and ADD instruction creates an immutable layer. In a single-stage build, RUN rm file creates a NEW layer that hides the file — the original layer with the file still exists in the image. Multi-stage builds avoid this entirely by never including the file in the final stage's layers in the first place.
# ───────────────────────────────────────────────────────────── # Stage 1: Build stage — contains Go compiler, source, dependencies # This stage is ~800 MB but NEVER ships to production # ───────────────────────────────────────────────────────────── FROM golang:1.22-alpine AS builder WORKDIR /app # Copy dependency files FIRST — this layer is cached until go.mod changes COPY go.mod go.sum ./ RUN go mod download # Copy source code AFTER dependencies — changing source doesn't invalidate dep cache COPY . . # Build a statically linked binary — no runtime dependencies needed # CGO_ENABLED=0 ensures no C library dependency # -ldflags='-s -w' strips debug symbols, reducing binary size by ~30% RUN CGO_ENABLED=0 GOOS=linux go build \ -ldflags='-s -w' \ -o /app/server \ ./cmd/server # ───────────────────────────────────────────────────────────── # Stage 2: Runtime stage — contains ONLY the binary and config # This stage is ~15-25 MB — a 97% reduction from the build stage # ───────────────────────────────────────────────────────────── FROM alpine:3.19 AS runtime # Install CA certificates for HTTPS and tzdata for timezone support RUN apk --no-cache add ca-certificates tzdata WORKDIR /app # Copy ONLY the compiled binary from the builder stage # Everything else (Go compiler, source, build cache) is discarded COPY --from=builder /app/server . # Copy config files if needed COPY --from=builder /app/config ./config # Run as non-root user — security best practice RUN addgroup -S appgroup && adduser -S appuser -G appgroup USER appuser EXPOSE 8080 ENTRYPOINT ["./server"]
# DOCKER_BUILDKIT=1 docker build -t myapp:latest .
#
# Size comparison:
# golang:1.22-alpine base: ~800 MB
# Builder stage with binary: ~820 MB
# Final runtime stage: ~22 MB (97% reduction)
#
# Verify with:
# docker images myapp:latest
# REPOSITORY TAG SIZE
# myapp latest 22.4MB
- Each FROM creates a new isolated filesystem
- Data moves between stages ONLY via COPY --from
- Build stage is destroyed after build — its layers never ship
- Final image = last stage filesystem only
- Secrets in early stages cannot leak into final stage unless explicitly copied
COPY . .) before installing dependencies. When any source file changes, Docker invalidates the COPY layer and every subsequent layer — including the dependency installation layer. Effect: Every code change triggers a full npm install or go mod download, adding 30-120 seconds to build time depending on dependency count. In CI, this multiplies across dozens of daily builds. Action: Copy dependency manifests (package.json, go.mod, pom.xml) BEFORE copying source code. Install dependencies in a layer that only invalidates when the manifest changes. This single reordering typically cuts rebuild time by 60-80%.FROM scratch (~0 MB) or FROM alpine (~7 MB) with ca-certificates for HTTPS.FROM node:20-slim (~200 MB). Alpine may fail on native modules that require glibc.FROM eclipse-temurin:21-jre-alpine (~100 MB). JRE only, not JDK — the compiler is not needed at runtime.FROM python:3.12-slim (~150 MB). Alpine's musl libc fails on C extensions.FROM scratch with a statically linked binary. No shell, no attacks — but no debug tools either.BuildKit, Parallel Stages, and Cache Mounts
BuildKit is Docker's modern build engine, enabled by setting DOCKER_BUILDKIT=1 or using docker buildx. It brings three critical capabilities to multi-stage builds:
- Parallel stage execution: Stages that do not depend on each other run concurrently. If your Dockerfile has a test stage and a build stage that both depend on the dependency-install stage, BuildKit runs test and build in parallel after dependencies are installed.
- Cache mounts:
--mount=type=cachepersists a directory across builds without invalidating the layer. This is transformative for package managers — mount the npm/pip/go cache directory so dependency downloads are cached across builds even when the layer would otherwise be invalidated. - Secret mounts:
--mount=type=secretprovides a temporary file during RUN execution that is never stored in any layer. This is the correct way to use API keys, tokens, and credentials during builds.
# syntax=docker/dockerfile:1 # required for BuildKit features # ───────────────────────────────────────────────────────────── # Stage 1: Dependency installation with cache mount # The npm cache persists across builds — re-installs are near-instant # ───────────────────────────────────────────────────────────── FROM node:20-alpine AS deps WORKDIR /app COPY package.json package-lock.json ./ # --mount=type=cache persists /root/.npm across builds # Even if this layer is invalidated, the cache is not lost RUN --mount=type=cache,target=/root/.npm \ npm ci --omit=dev # ───────────────────────────────────────────────────────────── # Stage 2: Build stage with secret mount for private registry # The npm token is available during build but never stored in a layer # ───────────────────────────────────────────────────────────── FROM node:20-alpine AS builder WORKDIR /app COPY package.json package-lock.json ./ # Secret mount: token is available as a file during this RUN only # It is NEVER stored in any layer — not even in a hidden layer RUN --mount=type=secret,id=npm_token,target=/run/secrets/npm_token \ --mount=type=cache,target=/root/.npm \ NPM_TOKEN=$(cat /run/secrets/npm_token) npm ci COPY . . RUN npm run build # ───────────────────────────────────────────────────────────── # Stage 3: Production runtime — minimal image # ───────────────────────────────────────────────────────────── FROM node:20-alpine AS runtime WORKDIR /app # Copy only production dependencies from deps stage COPY --from=deps /app/node_modules ./node_modules # Copy built application from builder stage COPY --from=builder /app/dist ./dist COPY --from=builder /app/package.json ./ # Run as non-root RUN addgroup -S appgroup && adduser -S appuser -G appgroup USER appuser EXPOSE 3000 CMD ["node", "dist/index.js"] # ───────────────────────────────────────────────────────────── # Build command with secret: # DOCKER_BUILDKIT=1 docker build \ # --secret id=npm_token,src=.npmrc_token \ # -t myapp:latest .
# [+] Building 12.3s (14/14) FINISHED
# => [deps 2/2] RUN --mount=type=cache npm ci 8.2s (cached)
# => [builder 3/4] RUN --mount=type=secret npm ci 9.1s
# => [builder 4/4] RUN npm run build 3.2s
# => [runtime 3/3] COPY --from=builder /app/dist 0.1s
#
# Second build (only source code changed):
# [+] Building 4.1s (14/14) FINISHED ← deps cached, only build stage reruns
# => [deps 2/2] RUN --mount=type=cache npm ci 0.3s (cache hit)
# => [builder 3/4] RUN npm ci 0.5s (cache hit)
# => [builder 4/4] RUN npm run build 3.2s (only this reruns)
--mount=type=cache for package manager caches. Combine with proper layer ordering (dependency manifest before source code). This typically cuts CI build time from 3-5 minutes to 30-60 seconds for incremental changes.--mount=type=secret. Available as a file during RUN, never stored in any layer.--mount=type=cache targeting the package manager cache (e.g., /root/.npm, /root/.cache/pip).COPY --from=intermediate in the final stage.Production Patterns: Go, Node.js, and Java
Each language ecosystem has specific multi-stage build patterns that address its unique characteristics. The patterns below are battle-tested in production and handle the most common failure modes.
# ───────────────────────────────────────────────────────────── # Java Multi-Stage Build: Maven build + JRE runtime # Reduces image from ~800 MB (JDK) to ~120 MB (JRE alpine) # ───────────────────────────────────────────────────────────── # Stage 1: Build with Maven FROM maven:3.9-eclipse-temurin-21 AS builder WORKDIR /app # Copy POM first — this layer is cached until pom.xml changes COPY pom.xml . # Download dependencies with cache mount RUN --mount=type=cache,target=/root/.m2 \ mvn dependency:go-offline -B # Copy source and build COPY src ./src RUN --mount=type=cache,target=/root/.m2 \ mvn package -DskipTests -B # Stage 2: Extract JRE runtime FROM eclipse-temurin:21-jre-alpine AS runtime WORKDIR /app # Copy only the fat JAR from the build stage COPY --from=builder /app/target/*.jar app.jar # JVM tuning for containers — critical for production # -XX:+UseContainerSupport respects cgroup memory limits # -XX:MaxRAMPercentage=75% uses 75% of container memory for heap ENV JAVA_OPTS="-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0 -XX:+UseG1GC" EXPOSE 8080 ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar app.jar"]
# maven:3.9-eclipse-temurin-21: ~800 MB (build stage — discarded)
# eclipse-temurin:21-jre-alpine: ~120 MB (runtime stage — ships)
# Final image with app JAR: ~145 MB
- Copy dependency manifest (pom.xml, go.mod, package.json) FIRST
- Install/download dependencies in a separate layer
- Copy source code AFTER dependencies are installed
- Build/compile in the final layer
- Changing source code only invalidates the build layer, not the dependency layer
FROM scratch (0 MB). Binary runs directly. COPY ca-certificates from alpine for HTTPS.FROM node:20-alpine (~120 MB). Fastest pull, smallest footprint.FROM node:20-slim (~200 MB). Alpine's musl libc breaks some native modules.FROM eclipse-temurin:21-jre-alpine (~120 MB). JRE only, not JDK.FROM python:3.12-alpine (~50 MB). Minimal footprint.FROM python:3.12-slim (~150 MB). Alpine requires recompiling C extensions.Security: Secret Management and Image Scanning
Multi-stage builds are a security primitive, not just a size optimisation. The build stage isolation means secrets used during compilation never appear in the final image — IF you use the correct mechanisms. The wrong mechanism (ENV, ARG, or inline RUN) leaks secrets into layers permanently.
Three rules for secret management in Docker builds: 1. Never use ENV or ARG for secrets — they persist in image metadata. 2. Never hardcode secrets in RUN commands — they persist in layer history. 3. Always use BuildKit secret mounts — --mount=type=secret provides temporary file access without layer persistence.
Beyond secrets, the final image should be scanned for vulnerabilities. Even a minimal alpine base image may contain packages with known CVEs. Trivy, Grype, and Snyk Container can scan images in CI and block deployment if critical vulnerabilities are found.
# syntax=docker/dockerfile:1 # ───────────────────────────────────────────────────────────── # WRONG: Secrets in ENV or ARG — visible in docker history and inspect # ───────────────────────────────────────────────────────────── # DO NOT DO THIS: # ENV NPM_TOKEN=ghp_xxxxxxxxxxxx ← visible in docker inspect # ARG DB_PASSWORD=secret123 ← visible in docker history # RUN echo $NPM_TOKEN > .npmrc ← visible in layer history # ───────────────────────────────────────────────────────────── # CORRECT: BuildKit secret mounts # ───────────────────────────────────────────────────────────── FROM node:20-alpine AS builder WORKDIR /app COPY package.json package-lock.json ./ # Secret is mounted as a file at /run/secrets/npm_token # Available only during this RUN command # Not stored in any layer — not even hidden RUN --mount=type=secret,id=npm_token,target=/run/secrets/npm_token \ --mount=type=cache,target=/root/.npm \ NPM_TOKEN=$(cat /run/secrets/npm_token) npm ci COPY . . RUN npm run build # ───────────────────────────────────────────────────────────── # Final stage: no secrets, no build tools, no source code # ───────────────────────────────────────────────────────────── FROM node:20-alpine AS runtime WORKDIR /app COPY --from=builder /app/node_modules ./node_modules COPY --from=builder /app/dist ./dist COPY --from=builder /app/package.json ./ USER node CMD ["node", "dist/index.js"] # ───────────────────────────────────────────────────────────── # Build command: # DOCKER_BUILDKIT=1 docker build \ # --secret id=npm_token,src=./.npm_token_value \ # -t myapp:latest . # # Verify no secrets in image: # docker history myapp:latest | grep -i token # should return nothing # docker inspect myapp:latest | grep -i token # should return nothing
# $ docker history myapp:latest
# IMAGE CREATED BY SIZE
# abc123 CMD ["node" "dist/index.js"] 0B
# def456 COPY /app/dist ./dist # buildkit 2.1MB
# ...
# No secret tokens visible anywhere in the history
docker inspect output even if they are not used in the final stage. If you use ARG NPM_TOKEN in a build stage, the value is stored in the image's metadata JSON. Use --mount=type=secret instead — secrets mounted this way are never stored in any image metadata, layer history, or intermediate filesystem.docker inspect or docker history on the image. In shared registries, this means every developer and every CI system with pull access can extract production secrets. Action: Use BuildKit secret mounts exclusively. Add Trivy or Grype scanning to CI to detect accidentally embedded secrets. Run docker history <image> as a post-build verification step.--mount=type=secret,id=token to mount the secret as a file during RUN. Never stored in any layer.docker run -e VAR=value or Kubernetes env vars — not in the Dockerfile.docker history <image> then docker inspect <image>. Search for leaked tokens. Add Trivy to CI.| Aspect | Single-Stage Build | Multi-Stage Build |
|---|---|---|
| Dockerfile structure | One FROM instruction | Multiple FROM instructions, each starting a new stage |
| Final image contents | Everything: compilers, source, build tools, runtime | Only runtime artifacts explicitly copied from build stages |
| Typical image size | 800 MB – 1.5 GB | 20 MB – 200 MB (90%+ reduction) |
| Secret handling | Secrets persist in layers unless manually removed (unsafe) | Secrets in build stages never appear in final image |
| Layer caching | Single cache chain — any change invalidates downstream layers | Per-stage caching — dependency stage cached independently of build stage |
| BuildKit parallelism | Not applicable — single sequential build | Independent stages execute in parallel |
| Security surface | Large — shell, package manager, compilers all present | Minimal — only runtime dependencies, often no shell |
| Debugging in production | Easy — full toolchain available in container | Harder — use ephemeral debug containers or distroless+debug images |
| CI build time | Slow — full rebuild on any code change | Fast — dependency layer cached, only build layer reruns |
🎯 Key Takeaways
- Multi-stage builds separate build tools from runtime artifacts. Only explicitly COPYed files survive. The final image contains only the last stage.
- BuildKit is not optional — cache mounts, secret mounts, and parallel stage execution all require it. Enable it in every CI pipeline.
- Dependency manifest before source code. Always. This single reordering cuts rebuild time by 60-80%.
- Secrets in ENV, ARG, or inline RUN commands persist in image layers. Use --mount=type=secret exclusively.
- The runtime base image determines your security surface and pull time. Choose the smallest base that supports your runtime requirements.
- RUN rm does not remove data from images. Multi-stage builds are the correct mechanism for excluding build artifacts.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QExplain how multi-stage Docker builds work at the layer level. What happens to the build stage's filesystem after the build completes?
- QHow would you reduce a 1.2 GB Node.js Docker image to under 150 MB? Walk through every change you would make to the Dockerfile.
- QWhat is the difference between layer caching and cache mounts (--mount=type=cache)? When would you use each?
- QA developer uses
ARG NPM_TOKENandRUN echo $NPM_TOKEN > .npmrcto install private packages. What is the security issue, and how do you fix it? - QHow does BuildKit enable parallel stage execution? What constraint must be satisfied for two stages to run in parallel?
- QYour CI builds take 8 minutes, with 6 minutes spent on npm install. The package.json changes rarely. How do you optimise the Dockerfile to cut build time?
- QWhat is the trade-off between using
FROM scratchandFROM alpineas a runtime base image? When would you choose each?
Frequently Asked Questions
What is a multi-stage Docker build?
A Dockerfile with multiple FROM instructions, each starting an isolated build stage. The build stage contains compilers and tools. The final stage contains only the runtime artifacts copied from the build stage via COPY --from. Build tools never ship in the final image.
How much smaller are multi-stage images?
Typically 90-97% smaller. A Go application goes from ~800 MB (full golang base) to ~20 MB (scratch with static binary). A Node.js application goes from ~1.2 GB (full node base with devDependencies) to ~150 MB (alpine with production dependencies only).
Do I need BuildKit for multi-stage builds?
Multi-stage builds work without BuildKit, but you lose cache mounts, secret mounts, and parallel stage execution. BuildKit is required for the security and performance features that make multi-stage builds production-grade. Always enable it.
How do I debug a container built from a minimal image?
Use ephemeral debug containers: kubectl debug -it <pod> --image=busybox --target=<container>. Or add a debug stage that includes debugging tools and use docker build --target debug when you need it. Never ship debugging tools in the production image.
Can I use multi-stage builds with docker-compose?
Yes. docker-compose supports multi-stage Dockerfiles natively. You can also specify a target stage with target: runtime in the build section to stop at a specific stage (useful for development builds that need the full toolchain).
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.