Skip to content
Home DevOps Dockerfile Explained: Instructions, Layers & Real-World Patterns

Dockerfile Explained: Instructions, Layers & Real-World Patterns

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Docker → Topic 7 of 18
Dockerfile explained for intermediate developers — understand every instruction, why layers matter, multi-stage builds, and the mistakes that break production images.
⚙️ Intermediate — basic DevOps knowledge assumed
In this tutorial, you'll learn
Dockerfile explained for intermediate developers — understand every instruction, why layers matter, multi-stage builds, and the mistakes that break production images.
  • Docker builds images as a stack of cached layers — order your COPY and RUN instructions from least-to-most frequently changing, always copying dependency manifests before source code, to get near-instant cached rebuilds.
  • Always use exec form (JSON array syntax) for CMD and ENTRYPOINT — shell form wraps your process in /bin/sh -c, bumping it to PID 2 and silently breaking graceful shutdown in Docker and Kubernetes.
  • Multi-stage builds let you use a full toolchain (800MB) during compilation and ship only the compiled output (12MB) to production — the build stage is discarded and never pushed to a registry.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer
  • Each instruction creates a read-only layer — a filesystem diff on top of the previous layer
  • Docker caches layers sequentially — changing one layer invalidates all layers after it
  • Order instructions from least-likely-to-change to most-likely-to-change for optimal caching
  • Multi-stage builds let you use heavy toolchains during compilation and ship only the output
  • FROM: selects the base image (alpine for size, debian/ubuntu for compatibility)
  • RUN: executes commands during build — chain with && to reduce layer count
  • COPY: copies files into the image — prefer over ADD unless you need tar extraction
  • CMD vs ENTRYPOINT: CMD provides default args, ENTRYPOINT sets the fixed executable
  • ARG vs ENV: ARG is build-time only, ENV persists at runtime — never put secrets in either
🚨 START HERE
Dockerfile Build Triage Cheat Sheet
First-response commands when Dockerfile builds are slow, images are bloated, or containers behave unexpectedly.
🟠Docker build is slow — every rebuild takes minutes.
Immediate ActionCheck which layers are being rebuilt vs cached.
Commands
docker build --progress=plain -t test:latest . 2>&1 | grep -E 'CACHED|RUN|COPY'
docker history <image> --format '{{.CreatedBy}} {{.Size}}'
Fix NowIf RUN npm install rebuilds on every change, move COPY package.json before COPY . . . Separate dependency installation from source code copying.
🟡Image is unexpectedly large (>500MB for a web app).
Immediate ActionInspect layer sizes and check for .dockerignore.
Commands
docker history <image> --format '{{.Size}} {{.CreatedBy}}' | sort -hr
cat .dockerignore 2>/dev/null || echo 'NO .dockerignore FILE'
Fix NowIf no .dockerignore exists, create one. If build tools are in the final image, use multi-stage builds. If apt/pip cache is in a layer, chain cleanup in the same RUN.
🟡Container takes 30 seconds to stop (SIGTERM not handled).
Immediate ActionCheck if CMD/ENTRYPOINT uses shell form.
Commands
docker inspect <image> --format '{{json .Config.Cmd}}'
docker run --rm <image> ps aux | head -5
Fix NowIf CMD is [/bin/sh -c ...], change to exec form: CMD ["app", "--flag"]. Verify PID 1 is your app process, not a shell.
🟡Secret found in image history after being removed from Dockerfile.
Immediate ActionRotate the exposed secret immediately.
Commands
docker history --no-trunc <image> | grep -i 'secret\|password\|key\|token'
docker save <image> | tar -xO 2>/dev/null | grep -c 'secret-value'
Fix NowRotate credentials. Rebuild with BuildKit --mount=type=secret. Add secret file patterns to .dockerignore.
🟡Build fails — 'COPY failed: file not found in build context'.
Immediate ActionCheck .dockerignore and file path relative to build context.
Commands
cat .dockerignore
ls -la <file-path-relative-to-build-context>
Fix NowIf file is in .dockerignore, remove it or use a different pattern. If path is wrong, use relative path from the directory where docker build is run.
🟡Multi-stage build produces unexpected output or missing files.
Immediate ActionInspect intermediate stages and COPY --from references.
Commands
docker build --target builder -t debug-builder . && docker run --rm -it debug-builder ls -la /build/
docker inspect <final-image> --format '{{json .RootFS.Layers}}' | python3 -m json.tool
Fix NowBuild the builder stage separately and inspect its filesystem. Verify the COPY --from=builder source path matches the actual output location.
Production IncidentKubernetes Pods Take 30 Seconds to Stop — Graceful Shutdown Silently Broken by Shell-Form CMDA team migrated their Node.js API from Docker Compose to Kubernetes. During rolling updates, old pods took exactly 30 seconds to terminate (the default grace period) instead of shutting down gracefully in 2-3 seconds. The application's graceful shutdown handler (drain HTTP connections, close database pools) never executed.
SymptomDuring Kubernetes rolling updates, old pods showed Terminating status for exactly 30 seconds before being killed. Application logs showed no shutdown message (the team had added a SIGTERM handler that logged 'Shutting down gracefully...'). Database connection pools were not closed cleanly, causing 'connection reset by peer' errors on the database server. The team checked the Kubernetes events: 'Killing container with id docker://api:pod did not terminate in 30s, using SIGKILL'.
AssumptionThe team assumed the application's SIGTERM handler had a bug. They tested it locally with docker stop and it worked — the shutdown message appeared and the process exited cleanly in 2 seconds. They assumed Kubernetes was sending a different signal. They added handlers for SIGINT, SIGHUP, and SIGQUIT. None of them fired. The team spent 3 days debugging the signal handling code.
Root causeThe Dockerfile used shell-form CMD: CMD node dist/index.js. Shell form wraps the command in /bin/sh -c, making the shell PID 1 and the Node.js process PID 2. Kubernetes sends SIGTERM to PID 1 (the shell). The shell does not forward signals to child processes by default. After 30 seconds (the default terminationGracePeriodSeconds), Kubernetes sends SIGKILL to all processes, killing the Node.js process without running any shutdown handlers. Locally, docker stop sends SIGTERM and then SIGKILL after a timeout, but Docker Desktop's behavior differs slightly, masking the issue during local testing.
Fix1. Changed CMD to exec form: CMD ["node", "dist/index.js"]. This makes the Node.js process PID 1, directly receiving SIGTERM. 2. Verified with docker run --rm <image> ps aux — PID 1 was now 'node dist/index.js' instead of '/bin/sh -c node dist/index.js'. 3. Tested in Kubernetes: pods now terminated in 2 seconds with the shutdown message appearing in logs. 4. Added a CI check that scans Dockerfiles for shell-form CMD/ENTRYPOINT and fails the build if found. 5. Documented the exec-form requirement in the team's Dockerfile style guide.
Key Lesson
Shell-form CMD wraps your process in /bin/sh -c, making it PID 2. SIGTERM goes to PID 1 (the shell), not your app. Graceful shutdown is silently broken.Exec-form CMD (JSON array syntax) makes your app PID 1. SIGTERM reaches your app directly. Always use exec form for CMD and ENTRYPOINT.docker stop and Kubernetes SIGTERM behavior can differ. Test signal handling in Kubernetes, not just locally.Add a CI check that detects shell-form CMD/ENTRYPOINT. This is a silent failure that only manifests under load during rolling updates.
Production Debug GuideFrom slow builds to bloated images — systematic debugging paths.
Docker build is slow — every rebuild takes 3-5 minutes even for small code changes.Check layer ordering. Run docker history <image> to see which layers were rebuilt. If the dependency install layer (npm install, pip install) rebuilds on every code change, the Dockerfile copies source code before dependency manifests. Fix: copy package.json/requirements.txt in a separate layer before COPY . . and run the install command in that layer.
Image is unexpectedly large — 1GB+ for a simple web application.Run docker history --no-trunc <image> to see layer sizes. Look for layers with large intermediate files (apt cache, npm cache, build artifacts). Check if .dockerignore exists — without it, COPY . . includes node_modules and .git. Check if multi-stage builds are used — the final image may contain build tools that should be in a discarded builder stage.
Container ignores SIGTERM — takes 30 seconds to stop in Kubernetes.Check if CMD or ENTRYPOINT uses shell form: docker inspect <image> --format '{{.Config.Cmd}}'. If the output is [/bin/sh -c node server.js], it is shell form. Fix: change to exec form: CMD ["node", "server.js"]. Verify PID 1: docker run --rm <image> ps aux. PID 1 should be your app, not /bin/sh.
Secrets visible in image history after being removed from Dockerfile.Run docker history --no-trunc <image> and search for the secret value. Secrets in ENV, ARG, or RUN commands are permanently stored in layer history. Even if you delete the secret in a later layer, it persists in the earlier layer. Fix: rotate the secret immediately. Rebuild using BuildKit --mount=type=secret for build-time secrets.
Build fails with 'COPY failed: file not found'.Check if the file is excluded by .dockerignore: cat .dockerignore. Check if the file path is correct relative to the build context (the directory where docker build is run). Check if the file exists: ls -la <file>. Common mistake: using an absolute path in COPY instead of a path relative to the build context.
Container runs as root despite USER instruction in Dockerfile.Check if the USER instruction is before CMD/ENTRYPOINT. Check if the base image overrides USER in its entrypoint. Verify: docker run --rm <image> whoami. If it returns root, check docker inspect <image> --format '{{.Config.User}}'. If empty, the USER instruction was not applied. Check if the user exists in /etc/passwd inside the image.

Dockerfiles eliminate environment drift. A Dockerfile is a plain-text script that defines every dependency, runtime, and configuration your application needs. Docker reads it and builds an image — a portable, immutable snapshot that runs identically on any machine.

The layer caching mechanism is the single most important concept. Each instruction creates a cached layer. Changing one instruction invalidates all subsequent layers. Order your instructions from least-likely-to-change to most-likely-to-change to maximize cache hits during development.

Three misconceptions cause the most production issues: CMD without exec form silently breaks graceful shutdown in Kubernetes, ENV and ARG are visible in docker history (never put secrets in them), and .dockerignore is not optional (COPY . . without it bakes secrets and gigabytes of junk into the image).

How Docker Builds an Image — Layers Are Everything

Before you write a single Dockerfile instruction, you need a mental model of what Docker is actually doing when it reads your file. Docker doesn't build one monolithic blob. It builds a stack of read-only layers, one per instruction. Each layer is a diff — only the filesystem changes from that step.

Why does this matter? Because Docker caches every layer. If you rebuild an image and nothing changed in a particular step, Docker reuses the cached layer instead of running it again. This turns a 3-minute build into a 4-second build. But the cache is sequential — as soon as one layer is invalidated (because something changed), every layer after it is also invalidated and rebuilt from scratch.

This single insight drives the most important Dockerfile design decision you'll ever make: order your instructions from least-likely-to-change to most-likely-to-change. Your base OS almost never changes. Your system dependencies change occasionally. Your app's package dependencies change sometimes. Your source code changes constantly. Structure your Dockerfile in that order and you'll get near-instant cached rebuilds during development.

Think of layers like a stack of transparent slides on an overhead projector. Each slide adds something. You can swap out the top slide without reprinting all the slides beneath it.

Layer size and the cleanup-in-same-layer rule: Each RUN instruction creates a new layer. If you download a 200MB package in one RUN and delete it in the next RUN, the 200MB still exists in the first layer — layers are additive. The delete only adds a whiteout marker. Always chain download and cleanup in the same RUN with && to avoid bloating the image with phantom files.

io/thecodeforge/Dockerfile.layer-demo · DOCKERFILE
12345678910111213141516171819202122232425262728293031
# Layer 1Base image pulled from Docker Hub.
# This layer is cached after the first pull and almost never changes.
FROM node:20-alpine

# Layer 2Set the working directory inside the container.
# All subsequent instructions run relative to this path.
WORKDIR /app

# Layer 3Copy ONLY the dependency manifest files first.
# Separating this from the source code is the key cache optimization.
# This layer only rebuilds when package.json or the lockfile changes.
COPY package.json package-lock.json ./

# Layer 4Install dependencies.
# Because we copied manifests separately above, npm install only re-runs
# when a dependency actually changes — not every time you edit a .js file.
RUN npm ci --omit=dev

# Layer 5Now copy the actual source code.
# This layer changes on every code edit, but that's fine because
# the expensive npm install layer above is still cached.
COPY src/ ./src/

# Layer 6Declare the port the app listens on (documentation only —
# EXPOSE does NOT actually publish the port to the host).
EXPOSE 3000

# Layer 7The default command to start the application.
# Using the JSON array (exec) form avoids spawning a shell,
# which means SIGTERM signals reach your Node process directly.
CMD ["node", "src/index.js"]
▶ Output
$ docker build -t my-node-app:1.0 .
[+] Building 42.3s (8/8) FINISHED
=> [1/6] FROM node:20-alpine 12.1s
=> [2/6] WORKDIR /app 0.1s
=> [3/6] COPY package.json package-lock.json ./ 0.1s
=> [4/6] RUN npm ci --omit=dev 28.4s
=> [5/6] COPY src/ ./src/ 0.2s
=> [6/6] EXPOSE 3000 0.0s
=> exporting to image 1.4s

# Now edit src/index.js and rebuild:
$ docker build -t my-node-app:1.1 .
[+] Building 1.2s (8/8) FINISHED
=> [1/6] FROM node:20-alpine CACHED
=> [2/6] WORKDIR /app CACHED
=> [3/6] COPY package.json ... CACHED
=> [4/6] RUN npm ci --omit=dev CACHED <- 28 seconds saved!
=> [5/6] COPY src/ ./src/ 0.2s <- only this layer rebuilt
=> exporting to image 0.8s
Mental Model
Layers as Transparent Slides
Why does changing one layer invalidate all layers after it?
  • Each layer is a diff on top of the previous layer. If the base changes, the diff no longer applies.
  • Docker cannot know if a later instruction depends on the changed content in an earlier layer.
  • The cache is sequential, not selective — Docker rebuilds from the first invalidated layer onward.
  • This is why layer ordering (least-change to most-change) is the single most impactful Dockerfile optimization.
📊 Production Insight
The cleanup-in-same-layer rule is the most common cause of bloated images. A team's image was 1.2GB because they ran apt-get install in one RUN and apt-get clean in the next. The 800MB apt cache persisted in the first layer. Fix: chain with && and clean up in the same RUN. This alone reduced their image from 1.2GB to 340MB.
🎯 Key Takeaway
Docker builds images as a stack of cached layers. Order instructions from least-to-most frequently changing. Copy dependency manifests before source code. Chain cleanup in the same RUN as the operation. This single optimization can turn 3-minute rebuilds into 4-second rebuilds.
Layer Ordering Strategy
IfBase image (FROM)
UseFirst layer. Changes rarely. Cached indefinitely until the tag is updated.
IfSystem dependencies (apt-get install, apk add)
UseSecond layer. Changes occasionally. Chain with && and clean up in the same RUN.
IfDependency manifests (package.json, requirements.txt)
UseThird layer. Changes when dependencies change. Copy BEFORE source code.
IfDependency installation (npm ci, pip install)
UseFourth layer. Changes when dependencies change. Cached until manifests change.
IfSource code (COPY . . or COPY src/)
UseLast layer. Changes on every code edit. Must be the final COPY to maximize cache.

The Instructions That Actually Matter — And What They're Really Doing

There are 18 Dockerfile instructions. In practice, you'll use about 10 of them regularly. Rather than listing all 18 mechanically, let's focus on the ones that cause confusion or have non-obvious behaviour — because those are the ones that bite you in production.

FROM is always first. It picks your starting layer. FROM scratch gives you an empty image — useful for compiled Go or Rust binaries. FROM node:20-alpine gives you Node on Alpine Linux, which is ~7MB versus ~180MB for Debian-based images. Prefer Alpine for production; prefer the fuller images when you need debugging tools.

RUN executes a shell command during the build. Each RUN creates a new layer. Chain related commands with && and clean up in the same RUN to avoid bloating the image with intermediate files that persist in a layer even after you delete them later.

COPY vs ADD: Use COPY almost always. ADD does extra magic — it auto-extracts tar archives and can fetch URLs — but that magic makes builds unpredictable. Use ADD only when you explicitly need its archive extraction feature.

ENV sets environment variables available at both build time and runtime. ARG sets variables available only at build time. Never put secrets in ENV — they're visible in docker inspect and image history. Use runtime secret injection instead.

ENTRYPOINT vs CMD: ENTRYPOINT sets the executable that always runs. CMD provides default arguments to it. When you run docker run my-image --verbose, that --verbose replaces CMD but gets appended to ENTRYPOINT. Together they let you build images that behave like CLI tools.

The HEALTHCHECK instruction: HEALTHCHECK tells Docker how to determine if the container's process is healthy. Without it, Docker only checks if the process is running — not if it is functional. A process that is running but deadlocked appears healthy. HEALTHCHECK runs a command periodically and marks the container as unhealthy if it fails. This is critical for orchestrators like Docker Swarm and Kubernetes that use health status for routing decisions.

io/thecodeforge/Dockerfile.instructions-deep-dive · DOCKERFILE
123456789101112131415161718192021222324252627282930313233343536373839404142434445
# Build-time variable — available only during docker build, not at runtime.
# Pass it with: docker build --build-arg APP_VERSION=2.1.0 .
ARG APP_VERSION=1.0.0

FROM python:3.12-slim

WORKDIR /api

# Runtime environment variable — visible to the running container.
# Safe for non-sensitive config like port numbers or log levels.
ENV LOG_LEVEL=info \
    PORT=8080

# Chain RUN commands with && to keep this as ONE layer.
# The final rm -rf cleans up apt cache IN THE SAME LAYER so it doesn't
# persist and bloat the image. If you ran rm -rf in a separate RUN,
# the cache would still exist in the previous layer — wasted space.
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl && \
    rm -rf /var/lib/apt/lists/*

# Copy dependency file alone first (cache optimization from section above)
COPY requirements.txt .

# Install Python deps. --no-cache-dir prevents pip from storing
# the download cache inside the image layer — saves ~50MB.
RUN pip install --no-cache-dir -r requirements.txt

# Copy application source code
COPY app/ ./app/

# HEALTHCHECK — tells Docker if the app is actually functional.
# Without this, Docker only checks if the process isinterval=30s --timeout=5s --start-period=10 healthy without a healthcheck.
HEALTH running.
# A deadlocked process appearsCHECK --s --retries=3 \
    CMD curl -f http://localhost:8080/health || exit 1

# ENTRYPOINT sets the fixed executable — this always runs.
# Using exec form (JSON array) so the process receives OS signals directly.
ENTRYPOINT ["python", "-m", "uvicorn"]

# CMD provides the default arguments to ENTRYPOINT.
# You can override these at runtime without changing ENTRYPOINT:
# docker run my-api app.main:app --port 9000
CMD ["app.main:app", "--host", "0.0.0.0", "--port", "8080"]
▶ Output
$ docker build -t my-python-api:latest .
[+] Building 38.7s (9/9) FINISHED

# Default startup (uses CMD arguments):
$ docker run --rm -p 8080:8080 my-python-api:latest
INFO: Started server process [1]
INFO: Uvicorn running on http://0.0.0.0:8080

# Override CMD arguments without touching ENTRYPOINT:
$ docker run --rm -p 9000:9000 my-python-api:latest app.main:app --host 0.0.0.0 --port 9000
INFO: Uvicorn running on http://0.0.0.0:9000

# Check image size — slim base + no-cache-dir pays off:
$ docker images my-python-api
REPOSITORY TAG IMAGE ID SIZE
my-python-api latest a3f91b2cd4e1 187MB
Mental Model
ENTRYPOINT as the Car, CMD as the Default Destination
When would you use ENTRYPOINT + CMD together instead of just CMD?
  • When building CLI-like tools: ENTRYPOINT is the tool, CMD is the default subcommand.
  • When you want a fixed executable with configurable arguments: ENTRYPOINT ["python", "-m", "uvicorn"] + CMD ["app:app", "--port", "8080"].
  • When users should be able to override arguments without re-specifying the executable.
  • When you want docker run <image> --help to work — the --help replaces CMD and is appended to ENTRYPOINT.
📊 Production Insight
HEALTHCHECK is the most underused Dockerfile instruction. Without it, Docker Swarm and Kubernetes only check if the process is running — not if it is functional. A process that is deadlocked or stuck in a retry loop appears healthy. Add HEALTHCHECK to every production Dockerfile. The start-period flag prevents false failures during slow startup.
🎯 Key Takeaway
Use COPY over ADD unless you need tar extraction. Never put secrets in ENV or ARG — they are visible in docker history. ENTRYPOINT + CMD together build CLI-like tools. HEALTHCHECK tells orchestrators if the process is functional, not just running. Always use exec form for CMD and ENTRYPOINT.

Multi-Stage Builds — The Pattern That Separates Pros from Beginners

Here's a scenario every developer hits: you need a compiler or build tool to produce your application binary, but you don't need that compiler in the final image running in production. Shipping the compiler anyway means a larger attack surface, a bigger image pulling over the network, and slower startup times in Kubernetes.

Multi-stage builds solve this elegantly. You define multiple FROM blocks in one Dockerfile. Each FROM starts a fresh image context. You build your application in an early 'builder' stage that has all the tools, then you COPY only the compiled output into a final, minimal 'runtime' stage. The builder stage is discarded — it never ships.

This pattern is transformative for compiled languages. A Go application that builds in a 800MB image with all the Go toolchain can ship as a 12MB Alpine or even a 3MB scratch image containing just the binary. But it's equally powerful for JavaScript — build your React app with node_modules in one stage, then copy only the /dist folder into an nginx image.

The key instruction is COPY --from=builder. The name builder is just a label you assign with AS in the FROM line. You can have as many stages as you need, and any stage can copy from any previous stage. You can even reference external images as copy sources with --from=nginx:alpine.

Build-time secrets in multi-stage builds: Multi-stage builds are the correct pattern for handling build-time secrets. Put the secret in the builder stage (using BuildKit --mount=type=secret), use it during compilation, and the secret never appears in the final runtime stage. The builder stage is discarded, and with it any trace of the secret.

Targeting a specific stage: Use docker build --target <stage-name> to build up to a specific stage. This is useful for debugging — build the builder stage and inspect it without building the runtime stage: docker build --target builder -t debug . && docker run --rm -it debug sh.

io/thecodeforge/Dockerfile.multi-stage-go · DOCKERFILE
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
# ─── Stage 1: Builder ──────────────────────────────────────────────────────
# This stage has the full Go toolchain (~800MB). It compiles our app.
# The 'AS builder' label lets us reference this stage later.
FROM golang:1.22-alpine AS builder

# Install git — needed if any Go modules pull from private repos
RUN apk add --no-cache git

WORKDIR /build

# Copy go module files first for cache optimization
COPY go.mod go.sum ./

# Download dependencies — this layer is cached until go.mod changes
RUN go mod download

# Copy all source code
COPY . .

# Build the binary.
# CGO_ENABLED=0 — statically link everything, no C runtime needed.
# GOOS=linux — compile for Linux even if building on a Mac.
# -ldflags "-w -s" — strip debug info and symbol table (~30% size reduction).
RUN CGO_ENABLED=0 GOOS=linux go build \
    -ldflags="-w -s" \
    -o /build/api-server \
    ./cmd/server

# ─── Stage 2: Runtime ──────────────────────────────────────────────────────
# 'scratch' is a completely empty image — no OS, no shell, nothing.
# The only thing in this final image is our compiled binary.
# This is as lean and secure as it gets.
FROM scratch AS runtime

# Copy TLS certificates from the builder stage so our app can make
# HTTPS calls. Without this, any TLS connection would fail.
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/

# Copy ONLY the compiled binary from the builder stage.
# Everything else from the 800MB build environment is discarded.
COPY --from=builder /build/api-server /api-server

EXPOSE 8080

# No shell in scratch, so we must use exec form
ENTRYPOINT ["/api-server"]
▶ Output
$ docker build -t go-api:production .
[+] Building 54.2s (12/12) FINISHED
=> [builder 1/7] FROM golang:1.22-alpine 18.3s
=> [builder 4/7] RUN go mod download 9.1s
=> [builder 6/7] RUN CGO_ENABLED=0 ... 22.4s
=> [runtime 1/1] FROM scratch 0.0s
=> [runtime 2/2] COPY --from=builder ... 0.1s
=> exporting to image 0.3s

# Compare image sizes — this is the payoff:
$ docker images | grep go-api
REPOSITORY TAG SIZE
go-api production 11.2MB <- final image shipped to production
go-api builder 847MB <- never leaves your build machine

# Run it:
$ docker run --rm -p 8080:8080 go-api:production
2024/01/15 10:23:01 API server listening on :8080
Mental Model
Multi-Stage as a Factory Assembly Line
Why is multi-stage build better than just deleting files in the final layer?
  • Deleting files in a layer does not reduce image size — layers are additive. The deleted files persist in earlier layers.
  • Multi-stage builds discard entire stages — the builder stage never appears in the final image.
  • The final image has fewer layers, smaller size, and a reduced attack surface (no compiler, no build tools).
  • Multi-stage builds also improve build cache — the builder stage is cached independently from the runtime stage.
📊 Production Insight
The --target flag is essential for debugging multi-stage builds. If the runtime stage fails, build only the builder stage and inspect its filesystem: docker build --target builder -t debug . && docker run --rm -it debug sh. This avoids rebuilding the entire Dockerfile when only one stage needs investigation.
🎯 Key Takeaway
Multi-stage builds let you use heavy toolchains during compilation and ship only the output to production. The builder stage is discarded and never pushed to a registry. Use --target to debug individual stages. This is the primary technique for reducing image size and attack surface.
Multi-Stage Strategy by Language
IfGo or Rust (compiled, statically linked)
UseBuilder stage with full toolchain. Runtime stage with FROM scratch. Ship only the binary. Image size: 5-15MB.
IfNode.js (TypeScript or bundled frontend)
UseBuilder stage with node for compilation. Runtime stage with node:alpine or nginx for serving. Ship only dist/ or build/.
IfPython (no compilation needed)
UseSingle stage is often sufficient. Use multi-stage only if you need build-time tools (gcc for C extensions) that are not needed at runtime.
IfJava (JVM, needs JDK to compile, JRE to run)
UseBuilder stage with JDK. Runtime stage with JRE or distroless. Ship only the .jar file.

Production-Ready Dockerfile — Putting It All Together

Knowing individual instructions is one thing. Knowing how they compose into a secure, efficient, production-grade Dockerfile is what makes the difference in a real project. There are four production concerns beyond 'does it build': image size, security, build speed, and signal handling.

Image size: use a minimal base, chain RUN commands, use multi-stage builds, and add a .dockerignore file — this is the most commonly forgotten file. Without it, COPY . . sends your entire project directory (including node_modules, .git, test fixtures) to the Docker build context, which can make builds take minutes before a single instruction executes.

Security: never run as root. Add a non-root user with RUN addgroup and adduser, then switch to it with USER. If an attacker compromises your app, running as a non-root user limits the blast radius significantly.

Signal handling: always use exec form ["executable", "arg"] for CMD and ENTRYPOINT — not shell form executable arg. Shell form wraps your command in /bin/sh -c, which means your process gets PID 2, not PID 1. Kubernetes and Docker send SIGTERM to PID 1 when stopping a container. If your app isn't PID 1, it never receives the signal and gets hard-killed after the timeout.

Build speed: everything from section one — order layers by change frequency, separate dependency manifests from source code.

The .dockerignore file in detail: The .dockerignore file excludes files from the build context before they are sent to the Docker daemon. Without it, the entire directory (including .git, node_modules, .env, test fixtures) is sent to the daemon, increasing build time and risking secret exposure. Common patterns to exclude: node_modules/, .git/, .env, .log, coverage/, __pycache__/, *.pyc, .dockerignore itself.

io/thecodeforge/Dockerfile.production-ready · DOCKERFILE
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
# ─── .dockerignore (create this file alongside your Dockerfile) ────────────
# node_modules/
# .git/
# .github/
# coverage/
# *.test.js
# .env*
# README.md
# docker-compose*.yml
# ─────────────────────────────────────────────────────────────────────────────

# ─── Stage 1: Dependency installation ────────────────────────────────────────
FROM node:20-alpine AS deps

# Create a non-root user early — we'll reuse this uid in the runtime stage.
# Using a numeric UID (1001) instead of a name is more portable across images.
RUN addgroup --system --gid 1001 appgroup && \
    adduser --system --uid 1001 --ingroup appgroup appuser

WORKDIR /app

# Copy only manifests — cache this expensive layer aggressively
COPY package.json package-lock.json ./

# ci is stricter than install — it fails if lockfile is out of sync,
# which catches dependency drift bugs in CI before they hit production.
RUN npm ci --omit=dev

# ─── Stage 2: Build (for TypeScript/React projects that need transpilation) ──
FROM node:20-alpine AS build

WORKDIR /app

# Copy deps from previous stage (avoids re-installing)
COPY --from=deps /app/node_modules ./node_modules
COPY . .

# Run the build step (TypeScript compile, bundling, etc.)
RUN npm run build

# ─── Stage 3: Production runtime ─────────────────────────────────────────────
FROM node:20-alpine AS production

WORKDIR /app

# Copy the non-root user definitions from the deps stage
COPY --from=deps /etc/passwd /etc/passwd
COPY --from=deps /etc/group /etc/group

# Copy only what production needs — nothing from build tools or dev deps
COPY --from=deps /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist
COPY package.json .

# Switch to non-root user BEFORE the final CMD.
# Everything after this line runs as appuser, not root.
USER appuser

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
    CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

EXPOSE 3000

# Exec form — process receives signals directly as PID 1.
# No shell wrapper means clean shutdown when Kubernetes sends SIGTERM.
CMD ["node", "dist/index.js"]
▶ Output
$ docker build --target production -t my-app:prod .
[+] Building 23.1s (14/14) FINISHED

$ docker run --rm -p 3000:3000 my-app:prod
Server running on port 3000

# Verify the process is NOT running as root:
$ docker run --rm my-app:prod whoami
appuser

# Verify PID 1 is your app (not a shell):
$ docker run --rm my-app:prod ps aux
PID USER COMMAND
1 appuser node dist/index.js <- PID 1, will receive SIGTERM correctly

# Lean final image:
$ docker images my-app:prod
REPOSITORY TAG SIZE
my-app prod 142MB
Mental Model
Production Dockerfile as a Security Checklist
What are the six mandatory checks for a production Dockerfile?
  • Non-root USER instruction before CMD/ENTRYPOINT.
  • No secrets in ENV, ARG, or RUN instructions.
  • .dockerignore file exists and excludes node_modules, .git, .env.
  • Exec-form CMD/ENTRYPOINT for signal handling.
  • Multi-stage build if build tools are not needed at runtime.
  • HEALTHCHECK instruction for orchestrator integration.
📊 Production Insight
The .dockerignore file is the most commonly forgotten file in Docker projects. Without it, COPY . . sends your entire directory to the Docker daemon, including .git (often 100MB+), node_modules (500MB+), and .env files with real credentials. This slows builds and risks secret exposure. Create .dockerignore before writing your first Dockerfile.
🎯 Key Takeaway
A production Dockerfile has six mandatory elements: non-root USER, no secrets in ENV/ARG, .dockerignore, exec-form CMD, multi-stage build if applicable, and HEALTHCHECK. Missing any one creates a silent vulnerability. The .dockerignore file is not optional — it prevents secret exposure and reduces build time.
🗂 Shell Form vs Exec Form — Signal Handling and Process Management
Why exec form is mandatory for production CMD and ENTRYPOINT.
AspectShell Form (RUN command arg)Exec Form (RUN ["command", "arg"])
SyntaxCMD node server.jsCMD ["node", "server.js"]
Process spawningRuns inside /bin/sh -c — your app is a child processRuns directly — your app IS the process
PID in containerYour app gets PID 2 or higherYour app gets PID 1
Signal handlingSIGTERM from Docker/K8s may not reach your appSIGTERM reaches your app directly — clean shutdown works
Shell features availableYes — variable expansion, pipes, &&No — must handle logic in the command itself
Best used forRUN instructions that need shell featuresCMD and ENTRYPOINT — always prefer this
RiskGraceful shutdown often silently brokenMinimal — this is the safe default

🎯 Key Takeaways

  • Docker builds images as a stack of cached layers — order your COPY and RUN instructions from least-to-most frequently changing, always copying dependency manifests before source code, to get near-instant cached rebuilds.
  • Always use exec form (JSON array syntax) for CMD and ENTRYPOINT — shell form wraps your process in /bin/sh -c, bumping it to PID 2 and silently breaking graceful shutdown in Docker and Kubernetes.
  • Multi-stage builds let you use a full toolchain (800MB) during compilation and ship only the compiled output (12MB) to production — the build stage is discarded and never pushed to a registry.
  • The .dockerignore file is mandatory, not optional — without it, COPY . . silently bakes node_modules, .git history, and .env files into your image; add it before you write your first COPY instruction.
  • Never put secrets in ENV or ARG — they are permanently visible in docker history. Use BuildKit --mount=type=secret for build-time secrets and secrets managers for runtime secrets.
  • A production Dockerfile has six mandatory checks: non-root USER, no secrets in ENV/ARG, .dockerignore, exec-form CMD, multi-stage build, and HEALTHCHECK.

⚠ Common Mistakes to Avoid

    Copying source code before installing dependencies
    Symptom

    every code change triggers a full npm install or pip install, making rebuilds take 2-5 minutes even when dependencies didn't change —

    Fix

    always COPY the dependency manifest (package.json, requirements.txt) in its own layer and run the install command before you COPY the rest of your source code.

    Using shell form for CMD and ENTRYPOINT
    Symptom

    your container takes 10-30 seconds to stop (Docker's default timeout) instead of shutting down instantly, and graceful shutdown hooks in your application never fire —

    Fix

    switch to exec (JSON array) form: change CMD node server.js to CMD ["node", "server.js"] so your process becomes PID 1 and receives OS signals directly.

    Running the container as root
    Symptom

    no immediate error, but a compromised container has unrestricted access to the container filesystem and any mounted volumes —

    Fix

    add RUN addgroup --system appgroup && adduser --system --ingroup appgroup appuser and then USER appuser before your CMD. Verify with docker run --rm your-image whoami — it should return your non-root username.

    No .dockerignore file
    Symptom

    build context is sent to the Docker daemon including node_modules, .git, and .env files. Builds are slow and secrets are baked into the image —

    Fix

    create .dockerignore with at minimum: node_modules/, .git/, .env, .log, coverage/.

    Putting secrets in ENV or ARG
    Symptom

    secrets visible in docker inspect and docker history --no-trunc. Anyone with image pull access can extract them —

    Fix

    use BuildKit --mount=type=secret for build-time secrets. Use runtime secret injection (Docker secrets, Kubernetes secrets, env vars from a secrets manager) for runtime secrets.

    Not cleaning up package manager cache in the same RUN
    Symptom

    image is 500MB+ larger than expected because apt cache or pip download cache persists in a layer —

    Fix

    chain cleanup in the same RUN: RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*. Separate RUN commands create separate layers.

Interview Questions on This Topic

  • QWhat's the difference between CMD and ENTRYPOINT, and can you give a real example of when you'd use both together in the same Dockerfile?
  • QIf I change one line in my source code and rebuild, which layers get rebuilt and why? How would you structure a Dockerfile to make that rebuild as fast as possible?
  • QWhat's the difference between ARG and ENV — and why should you never put a secret in either one? What's the correct alternative?
  • QExplain how multi-stage builds work. How would you use them to reduce a Go application's image from 800MB to 12MB?
  • QYour container takes 30 seconds to stop in Kubernetes instead of shutting down gracefully. Walk me through the debugging process and the most likely root cause.
  • QWhat is the purpose of the HEALTHCHECK instruction? What happens if you omit it in a Docker Swarm or Kubernetes deployment?

Frequently Asked Questions

What is the difference between a Dockerfile and a Docker image?

A Dockerfile is the source code — a plain-text instruction file you write and version control. A Docker image is the compiled artifact produced when Docker reads and executes that Dockerfile. The relationship is the same as source code to a compiled binary: you share the Dockerfile, Docker builds the image, and you run containers from the image.

How do I reduce the size of my Docker image?

The three highest-impact changes are: (1) use a minimal base image like Alpine instead of full Debian — this alone drops your base from ~180MB to ~7MB; (2) use multi-stage builds so your build tools and compiler never ship to production; (3) chain RUN commands with && and clean up package manager caches in the same RUN instruction so intermediate files don't persist in a layer.

Why does my container ignore SIGTERM and take 30 seconds to stop?

You're almost certainly using shell form for your CMD or ENTRYPOINT (e.g., CMD node server.js). This wraps your app in /bin/sh -c, making the shell PID 1 and your app PID 2. Docker sends SIGTERM to PID 1 (the shell), which doesn't forward it to your app. After the timeout, Docker sends SIGKILL. Fix it by switching to exec form: CMD ["node", "server.js"].

What is the difference between ARG and ENV?

ARG is available only during docker build — it does not exist in the running container. ENV is available at both build time and runtime. Neither should contain secrets — both are visible in docker history --no-trunc. For build-time secrets, use BuildKit --mount=type=secret. For runtime secrets, use Docker secrets or a secrets manager.

How do I debug a multi-stage build that produces unexpected output?

Use the --target flag to build only up to a specific stage: docker build --target builder -t debug . Then run the stage interactively: docker run --rm -it debug sh. Inspect the filesystem to verify files are where you expect. Then build the full image and compare.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousDocker Images and ContainersNext →Docker Volumes and Networking
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged