Intermediate 10 min · March 06, 2026

Dockerfile CMD Shell Form — Why SIGTERM Fails

Q: What is the difference between a Dockerfile and a Docker image?

A Dockerfile is the source code — a plain-text instruction file you write and version control. A Docker image is the compiled artifact produced when Docker reads and executes that Dockerfile. The relationship is the same as source code to a compiled binary: you share the Dockerfile, Docker builds the image, and you run containers from the image.

Q: How do I reduce the size of my Docker image?

The three highest-impact changes are: (1) use a minimal base image like Alpine instead of full Debian — this alone drops your base from ~180MB to ~7MB; (2) use multi-stage builds so your build tools and compiler never ship to production; (3) chain RUN commands with && and clean up package manager caches in the same RUN instruction so intermediate files don't persist in a layer.

Q: Why does my container ignore SIGTERM and take 30 seconds to stop?

You're almost certainly using shell form for your CMD or ENTRYPOINT (e.g., `CMD node server.js`). This wraps your app in /bin/sh -c, making the shell PID 1 and your app PID 2. Docker sends SIGTERM to PID 1 (the shell), which doesn't forward it to your app. After the timeout, Docker sends SIGKILL. Fix it by switching to exec form: `CMD ["node", "server.js"]`.

Q: What is the difference between ARG and ENV?

ARG is available only during docker build — it does not exist in the running container. ENV is available at both build time and runtime. Neither should contain secrets — both are visible in docker history --no-trunc. For build-time secrets, use BuildKit --mount=type=secret. For runtime secrets, use Docker secrets or a secrets manager.

Q: How do I debug a multi-stage build that produces unexpected output?

Use the --target flag to build only up to a specific stage: docker build --target builder -t debug . Then run the stage interactively: docker run --rm -it debug sh. Inspect the filesystem to verify files are where you expect. Then build the full image and compare.

Shell-form CMD makes your app PID 2, so Kubernetes SIGTERM hits the shell instead.

Naren Founder & Principal Engineer

20+ years shipping production infrastructure and CI/CD at scale. Everything here is grounded in real deployments.

✓ Production

production tested

May 23, 2026

last updated

1,554

articles · all by Naren

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

Each instruction creates a read-only layer — a filesystem diff on top of the previous layer
Docker caches layers sequentially — changing one layer invalidates all layers after it
Order instructions from least-likely-to-change to most-likely-to-change for optimal caching
Multi-stage builds let you use heavy toolchains during compilation and ship only the output
FROM: selects the base image (alpine for size, debian/ubuntu for compatibility)
RUN: executes commands during build — chain with && to reduce layer count
COPY: copies files into the image — prefer over ADD unless you need tar extraction
CMD vs ENTRYPOINT: CMD provides default args, ENTRYPOINT sets the fixed executable
ARG vs ENV: ARG is build-time only, ENV persists at runtime — never put secrets in either

✦ Definition~90s read

What is Dockerfile?

The Dockerfile CMD instruction in shell form (CMD command param1 param2) is the most common source of zombie processes and failed graceful shutdowns in containerized applications. Unlike the exec form (CMD ["executable", "param1"]), shell form wraps your command as an argument to /bin/sh -c, creating a shell process as PID 1.

★

Imagine you're moving to a new city and need to set up your apartment exactly the way you like it.

When Docker sends SIGTERM to the container, that signal hits the shell — not your application. The shell typically ignores SIGTERM, leaving your app running until Docker's 10-second timeout forces a SIGKILL. This means database connections aren't drained, in-flight requests are dropped, and log buffers go unwritten.

The fix is trivial: always use exec form for CMD and ENTRYPOINT unless you explicitly need shell variable expansion or command chaining. If you must use shell form, wrap your command with exec — CMD exec myapp — which replaces the shell process with your application, making it PID 1 and properly signal-aware.

This isn't a Docker bug; it's a fundamental Unix process model behavior that catches everyone at least once in production.

Plain-English First

Imagine you're moving to a new city and need to set up your apartment exactly the way you like it. Instead of doing it from memory every time, you write a step-by-step instruction sheet: 'Step 1 — buy a bed frame. Step 2 — assemble it. Step 3 — put the mattress on top.' A Dockerfile is exactly that instruction sheet, but for your application's environment. Docker reads it top to bottom and builds a perfect, repeatable copy of your app's home — every single time, on any machine in the world.

Dockerfiles eliminate environment drift. A Dockerfile is a plain-text script that defines every dependency, runtime, and configuration your application needs. Docker reads it and builds an image — a portable, immutable snapshot that runs identically on any machine.

The layer caching mechanism is the single most important concept. Each instruction creates a cached layer. Changing one instruction invalidates all subsequent layers. Order your instructions from least-likely-to-change to most-likely-to-change to maximize cache hits during development.

Three misconceptions cause the most production issues: CMD without exec form silently breaks graceful shutdown in Kubernetes, ENV and ARG are visible in docker history (never put secrets in them), and .dockerignore is not optional (COPY . . without it bakes secrets and gigabytes of junk into the image).

What Dockerfile CMD Shell Form Actually Does

The Dockerfile CMD instruction defines the default command that runs when a container starts. In shell form — CMD command param1 — Docker wraps it as /bin/sh -c "command param1". That shell process becomes PID 1 inside the container, not your application. This matters because PID 1 in Linux has a unique responsibility: it must handle SIGTERM and other signals sent by docker stop. When you use shell form, your Java app (e.g., java -jar app.jar) runs as a child of sh, not as PID 1. Docker stop sends SIGTERM to PID 1 — the shell — which by default does not forward signals to child processes. Your JVM never sees SIGTERM. After a 10-second grace period, Docker escalates to SIGKILL, killing the JVM abruptly. This means no graceful shutdown: no shutdown hooks, no draining connections, no flushing buffers. In production, this causes dropped requests, corrupted state, and angry users. Use exec form — CMD ["java", "-jar", "app.jar"] — to make your app PID 1 and receive signals directly.

Shell Form Is Not Just Syntax

Switching from shell to exec form changes signal delivery — your JVM goes from never seeing SIGTERM to handling it properly.

Production Insight

Teams using shell form with Spring Boot apps see random connection pool exhaustion after deployments because the old container's connections are never gracefully closed.

Symptom: docker stop returns after 10 seconds, but the app logs show no shutdown hook execution — the JVM is killed by SIGKILL.

Rule: Always use exec form for CMD and ENTRYPOINT when your process must handle signals — which is every production service.

Key Takeaway

Shell form wraps your command in /bin/sh -c, making the shell PID 1 and your app a child process.

PID 1 must forward signals — the default shell does not, so SIGTERM never reaches your JVM.

Use exec form (JSON array) for CMD and ENTRYPOINT to make your application PID 1 and receive signals directly.

thecodeforge.io

Dockerfile CMD Shell Form and SIGTERM Failure

Dockerfile Explained

How Docker Builds an Image — Layers Are Everything

Before you write a single Dockerfile instruction, you need a mental model of what Docker is actually doing when it reads your file. Docker doesn't build one monolithic blob. It builds a stack of read-only layers, one per instruction. Each layer is a diff — only the filesystem changes from that step.

Why does this matter? Because Docker caches every layer. If you rebuild an image and nothing changed in a particular step, Docker reuses the cached layer instead of running it again. This turns a 3-minute build into a 4-second build. But the cache is sequential — as soon as one layer is invalidated (because something changed), every layer after it is also invalidated and rebuilt from scratch.

This single insight drives the most important Dockerfile design decision you'll ever make: order your instructions from least-likely-to-change to most-likely-to-change. Your base OS almost never changes. Your system dependencies change occasionally. Your app's package dependencies change sometimes. Your source code changes constantly. Structure your Dockerfile in that order and you'll get near-instant cached rebuilds during development.

Think of layers like a stack of transparent slides on an overhead projector. Each slide adds something. You can swap out the top slide without reprinting all the slides beneath it.

Layer size and the cleanup-in-same-layer rule: Each RUN instruction creates a new layer. If you download a 200MB package in one RUN and delete it in the next RUN, the 200MB still exists in the first layer — layers are additive. The delete only adds a whiteout marker. Always chain download and cleanup in the same RUN with && to avoid bloating the image with phantom files.

io/thecodeforge/Dockerfile.layer-demoDOCKERFILE

# Layer 1 — Base image pulled from Docker Hub.
# This layer is cached after the first pull and almost never changes.
FROM node:20-alpine

# Layer 2 — Set the working directory inside the container.
# All subsequent instructions run relative to this path.
WORKDIR /app

# Layer 3 — Copy ONLY the dependency manifest files first.
# Separating this from the source code is the key cache optimization.
# This layer only rebuilds when package.json or the lockfile changes.
COPY package.json package-lock.json ./

# Layer 4 — Install dependencies.
# Because we copied manifests separately above, npm install only re-runs
# when a dependency actually changes — not every time you edit a .js file.
RUN npm ci --omit=dev

# Layer 5 — Now copy the actual source code.
# This layer changes on every code edit, but that's fine because
# the expensive npm install layer above is still cached.
COPY src/ ./src/

# Layer 6 — Declare the port the app listens on (documentation only —
# EXPOSE does NOT actually publish the port to the host).
EXPOSE 3000

# Layer 7 — The default command to start the application.
# Using the JSON array (exec) form avoids spawning a shell,
# which means SIGTERM signals reach your Node process directly.
CMD ["node", "src/index.js"]

Output

$ docker build -t my-node-app:1.0 .

[+] Building 42.3s (8/8) FINISHED

=> [1/6] FROM node:20-alpine 12.1s

=> [2/6] WORKDIR /app 0.1s

=> [3/6] COPY package.json package-lock.json ./ 0.1s

=> [4/6] RUN npm ci --omit=dev 28.4s

=> [5/6] COPY src/ ./src/ 0.2s

=> [6/6] EXPOSE 3000 0.0s

=> exporting to image 1.4s

# Now edit src/index.js and rebuild:

$ docker build -t my-node-app:1.1 .

[+] Building 1.2s (8/8) FINISHED

=> [1/6] FROM node:20-alpine CACHED

=> [2/6] WORKDIR /app CACHED

=> [3/6] COPY package.json ... CACHED

=> [4/6] RUN npm ci --omit=dev CACHED <- 28 seconds saved!

=> [5/6] COPY src/ ./src/ 0.2s <- only this layer rebuilt

=> exporting to image 0.8s

Layers as Transparent Slides

Each layer is a diff on top of the previous layer. If the base changes, the diff no longer applies.
Docker cannot know if a later instruction depends on the changed content in an earlier layer.
The cache is sequential, not selective — Docker rebuilds from the first invalidated layer onward.
This is why layer ordering (least-change to most-change) is the single most impactful Dockerfile optimization.

Production Insight

The cleanup-in-same-layer rule is the most common cause of bloated images. A team's image was 1.2GB because they ran apt-get install in one RUN and apt-get clean in the next. The 800MB apt cache persisted in the first layer. Fix: chain with && and clean up in the same RUN. This alone reduced their image from 1.2GB to 340MB.

Key Takeaway

Docker builds images as a stack of cached layers. Order instructions from least-to-most frequently changing. Copy dependency manifests before source code. Chain cleanup in the same RUN as the operation. This single optimization can turn 3-minute rebuilds into 4-second rebuilds.

Layer Ordering Strategy

IfBase image (FROM)

→

UseFirst layer. Changes rarely. Cached indefinitely until the tag is updated.

IfSystem dependencies (apt-get install, apk add)

→

UseSecond layer. Changes occasionally. Chain with && and clean up in the same RUN.

IfDependency manifests (package.json, requirements.txt)

→

UseThird layer. Changes when dependencies change. Copy BEFORE source code.

IfDependency installation (npm ci, pip install)

→

UseFourth layer. Changes when dependencies change. Cached until manifests change.

IfSource code (COPY . . or COPY src/)

→

UseLast layer. Changes on every code edit. Must be the final COPY to maximize cache.

The Instructions That Actually Matter — And What They're Really Doing

There are 18 Dockerfile instructions. In practice, you'll use about 10 of them regularly. Rather than listing all 18 mechanically, let's focus on the ones that cause confusion or have non-obvious behaviour — because those are the ones that bite you in production.

FROM is always first. It picks your starting layer. FROM scratch gives you an empty image — useful for compiled Go or Rust binaries. FROM node:20-alpine gives you Node on Alpine Linux, which is ~7MB versus ~180MB for Debian-based images. Prefer Alpine for production; prefer the fuller images when you need debugging tools.

RUN executes a shell command during the build. Each RUN creates a new layer. Chain related commands with && and clean up in the same RUN to avoid bloating the image with intermediate files that persist in a layer even after you delete them later.

COPY vs ADD: Use COPY almost always. ADD does extra magic — it auto-extracts tar archives and can fetch URLs — but that magic makes builds unpredictable. Use ADD only when you explicitly need its archive extraction feature.

ENV sets environment variables available at both build time and runtime. ARG sets variables available only at build time. Never put secrets in ENV — they're visible in docker inspect and image history. Use runtime secret injection instead.

ENTRYPOINT vs CMD: ENTRYPOINT sets the executable that always runs. CMD provides default arguments to it. When you run docker run my-image --verbose, that --verbose replaces CMD but gets appended to ENTRYPOINT. Together they let you build images that behave like CLI tools.

The HEALTHCHECK instruction: HEALTHCHECK tells Docker how to determine if the container's process is healthy. Without it, Docker only checks if the process is running — not if it is functional. A process that is running but deadlocked appears healthy. HEALTHCHECK runs a command periodically and marks the container as unhealthy if it fails. This is critical for orchestrators like Docker Swarm and Kubernetes that use health status for routing decisions.

io/thecodeforge/Dockerfile.instructions-deep-diveDOCKERFILE

# Build-time variable — available only during docker build, not at runtime.
# Pass it with: docker build --build-arg APP_VERSION=2.1.0 .
ARG APP_VERSION=1.0.0

FROM python:3.12-slim

WORKDIR /api

# Runtime environment variable — visible to the running container.
# Safe for non-sensitive config like port numbers or log levels.
ENV LOG_LEVEL=info \
    PORT=8080

# Chain RUN commands with && to keep this as ONE layer.
# The final rm -rf cleans up apt cache IN THE SAME LAYER so it doesn't
# persist and bloat the image. If you ran rm -rf in a separate RUN,
# the cache would still exist in the previous layer — wasted space.
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl && \
    rm -rf /var/lib/apt/lists/*

# Copy dependency file alone first (cache optimization from section above)
COPY requirements.txt .

# Install Python deps. --no-cache-dir prevents pip from storing
# the download cache inside the image layer — saves ~50MB.
RUN pip install --no-cache-dir -r requirements.txt

# Copy application source code
COPY app/ ./app/

# HEALTHCHECK — tells Docker if the app is actually functional.
# Without this, Docker only checks if the process isinterval=30s --timeout=5s --start-period=10 healthy without a healthcheck.
HEALTH running.
# A deadlocked process appearsCHECK --s --retries=3 \
    CMD curl -f http://localhost:8080/health || exit 1

# ENTRYPOINT sets the fixed executable — this always runs.
# Using exec form (JSON array) so the process receives OS signals directly.
ENTRYPOINT ["python", "-m", "uvicorn"]

# CMD provides the default arguments to ENTRYPOINT.
# You can override these at runtime without changing ENTRYPOINT:
# docker run my-api app.main:app --port 9000
CMD ["app.main:app", "--host", "0.0.0.0", "--port", "8080"]

Output

$ docker build -t my-python-api:latest .

[+] Building 38.7s (9/9) FINISHED

# Default startup (uses CMD arguments):

$ docker run --rm -p 8080:8080 my-python-api:latest

INFO: Started server process [1]

INFO: Uvicorn running on http://0.0.0.0:8080

# Override CMD arguments without touching ENTRYPOINT:

$ docker run --rm -p 9000:9000 my-python-api:latest app.main:app --host 0.0.0.0 --port 9000

INFO: Uvicorn running on http://0.0.0.0:9000

# Check image size — slim base + no-cache-dir pays off:

$ docker images my-python-api

REPOSITORY TAG IMAGE ID SIZE

my-python-api latest a3f91b2cd4e1 187MB

ENTRYPOINT as the Car, CMD as the Default Destination

When building CLI-like tools: ENTRYPOINT is the tool, CMD is the default subcommand.
When you want a fixed executable with configurable arguments: ENTRYPOINT ["python", "-m", "uvicorn"] + CMD ["app:app", "--port", "8080"].
When users should be able to override arguments without re-specifying the executable.
When you want docker run <image> --help to work — the --help replaces CMD and is appended to ENTRYPOINT.

Production Insight

HEALTHCHECK is the most underused Dockerfile instruction. Without it, Docker Swarm and Kubernetes only check if the process is running — not if it is functional. A process that is deadlocked or stuck in a retry loop appears healthy. Add HEALTHCHECK to every production Dockerfile. The start-period flag prevents false failures during slow startup.

Key Takeaway

Use COPY over ADD unless you need tar extraction. Never put secrets in ENV or ARG — they are visible in docker history. ENTRYPOINT + CMD together build CLI-like tools. HEALTHCHECK tells orchestrators if the process is functional, not just running. Always use exec form for CMD and ENTRYPOINT.

Multi-Stage Builds — The Pattern That Separates Pros from Beginners

Here's a scenario every developer hits: you need a compiler or build tool to produce your application binary, but you don't need that compiler in the final image running in production. Shipping the compiler anyway means a larger attack surface, a bigger image pulling over the network, and slower startup times in Kubernetes.

Multi-stage builds solve this elegantly. You define multiple FROM blocks in one Dockerfile. Each FROM starts a fresh image context. You build your application in an early 'builder' stage that has all the tools, then you COPY only the compiled output into a final, minimal 'runtime' stage. The builder stage is discarded — it never ships.

This pattern is transformative for compiled languages. A Go application that builds in a 800MB image with all the Go toolchain can ship as a 12MB Alpine or even a 3MB scratch image containing just the binary. But it's equally powerful for JavaScript — build your React app with node_modules in one stage, then copy only the /dist folder into an nginx image.

The key instruction is COPY --from=builder. The name builder is just a label you assign with AS in the FROM line. You can have as many stages as you need, and any stage can copy from any previous stage. You can even reference external images as copy sources with --from=nginx:alpine.

Build-time secrets in multi-stage builds: Multi-stage builds are the correct pattern for handling build-time secrets. Put the secret in the builder stage (using BuildKit --mount=type=secret), use it during compilation, and the secret never appears in the final runtime stage. The builder stage is discarded, and with it any trace of the secret.

Targeting a specific stage: Use docker build --target <stage-name> to build up to a specific stage. This is useful for debugging — build the builder stage and inspect it without building the runtime stage: docker build --target builder -t debug . && docker run --rm -it debug sh.

io/thecodeforge/Dockerfile.multi-stage-goDOCKERFILE

# ─── Stage 1: Builder ──────────────────────────────────────────────────────
# This stage has the full Go toolchain (~800MB). It compiles our app.
# The 'AS builder' label lets us reference this stage later.
FROM golang:1.22-alpine AS builder

# Install git — needed if any Go modules pull from private repos
RUN apk add --no-cache git

WORKDIR /build

# Copy go module files first for cache optimization
COPY go.mod go.sum ./

# Download dependencies — this layer is cached until go.mod changes
RUN go mod download

# Copy all source code
COPY . .

# Build the binary.
# CGO_ENABLED=0 — statically link everything, no C runtime needed.
# GOOS=linux — compile for Linux even if building on a Mac.
# -ldflags "-w -s" — strip debug info and symbol table (~30% size reduction).
RUN CGO_ENABLED=0 GOOS=linux go build \
    -ldflags="-w -s" \
    -o /build/api-server \
    ./cmd/server

# ─── Stage 2: Runtime ──────────────────────────────────────────────────────
# 'scratch' is a completely empty image — no OS, no shell, nothing.
# The only thing in this final image is our compiled binary.
# This is as lean and secure as it gets.
FROM scratch AS runtime

# Copy TLS certificates from the builder stage so our app can make
# HTTPS calls. Without this, any TLS connection would fail.
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/

# Copy ONLY the compiled binary from the builder stage.
# Everything else from the 800MB build environment is discarded.
COPY --from=builder /build/api-server /api-server

EXPOSE 8080

# No shell in scratch, so we must use exec form
ENTRYPOINT ["/api-server"]

Output

$ docker build -t go-api:production .

[+] Building 54.2s (12/12) FINISHED

=> [builder 1/7] FROM golang:1.22-alpine 18.3s

=> [builder 4/7] RUN go mod download 9.1s

=> [builder 6/7] RUN CGO_ENABLED=0 ... 22.4s

=> [runtime 1/1] FROM scratch 0.0s

=> [runtime 2/2] COPY --from=builder ... 0.1s

=> exporting to image 0.3s

# Compare image sizes — this is the payoff:

$ docker images | grep go-api

REPOSITORY TAG SIZE

go-api production 11.2MB <- final image shipped to production

go-api builder 847MB <- never leaves your build machine

# Run it:

$ docker run --rm -p 8080:8080 go-api:production

2024/01/15 10:23:01 API server listening on :8080

Multi-Stage as a Factory Assembly Line

Deleting files in a layer does not reduce image size — layers are additive. The deleted files persist in earlier layers.
Multi-stage builds discard entire stages — the builder stage never appears in the final image.
The final image has fewer layers, smaller size, and a reduced attack surface (no compiler, no build tools).
Multi-stage builds also improve build cache — the builder stage is cached independently from the runtime stage.

Production Insight

The --target flag is essential for debugging multi-stage builds. If the runtime stage fails, build only the builder stage and inspect its filesystem: docker build --target builder -t debug . && docker run --rm -it debug sh. This avoids rebuilding the entire Dockerfile when only one stage needs investigation.

Key Takeaway

Multi-stage builds let you use heavy toolchains during compilation and ship only the output to production. The builder stage is discarded and never pushed to a registry. Use --target to debug individual stages. This is the primary technique for reducing image size and attack surface.

Multi-Stage Strategy by Language

IfGo or Rust (compiled, statically linked)

→

UseBuilder stage with full toolchain. Runtime stage with FROM scratch. Ship only the binary. Image size: 5-15MB.

IfNode.js (TypeScript or bundled frontend)

→

UseBuilder stage with node for compilation. Runtime stage with node:alpine or nginx for serving. Ship only dist/ or build/.

IfPython (no compilation needed)

→

UseSingle stage is often sufficient. Use multi-stage only if you need build-time tools (gcc for C extensions) that are not needed at runtime.

IfJava (JVM, needs JDK to compile, JRE to run)

→

UseBuilder stage with JDK. Runtime stage with JRE or distroless. Ship only the .jar file.

Production-Ready Dockerfile — Putting It All Together

Knowing individual instructions is one thing. Knowing how they compose into a secure, efficient, production-grade Dockerfile is what makes the difference in a real project. There are four production concerns beyond 'does it build': image size, security, build speed, and signal handling.

Image size: use a minimal base, chain RUN commands, use multi-stage builds, and add a .dockerignore file — this is the most commonly forgotten file. Without it, COPY . . sends your entire project directory (including node_modules, .git, test fixtures) to the Docker build context, which can make builds take minutes before a single instruction executes.

Security: never run as root. Add a non-root user with RUN addgroup and adduser, then switch to it with USER. If an attacker compromises your app, running as a non-root user limits the blast radius significantly.

Signal handling: always use exec form ["executable", "arg"] for CMD and ENTRYPOINT — not shell form executable arg. Shell form wraps your command in /bin/sh -c, which means your process gets PID 2, not PID 1. Kubernetes and Docker send SIGTERM to PID 1 when stopping a container. If your app isn't PID 1, it never receives the signal and gets hard-killed after the timeout.

Build speed: everything from section one — order layers by change frequency, separate dependency manifests from source code.

The .dockerignore file in detail: The .dockerignore file excludes files from the build context before they are sent to the Docker daemon. Without it, the entire directory (including .git, node_modules, .env, test fixtures) is sent to the daemon, increasing build time and risking secret exposure. Common patterns to exclude: node_modules/, .git/, .env, .log, coverage/, __pycache__/, *.pyc, .dockerignore itself.

io/thecodeforge/Dockerfile.production-readyDOCKERFILE

# ─── .dockerignore (create this file alongside your Dockerfile) ────────────
# node_modules/
# .git/
# .github/
# coverage/
# *.test.js
# .env*
# README.md
# docker-compose*.yml
# ─────────────────────────────────────────────────────────────────────────────

# ─── Stage 1: Dependency installation ────────────────────────────────────────
FROM node:20-alpine AS deps

# Create a non-root user early — we'll reuse this uid in the runtime stage.
# Using a numeric UID (1001) instead of a name is more portable across images.
RUN addgroup --system --gid 1001 appgroup && \
    adduser --system --uid 1001 --ingroup appgroup appuser

WORKDIR /app

# Copy only manifests — cache this expensive layer aggressively
COPY package.json package-lock.json ./

# ci is stricter than install — it fails if lockfile is out of sync,
# which catches dependency drift bugs in CI before they hit production.
RUN npm ci --omit=dev

# ─── Stage 2: Build (for TypeScript/React projects that need transpilation) ──
FROM node:20-alpine AS build

WORKDIR /app

# Copy deps from previous stage (avoids re-installing)
COPY --from=deps /app/node_modules ./node_modules
COPY . .

# Run the build step (TypeScript compile, bundling, etc.)
RUN npm run build

# ─── Stage 3: Production runtime ─────────────────────────────────────────────
FROM node:20-alpine AS production

WORKDIR /app

# Copy the non-root user definitions from the deps stage
COPY --from=deps /etc/passwd /etc/passwd
COPY --from=deps /etc/group /etc/group

# Copy only what production needs — nothing from build tools or dev deps
COPY --from=deps /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist
COPY package.json .

# Switch to non-root user BEFORE the final CMD.
# Everything after this line runs as appuser, not root.
USER appuser

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
    CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

EXPOSE 3000

# Exec form — process receives signals directly as PID 1.
# No shell wrapper means clean shutdown when Kubernetes sends SIGTERM.
CMD ["node", "dist/index.js"]

Output

$ docker build --target production -t my-app:prod .

[+] Building 23.1s (14/14) FINISHED

$ docker run --rm -p 3000:3000 my-app:prod

Server running on port 3000

# Verify the process is NOT running as root:

$ docker run --rm my-app:prod whoami

appuser

# Verify PID 1 is your app (not a shell):

$ docker run --rm my-app:prod ps aux

PID USER COMMAND

1 appuser node dist/index.js <- PID 1, will receive SIGTERM correctly

# Lean final image:

$ docker images my-app:prod

REPOSITORY TAG SIZE

my-app prod 142MB

Production Dockerfile as a Security Checklist

Non-root USER instruction before CMD/ENTRYPOINT.
No secrets in ENV, ARG, or RUN instructions.
.dockerignore file exists and excludes node_modules, .git, .env.
Exec-form CMD/ENTRYPOINT for signal handling.
Multi-stage build if build tools are not needed at runtime.
HEALTHCHECK instruction for orchestrator integration.

Production Insight

The .dockerignore file is the most commonly forgotten file in Docker projects. Without it, COPY . . sends your entire directory to the Docker daemon, including .git (often 100MB+), node_modules (500MB+), and .env files with real credentials. This slows builds and risks secret exposure. Create .dockerignore before writing your first Dockerfile.

Key Takeaway

A production Dockerfile has six mandatory elements: non-root USER, no secrets in ENV/ARG, .dockerignore, exec-form CMD, multi-stage build if applicable, and HEALTHCHECK. Missing any one creates a silent vulnerability. The .dockerignore file is not optional — it prevents secret exposure and reduces build time.

Why Your Dockerfile Needs a .dockerignore — And Most Don't Bother

You've seen it. The build that crawls for 90 seconds, copying node_modules or .git into the image for no reason. That's what happens when you skip a .dockerignore. The build context — everything in your project directory — gets shipped to the Docker daemon. Including your secrets, your 400MB vendor folder, and the cat picture you forgot about.

A .dockerignore works exactly like .gitignore. It tells the build to prune dead weight before the COPY instruction even runs. Smaller context means faster builds. Fewer cache invalidations. And you're not baking your .env file into the final image by accident.

Production teams treat .dockerignore as mandatory. Not optional. Not "nice to have". You don't get to ship a lean image without it.

Dockerfile.ymlYAML

// io.thecodeforge — devops tutorial

// .dockerignore — ditch the junk before the build starts
node_modules
.git
.env
*.log
build/
dist/
.DS_Store

// Your Dockerfile
FROM node:20-alpine
WORKDIR /app

// Without .dockerignore, this COPY grabs everything
// With .dockerignore, it's just src + package.json
COPY . .
// Simulated log showing build context size
// Wait for output below

Output

Without .dockerignore: Sending build context to Docker daemon 245.78MB

With .dockerignore: Sending build context to Docker daemon 4.21MB

Production Trap:

If your build context includes .git, you're shipping your commit history into the image. Anyone with docker history can see your source. Always add .git to your .dockerignore.

Key Takeaway

Pair a .dockerignore with every COPY. Smaller context = faster builds = no secrets in the image.

How the Build Cache Really Works — Stop Breaking It

Docker doesn't rebuild everything from scratch. It caches layers. When you change a line in your Dockerfile, Docker checks if the previous instruction's layer already exists. If it does, it reuses it. If not, it invalidates the cache — and every layer after it.

The problem? Developers put frequently changing files early in the Dockerfile. COPY the entire source tree before running npm install. Now every code change invalidates the node_modules layer. You pay for a full npm install on every build.

The fix is brutal and effective: order your instructions from least to most volatile. Start with package managers and lockfiles. Install dependencies. Then copy the source code. Docker's cache is dumb — it follows the order you give it. Give it a good order.

This pattern saves minutes per build in CI. If your builds take longer than 60 seconds, your layer ordering is wrong.

CacheOptimizedDockerfile.ymlYAML

// io.thecodeforge — devops tutorial

// WRONG ORDER — every code change busts the whole cache
FROM python:3.12-slim
WORKDIR /app
COPY . .                          // Source changes ALL the time
RUN pip install -r requirements.txt // Rebuilds dependencies EVERY time

// RIGHT ORDER — stable layers first
FROM python:3.12-slim AS builder
WORKDIR /app
COPY requirements.txt .            // Changes rarely
RUN pip install --no-cache-dir -r requirements.txt  // Stable cache hit
COPY src ./src                     // Source changes often — only busts this layer

// Build output showing cache status
// RUN pip install...  --->  Using cache (fast)
// COPY src ./src      --->  Cache busted (slow, but only one layer)

Output

Step 4/5 : RUN pip install -r requirements.txt

---> Using cache

---> a1b2c3d4e5f6

Step 5/5 : COPY src ./src

---> 7a8b9c0d1e2f

Time: 2.3 seconds (cache used for install)

Senior Shortcut:

Deduplicate your package manager installs by using a builder stage for dependencies, then copy only the virtual env or node_modules into the final stage. Keeps the cache intact and the final image small.

Key Takeaway

Order instructions from stable to volatile. Cache misses only invalidate the layers that follow — put your code last.

Metadata in Dockerfiles — Labels, EXPOSE, and the Lies You Tell

A Dockerfile isn't just for building. It's for documentation. The instructions that don't affect the filesystem — LABEL, EXPOSE, and ARG — tell anyone who reads the image what it's supposed to do.

LABEL adds metadata as key-value pairs. Use it for maintainer contact, version, and git commit. EXPOSE doesn't actually publish a port. It annotates that the container listens on that port at runtime. It's a contract between the image author and the person running it. If you skip it, you're hiding what the app needs.

ARG defines build-time variables. Use it to pass version numbers or environment-specific configs without hardcoding. But be careful — ARG values persist in the image history. Don't put secrets in ARG unless you want them leaked.

Production workflows read these labels. Registries sort by them. Monitoring tools surface them. If your Dockerfile has zero labels, you're shipping a blank ID card.

MetadataDockerfile.ymlYAML

// io.thecodeforge — devops tutorial

FROM golang:1.22-alpine AS builder

// Build-time args — NOT for secrets
ARG APP_VERSION
ARG GIT_SHA

LABEL org.opencontainers.image.version=${APP_VERSION}
LABEL org.opencontainers.image.revision=${GIT_SHA}
LABEL org.opencontainers.image.source="https://github.com/yourorg/payments-api"

EXPOSE 8080

// Inspection output for the built image
// docker inspect payments-api:latest shows all labels

Output

Labels:

org.opencontainers.image.revision: abc123def456

org.opencontainers.image.source: https://github.com/yourorg/payments-api

org.opencontainers.image.version: v2.4.1

ExposedPorts:

8080/tcp

Production Trap:

Never use ARG for tokens, passwords, or API keys. They persist in the image layers and are viewable with docker history. Use Docker secrets or buildkit's --secret flag instead.

Key Takeaway

LABEL your image as documentation. EXPOSE declares intent. ARG is for build-time configs, not secrets.

Why Every Dockerfile Needs a Clear Explanation of Base Image Choices

Most Dockerfiles start with FROM ubuntu:latest or FROM node:18-alpine without explaining why. That's a trap. The base image you pick directly dictates image size, attack surface, and compatibility. A bloated base like ubuntu:22.04 is 77 MB and includes unnecessary tools — perfect for testing, terrible for production. Alpine images drop to 5 MB but use musl libc, breaking binaries compiled against glibc. Distroless images strip everything but the runtime, reducing CVEs to near zero but making debugging impossible without sidecars. The rule: state your base image rationale in a comment above FROM. 'We use node:18-slim because Alpine's musl breaks our native bcrypt module.' That single line saves the next engineer hours of guessing. Never inherit a base image you can't explain in one sentence.

BaseImageRationale.ymlYAML

// io.thecodeforge — devops tutorial

# Rationale: node:18-slim avoids musl issues with bcrypt native bindings
# Alpine would cause runtime segfaults on this specific version
FROM node:18-slim AS builder

# Second stage uses distroless to minimize CVEs
# Only the compiled binary and dependencies are copied
FROM gcr.io/distroless/nodejs18-debian11
COPY --from=builder /app/dist /app
CMD ["/app/server.js"]

Output

Image shrinks from 500MB to 120MB. CVE count drops from 12 to 0.

Production Trap:

Alpine sounds lightweight but breaks apps using C extensions linked to glibc (pg, bcrypt, sharp). Test musl compatibility before you commit.

Key Takeaway

Every FROM line must be justified with a one-line comment explaining why that base image survives the trade-off.

Additional Resources That Fix Real Dockerfile Pain Points

Official documentation won't teach you what hurts most: debugging broken cache hits, wrestling with BuildKit secrets, or slimming images without breaking the app. These resources close that gap. For cache debugging, read Docker's 'Optimizing Builds with Cache' docs — but skip the theory and jump to the 'Cache invalidation patterns' section. For multi-stage builds, check out 'Docker Multi-Stage Builds: The Practical Guide' on dev.to by a former Docker engineer — it covers live examples of cross-stage variable passing. For security, use Hadolint (hadolint.github.io) to lint your Dockerfile against 100+ rules, then read Aqua Security's 'Dockerfile Best Practices' for real CVE reduction metrics. These aren't blog fluff — they're the exact resources senior engineers open when their pipeline fails.

DockerfileCheatSheet.ymlYAML

// io.thecodeforge — devops tutorial

# Lint your Dockerfile before every commit
# hadolint Dockerfile --ignore DL3008 --trusted-registry docker.io

# Build with cache debugging
# DOCKER_BUILDKIT=1 docker build --progress=plain . 2>&1 | grep "CACHED"

# Security scan after build
# docker scout cves my-image:latest --only-severity high

# Add metadata for traceability
LABEL org.opencontainers.image.source="https://github.com/myorg/repo"

Output

Lint catches 3 unused ENV variables. Build cache hits jump from 20% to 80% after fixing layer ordering.

Senior Move:

Bookmark hadolint's rule list — it explains why each line you write breaks security or cache patterns.

Key Takeaway

Pro engineers rely on three tools: Hadolint for linting, Docker Scout for CVEs, and BuildKit plain progress for cache debugging.

● Production incidentPOST-MORTEMseverity: high

Kubernetes Pods Take 30 Seconds to Stop — Graceful Shutdown Silently Broken by Shell-Form CMD

Symptom

During Kubernetes rolling updates, old pods showed Terminating status for exactly 30 seconds before being killed. Application logs showed no shutdown message (the team had added a SIGTERM handler that logged 'Shutting down gracefully...'). Database connection pools were not closed cleanly, causing 'connection reset by peer' errors on the database server. The team checked the Kubernetes events: 'Killing container with id docker://api:pod did not terminate in 30s, using SIGKILL'.

Assumption

The team assumed the application's SIGTERM handler had a bug. They tested it locally with docker stop and it worked — the shutdown message appeared and the process exited cleanly in 2 seconds. They assumed Kubernetes was sending a different signal. They added handlers for SIGINT, SIGHUP, and SIGQUIT. None of them fired. The team spent 3 days debugging the signal handling code.

Root cause

The Dockerfile used shell-form CMD: CMD node dist/index.js. Shell form wraps the command in /bin/sh -c, making the shell PID 1 and the Node.js process PID 2. Kubernetes sends SIGTERM to PID 1 (the shell). The shell does not forward signals to child processes by default. After 30 seconds (the default terminationGracePeriodSeconds), Kubernetes sends SIGKILL to all processes, killing the Node.js process without running any shutdown handlers. Locally, docker stop sends SIGTERM and then SIGKILL after a timeout, but Docker Desktop's behavior differs slightly, masking the issue during local testing.

Fix

1. Changed CMD to exec form: CMD ["node", "dist/index.js"]. This makes the Node.js process PID 1, directly receiving SIGTERM. 2. Verified with docker run --rm <image> ps aux — PID 1 was now 'node dist/index.js' instead of '/bin/sh -c node dist/index.js'. 3. Tested in Kubernetes: pods now terminated in 2 seconds with the shutdown message appearing in logs. 4. Added a CI check that scans Dockerfiles for shell-form CMD/ENTRYPOINT and fails the build if found. 5. Documented the exec-form requirement in the team's Dockerfile style guide.

Key lesson

Shell-form CMD wraps your process in /bin/sh -c, making it PID 2. SIGTERM goes to PID 1 (the shell), not your app. Graceful shutdown is silently broken.
Exec-form CMD (JSON array syntax) makes your app PID 1. SIGTERM reaches your app directly. Always use exec form for CMD and ENTRYPOINT.
docker stop and Kubernetes SIGTERM behavior can differ. Test signal handling in Kubernetes, not just locally.
Add a CI check that detects shell-form CMD/ENTRYPOINT. This is a silent failure that only manifests under load during rolling updates.

Production debug guideFrom slow builds to bloated images — systematic debugging paths.6 entries

Symptom · 01

Docker build is slow — every rebuild takes 3-5 minutes even for small code changes.

→

Fix

Check layer ordering. Run docker history <image> to see which layers were rebuilt. If the dependency install layer (npm install, pip install) rebuilds on every code change, the Dockerfile copies source code before dependency manifests. Fix: copy package.json/requirements.txt in a separate layer before COPY . . and run the install command in that layer.

Symptom · 02

Image is unexpectedly large — 1GB+ for a simple web application.

→

Fix

Run docker history --no-trunc <image> to see layer sizes. Look for layers with large intermediate files (apt cache, npm cache, build artifacts). Check if .dockerignore exists — without it, COPY . . includes node_modules and .git. Check if multi-stage builds are used — the final image may contain build tools that should be in a discarded builder stage.

Symptom · 03

Container ignores SIGTERM — takes 30 seconds to stop in Kubernetes.

→

Fix

Check if CMD or ENTRYPOINT uses shell form: docker inspect <image> --format '{{.Config.Cmd}}'. If the output is [/bin/sh -c node server.js], it is shell form. Fix: change to exec form: CMD ["node", "server.js"]. Verify PID 1: docker run --rm <image> ps aux. PID 1 should be your app, not /bin/sh.

Symptom · 04

Secrets visible in image history after being removed from Dockerfile.

→

Fix

Run docker history --no-trunc <image> and search for the secret value. Secrets in ENV, ARG, or RUN commands are permanently stored in layer history. Even if you delete the secret in a later layer, it persists in the earlier layer. Fix: rotate the secret immediately. Rebuild using BuildKit --mount=type=secret for build-time secrets.

Symptom · 05

Build fails with 'COPY failed: file not found'.

→

Fix

Check if the file is excluded by .dockerignore: cat .dockerignore. Check if the file path is correct relative to the build context (the directory where docker build is run). Check if the file exists: ls -la <file>. Common mistake: using an absolute path in COPY instead of a path relative to the build context.

Symptom · 06

Container runs as root despite USER instruction in Dockerfile.

→

Fix

Check if the USER instruction is before CMD/ENTRYPOINT. Check if the base image overrides USER in its entrypoint. Verify: docker run --rm <image> whoami. If it returns root, check docker inspect <image> --format '{{.Config.User}}'. If empty, the USER instruction was not applied. Check if the user exists in /etc/passwd inside the image.

★ Dockerfile Build Triage Cheat SheetFirst-response commands when Dockerfile builds are slow, images are bloated, or containers behave unexpectedly.

Docker build is slow — every rebuild takes minutes.−

Immediate action

Check which layers are being rebuilt vs cached.

Commands

docker build --progress=plain -t test:latest . 2>&1 | grep -E 'CACHED|RUN|COPY'

docker history <image> --format '{{.CreatedBy}} {{.Size}}'

Fix now

If RUN npm install rebuilds on every change, move COPY package.json before COPY . . . Separate dependency installation from source code copying.

Image is unexpectedly large (>500MB for a web app).+

Container takes 30 seconds to stop (SIGTERM not handled).+

Secret found in image history after being removed from Dockerfile.+

Build fails — 'COPY failed: file not found in build context'.+

Multi-stage build produces unexpected output or missing files.+

Shell Form vs Exec Form — Signal Handling and Process Management

Aspect	Shell Form (RUN command arg)	Exec Form (RUN ["command", "arg"])
Syntax	CMD node server.js	CMD ["node", "server.js"]
Process spawning	Runs inside /bin/sh -c — your app is a child process	Runs directly — your app IS the process
PID in container	Your app gets PID 2 or higher	Your app gets PID 1
Signal handling	SIGTERM from Docker/K8s may not reach your app	SIGTERM reaches your app directly — clean shutdown works
Shell features available	Yes — variable expansion, pipes, &&	No — must handle logic in the command itself
Best used for	RUN instructions that need shell features	CMD and ENTRYPOINT — always prefer this
Risk	Graceful shutdown often silently broken	Minimal — this is the safe default

Key takeaways

Docker builds images as a stack of cached layers

order your COPY and RUN instructions from least-to-most frequently changing, always copying dependency manifests before source code, to get near-instant cached rebuilds.

Always use exec form (JSON array syntax) for CMD and ENTRYPOINT

shell form wraps your process in /bin/sh -c, bumping it to PID 2 and silently breaking graceful shutdown in Docker and Kubernetes.

Multi-stage builds let you use a full toolchain (800MB) during compilation and ship only the compiled output (12MB) to production

the build stage is discarded and never pushed to a registry.

The .dockerignore file is mandatory, not optional

without it, COPY . . silently bakes node_modules, .git history, and .env files into your image; add it before you write your first COPY instruction.

Never put secrets in ENV or ARG

they are permanently visible in docker history. Use BuildKit --mount=type=secret for build-time secrets and secrets managers for runtime secrets.

A production Dockerfile has six mandatory checks

non-root USER, no secrets in ENV/ARG, .dockerignore, exec-form CMD, multi-stage build, and HEALTHCHECK.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

FAQ · 5 QUESTIONS

Frequently Asked Questions

What is the difference between a Dockerfile and a Docker image?

How do I reduce the size of my Docker image?

Why does my container ignore SIGTERM and take 30 seconds to stop?

What is the difference between ARG and ENV?

How do I debug a multi-stage build that produces unexpected output?

Naren Founder & Principal Engineer

20+ years shipping production infrastructure and CI/CD at scale. Everything here is grounded in real deployments.

✓ Verified

production tested

May 23, 2026

last updated

1,554

articles · all by Naren

🔥

That's Docker. Mark it forged?

10 min read · try the examples if you haven't