Dockerfile CMD Shell Form — Why SIGTERM Fails
Shell-form CMD makes your app PID 2, so Kubernetes SIGTERM hits the shell instead.
20+ years shipping production infrastructure and CI/CD at scale. Everything here is grounded in real deployments.
- Each instruction creates a read-only layer — a filesystem diff on top of the previous layer
- Docker caches layers sequentially — changing one layer invalidates all layers after it
- Order instructions from least-likely-to-change to most-likely-to-change for optimal caching
- Multi-stage builds let you use heavy toolchains during compilation and ship only the output
- FROM: selects the base image (alpine for size, debian/ubuntu for compatibility)
- RUN: executes commands during build — chain with && to reduce layer count
- COPY: copies files into the image — prefer over ADD unless you need tar extraction
- CMD vs ENTRYPOINT: CMD provides default args, ENTRYPOINT sets the fixed executable
- ARG vs ENV: ARG is build-time only, ENV persists at runtime — never put secrets in either
Imagine you're moving to a new city and need to set up your apartment exactly the way you like it. Instead of doing it from memory every time, you write a step-by-step instruction sheet: 'Step 1 — buy a bed frame. Step 2 — assemble it. Step 3 — put the mattress on top.' A Dockerfile is exactly that instruction sheet, but for your application's environment. Docker reads it top to bottom and builds a perfect, repeatable copy of your app's home — every single time, on any machine in the world.
Dockerfiles eliminate environment drift. A Dockerfile is a plain-text script that defines every dependency, runtime, and configuration your application needs. Docker reads it and builds an image — a portable, immutable snapshot that runs identically on any machine.
The layer caching mechanism is the single most important concept. Each instruction creates a cached layer. Changing one instruction invalidates all subsequent layers. Order your instructions from least-likely-to-change to most-likely-to-change to maximize cache hits during development.
Three misconceptions cause the most production issues: CMD without exec form silently breaks graceful shutdown in Kubernetes, ENV and ARG are visible in docker history (never put secrets in them), and .dockerignore is not optional (COPY . . without it bakes secrets and gigabytes of junk into the image).
What Dockerfile CMD Shell Form Actually Does
The Dockerfile CMD instruction defines the default command that runs when a container starts. In shell form — CMD command param1 — Docker wraps it as /bin/sh -c "command param1". That shell process becomes PID 1 inside the container, not your application. This matters because PID 1 in Linux has a unique responsibility: it must handle SIGTERM and other signals sent by docker stop. When you use shell form, your Java app (e.g., java -jar app.jar) runs as a child of sh, not as PID 1. Docker stop sends SIGTERM to PID 1 — the shell — which by default does not forward signals to child processes. Your JVM never sees SIGTERM. After a 10-second grace period, Docker escalates to SIGKILL, killing the JVM abruptly. This means no graceful shutdown: no shutdown hooks, no draining connections, no flushing buffers. In production, this causes dropped requests, corrupted state, and angry users. Use exec form — CMD ["java", "-jar", "app.jar"] — to make your app PID 1 and receive signals directly.
How Docker Builds an Image — Layers Are Everything
Before you write a single Dockerfile instruction, you need a mental model of what Docker is actually doing when it reads your file. Docker doesn't build one monolithic blob. It builds a stack of read-only layers, one per instruction. Each layer is a diff — only the filesystem changes from that step.
Why does this matter? Because Docker caches every layer. If you rebuild an image and nothing changed in a particular step, Docker reuses the cached layer instead of running it again. This turns a 3-minute build into a 4-second build. But the cache is sequential — as soon as one layer is invalidated (because something changed), every layer after it is also invalidated and rebuilt from scratch.
This single insight drives the most important Dockerfile design decision you'll ever make: order your instructions from least-likely-to-change to most-likely-to-change. Your base OS almost never changes. Your system dependencies change occasionally. Your app's package dependencies change sometimes. Your source code changes constantly. Structure your Dockerfile in that order and you'll get near-instant cached rebuilds during development.
Think of layers like a stack of transparent slides on an overhead projector. Each slide adds something. You can swap out the top slide without reprinting all the slides beneath it.
Layer size and the cleanup-in-same-layer rule: Each RUN instruction creates a new layer. If you download a 200MB package in one RUN and delete it in the next RUN, the 200MB still exists in the first layer — layers are additive. The delete only adds a whiteout marker. Always chain download and cleanup in the same RUN with && to avoid bloating the image with phantom files.
- Each layer is a diff on top of the previous layer. If the base changes, the diff no longer applies.
- Docker cannot know if a later instruction depends on the changed content in an earlier layer.
- The cache is sequential, not selective — Docker rebuilds from the first invalidated layer onward.
- This is why layer ordering (least-change to most-change) is the single most impactful Dockerfile optimization.
The Instructions That Actually Matter — And What They're Really Doing
There are 18 Dockerfile instructions. In practice, you'll use about 10 of them regularly. Rather than listing all 18 mechanically, let's focus on the ones that cause confusion or have non-obvious behaviour — because those are the ones that bite you in production.
FROM is always first. It picks your starting layer. FROM scratch gives you an empty image — useful for compiled Go or Rust binaries. FROM node:20-alpine gives you Node on Alpine Linux, which is ~7MB versus ~180MB for Debian-based images. Prefer Alpine for production; prefer the fuller images when you need debugging tools.
RUN executes a shell command during the build. Each RUN creates a new layer. Chain related commands with && and clean up in the same RUN to avoid bloating the image with intermediate files that persist in a layer even after you delete them later.
COPY vs ADD: Use COPY almost always. ADD does extra magic — it auto-extracts tar archives and can fetch URLs — but that magic makes builds unpredictable. Use ADD only when you explicitly need its archive extraction feature.
ENV sets environment variables available at both build time and runtime. ARG sets variables available only at build time. Never put secrets in ENV — they're visible in docker inspect and image history. Use runtime secret injection instead.
ENTRYPOINT vs CMD: ENTRYPOINT sets the executable that always runs. CMD provides default arguments to it. When you run docker run my-image --verbose, that --verbose replaces CMD but gets appended to ENTRYPOINT. Together they let you build images that behave like CLI tools.
The HEALTHCHECK instruction: HEALTHCHECK tells Docker how to determine if the container's process is healthy. Without it, Docker only checks if the process is running — not if it is functional. A process that is running but deadlocked appears healthy. HEALTHCHECK runs a command periodically and marks the container as unhealthy if it fails. This is critical for orchestrators like Docker Swarm and Kubernetes that use health status for routing decisions.
- When building CLI-like tools: ENTRYPOINT is the tool, CMD is the default subcommand.
- When you want a fixed executable with configurable arguments: ENTRYPOINT ["python", "-m", "uvicorn"] + CMD ["app:app", "--port", "8080"].
- When users should be able to override arguments without re-specifying the executable.
- When you want docker run <image> --help to work — the --help replaces CMD and is appended to ENTRYPOINT.
Multi-Stage Builds — The Pattern That Separates Pros from Beginners
Here's a scenario every developer hits: you need a compiler or build tool to produce your application binary, but you don't need that compiler in the final image running in production. Shipping the compiler anyway means a larger attack surface, a bigger image pulling over the network, and slower startup times in Kubernetes.
Multi-stage builds solve this elegantly. You define multiple FROM blocks in one Dockerfile. Each FROM starts a fresh image context. You build your application in an early 'builder' stage that has all the tools, then you COPY only the compiled output into a final, minimal 'runtime' stage. The builder stage is discarded — it never ships.
This pattern is transformative for compiled languages. A Go application that builds in a 800MB image with all the Go toolchain can ship as a 12MB Alpine or even a 3MB scratch image containing just the binary. But it's equally powerful for JavaScript — build your React app with node_modules in one stage, then copy only the /dist folder into an nginx image.
The key instruction is COPY --from=builder. The name builder is just a label you assign with AS in the FROM line. You can have as many stages as you need, and any stage can copy from any previous stage. You can even reference external images as copy sources with --from=nginx:alpine.
Build-time secrets in multi-stage builds: Multi-stage builds are the correct pattern for handling build-time secrets. Put the secret in the builder stage (using BuildKit --mount=type=secret), use it during compilation, and the secret never appears in the final runtime stage. The builder stage is discarded, and with it any trace of the secret.
Targeting a specific stage: Use docker build --target <stage-name> to build up to a specific stage. This is useful for debugging — build the builder stage and inspect it without building the runtime stage: docker build --target builder -t debug . && docker run --rm -it debug sh.
- Deleting files in a layer does not reduce image size — layers are additive. The deleted files persist in earlier layers.
- Multi-stage builds discard entire stages — the builder stage never appears in the final image.
- The final image has fewer layers, smaller size, and a reduced attack surface (no compiler, no build tools).
- Multi-stage builds also improve build cache — the builder stage is cached independently from the runtime stage.
Production-Ready Dockerfile — Putting It All Together
Knowing individual instructions is one thing. Knowing how they compose into a secure, efficient, production-grade Dockerfile is what makes the difference in a real project. There are four production concerns beyond 'does it build': image size, security, build speed, and signal handling.
Image size: use a minimal base, chain RUN commands, use multi-stage builds, and add a .dockerignore file — this is the most commonly forgotten file. Without it, COPY . . sends your entire project directory (including node_modules, .git, test fixtures) to the Docker build context, which can make builds take minutes before a single instruction executes.
Security: never run as root. Add a non-root user with RUN addgroup and adduser, then switch to it with USER. If an attacker compromises your app, running as a non-root user limits the blast radius significantly.
Signal handling: always use exec form ["executable", "arg"] for CMD and ENTRYPOINT — not shell form executable arg. Shell form wraps your command in /bin/sh -c, which means your process gets PID 2, not PID 1. Kubernetes and Docker send SIGTERM to PID 1 when stopping a container. If your app isn't PID 1, it never receives the signal and gets hard-killed after the timeout.
Build speed: everything from section one — order layers by change frequency, separate dependency manifests from source code.
The .dockerignore file in detail: The .dockerignore file excludes files from the build context before they are sent to the Docker daemon. Without it, the entire directory (including .git, node_modules, .env, test fixtures) is sent to the daemon, increasing build time and risking secret exposure. Common patterns to exclude: node_modules/, .git/, .env, .log, coverage/, __pycache__/, *.pyc, .dockerignore itself.
- Non-root USER instruction before CMD/ENTRYPOINT.
- No secrets in ENV, ARG, or RUN instructions.
- .dockerignore file exists and excludes node_modules, .git, .env.
- Exec-form CMD/ENTRYPOINT for signal handling.
- Multi-stage build if build tools are not needed at runtime.
- HEALTHCHECK instruction for orchestrator integration.
Why Your Dockerfile Needs a .dockerignore — And Most Don't Bother
You've seen it. The build that crawls for 90 seconds, copying node_modules or .git into the image for no reason. That's what happens when you skip a .dockerignore. The build context — everything in your project directory — gets shipped to the Docker daemon. Including your secrets, your 400MB vendor folder, and the cat picture you forgot about.
A .dockerignore works exactly like .gitignore. It tells the build to prune dead weight before the COPY instruction even runs. Smaller context means faster builds. Fewer cache invalidations. And you're not baking your .env file into the final image by accident.
Production teams treat .dockerignore as mandatory. Not optional. Not "nice to have". You don't get to ship a lean image without it.
How the Build Cache Really Works — Stop Breaking It
Docker doesn't rebuild everything from scratch. It caches layers. When you change a line in your Dockerfile, Docker checks if the previous instruction's layer already exists. If it does, it reuses it. If not, it invalidates the cache — and every layer after it.
The problem? Developers put frequently changing files early in the Dockerfile. COPY the entire source tree before running npm install. Now every code change invalidates the node_modules layer. You pay for a full npm install on every build.
The fix is brutal and effective: order your instructions from least to most volatile. Start with package managers and lockfiles. Install dependencies. Then copy the source code. Docker's cache is dumb — it follows the order you give it. Give it a good order.
This pattern saves minutes per build in CI. If your builds take longer than 60 seconds, your layer ordering is wrong.
Metadata in Dockerfiles — Labels, EXPOSE, and the Lies You Tell
A Dockerfile isn't just for building. It's for documentation. The instructions that don't affect the filesystem — LABEL, EXPOSE, and ARG — tell anyone who reads the image what it's supposed to do.
LABEL adds metadata as key-value pairs. Use it for maintainer contact, version, and git commit. EXPOSE doesn't actually publish a port. It annotates that the container listens on that port at runtime. It's a contract between the image author and the person running it. If you skip it, you're hiding what the app needs.
ARG defines build-time variables. Use it to pass version numbers or environment-specific configs without hardcoding. But be careful — ARG values persist in the image history. Don't put secrets in ARG unless you want them leaked.
Production workflows read these labels. Registries sort by them. Monitoring tools surface them. If your Dockerfile has zero labels, you're shipping a blank ID card.
docker history. Use Docker secrets or buildkit's --secret flag instead.Why Every Dockerfile Needs a Clear Explanation of Base Image Choices
Most Dockerfiles start with FROM ubuntu:latest or FROM node:18-alpine without explaining why. That's a trap. The base image you pick directly dictates image size, attack surface, and compatibility. A bloated base like ubuntu:22.04 is 77 MB and includes unnecessary tools — perfect for testing, terrible for production. Alpine images drop to 5 MB but use musl libc, breaking binaries compiled against glibc. Distroless images strip everything but the runtime, reducing CVEs to near zero but making debugging impossible without sidecars. The rule: state your base image rationale in a comment above FROM. 'We use node:18-slim because Alpine's musl breaks our native bcrypt module.' That single line saves the next engineer hours of guessing. Never inherit a base image you can't explain in one sentence.
Additional Resources That Fix Real Dockerfile Pain Points
Official documentation won't teach you what hurts most: debugging broken cache hits, wrestling with BuildKit secrets, or slimming images without breaking the app. These resources close that gap. For cache debugging, read Docker's 'Optimizing Builds with Cache' docs — but skip the theory and jump to the 'Cache invalidation patterns' section. For multi-stage builds, check out 'Docker Multi-Stage Builds: The Practical Guide' on dev.to by a former Docker engineer — it covers live examples of cross-stage variable passing. For security, use Hadolint (hadolint.github.io) to lint your Dockerfile against 100+ rules, then read Aqua Security's 'Dockerfile Best Practices' for real CVE reduction metrics. These aren't blog fluff — they're the exact resources senior engineers open when their pipeline fails.
Kubernetes Pods Take 30 Seconds to Stop — Graceful Shutdown Silently Broken by Shell-Form CMD
- Shell-form CMD wraps your process in /bin/sh -c, making it PID 2. SIGTERM goes to PID 1 (the shell), not your app. Graceful shutdown is silently broken.
- Exec-form CMD (JSON array syntax) makes your app PID 1. SIGTERM reaches your app directly. Always use exec form for CMD and ENTRYPOINT.
- docker stop and Kubernetes SIGTERM behavior can differ. Test signal handling in Kubernetes, not just locally.
- Add a CI check that detects shell-form CMD/ENTRYPOINT. This is a silent failure that only manifests under load during rolling updates.
docker build --progress=plain -t test:latest . 2>&1 | grep -E 'CACHED|RUN|COPY'docker history <image> --format '{{.CreatedBy}} {{.Size}}'Key takeaways
Interview Questions on This Topic
Frequently Asked Questions
20+ years shipping production infrastructure and CI/CD at scale. Everything here is grounded in real deployments.
That's Docker. Mark it forged?
10 min read · try the examples if you haven't