Intermediate 6 min · March 17, 2026

Docker Images and Containers — Why Unpinned Tags Break

Q: What is the difference between CMD and ENTRYPOINT in a Dockerfile?

ENTRYPOINT sets the executable that always runs — it cannot be overridden by docker run arguments. CMD provides default arguments to ENTRYPOINT or a default command if no ENTRYPOINT is set. Together: ENTRYPOINT is the binary, CMD is the default arguments. Use ENTRYPOINT for the main process of a container, CMD for configurable defaults.

Q: How do you keep Docker images small?

Use slim or alpine base images (python:3.12-slim vs python:3.12). Use multi-stage builds to separate the build environment from the runtime image. Combine RUN commands to reduce layers: RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*. Copy only what the container needs — use .dockerignore.

Q: What does exit code 137 mean for a Docker container?

Exit code 137 means the container was killed by signal 9 (SIGKILL), typically by the Linux OOM killer. The container exceeded its memory limit or the host ran out of memory. Debug with docker inspect to check OOMKilled status, then either increase the memory limit (--memory flag) or fix the memory leak in your application.

Q: Does EXPOSE in a Dockerfile actually publish the port?

No. EXPOSE is documentation — it signals which port the application listens on but does not publish it. To actually make the port accessible from the host, use -p HOST:CONTAINER when running the container. EXPOSE is useful for documentation and for tools that auto-detect ports, but it has no runtime effect.

Q: Can I modify a running Docker image?

No. Images are immutable. You can modify a running container's writable layer (add files, change config), but those changes are lost when the container is removed. To persist changes, either use volumes for data or create a new image with docker commit (not recommended for production — use a Dockerfile instead).

Unpinned FROM python:3.12-slim broke production when Debian changed libssl.

Naren Founder & Principal Engineer

20+ years shipping production infrastructure and CI/CD at scale. Notes here come from systems that actually shipped.

✓ Production

production tested

July 18, 2026

last updated

2,466

articles · all by Naren

Before you start⏱ 25 min

✓Solid grasp of DevOps fundamentals
✓Comfortable with command-line tools
✓Basic Linux administration knowledge

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

Image: read-only layers + metadata (CMD, ENV, EXPOSE)
Container: image + writable layer + process
Registry: stores and distributes images (Docker Hub, ECR)
Dockerfile: recipe that produces an image

✦ Definition~90s read

What is Docker Images and Containers?

Docker images are immutable, read-only templates — think of them as a filesystem snapshot plus metadata (exposed ports, entrypoint, environment). Containers are running instances of those images, with a writable layer on top. That distinction matters because docker run creates a container from an image, but docker commit can freeze a container’s state back into an image, which is how people accidentally bake ephemeral data into production artifacts.

★

A Docker image is like a recipe card — it describes exactly what ingredients and steps are needed to produce a dish.

The image is the blueprint; the container is the runtime process. They are not interchangeable, and treating them as such leads to the unpinned-tag chaos this article addresses.

In practice, you define images via a Dockerfile — a declarative script that layers filesystem changes (e.g., FROM node:20-alpine, COPY, RUN npm install). Each instruction creates a cached layer, which speeds rebuilds but also means that apt-get update without pinning versions produces non-reproducible images.

Containers add networking, volumes (persistent data outside the container’s writable layer), and resource limits (CPU/memory via --cpus and --memory). Without limits, one container can starve others on the same host — the classic "noisy neighbor" problem that multi-tenant deployments must solve.

Multi-stage builds (FROM golang:1.21 AS builder, then FROM scratch) are not a nice-to-have; they’re how you avoid shipping a 1.2GB image with a compiler, debug tools, and OS headers to production. A single-stage go build image might be 800MB; a multi-stage result can be under 10MB.

That’s the difference between a 3-second pull and a 30-second pull at scale. Combined with pinned base image digests (e.g., node:20-alpine@sha256:abc123) instead of mutable tags like :latest, you eliminate the "works on my machine" class of bugs entirely.

This article walks through why each of these decisions — image vs. container, pinned vs. unpinned, single-stage vs. multi-stage — directly determines whether your deployment pipeline is a liability or an asset.

Plain-English First

A Docker image is like a recipe card — it describes exactly what ingredients and steps are needed to produce a dish. A container is the actual dish made from that recipe. You can make many dishes (containers) from the same recipe (image), and each dish can be slightly customized (environment variables, mounted volumes), but the base recipe never changes.

Unpinned Docker tags silently shift underneath your builds, turning reproducible deployments into lottery tickets. When Debian updated libssl under the python:3.12-slim tag, production broke for teams who assumed tags were immutable. This article explains why image digests, layer caching, and multi-stage builds are non-negotiable for production-grade containers.

Why Docker Images and Containers Are Not the Same Thing

A Docker image is an immutable, layered filesystem snapshot — a blueprint. A container is a running process with its own writable layer, spawned from that image. The core mechanic: images are read-only templates; containers add a thin writable layer on top, which is discarded when the container is removed unless explicitly committed.

Images use UnionFS (e.g., overlay2) to stack layers. Each Dockerfile instruction (RUN, COPY, etc.) creates a new layer. Layers are cached and shared across images — pulling a new image often only downloads the delta. Containers share the host kernel but get isolated namespaces (PID, network, mount). This means you can run multiple containers from the same image without duplication, but any writes inside a container exist only in that container's ephemeral layer.

Use images to distribute and version your application artifact. Use containers to run that artifact in a consistent, isolated environment. The practical trap: if you rely on latest or any unpinned tag, your image can change silently between pulls, breaking reproducibility. Pin to a digest or a semantic version tag in production.

⚠ The Layer Cache Illusion

Changing a single line in a Dockerfile invalidates all subsequent layers — not just the one that changed. Order your instructions from least to most volatile to maximize cache reuse.

📊 Production Insight

A team used node:14 in Dockerfile — Node 14.17.0 silently became 14.21.3, breaking a regex that relied on a fixed bug. The symptom: random 500 errors in production with no code change. Rule: always pin the full version tag (e.g., node:14.21.3-slim) or use the SHA256 digest.

🎯 Key Takeaway

Images are immutable; containers are ephemeral — never treat a container's filesystem as persistent storage.

Always pin image tags to a specific version or digest — latest is a moving target that will break your build.

Layer ordering in Dockerfile directly impacts build speed and CI costs — put stable dependencies first.

thecodeforge.io

Docker Images Containers

Building an Image with Dockerfile

A Dockerfile is a recipe for building an image. Each instruction creates a new layer — a filesystem diff on top of the previous layer. Docker caches layers and only rebuilds from the first changed instruction downward. Understanding this is the single most important optimization for build speed.

The layer caching principle: if a layer has not changed and all preceding layers are cached, Docker reuses the cached layer instantly. Dependencies change rarely. Code changes frequently. Put rare changes first.

The .dockerignore file controls what gets sent to the Docker Daemon as build context. Without it, your entire project directory — including .git (often 100MB+), node_modules, __pycache__, and .env files with secrets — is sent on every build. This slows builds and risks secret exposure.

DockerfileDOCKERFILE

# Best practices Dockerfile for a Python FastAPI app
FROM python:3.12-slim  # slim variant: much smaller than full python image

WORKDIR /app

# Copy requirements FIRST for layer caching
# If code changes but requirements don't, this layer is cached
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Now copy application code
COPY . .

# Run as non-root for security
RUN useradd -m appuser
USER appuser

# Document the port (does not actually publish it)
EXPOSE 8000

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Output

# docker build -t myapp:1.0 .

Mental Model

Docker Layers as a Stack of Transparent Sheets

Why does the order of Dockerfile instructions matter for build speed?

Docker caches layers top-to-bottom. A changed instruction invalidates all subsequent layers.
Dependencies change rarely. Code changes frequently. Put rare changes first.
COPY requirements.txt before COPY . . ensures pip install is cached on code-only changes.
Each RUN creates a layer. Combining with && reduces layer count and image size.

📊 Production Insight

The layer caching insight is the highest-impact optimization for CI/CD pipelines. In a typical Python project, dependencies change once per sprint but code changes every commit. Without the requirements.txt-first pattern, every commit triggers a full pip install — adding 30-120 seconds to every build. With it, code-only changes rebuild in 2-5 seconds. Across 100 builds per day, that is 1-3 hours of CI time saved daily.

🎯 Key Takeaway

Dockerfile layer ordering is a build speed optimization, not a style choice. Copy dependencies before code. Combine RUN commands. Use .dockerignore. These three changes turn 5-minute builds into 30-second builds. In CI/CD, that compounds into hours saved per week.

Dockerfile Optimization Decisions

IfBuild takes >2 minutes and dependencies rarely change

→

UseMove COPY requirements.txt and RUN pip install before COPY . . — cache the dependency layer

IfImage is >1GB and includes build tools (gcc, make)

→

UseUse multi-stage builds — copy only runtime artifacts to the final image

IfBuild context upload is slow (>10 seconds)

→

UseAdd .dockerignore to exclude .git, node_modules, __pycache__, .env

IfImage contains secrets (API keys in ENV or ARG)

→

UseUse BuildKit secrets: --mount=type=secret. Never bake secrets into layers.

Running Containers

A container is a running instance of an image with an additional thin writable layer. Multiple containers can run from the same image — each with its own writable layer, environment variables, and port mappings. The image itself never changes.

The container lifecycle: create (allocate resources), start (run the CMD process), stop (send SIGTERM, wait, then SIGKILL), remove (delete the writable layer). The --rm flag automates removal on stop.

Port mapping confusion: -p HOST:CONTAINER maps the host port to the container port. Your application inside the container must bind to 0.0.0.0 (all interfaces), not 127.0.0.1 (localhost). Binding to localhost inside the container means the application only accepts connections from within the container — the host port mapping becomes useless.

The difference between stop and kill: docker stop sends SIGTERM (graceful shutdown, 10-second default timeout), then SIGKILL if the process does not exit. docker kill sends SIGKILL immediately. Always use stop in production — it gives your application time to flush logs, close database connections, and complete in-flight requests.

container-lifecycle.shBASH

# Run a container
docker run -p 8000:8000 myapp:1.0

# Flags:
# -p HOST_PORT:CONTAINER_PORT   — publish port
# -d                            — detached (background)
# -e DATABASE_URL=postgres://.. — environment variable
# -v /host/path:/container/path — bind mount volume
# --name my-container           — give it a name
# --rm                          — remove when stopped

docker run -d \
  --name api-server \
  -p 8000:8000 \
  -e DATABASE_URL=postgres://localhost/mydb \
  --rm \
  myapp:1.0

# Container lifecycle
docker ps          # running containers
docker ps -a       # all containers (including stopped)
docker stop api-server     # graceful stop (SIGTERM)
docker kill api-server     # force stop (SIGKILL)
docker rm api-server       # remove stopped container
docker logs api-server     # view logs
docker logs -f api-server  # follow logs
docker exec -it api-server bash  # shell into running container

Output

# Container is running and accessible on port 8000

Mental Model

Container as a Process with a Mask

Why does binding to 127.0.0.1 inside a container break port mapping?

127.0.0.1 inside the container is the container's own loopback, not the host's.
Port mapping (-p 8000:8000) forwards traffic from the host to the container's network interface.
If the app binds to 127.0.0.1, it only accepts connections from inside the container.
Binding to 0.0.0.0 accepts connections from all interfaces, including the mapped port.

📊 Production Insight

The stop-vs-kill distinction matters for zero-downtime deployments. When Kubernetes or a load balancer removes a pod, it sends SIGTERM first. If your application does not handle SIGTERM gracefully (flush buffers, close connections, drain requests), you get connection resets and data loss. Always implement a SIGTERM handler. The default 10-second stop timeout may be too short for applications with long-running requests — increase it with --stop-timeout or terminationGracePeriodSeconds in Kubernetes.

🎯 Key Takeaway

Containers are processes with isolation masks, not mini-VMs. Bind to 0.0.0.0, not 127.0.0.1. Use docker stop (graceful) in production, not docker kill (forced). Implement SIGTERM handlers for clean shutdown. The --rm flag prevents container accumulation during development.

Container Lifecycle Decisions

IfDevelopment — quick iteration, container should auto-cleanup

→

UseUse docker run --rm — removes container on stop, no manual cleanup

IfProduction — container should restart on crash

→

UseUse restart: unless-stopped or restart: always. Implement health checks.

IfNeed to inspect a running container

→

UseUse docker exec -it <container> sh — shell into the container without SSH

IfContainer is stuck and not responding to stop

→

UseUse docker kill <container> — sends SIGKILL immediately. Check for zombie processes.

IfNeed to copy files into/out of a running container

→

UseUse docker cp <host-path> <container>:<path> — no need to rebuild the image

thecodeforge.io

Docker Images Containers

Volumes and Networking

Containers are ephemeral — when a container stops, its writable layer is discarded. For state that must survive container restarts (databases, file uploads, logs), you need volumes.

Named volumes: Managed by Docker. The storage location is controlled by the Docker Daemon (typically /var/lib/docker/volumes/). Survives docker compose down. Destroyed only by docker compose down -v or docker volume rm. Best for databases.

Bind mounts: Mount a host directory into the container. Changes on either side are reflected immediately. Great for development (hot reload). Not recommended for production — ties the container to a specific host path and breaks portability.

Networking: Containers on the same Docker network can reach each other by container name. Docker's embedded DNS server (127.0.0.11) resolves container names to internal IPs. The host-mapped port (left side of -p) is for external access. Container-to-container communication uses the container port directly — never the host port.

Volume lifecycle gotcha: Named volumes persist data even after the container is removed. This is a feature for databases but a trap for test environments — stale data from previous test runs can cause non-deterministic test failures. Use docker compose down -v in CI to ensure clean state.

volumes-and-networking.shBASH

# Named volumes: data persists after container is removed
docker volume create postgres-data
docker run -d \
  -v postgres-data:/var/lib/postgresql/data \
  -e POSTGRES_PASSWORD=secret \
  postgres:16

# Container networking: containers on same network can reach each other by name
docker network create myapp-network

docker run -d --name db --network myapp-network postgres:16
docker run -d \
  --name api \
  --network myapp-network \
  -e DATABASE_URL=postgres://db/myapp \
  myapp:1.0
# 'api' container reaches 'db' by hostname 'db'

# Image management
docker image ls          # list images
docker image rm myapp:1.0
docker system prune -a   # remove all unused images, containers, networks

Output

# Volume data persists; containers communicate by name

Mental Model

Volumes as External Hard Drives

Why should you use named volumes over bind mounts in production?

Named volumes are portable — Docker manages the storage location, not a host path.
Bind mounts couple the container to a specific host — breaks multi-machine deployments.
Named volumes survive docker compose down. Bind mounts depend on the host directory existing.
Named volumes use Docker's optimized storage driver (overlay2). Bind mounts go through the host filesystem.

📊 Production Insight

The bind-mount-in-production anti-pattern is common in teams that graduate from development to production without changing their Compose files. A bind mount like -v /data/postgres:/var/lib/postgresql/data ties the database to a specific server. When the server is replaced during an infrastructure migration, the data is left behind. Named volumes are managed by Docker and can be backed up, migrated, and restored independently of the host filesystem.

🎯 Key Takeaway

Named volumes for production persistence. Bind mounts for development convenience. tmpfs for sensitive temporary data. Containers on the same network communicate by name — never use localhost or host-mapped ports for container-to-container traffic. Always use docker compose down -v in CI for clean test state.

Volume Type Selection

IfDatabase or persistent state in production

→

UseUse named volumes: docker volume create. Back up with docker run --rm -v vol:/data -v $(pwd):/backup alpine tar czf /backup/backup.tar.gz /data

IfDevelopment — live code reloading

→

UseUse bind mounts: -v $(pwd):/app. Fast iteration, no rebuild needed.

IfSensitive temporary data (session tokens, encryption keys)

→

UseUse tmpfs mounts: --tmpfs /tmp/secrets:size=10m. Data never touches disk.

IfCI test runs — need clean state every time

→

UseUse docker compose down -v to destroy named volumes between runs.

Why Multi-Stage Builds Are Mandatory, Not Optional

Production images shouldn't ship compilers, debug tools, or your secret SSH keys. Yet I see Dockerfiles that copy node_modules into production. That's lazy and insecure. Multi-stage builds solve this by letting you compile in one stage and copy only the binary into the final image. The result is an image that's 5x smaller and has zero build-time cruft. Attack surface shrinks. Startup time drops. Disk costs go down. Do it on day one.

Here's the pattern: Stage one installs your full SDK and dependencies, compiles the app. Stage two starts from a lean base like alpine:3.19, copies just the compiled artifact. No source code, no dev dependencies, no package manager. If you're not doing this, your CI/CD is delivering a bloated liability. Start now.

A builder pattern for a Go service is the cleanest example. One FROM golang:1.22 AS builder, one FROM alpine:3.19. Two lines in the Dockerfile, 90% less attack surface.

MultiStageGoService.ymlYAML

// io.thecodeforge — devops tutorial

# Stage 1: Build the binary
FROM golang:1.22 AS builder
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /app/payment-api ./cmd/server

# Stage 2: Minimal runtime
FROM alpine:3.19
# Only ship certs and the binary
RUN apk add --no-cache ca-certificates tzdata
COPY --from=builder /app/payment-api /payment-api
EXPOSE 8080
USER 1001
ENTRYPOINT ["/payment-api"]

Output

Final image size: 18.4 MB (vs. 1.2 GB for single-stage)

Security alerts: 0 critical, 2 low (ca-certificates library)

⚠ Production Trap:

Forgetting to strip debug symbols? Your production image ships /proc/self/mem access paths. Always set CGO_ENABLED=0 and strip with -ldflags='-s -w' in Go builds.

🎯 Key Takeaway

Every production image must use at least two stages: one builder, one runtime. If your Dockerfile has only one FROM, you're shipping your dev environment to prod.

Container Resource Limits Save You From Noisy Neighbors

A single runaway container can tank your entire host. I've seen a memory leak in a metrics container OOM-kill the production database next to it. Docker's default is unlimited — don't trust it. Always pin CPU and memory. This isn't Kubernetes-specific; you set limits in docker run flags or Compose files. If you skip this, you're gambling that every app behaves perfectly every time. They won't.

Why before how: Predictable performance. If your payment service needs 512MB, cap it at 512MB. If it spikes, it gets throttled or killed — but it doesn't take down the logging pipeline. Also stops crypto-mining compromises from melting your fleet. In Compose, use deploy.resources.limits or per-service mem_limit / cpus. Test with docker stats after.

Here's a Compose file that runs a Node.js app next to a Redis cache. Both are capped. Notice Redis gets less CPU — it's a cache, not a compute engine.

ResourceLimitedStack.ymlYAML

// io.thecodeforge — devops tutorial

version: "3.9"
services:
  webapp:
    image: node:20-alpine
    mem_limit: 512m          # Hard memory cap in megabytes
    cpus: "1.5"              # Max 1.5 CPU cores
    deploy:
      resources:
        limits:
          cpus: "1.5"
          memory: 512M
    command: ["node", "server.js"]

  redis-cache:
    image: redis:7-alpine
    mem_limit: 128m
    cpus: "0.5"
    deploy:
      resources:
        limits:
          cpus: "0.5"
          memory: 128M

Output

$ docker compose up -d

$ docker stats --no-stream

CONTAINER CPU % MEM USAGE / LIMIT

webapp 0.12% 210.6MiB / 512MiB

redis-cache 0.04% 42.1MiB / 128MiB

🔥Senior Shortcut:

Set --memory-reservation to 75% of your limit. This gives the kernel a soft target to reclaim memory before the hard kill. Prevents abrupt OOMs during traffic spikes.

🎯 Key Takeaway

Every container gets a hard memory and CPU cap. No exceptions. Test limits with docker stats in staging before deploying to prod.

A Simple Docker Workflow: Ship It Without the Ceremony

Stop treating Docker like a secret ceremony. A real workflow starts with a clean .dockerignore and ends with docker-compose up -d in production. You don't need Kubernetes for your microservice that runs two containers. You need repeatability, not complexity.

Start with a Dockerfile that pins base image tags — FROM node:20-alpine3.19, not FROM node. Then run docker build -t myapp:v1 . and docker run -d -p 3000:3000 myapp:v1. For multi-container apps, use docker-compose.yml with services, ports, and volumes. That's it. Test locally, push to a registry like Docker Hub or ECR, and pull on your server.

This workflow catches env drift early, slashes onboarding time for new devs, and makes rollbacks a docker pull :v0 away. Stop over-engineering your pipeline before you ship anything.

docker-compose.ymlYAML

// io.thecodeforge — devops tutorial
// Minimal production-ready docker-compose
version: '3.8'

services:
  api:
    build:
      context: .
      dockerfile: Dockerfile.prod
    image: myorg/api:v1
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - DB_HOST=db
    depends_on:
      - db
    restart: unless-stopped
    deploy:
      resources:
        limits:
          memory: 256m

  db:
    image: postgres:16-alpine
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      - POSTGRES_PASSWORD_FILE=/run/secrets/db_pw
    secrets:
      - db_pw
    restart: unless-stopped

volumes:
  pgdata:

secrets:
  db_pw:
    file: ./secrets/db_password.txt

Output

WARNING: The `api` service is using a legacy 'links' property in your stack.

This property is deprecated and will be removed in future versions.

Use 'depends_on' instead.

Starting api ... done

Starting db ... done

✅ api is running at http://localhost:3000

⚠ Production Trap:

Never use latest tag in production. Pin to semantic versions or commit SHAs. latest is a time bomb for broken deploys and zero traceability.

🎯 Key Takeaway

Docker workflow is build → tag → push → pull → run. No ceremony, no Kubernetes required.

Advanced Tools to Know: Stop Fighting Docker Solo

Vanilla Docker is fine for toy projects. In production, you need tools that kill the edge cases you haven't hit yet. Start with Dive for image layer inspection — it shows you exactly where your image bloat lives. One dive myimage:latest and you'll see that apt-get clean you forgot. Next, Trivy for vulnerability scanning. Scan your images before they hit production, not after. docker scan is for amateurs. Trivy catches CVEs in base layers and dependencies.

Then there's Hadolint — a Dockerfile linter that enforces best practices like pinning versions and avoiding COPY --chown without a .dockerignore. It'll save you from yourself during code review. And Docker Scout for continuous analysis if you need a GUI for compliance. But the real power move is combining BuildKit caching with --cache-from in CI pipelines. Your Docker builds go from 5 minutes to 30 seconds.

These tools aren't optional. They're the difference between "works on my machine" and "works everywhere, forever, and I can prove it."

Dockerfile.lintYAML

// io.thecodeforge — devops tutorial
// Hadolint-compliant Dockerfile with security scanning
FROM node:20-alpine3.19 AS builder

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

FROM node:20-alpine3.19 AS production

RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

COPY --from=builder --chown=appuser:appgroup /app/node_modules ./node_modules
COPY --chown=appuser:appgroup . .

EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

CMD ["node", "server.js"]

Output

$ hadolint Dockerfile.lint

No errors found

$ trivy image myapp:v1

2024-01-15T10:30:00Z INFO Vulnerability scanning is enabled

2024-01-15T10:30:02Z INFO Detected OS: alpine 3.19.0

2024-01-15T10:30:03Z INFO Number of vulnerabilities: 0

✅ Image is clean. No high or critical CVEs.

💡Senior Shortcut:

Add docker run --rm -v /var/run/docker.sock:/var/run/docker.sock aquasec/trivy image --severity HIGH,CRITICAL your-image to your CI. Fail the build if any critical CVE appears. One line, zero excuses.

🎯 Key Takeaway

Dive, Trivy, Hadolint, and BuildKit aren't nice-to-haves — they're the professional Docker stack. Use them or get burned.

Step 4. Verify Installation

Before trusting Docker to run production workloads, you must verify the installation works end-to-end. This isn't about typing docker --version. You need to confirm the daemon is running, networking is operational, and containers can actually execute. Start by running docker info to check daemon status and see critical details like storage driver, OS type, and number of containers. Next, pull a small test image like hello-world: docker run hello-world. This validates the download pipeline, image extraction, and container lifecycle. Finally, run an interactive container with busybox to confirm stdin/stdout are working: docker run -it busybox sh. If any step fails, check that your user is in the docker group, the Docker service is enabled with systemctl enable docker, and your system supports virtualization if using Docker Desktop. Skipping verification leads to silent failures during deployment.

docker-verify.ymlYAML

// io.thecodeforge — devops tutorial

// Verify Docker installation
- name: Check running containers
  shell: docker ps
  register: result

- name: Pull and run hello-world
  shell: docker run hello-world
  when: result.rc == 0

- name: Test interactive container
  shell: echo "exit" | docker run -i busybox sh
  when: result.rc == 0

⚠ Production Trap:

Running docker ps without errors doesn't mean your containers can reach the network. Always test with docker run --rm alpine ping -c 1 8.8.8.8 to validate DNS and networking.

🎯 Key Takeaway

Verify all three layers: daemon, image pull, and interactive shell before any production deployment.

Key Features

Docker containers aren't just lightweight VMs—they bring unique features that shift how you build and ship software. First, image layering: every Dockerfile instruction creates a read-only layer, enabling caching, reuse, and minimal bandwidth on pulls. Second, container isolation via Linux namespaces and cgroups gives you process-level separation without a hypervisor. Third, portability: containers run identically on your laptop, a bare-metal server, or a cloud cluster, because the image bundles the application plus its OS dependencies. Fourth, ephemeral by design: containers are meant to be created and destroyed frequently, forcing stateless architectures that scale horizontally. Fifth, built-in orchestration primitives like health checks, restart policies, and resource limits let you run containers reliably without external tools. Finally, Docker's registry ecosystem (Docker Hub, private registries) makes distribution as simple as docker push. These features eliminate the classic "works on my machine" problem and make CI/CD pipelines deterministic.

key-features.ymlYAML

// io.thecodeforge — devops tutorial

// Docker key features demo
services:
  web:
    image: nginx:alpine
    ports:
      - "8080:80"
    deploy:
      restart_policy:
        condition: on-failure
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost"]
      interval: 30s
    resources:
      limits:
        cpus: '0.5'
        memory: 256M

⚠ Production Trap:

Don't treat containers like VMs—avoid SSH daemons inside containers. Use docker exec for debugging. Container immutability forces better deployment practices.

🎯 Key Takeaway

The six features—layering, isolation, portability, ephemerality, orchestration primitives, and registry—make Docker the foundation of modern DevOps.

● Production incidentPOST-MORTEMseverity: high

Production Outage from Unpinned Base Image — python:3.12-slim Changed OS Underneath

Symptom

Container exits with ImportError: libssl.so.1.1: cannot open shared object file. No application code changed in 3 weeks. Previous deployment worked. CI build logs show a different base image digest than the last successful build.

Assumption

Team assumed a corrupted Docker layer cache on the CI runner. They retried the build 4 times with --no-cache. Each build produced the same error. Second assumption: a dependency in requirements.txt had a breaking update. They pinned every Python package — the error persisted.

Root cause

The Dockerfile used FROM python:3.12-slim without pinning the OS codename. Between builds, the official Python image updated the slim variant from Debian 11 (bullseye) to Debian 12 (bookworm). Debian 12 ships libssl3, not libssl1.1. The psycopg2-binary package compiled against libssl1.1 could not load. The CI runner had no warm cache, so it pulled the latest python:3.12-slim on every build.

Fix

1. Pinned to FROM python:3.12.3-slim-bookworm — exact version, exact OS codename. 2. Added --platform linux/amd64 to FROM to prevent ARM/AMD64 mismatches. 3. Added a CI step that extracts and logs the base image digest, failing if it changes unexpectedly. 4. Added hadolint to CI pipeline to enforce version pinning rules. 5. Documented: every FROM must pin to exact version and OS codename.

Key lesson

Unpinned tags are time bombs. python:3.12-slim is not a fixed target — it moves.
Pin both the version AND the OS codename: python:3.12.3-slim-bookworm.
CI runners without warm caches pull the latest image on every build, exposing you to silent upstream changes.
A 20-character change in a FROM line prevents hours of incident response.
Image digest pinning (FROM python:3.12.3-slim-bookworm@sha256:abc123...) is the strongest guarantee.

Production debug guideFrom failed container to root cause — systematic debugging paths.6 entries

Symptom · 01

Container exits immediately after start.

→

Fix

Check logs: docker logs <container>. If empty, the CMD failed before writing output. Run interactively: docker run -it <image> sh and execute the CMD manually to see the error.

Symptom · 02

Container runs but application is unreachable.

→

Fix

Verify port mapping: docker port <container>. Check that the application binds to 0.0.0.0, not 127.0.0.1 (localhost inside the container is the container itself, not the host).

Symptom · 03

Image build is extremely slow (>5 minutes for a simple app).

→

Fix

Check layer caching: docker history <image>. If every build reinstalls dependencies, ensure COPY requirements.txt comes before COPY . . Add .dockerignore to exclude .git, node_modules, __pycache__.

Symptom · 04

Container works on one machine but fails on another.

→

Fix

Compare base image digests: docker inspect --format='{{.Image}}' <container>. Check for architecture mismatches (ARM vs AMD64). Check for missing environment variables or volume mounts.

Symptom · 05

Disk space exhausted on Docker host.

→

Fix

Check Docker disk usage: docker system df. Clean up: docker system prune -a removes unused images and stopped containers. For selective cleanup: docker image prune, docker container prune.

Symptom · 06

Container uses too much memory, gets OOM-killed.

→

Fix

Check with docker stats <container>. Set memory limits: docker run --memory=512m. If the application has a memory leak, the limit prevents host-wide impact. Check exit code 137 = OOM killed.

★ Docker Container Triage Cheat SheetFirst-response commands when a container issue is reported.

Container crashed or restarting in a loop.−

Immediate action

Check logs and exit code.

Commands

docker logs --tail 50 <container>

docker inspect <container> --format='{{.State.ExitCode}} {{.State.OOMKilled}}'

Fix now

Exit code 0 = CMD completed (wrong CMD). Exit code 1 = app error (check logs). Exit code 137 = OOM killed (--memory too low). Exit code 139 = segfault (base image mismatch).

Container running but not responding to requests.+

docker build fails with 'no space left on device'.+

Image build is very slow, reinstalls dependencies every time.+

Container works locally but fails in CI or on another machine.+

Image vs Container vs Volume

Characteristic	Image	Container	Volume
What it is	Immutable template (read-only layers)	Running instance of an image	Persistent storage outside container lifecycle
Mutability	Immutable — never changes after build	Writable layer on top of image	Fully writable, persists independently
Created by	docker build (from Dockerfile)	docker run (from image)	docker volume create or Compose
Survives removal	Yes (until explicitly deleted)	No (writable layer discarded)	Yes (until explicitly deleted)
Shared between	Multiple containers	Single container instance	Multiple containers
Storage location	Docker Daemon storage (/var/lib/docker)	Docker Daemon storage (writable layer)	Docker-managed or host path (bind mount)
Size impact	Determined by layers and base image	Image size + writable layer delta	Independent of image/container size
Use case	Package application + dependencies	Run the application	Persist database, uploads, logs

⚙ Quick Reference

6 commands from this guide

File	Command / Code	Purpose
MultiStageGoService.yml	FROM golang:1.22 AS builder	Why Multi-Stage Builds Are Mandatory, Not Optional
ResourceLimitedStack.yml	version: "3.9"	Container Resource Limits Save You From Noisy Neighbors
docker-compose.yml	version: '3.8'	A Simple Docker Workflow
Dockerfile.lint	FROM node:20-alpine3.19 AS builder	Advanced Tools to Know
docker-verify.yml	- name: Check running containers	Step 4. Verify Installation
key-features.yml	services:	Key Features

Key takeaways

Images are immutable layers; containers add a thin writable layer on top

changes do not affect the image.

Layer caching

copy requirements.txt and install before copying source code for faster builds.

EXPOSE documents a port but does not publish it

use -p HOST:CONTAINER to publish.

Named volumes persist data after container removal; bind mounts share host directories.

Containers on the same Docker network reach each other by container name.

Pin exact versions in FROM

unversioned tags are silent time bombs in production.

Bind to 0.0.0.0 inside the container, not 127.0.0.1

localhost inside a container is the container itself.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

FAQ · 5 QUESTIONS

Frequently Asked Questions

What is the difference between CMD and ENTRYPOINT in a Dockerfile?

How do you keep Docker images small?

What does exit code 137 mean for a Docker container?

Does EXPOSE in a Dockerfile actually publish the port?

Can I modify a running Docker image?

Naren Founder & Principal Engineer

20+ years shipping production infrastructure and CI/CD at scale. Notes here come from systems that actually shipped.

✓ Verified

production tested

July 18, 2026

last updated

2,466

articles · all by Naren

🔥

That's Docker. Mark it forged?

6 min read · try the examples if you haven't