Docker Images and Containers — Why Unpinned Tags Break
Unpinned FROM python:3.12-slim broke production when Debian changed libssl.
20+ years shipping production infrastructure and CI/CD at scale. Notes here come from systems that actually shipped.
- Image: read-only layers + metadata (CMD, ENV, EXPOSE)
- Container: image + writable layer + process
- Registry: stores and distributes images (Docker Hub, ECR)
- Dockerfile: recipe that produces an image
A Docker image is like a recipe card — it describes exactly what ingredients and steps are needed to produce a dish. A container is the actual dish made from that recipe. You can make many dishes (containers) from the same recipe (image), and each dish can be slightly customized (environment variables, mounted volumes), but the base recipe never changes.
Why Docker Images and Containers Are Not the Same Thing
A Docker image is an immutable, layered filesystem snapshot — a blueprint. A container is a running process with its own writable layer, spawned from that image. The core mechanic: images are read-only templates; containers add a thin writable layer on top, which is discarded when the container is removed unless explicitly committed.
Images use UnionFS (e.g., overlay2) to stack layers. Each Dockerfile instruction (RUN, COPY, etc.) creates a new layer. Layers are cached and shared across images — pulling a new image often only downloads the delta. Containers share the host kernel but get isolated namespaces (PID, network, mount). This means you can run multiple containers from the same image without duplication, but any writes inside a container exist only in that container's ephemeral layer.
Use images to distribute and version your application artifact. Use containers to run that artifact in a consistent, isolated environment. The practical trap: if you rely on latest or any unpinned tag, your image can change silently between pulls, breaking reproducibility. Pin to a digest or a semantic version tag in production.
node:14 in Dockerfile — Node 14.17.0 silently became 14.21.3, breaking a regex that relied on a fixed bug. The symptom: random 500 errors in production with no code change. Rule: always pin the full version tag (e.g., node:14.21.3-slim) or use the SHA256 digest.latest is a moving target that will break your build.Building an Image with Dockerfile
A Dockerfile is a recipe for building an image. Each instruction creates a new layer — a filesystem diff on top of the previous layer. Docker caches layers and only rebuilds from the first changed instruction downward. Understanding this is the single most important optimization for build speed.
The layer caching principle: if a layer has not changed and all preceding layers are cached, Docker reuses the cached layer instantly. Dependencies change rarely. Code changes frequently. Put rare changes first.
The .dockerignore file controls what gets sent to the Docker Daemon as build context. Without it, your entire project directory — including .git (often 100MB+), node_modules, __pycache__, and .env files with secrets — is sent on every build. This slows builds and risks secret exposure.
- Docker caches layers top-to-bottom. A changed instruction invalidates all subsequent layers.
- Dependencies change rarely. Code changes frequently. Put rare changes first.
- COPY requirements.txt before COPY . . ensures pip install is cached on code-only changes.
- Each RUN creates a layer. Combining with && reduces layer count and image size.
Running Containers
A container is a running instance of an image with an additional thin writable layer. Multiple containers can run from the same image — each with its own writable layer, environment variables, and port mappings. The image itself never changes.
The container lifecycle: create (allocate resources), start (run the CMD process), stop (send SIGTERM, wait, then SIGKILL), remove (delete the writable layer). The --rm flag automates removal on stop.
Port mapping confusion: -p HOST:CONTAINER maps the host port to the container port. Your application inside the container must bind to 0.0.0.0 (all interfaces), not 127.0.0.1 (localhost). Binding to localhost inside the container means the application only accepts connections from within the container — the host port mapping becomes useless.
The difference between stop and kill: docker stop sends SIGTERM (graceful shutdown, 10-second default timeout), then SIGKILL if the process does not exit. docker kill sends SIGKILL immediately. Always use stop in production — it gives your application time to flush logs, close database connections, and complete in-flight requests.
- 127.0.0.1 inside the container is the container's own loopback, not the host's.
- Port mapping (-p 8000:8000) forwards traffic from the host to the container's network interface.
- If the app binds to 127.0.0.1, it only accepts connections from inside the container.
- Binding to 0.0.0.0 accepts connections from all interfaces, including the mapped port.
Volumes and Networking
Containers are ephemeral — when a container stops, its writable layer is discarded. For state that must survive container restarts (databases, file uploads, logs), you need volumes.
Named volumes: Managed by Docker. The storage location is controlled by the Docker Daemon (typically /var/lib/docker/volumes/). Survives docker compose down. Destroyed only by docker compose down -v or docker volume rm. Best for databases.
Bind mounts: Mount a host directory into the container. Changes on either side are reflected immediately. Great for development (hot reload). Not recommended for production — ties the container to a specific host path and breaks portability.
Networking: Containers on the same Docker network can reach each other by container name. Docker's embedded DNS server (127.0.0.11) resolves container names to internal IPs. The host-mapped port (left side of -p) is for external access. Container-to-container communication uses the container port directly — never the host port.
Volume lifecycle gotcha: Named volumes persist data even after the container is removed. This is a feature for databases but a trap for test environments — stale data from previous test runs can cause non-deterministic test failures. Use docker compose down -v in CI to ensure clean state.
- Named volumes are portable — Docker manages the storage location, not a host path.
- Bind mounts couple the container to a specific host — breaks multi-machine deployments.
- Named volumes survive docker compose down. Bind mounts depend on the host directory existing.
- Named volumes use Docker's optimized storage driver (overlay2). Bind mounts go through the host filesystem.
Why Multi-Stage Builds Are Mandatory, Not Optional
Production images shouldn't ship compilers, debug tools, or your secret SSH keys. Yet I see Dockerfiles that copy node_modules into production. That's lazy and insecure. Multi-stage builds solve this by letting you compile in one stage and copy only the binary into the final image. The result is an image that's 5x smaller and has zero build-time cruft. Attack surface shrinks. Startup time drops. Disk costs go down. Do it on day one.
Here's the pattern: Stage one installs your full SDK and dependencies, compiles the app. Stage two starts from a lean base like alpine:3.19, copies just the compiled artifact. No source code, no dev dependencies, no package manager. If you're not doing this, your CI/CD is delivering a bloated liability. Start now.
A builder pattern for a Go service is the cleanest example. One FROM golang:1.22 AS builder, one FROM alpine:3.19. Two lines in the Dockerfile, 90% less attack surface.
/proc/self/mem access paths. Always set CGO_ENABLED=0 and strip with -ldflags='-s -w' in Go builds.Container Resource Limits Save You From Noisy Neighbors
A single runaway container can tank your entire host. I've seen a memory leak in a metrics container OOM-kill the production database next to it. Docker's default is unlimited — don't trust it. Always pin CPU and memory. This isn't Kubernetes-specific; you set limits in docker run flags or Compose files. If you skip this, you're gambling that every app behaves perfectly every time. They won't.
Why before how: Predictable performance. If your payment service needs 512MB, cap it at 512MB. If it spikes, it gets throttled or killed — but it doesn't take down the logging pipeline. Also stops crypto-mining compromises from melting your fleet. In Compose, use deploy.resources.limits or per-service mem_limit / cpus. Test with docker stats after.
Here's a Compose file that runs a Node.js app next to a Redis cache. Both are capped. Notice Redis gets less CPU — it's a cache, not a compute engine.
--memory-reservation to 75% of your limit. This gives the kernel a soft target to reclaim memory before the hard kill. Prevents abrupt OOMs during traffic spikes.docker stats in staging before deploying to prod.A Simple Docker Workflow: Ship It Without the Ceremony
Stop treating Docker like a secret ceremony. A real workflow starts with a clean .dockerignore and ends with docker-compose up -d in production. You don't need Kubernetes for your microservice that runs two containers. You need repeatability, not complexity.
Start with a Dockerfile that pins base image tags — FROM node:20-alpine3.19, not FROM node. Then run docker build -t myapp:v1 . and docker run -d -p 3000:3000 myapp:v1. For multi-container apps, use docker-compose.yml with services, ports, and volumes. That's it. Test locally, push to a registry like Docker Hub or ECR, and pull on your server.
This workflow catches env drift early, slashes onboarding time for new devs, and makes rollbacks a docker pull :v0 away. Stop over-engineering your pipeline before you ship anything.
latest tag in production. Pin to semantic versions or commit SHAs. latest is a time bomb for broken deploys and zero traceability.Advanced Tools to Know: Stop Fighting Docker Solo
Vanilla Docker is fine for toy projects. In production, you need tools that kill the edge cases you haven't hit yet. Start with Dive for image layer inspection — it shows you exactly where your image bloat lives. One dive myimage:latest and you'll see that apt-get clean you forgot. Next, Trivy for vulnerability scanning. Scan your images before they hit production, not after. docker scan is for amateurs. Trivy catches CVEs in base layers and dependencies.
Then there's Hadolint — a Dockerfile linter that enforces best practices like pinning versions and avoiding COPY --chown without a .dockerignore. It'll save you from yourself during code review. And Docker Scout for continuous analysis if you need a GUI for compliance. But the real power move is combining BuildKit caching with --cache-from in CI pipelines. Your Docker builds go from 5 minutes to 30 seconds.
These tools aren't optional. They're the difference between "works on my machine" and "works everywhere, forever, and I can prove it."
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock aquasec/trivy image --severity HIGH,CRITICAL your-image to your CI. Fail the build if any critical CVE appears. One line, zero excuses.Step 4. Verify Installation
Before trusting Docker to run production workloads, you must verify the installation works end-to-end. This isn't about typing docker --version. You need to confirm the daemon is running, networking is operational, and containers can actually execute. Start by running docker info to check daemon status and see critical details like storage driver, OS type, and number of containers. Next, pull a small test image like hello-world: docker run hello-world. This validates the download pipeline, image extraction, and container lifecycle. Finally, run an interactive container with busybox to confirm stdin/stdout are working: docker run -it busybox sh. If any step fails, check that your user is in the docker group, the Docker service is enabled with systemctl enable docker, and your system supports virtualization if using Docker Desktop. Skipping verification leads to silent failures during deployment.
Key Features
Docker containers aren't just lightweight VMs—they bring unique features that shift how you build and ship software. First, image layering: every Dockerfile instruction creates a read-only layer, enabling caching, reuse, and minimal bandwidth on pulls. Second, container isolation via Linux namespaces and cgroups gives you process-level separation without a hypervisor. Third, portability: containers run identically on your laptop, a bare-metal server, or a cloud cluster, because the image bundles the application plus its OS dependencies. Fourth, ephemeral by design: containers are meant to be created and destroyed frequently, forcing stateless architectures that scale horizontally. Fifth, built-in orchestration primitives like health checks, restart policies, and resource limits let you run containers reliably without external tools. Finally, Docker's registry ecosystem (Docker Hub, private registries) makes distribution as simple as docker push. These features eliminate the classic "works on my machine" problem and make CI/CD pipelines deterministic.
Production Outage from Unpinned Base Image — python:3.12-slim Changed OS Underneath
- Unpinned tags are time bombs. python:3.12-slim is not a fixed target — it moves.
- Pin both the version AND the OS codename: python:3.12.3-slim-bookworm.
- CI runners without warm caches pull the latest image on every build, exposing you to silent upstream changes.
- A 20-character change in a FROM line prevents hours of incident response.
- Image digest pinning (FROM python:3.12.3-slim-bookworm@sha256:abc123...) is the strongest guarantee.
docker logs --tail 50 <container>docker inspect <container> --format='{{.State.ExitCode}} {{.State.OOMKilled}}'Key takeaways
Interview Questions on This Topic
Frequently Asked Questions
20+ years shipping production infrastructure and CI/CD at scale. Notes here come from systems that actually shipped.
That's Docker. Mark it forged?
7 min read · try the examples if you haven't