Intermediate 11 min · March 06, 2026

Docker Interview Questions — Production Failure Patterns

Q: What is the difference between a Docker image and a Docker container?

An image is a read-only, layered template — think of it as a frozen snapshot of a filesystem and its metadata. A container is a live, running instance of that image with a thin writable layer added on top. You can create dozens of containers from a single image simultaneously, each isolated from the others. When a container is deleted, its writable layer is gone, but the original image is untouched.

Q: How do you handle secrets like passwords or API keys in Docker containers?

Secrets should never be baked into Docker images. For local dev, use environment variables or .env files (in .gitignore). In production, use orchestration-level features like Docker Secrets or Kubernetes Secrets. Within a standalone container, mounting a tmpfs volume is a secure way to pass sensitive data into memory so it never persists to the host disk. BuildKit --mount=type=secret handles build-time secrets without leaving them in image layers.

Q: Can you run a container without the Docker Daemon?

Yes. While the Docker engine relies on the daemon, the actual container execution is handled by lower-level runtimes like containerd or runc. Tools like Podman provide a daemonless alternative that implements the same OCI standards as Docker, allowing you to run containers as a standard user process without a background service.

Q: What does exit code 137 mean for a Docker container?

Exit code 137 means the container was killed by signal 9 (SIGKILL), typically by the Linux OOM killer. The container exceeded its memory limit or the host ran out of memory. Debug with docker inspect to check OOMKilled status, then either increase the memory limit (--memory flag) or fix the memory leak in your application.

Q: What is the difference between docker stop and docker kill?

docker stop sends SIGTERM (graceful shutdown) and waits 10 seconds (configurable with --stop-timeout), then sends SIGKILL if the process has not exited. docker kill sends SIGKILL immediately with no grace period. In production, always use docker stop to give your application time to flush logs, close database connections, and drain in-flight requests. docker kill is for stuck containers that do not respond to SIGTERM.

Bind mount migrations wiped a production database.

Naren Founder & Principal Engineer

20+ years shipping production code across the stack, with years spent interviewing engineers. Drawn from code that ran under real load.

✓ Production

production tested

July 18, 2026

last updated

2,466

articles · all by Naren

Before you start⏱ 25 min

✓Solid grasp of fundamentals
✓Comfortable reading code examples
✓Basic production concepts

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

Images vs containers: immutable template vs running instance with writable layer
Layer caching: instruction order determines build speed
Volumes vs bind mounts vs tmpfs: three storage mechanisms with different lifecycle guarantees
Networking: bridge/host/none drivers, DNS resolution by service name
Multi-stage builds: separate build-time from runtime dependencies

✦ Definition~90s read

What is Docker Interview Questions?

Every junior parrots 'containers are lightweight VMs.' That's wrong, and interviewers will let you hang yourself with it.

★

Imagine you're shipping a birthday cake to a friend across the country.

A container is a process. Full stop. It runs on the host kernel with cgroups limiting resources and namespaces isolating what it can see. No hypervisor, no guest OS. That's why a container boots in milliseconds, not minutes.

The isolation is an illusion, but a useful one. Namespaces give each container its own filesystem, process tree, network stack, and user IDs. Cgroups enforce CPU quotas, memory caps, and I/O throttling. When you run docker run, the daemon creates a new cgroup tree and namespace for that process, then starts a single application in it.

Here's the production headache: since containers share the host kernel, a kernel panic takes down every container on that host. Virtual machines would survive. Never treat containers as hardened security boundaries — they aren't. Use them for dependency isolation and resource control, not multi-tenant hostile workloads.

Senior engineers understand this trade-off. Juniors think containers are magic.

Plain-English First

Imagine you're shipping a birthday cake to a friend across the country. Instead of hoping the bakery at their end can recreate the exact recipe, you ship the entire kitchen — oven, ingredients, and instructions — sealed in a box. When they open it, the cake comes out perfectly every time. That sealed box is a Docker container. It bundles your app and everything it needs to run, so it works the same on your laptop, a test server, and production — no 'it worked on my machine' excuses.

Docker interview questions probe production instincts, not memorized definitions. Interviewers want to know if you have debugged a container that could not reach its database, optimized a build that took 10 minutes, or tracked down a secret leak in an image layer.

The three pillars that separate strong answers from weak ones: understanding the image/container lifecycle (immutable templates vs ephemeral instances), mastering layer caching (instruction order as a performance decision), and knowing the storage and networking trade-offs (named volumes vs bind mounts, bridge vs host networking).

Common misconceptions that fail interviews: EXPOSE publishes a port (it does not), containers are VMs (they share the host kernel), and docker stop and docker kill are the same (SIGTERM vs SIGKILL). Getting these wrong signals a lack of hands-on production experience.

Why Docker Interview Questions Focus on Production Failure Patterns

Docker interview questions are not about memorizing commands or Dockerfile syntax. They test your understanding of containerization mechanics—how Linux namespaces and cgroups isolate processes, and how layered images (UnionFS) enable efficient distribution. The core mechanic is that a container is just a process with restricted visibility and resource limits, not a lightweight VM. This distinction matters because it changes how you debug, monitor, and secure applications.

In practice, Docker's key properties are ephemeral filesystems, shared kernel with the host, and network namespace isolation. Containers start in milliseconds, but they also inherit host kernel vulnerabilities. A container crash doesn't persist state unless volumes are mounted. Resource limits (CPU, memory, PID) are enforced by cgroups, and misconfiguring them leads to OOM kills or throttling, not graceful degradation.

You use Docker to achieve environment parity between dev, CI, and production, and to enable microservice decomposition with minimal overhead. It matters because teams that treat containers as VMs hit failure patterns like zombie processes, leaked file descriptors, or network DNS timeouts. Understanding these patterns is what separates a Docker user from a Docker engineer.

⚠ Container ≠ VM

Containers share the host kernel; a kernel panic in one container takes down all containers on that host. Always design for host-level failure.

📊 Production Insight

A team ran a Java app in Docker without setting -Xmx, assuming cgroup memory limits would constrain the JVM. The JVM read host memory, allocated a huge heap, and got OOM-killed repeatedly.

Symptom: container exits with code 137 (SIGKILL from OOM killer) but no Java heap dump—because the kernel kills the process before the JVM can write one.

Rule: Always set JVM memory flags relative to cgroup limits (use -XX:+UseContainerSupport and -XX:MaxRAMPercentage=75.0).

🎯 Key Takeaway

Containers are processes, not VMs—debug with ps, top, and strace, not with a virtual console.

Ephemeral filesystem means logs and state must be externalized via volumes or stdout.

Resource limits must be explicitly configured inside the container, not just at the Docker level.

thecodeforge.io

Docker Interview Questions

Core Concepts: Images, Containers, and the Daemon — What Interviewers Really Want to Hear

Most candidates can define an image and a container. What separates a strong answer is explaining the relationship between them.

An image is an immutable, layered snapshot of a filesystem and its metadata — think of it as a read-only template. A container is a running instance of that image, plus a thin writable layer on top. When the container dies, that writable layer is gone. This is why containers are considered ephemeral by design.

The Docker daemon (dockerd) is the long-running background process that does the actual work: building images, managing container lifecycles, handling networking, and talking to registries. The Docker CLI you type commands into is just a client that sends API requests to the daemon over a Unix socket.

Interviewers love asking about layers because they reveal whether you understand caching. Every instruction in a Dockerfile creates a new layer. Layers are cached by their content hash. If layer 3 changes, every layer after it is invalidated and must be rebuilt. This is why instruction order in a Dockerfile matters enormously for build speed — put the things that change least (installing OS packages) at the top, and the things that change most (copying your app source code) near the bottom.

Copy-on-Write (CoW) internals: Docker's storage drivers (overlay2 is the default) use Copy-on-Write. When a container reads a file, it reads directly from the image layer. When it writes, the file is copied to the writable layer and modified there. This means multiple containers sharing the same image share the same read-only layers in memory — only the writable deltas are unique per container. This is why starting 50 containers from the same image is fast and memory-efficient.

io/thecodeforge/OptimisedNodeApp.dockerfileDOCKERFILE

# ─── STAGE 1: Base OS + system packages ───
FROM node:20-alpine AS base

# Namespace branding
LABEL maintainer="engineering@thecodeforge.io"

WORKDIR /app

# ─── STAGE 2: Install dependencies (Layer Caching Strategy) ───
# Copy only manifests to leverage Docker layer cache
COPY package.json package-lock.json ./
RUN npm ci --omit=dev

# ─── STAGE 3: Application Source ───
COPY src/ ./src/

EXPOSE 3000

# Exec form ensures SIGTERM handling
CMD ["node", "src/server.js"]

Output

Step 4/7 : RUN npm ci --omit=dev

---> Using cache

Successfully built a1b2c3d4e5f6

Successfully tagged io.thecodeforge/node-app:latest

Mental Model

Images as Git Commits, Containers as Working Directories

Why does changing one Dockerfile instruction invalidate all subsequent layers?

Each layer's content hash depends on the layer below it. A changed instruction produces a different hash.
Docker caches by hash. If the hash changes, the cache is invalidated for that layer and all layers above.
This cascading invalidation is why instruction order matters — put stable instructions first.
The layer cache is local to the build machine. CI runners without warm caches rebuild everything.

📊 Production Insight

The Copy-on-Write mechanism has a performance implication for write-heavy containers. Every write copies the entire file from the image layer to the writable layer before modifying it. For large files that are frequently written (databases, log files), this adds latency. Use named volumes for write-heavy data — volumes bypass CoW and write directly to the host filesystem.

🎯 Key Takeaway

Images are immutable templates. Containers are ephemeral instances with a writable layer. The daemon is the engine — the CLI is just a client. Layer caching is a build speed optimization driven by instruction order. Copy-on-Write means shared image layers are memory-efficient but write-heavy workloads need volumes.

Image vs Container Decisions in Production

IfNeed to deploy the same app to multiple environments

→

UseBuild one image, run multiple containers with different environment variables

IfNeed to persist data across container restarts

→

UseUse named volumes — the image is immutable, volumes are mutable

IfNeed to debug a running container

→

UseUse docker exec to shell in. Never SSH into a container. Never modify the running container for permanent fixes.

IfNeed to roll back a deployment

→

UseDeploy the previous image tag. Containers are disposable — images are the source of truth.

Volumes vs Bind Mounts vs tmpfs — The Storage Question That Trips People Up

Data persistence is one of Docker's most misunderstood areas, and interviewers use it to separate people who've read the docs from people who've debugged production.

Containers are ephemeral. The writable layer that gets created when a container starts is destroyed when the container is removed. If you write a database file into that layer, you lose it the moment the container exits. The three storage mechanisms Docker offers each solve this differently.

A named volume is managed entirely by Docker. Docker decides where on the host filesystem the data lives (usually /var/lib/docker/volumes/). Your container just sees a directory. Volumes survive container deletion, can be shared between containers, and work across platforms. Use volumes for anything you care about keeping — databases, uploads, generated certificates.

A bind mount maps a specific host path into the container. You control the path. This is powerful for local development — you mount your source code directory into the container and edits you make on the host appear instantly inside the container, enabling hot-reload workflows. But bind mounts are tightly coupled to host filesystem layout, which makes them fragile in production.

tmpfs mounts are stored in the host's memory only. The moment the container stops, the data is gone. Use tmpfs for sensitive temporary data you explicitly do not want written to disk — think secrets, session tokens, or scratch space for cryptographic operations.

Failure scenario — bind mount in production causes data loss: A team ran PostgreSQL in Docker Compose with a bind mount: -v /data/postgres:/var/lib/postgresql/data. During a server migration, they copied the container configuration but not the host directory. The new server started with an empty mount, PostgreSQL initialized a fresh database, and the team deleted the old server. All production data was lost. The fix: use named volumes (docker volume create) which are managed by Docker and can be backed up and migrated independently of the host filesystem.

Performance trade-off: Named volumes use Docker's storage driver (overlay2 by default) which adds a thin abstraction layer. Bind mounts go through the host filesystem directly, which can be faster for I/O-intensive workloads. In benchmarks, bind mounts outperform named volumes by 5-15% on write-heavy database workloads. But the portability and management benefits of named volumes far outweigh this performance difference for production use.

io/thecodeforge/storage_setup.shBASH

#!/bin/bash

# Production Pattern: Create a managed volume for the database
docker volume create thecodeforge_db_data

docker run -d \
  --name forge-db \
  -v thecodeforge_db_data:/var/lib/postgresql/data \
  postgres:16-alpine

# Security Pattern: Using tmpfs for API keys in memory
docker run -d \
  --name forge-sec-processor \
  --mount type=tmpfs,destination=/app/secrets,tmpfs-size=64m \
  io.thecodeforge/processor:latest

Output

$ docker volume ls

DRIVER VOLUME NAME

local thecodeforge_db_data

Mental Model

Storage Types as Lease Agreements

Why should you never use bind mounts in production?

Bind mounts couple the container to a specific host path — breaks portability across machines.
If the host directory does not exist, Docker creates it as root — permission issues on subsequent runs.
Host filesystem permissions can conflict with container user permissions.
Server migration requires manually copying the host directory — easy to forget, impossible to recover from.
Named volumes are managed by Docker, portable, and can be backed up with docker volume commands.

📊 Production Insight

The bind-mount-in-production anti-pattern is the most common storage mistake in Docker deployments. Teams use bind mounts during development for hot-reload convenience, then deploy the same compose file to production without changing the volume configuration. When the server is replaced during infrastructure migration, the data is left behind on the old host. Named volumes eliminate this risk by decoupling data from the host filesystem.

🎯 Key Takeaway

Named volumes for production — they survive container removal, are portable across hosts, and are managed by Docker. Bind mounts for development — they enable hot-reload but break portability. tmpfs for secrets — RAM-only, never touches disk. The bind-mount-in-production mistake is the single most common cause of data loss in Docker deployments.

Volume Type Selection by Environment

IfProduction database or persistent state

→

UseUse named volumes: docker volume create. Back up with docker run --rm -v vol:/data -v $(pwd):/backup alpine tar czf /backup/backup.tar.gz /data

IfDevelopment — live code reloading

→

UseUse bind mounts: -v $(pwd):/app. Fast iteration, no rebuild needed.

IfSensitive temporary data (secrets, session tokens)

→

UseUse tmpfs mounts: --tmpfs /tmp/secrets:size=10m. Data never touches disk.

IfCI test runs — need clean state every time

→

UseUse docker compose down -v to destroy named volumes between runs.

thecodeforge.io

Docker Interview Questions

Docker Networking Deep Dive — How Containers Actually Talk to Each Other

Networking is where many Docker users hit a wall. The mental model that unlocks it: each Docker network is a private virtual switch. Containers attached to the same switch can talk to each other by container name. Containers on different switches can't reach each other unless you explicitly connect them or use a shared network.

Docker ships with three built-in network drivers. The bridge driver (default) creates a private network on the host. Containers on the same bridge network can communicate with each other using DNS — Docker has a built-in DNS server that resolves container names and service names automatically. This is how a Node.js API container can connect to a Postgres container using the hostname 'postgres' rather than an IP address that changes every restart.

The host driver removes the network namespace entirely. The container uses the host's network stack directly. This gives maximum network performance (no virtual switch overhead) but destroys isolation — the container can see all host ports.

The none driver disables networking completely. The container has only a loopback interface. Useful for running batch jobs that must be air-gapped, or for testing how your app behaves with no network access.

Connection refused debugging: When two containers cannot communicate, the most common causes are: (1) they are on different networks — verify with docker network inspect, (2) the target service binds to 127.0.0.1 instead of 0.0.0.0 — localhost inside a container is the container itself, not the host, (3) the connection string uses the host-mapped port instead of the container port — container-to-container communication uses the container port directly, (4) a healthcheck or depends_on race condition — the target service is not ready yet.

Performance trade-off — bridge vs host networking: Bridge networking adds a virtual Ethernet pair and iptables NAT rules for each container. This adds approximately 10-50 microseconds of latency per packet compared to host networking. For latency-sensitive workloads (high-frequency trading, real-time gaming), host networking eliminates this overhead. But it destroys port isolation — two containers cannot bind to the same port on host networking.

io/thecodeforge/docker-compose.ymlYAML

version: '3.9'

services:
  api:
    image: io.thecodeforge/api:latest
    networks:
      - forge_internal
    environment:
      - DB_HOST=database

  database:
    image: postgres:16-alpine
    networks:
      - forge_internal
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

networks:
  forge_internal:
    driver: bridge

Output

$ docker compose up -d

[+] Running 3/3

✔ Network forge_internal Created

✔ Container database Started

✔ Container api Started

Mental Model

Docker Networks as Virtual LANs

Why can containers resolve each other by service name without any DNS configuration?

Docker's embedded DNS server at 127.0.0.11 resolves container names to internal IPs.
Each container's /etc/resolv.conf points to this embedded DNS server automatically.
The DNS server is network-aware — it only resolves names for containers on the same network.
This is why containers on different networks cannot resolve each other — the DNS server enforces network isolation.

📊 Production Insight

Never publish database ports to the host (ports: - '5432:5432') in any environment beyond local development. Internal services should live on private networks with no published ports. Only edge services (APIs, reverse proxies) should have published ports. This is a security boundary — if the database port is published, anyone who can reach the host can attempt to connect directly, bypassing application-level access controls.

🎯 Key Takeaway

Docker networks are virtual LANs. Bridge networking provides isolation with DNS resolution by service name. Host networking eliminates overhead but destroys isolation. Never publish database ports to the host. The embedded DNS server at 127.0.0.11 is network-aware — containers on different networks cannot resolve each other.

Network Driver Selection

IfStandard application deployment with multiple services

→

UseUse bridge networking — isolation with DNS resolution by service name

IfLatency-sensitive workload (trading, gaming, real-time)

→

UseUse host networking — eliminates virtual switch overhead but loses port isolation

IfBatch job that must be air-gapped or security-scanned

→

UseUse none networking — no network access at all

IfMultiple tiers that should not see each other (frontend, API, database)

→

UseUse multiple bridge networks — assign services to only the networks they need

Multi-Stage Builds and Image Size — The Optimisation Question That Defines Seniors

A production Docker image should contain only what is needed to run the application. Not the compiler. Not the test framework. Not the build tools. Most candidates understand single-stage builds. Seniors reach for multi-stage builds by default.

The idea is simple: use one stage to build your app (with all the heavyweight tools that requires), then start fresh from a minimal base image and copy only the compiled output. The final image has no knowledge of how it was built — just what needs to run.

For a Go application this is dramatic: the builder stage might pull in the entire Go toolchain (hundreds of MB), but the final stage starts from scratch (literally 'FROM scratch') and contains only the statically compiled binary — often under 10MB total image size.

Security benefit: Every tool in your production image is an attack surface. gcc, make, curl, wget — if an attacker gets shell access to your container, these tools let them compile exploits, download payloads, and pivot. A distroless or Alpine runtime image with no build tools gives an attacker almost nothing to work with.

Deployment speed impact: Container images must be pulled to every node before they can run. A 1.2GB image takes 30-60 seconds to pull over a fast network. A 12MB image pulls in under 1 second. During rolling deployments across 20 nodes, that difference is minutes of deployment time.

Why deleting files in a RUN command does not shrink the image: Docker layers are additive. If you RUN apt-get install gcc in one layer and RUN apt-get remove gcc in the next, the gcc files still exist in the first layer — the image size does not decrease. The second layer just marks them as deleted. Multi-stage builds solve this by starting a fresh stage — the runtime stage never contains build tools in any layer.

io/thecodeforge/multistage.dockerfileDOCKERFILE

# ─── STAGE 1: Build Environment ───
FROM golang:1.22-alpine AS builder
WORKDIR /src
COPY . .
RUN CGO_ENABLED=0 go build -o /app/forge-binary main.go

# ─── STAGE 2: Production Distroless ───
# Using distroless for security and minimal footprint
FROM gcr.io/distroless/static-debian12
COPY --from=builder /app/forge-binary /forge-binary

USER nonroot:nonroot
ENTRYPOINT ["/forge-binary"]

Output

REPOSITORY TAG IMAGE ID SIZE

io.thecodeforge/app latest d8f3e2a1b0c9 12.4MB

Mental Model

Multi-Stage Builds as a Factory Conveyor Belt

Why not just delete build tools in a RUN command at the end of a single-stage Dockerfile?

Docker layers are additive. A file added then deleted in a later layer still occupies space in the earlier layer.
RUN apt-get install gcc && apt-get remove gcc still has gcc in the install layer.
Multi-stage builds start fresh — the runtime stage never contains build tools in any layer.
This is the only way to genuinely reduce image size, not just hide files from the filesystem.

📊 Production Insight

Multi-stage builds are not optional for production. A single-stage image with build tools has a larger attack surface, slower pull times, and higher storage costs. The security benefit alone justifies the effort — every unnecessary binary in your production image is a tool an attacker can use post-compromise. Scan your images with Trivy or Docker Scout to verify your runtime image contains no build tools.

🎯 Key Takeaway

Multi-stage builds separate build-time and runtime dependencies. The builder stage compiles everything. The runtime stage copies only the artifact. This reduces image size by 80-95%, shrinks attack surface, and speeds up deployments. Deleting files in a RUN command does not shrink the image — layers are additive. Only multi-stage builds genuinely reduce image size.

Image Size Optimization Strategy

IfCompiled language (Go, Rust, Java)

→

UseMulti-stage build with FROM scratch or distroless for the runtime stage

IfInterpreted language (Python, Node.js)

→

UseMulti-stage build with slim/alpine base for the runtime stage. Copy only installed packages, not build tools.

IfImage is > 500MB

→

UseAudit with dive (github.com/wagoodman/dive) to find bloated layers. Use .dockerignore. Combine RUN commands.

IfNeed maximum security (zero attack surface)

→

UseUse distroless images — no shell, no package manager, no utilities. Just your binary and its runtime dependencies.

What Is a Docker Container, Really? (Forget the Marketing)

Every junior parrots 'containers are lightweight VMs.' That's wrong, and interviewers will let you hang yourself with it.

Senior engineers understand this trade-off. Juniors think containers are magic.

InspectContainerProcess.pyPYTHON

// io.thecodeforge — interview tutorial

import subprocess
import json

# Show the real host PID of a container's init process
container_id = "a1b2c3d4e5f6"
docker_inspect = subprocess.run(
    ["docker", "inspect", container_id, "--format", "{{.State.Pid}}"],
    capture_output=True, text=True
)
container_pid = int(docker_inspect.stdout.strip())

# The container's process is visible on the host
proc_status = open(f"/proc/{container_pid}/status").read()
for line in proc_status.splitlines():
    if line.startswith("Name:") or line.startswith("Pid:"):
        print(line)

Output

Name: python3.11

Pid: 31204

⚠ Production Trap:

If your container runs sleep infinity, the container dies when that PID exits. Wrapper scripts that fork without exec create zombie processes the container can't reap.

🎯 Key Takeaway

A container is just a cgroup-restricted, namespace-isolated process. No kernel, no boot, no VM benefits.

Docker Hub Isn't a Backup Plan — It's Your First Attack Vector

Docker Hub is the public registry where developers push and pull images. It's also the easiest way to get pwned in production.

Pull a popular image from Docker Hub and check its layer history. You'll often find images built from ubuntu:latest with apt-get unauthenticated, pip packages pinned to ranges, or worse — images pushed by 'official' accounts that signed their keys three years ago and lost access.

Interviewers ask about Docker Hub because supply-chain attacks kill companies. SolarWinds wasn't a Docker attack, but the pattern is identical: one compromised image in your pipeline and you're exfiltrating production secrets to an S3 bucket you don't own.

Here's the senior response: never pull latest. Pin every image to its SHA256 digest. Run your own private registry (Harbor, ECR, GCR) with vulnerability scanning. Sign images with Docker Content Trust. And for the love of god, don't use Docker Hub for production orchestration — kubelets don't authenticate by default, and anyone can push a malicious image with a similar name.

Docker Hub is a public square. You wouldn't trust a random USB stick. Don't trust a random image.

VerifyImageIntegrity.pyPYTHON

// io.thecodeforge — interview tutorial

import docker
import hashlib

client = docker.from_env()

# Pull a specific digest, not a tag
image = client.images.pull("nginx:1.25.3@sha256:abc123...")

# Verify the digest hasn't changed
pulled_digest = list(image.attrs["RepoDigests"])[0].split("@")[1]
expected_digest = "sha256:abc123..."

if pulled_digest != expected_digest:
    raise RuntimeError(f"Image tampered: {pulled_digest}")
else:
    print(f"Image verified: {pulled_digest[:20]}...")

Output

Image verified: sha256:abc123456789...

🔥Senior Shortcut:

Run docker image inspect --format '{{.Id}}' nginx | cut -d: -f2 to get the content-addressable ID. Compare that against your CI pipeline's pinned digest.

🎯 Key Takeaway

Pin images by SHA256 digest, not tag. Use a private registry with scanning. Docker Hub is a public bus, not a vault.

How Many Containers Are Actually Running? The Stats Command That Exposes Your Monitoring Gaps

You can't secure or scale what you can't count. Interviewers ask this because most engineers limp along with docker ps and miss half the picture. The real answer isn't about counting—it's about knowing your system's actual load.

docker container ls shows running containers. But paused and stopped ones are invisible there. That's how production incidents happen: an engineer sees zero running containers, thinks the service is down, and starts a fire drill—when actually the container's just paused from a resource cgroup throttle. Docker tracks three states: running, paused, stopped. Use docker stats --all or docker info to get the full census.

The senior move is knowing that paused containers still hold memory locks and file descriptors. Counting isn't a parlor trick—it's capacity planning and incident root cause. If you can't tell me exactly how many containers are in each state right now, you're flying blind.

container_count.pyPYTHON

// io.thecodeforge — interview tutorial

import subprocess
import json

def get_container_counts():
    # Get container stats in JSON format
    raw = subprocess.run(
        ["docker", "stats", "--no-stream", "--format", "{{json .}}"],
        capture_output=True, text=True
    )
    lines = [l for l in raw.stdout.strip().split('\n') if l]
    
    states = {"running": 0, "paused": 0, "stopped": 0}
    for line in lines:
        state = json.loads(line).get("State", "unknown")
        state = state.lower()
        if state in states:
            states[state] += 1
    
    return states

if __name__ == "__main__":
    counts = get_container_counts()
    print(f"Running: {counts['running']}, Paused: {counts['paused']}, Stopped: {counts['stopped']}")

Output

Running: 3, Paused: 1, Stopped: 7

⚠ Production Trap:

Paused containers look harmless but still consume PIDs and network ports. Always alert on non-zero paused counts in prod.

🎯 Key Takeaway

Never rely on docker ps alone. Always count running, paused, and stopped containers separately for accurate capacity insight.

Docker Object Labels: The Metadata Hack That Saves You From Container Chaos

Labels are tiny key-value pairs attached to containers, images, or volumes. Ignore them and you'll be grepping through container names at 3 AM trying to figure out which microservice version ran that job last week. Interviewers want to know you think in metadata, not magic strings.

Labels aren't mere comments—they're the backbone of production governance. For example, a label like org.label-schema.version=2.0 lets your CI system instantly identify and canary-rollback a specific image. prometheus.io/scrape=true tells monitoring to collect metrics from that container. Docker Compose labels let orchestrators filter and target workloads without touching the container internals.

The production pattern: enforce label schemas in your Dockerfile as a compliance gate. Use docker container ls --filter "label=environment=production" to audit workloads in seconds. No labels means no accountability. If a container crashes and has zero labels, you're debugging in the dark.

label_filter.pyPYTHON

// io.thecodeforge — interview tutorial

import subprocess
import json

def get_containers_by_label(label_key, label_value):
    filter_expr = f"label={label_key}={label_value}"
    raw = subprocess.run(
        ["docker", "container", "ls", "--filter", filter_expr, "--format", "{{json .}}"],
        capture_output=True, text=True
    )
    lines = [l for l in raw.stdout.strip().split('\n') if l]
    return [json.loads(l) for l in lines]

if __name__ == "__main__":
    prod_containers = get_containers_by_label("environment", "production")
    print(f"Production containers: {len(prod_containers)}")
    for c in prod_containers:
        print(f"  - {c['Names']} (Image: {c['Image']})")

Output

Production containers: 2

- web-api-v2 (Image: myapp/api:2.0.1)

- worker-queue (Image: myapp/worker:1.8.3)

💡Senior Shortcut:

Stick labels in Dockerfiles as early as possible. You can't retroactively label a running container—build the schema into your CI pipeline.

🎯 Key Takeaway

Docker labels are the only metadata that survives container restart. Use them for monitoring, compliance, and service discovery before you need them.

Container Hardening: What Interviewers Watch for Beyond Basic Security

Hardening Docker containers means stripping them to the bare minimum. Interviewers want candidates who understand why default images are dangerous. The why: every unnecessary package, setuid binary, or network capability expands the attack surface. Start with a scratch or distroless base image — no shell, no package manager, no utilities. Drop all capabilities with --cap-drop=ALL, then add back only what the process needs. Run as a non-root user inside the container; the USER directive in the Dockerfile is not optional. Mount the filesystem read-only unless writes are required. Use --read-only --tmpfs /tmp for writable scratch space without persistence. Use --security-opt=no-new-privileges to prevent privilege escalation. These combine to make a container nearly immune to common breakout techniques like cgroups escapes or kernel exploits. Interviewers ask this to see if you treat containers as disposable processes, not mini-VMs.

SecureContainer.pyPYTHON

// io.thecodeforge — interview tutorial

// Run a hardened nginx container
import subprocess

cmd = [
    "docker", "run", "-d",
    "--name", "hardened-nginx",
    "--cap-drop=ALL",
    "--cap-add=NET_BIND_SERVICE",
    "--read-only",
    "--tmpfs", "/tmp:noexec,nosuid,size=64m",
    "--security-opt=no-new-privileges",
    "--user", "101:101",
    "nginx:alpine"
]
result = subprocess.run(cmd, capture_output=True, text=True)
print(result.stdout.strip() if result.returncode == 0 else result.stderr)

Output

e1b2c3d4e5f6

⚠ Production Trap:

Adding --privileged to get a container working is a job interview red flag. It grants every capability and bypasses all seccomp and AppArmor profiles.

🎯 Key Takeaway

Attack surface is cumulative: remove defaults you don't control.

Docker CE vs EE: The Licensing Question That Reveals Production Experience

Docker CE (Community Edition) and EE (Enterprise Edition) diverged significantly after Docker Inc. sold the enterprise business to Mirantis. CE is now Docker Engine, free and open-source, with no support SLAs. EE became Mirantis Kubernetes Engine (MKE) — a commercial platform with built-in orchestration, image scanning, role-based access control, and 24/7 support. The key difference for interviewers: CE uses docker swarm init for basic clustering, while EE (MKE) integrates with Kubernetes natively and includes security features like signed image enforcement and certificate-based node authentication. EE also offers guaranteed stability patches for 24 months versus CE's rolling releases. The why matters here: production teams need to know that CE has no vulnerability database included — you must use Docker Scout or third-party security scanners separately. EE/MKE bundles that. Interviewers probe this to see if you've dealt with compliance requirements like FedRAMP or SOC2, where CE alone won't pass audit without extensive bolt-ons. Know your licensing economics — CE is free but costs engineering time in security and stability gap work.

CheckEdition.pyPYTHON

// io.thecodeforge — interview tutorial

// Detect Docker edition from CLI output
import subprocess

version = subprocess.run(
    ["docker", "version", "--format", "{{.Server.Version}}"],
    capture_output=True, text=True
).stdout.strip()

# Enterprise edition shows "mirantis" or "docker-ee" in version string
is_ee = "mirantis" in version.lower() or "ee" in version.lower()
print(f"Docker Version: {version}")
print(f"Enterprise Edition: {is_ee}")

Output

Docker Version: 24.0.7

Enterprise Edition: False

🔥Interview Shortcut:

If your resume says 'Docker EE 3.x', expect questions on Mirantis Kubernetes Engine upgrades and image trust policies.

🎯 Key Takeaway

CE is free but unpaid security work; EE costs money but saves audit time.

Multi-Stage Builds: Optimizing Image Size and Build Cache

Multi-stage builds are a Docker feature that allows you to use multiple FROM statements in a single Dockerfile, each representing a different stage. This technique is crucial for reducing final image size by separating build-time dependencies from runtime artifacts. For example, a Go application can compile in a stage with the full Go toolchain, then copy only the compiled binary to a minimal base image like alpine. This eliminates unnecessary layers and reduces attack surface. Additionally, Docker's build cache can be leveraged by ordering layers from least to most frequently changed. For instance, installing system packages before copying source code ensures cached layers are reused. A common pitfall is forgetting to use --no-cache or --mount=type=cache for package managers, which can bloat images. Seniors demonstrate this by showing how to reduce a 1GB image to under 20MB, and by explaining trade-offs between build speed and image size.

DockerfileDOCKERFILE

FROM golang:1.20 AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o myapp .

FROM alpine:3.18
RUN apk add --no-cache ca-certificates
COPY --from=builder /app/myapp /usr/local/bin/
CMD ["myapp"]

💡Leverage Build Cache

📊 Production Insight

In production, use multi-stage builds with distroless or scratch images to minimize vulnerabilities. Always pin base image versions and use image scanning tools like Trivy to detect issues before deployment.

🎯 Key Takeaway

Multi-stage builds separate build and runtime environments, drastically reducing image size and improving security by excluding build tools from the final image.

Docker Compose vs Kubernetes: When to Use Each

Docker Compose and Kubernetes both orchestrate containers, but serve different purposes. Compose is ideal for single-host development environments, local testing, and simple deployments. It uses a YAML file to define services, networks, and volumes, and is easy to set up with commands like docker-compose up. Kubernetes, on the other hand, is designed for production-grade, multi-host clusters with features like auto-scaling, self-healing, and rolling updates. Use Compose when you need a lightweight setup for a small team or a microservice application on a single machine. Use Kubernetes when you require high availability, load balancing, and complex scheduling across multiple nodes. A common mistake is using Compose in production without proper monitoring or resilience. For example, a startup might start with Compose for rapid prototyping, then migrate to Kubernetes as they scale. Interviewers look for understanding of trade-offs: Compose simplifies development, while Kubernetes adds operational complexity but enables enterprise-grade reliability.

docker-compose.ymlYAML

version: '3.8'
services:
  web:
    image: nginx:alpine
    ports:
      - "80:80"
  app:
    build: .
    environment:
      - DB_HOST=db
  db:
    image: postgres:13
    volumes:
      - pgdata:/var/lib/postgresql/data
volumes:
  pgdata:

🔥Hybrid Approach

📊 Production Insight

In production, never rely on Compose for critical workloads without proper monitoring, logging, and backup strategies. Kubernetes requires investment in cluster management, but pays off with scalability and resilience.

🎯 Key Takeaway

Docker Compose suits single-host dev/test environments; Kubernetes is for production-grade, multi-host orchestration with advanced features.

Container Security: Image Scanning, Non-Root, Read-Only Root

Container security is a top concern in production. Three key practices are image scanning, running as non-root, and using read-only root filesystems. Image scanning tools like Trivy, Clair, or Docker Scout analyze images for known vulnerabilities (CVEs) in packages and libraries. Integrate scanning into CI/CD pipelines to block insecure images. Running containers as non-root (using USER directive in Dockerfile) prevents privilege escalation if the container is compromised. For example, use USER 1001 instead of root. Read-only root filesystem (--read-only flag) prevents writes to the container's filesystem, forcing all writes to mounted volumes. This limits damage from malware. Combine with --tmpfs for temporary files. Interviewers expect you to demonstrate knowledge of Docker security defaults and how to harden containers. A common failure is using root inside containers or ignoring base image vulnerabilities. Show how to create a secure image with minimal layers and no unnecessary tools.

DockerfileDOCKERFILE

FROM node:18-alpine AS base
RUN addgroup -g 1001 appgroup && adduser -u 1001 -G appgroup -s /bin/sh -D appuser

FROM base AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

FROM base
COPY --from=build /app/node_modules /app/node_modules
COPY --chown=appuser:appgroup . /app
USER appuser
WORKDIR /app
CMD ["node", "server.js"]

⚠ Avoid Root by Default

📊 Production Insight

In production, enforce security policies via admission controllers (e.g., OPA/Gatekeeper) that require non-root and read-only root. Regularly scan images and update base images to patch vulnerabilities.

🎯 Key Takeaway

Container security requires image scanning, non-root execution, and read-only root filesystems to minimize attack surface and limit breach impact.

● Production incidentPOST-MORTEMseverity: high

Production Data Loss — Bind Mount in Docker Compose Wiped Database on Server Migration

Symptom

After server migration, the application started successfully but all API responses returned empty results. Database queries against user tables returned 0 rows. The team initially assumed a connection string issue pointing to the wrong database instance.

Assumption

Team assumed the DATABASE_URL was misconfigured after migration. They verified the connection string, tested DNS resolution, and confirmed the container was running. All checks passed. Second assumption: a migration script had not run. They checked migration logs — all migrations reported as already applied against the empty database.

Root cause

The docker-compose.yml used a bind mount: -v /data/postgres:/var/lib/postgresql/data. During migration, the team copied the docker-compose.yml to the new server but did not copy /data/postgres. PostgreSQL started with an empty /data/postgres directory and initialized a fresh database cluster. The old server was decommissioned and wiped 48 hours later. The data was on that server's /data/postgres directory, not in a Docker-managed volume.

Fix

1. Replaced bind mount with named volume: volumes: - postgres_data:/var/lib/postgresql/data. 2. Added automated daily backups using pg_dump to an S3 bucket. 3. Added a pre-migration checklist that includes volume data verification. 4. Implemented docker volume inspect to verify data exists before starting the database container. 5. Added a CI check that flags bind mounts in production compose files.

Key lesson

Bind mounts in production couple your data to a specific host path. When the host goes away, the data goes with it.
Named volumes are managed by Docker and are easier to back up, migrate, and verify independently of the host filesystem.
Always verify data exists in the target volume before starting a database container after migration.
Decommissioning a server without verifying data has been migrated is an irreversible mistake.
Automated backups are not optional for production databases — they are the last line of defense against data loss.

Production debug guideSystematic debugging paths that demonstrate production experience.6 entries

Symptom · 01

Container exits immediately after start with no log output.

→

Fix

The CMD failed before writing to stdout. Run interactively: docker run -it <image> sh, then execute the CMD manually. Check if the entrypoint script has a shebang and execute permissions. Check if the binary exists at the expected path.

Symptom · 02

Two containers on the same network cannot communicate.

→

Fix

Verify both are on the same network: docker network inspect <network>. Test DNS resolution: docker exec <container> nslookup <target-service>. Check if the target service binds to 0.0.0.0, not 127.0.0.1. Check if a firewall or security group is blocking the container port.

Symptom · 03

Container gets OOM-killed repeatedly (exit code 137).

→

Fix

Check memory usage: docker stats <container>. Set a memory limit to prevent host-wide impact: docker run --memory=512m. Profile the application for memory leaks. Check if the container is processing data larger than expected (large file uploads, unbounded caches).

Symptom · 04

Docker build reinstalls dependencies on every code change.

→

Fix

Check Dockerfile layer ordering. Ensure COPY requirements.txt comes before COPY . . Run docker history <image> to see which layers were rebuilt. Add .dockerignore to exclude .git, node_modules, __pycache__ from build context.

Symptom · 05

Container works locally but fails in CI or on another machine.

→

Fix

Compare base image digests: docker inspect --format='{{.Image}}' <container>. Check for architecture mismatches (ARM vs AMD64). Check for missing .env file or environment variables. Check Docker Engine version compatibility.

Symptom · 06

docker stop takes 10 seconds and the container is killed.

→

Fix

The application does not handle SIGTERM. Check if CMD uses shell form (CMD npm start) instead of exec form (CMD ["npm", "start"]). Shell form makes /bin/sh PID 1, which does not forward signals. Implement a SIGTERM handler in the application. Increase stop timeout with --stop-timeout 30.

★ Docker Container Triage Cheat SheetFirst-response commands for common Docker issues in production.

Container crashed or restarting in a loop.−

Immediate action

Check logs and exit code.

Commands

docker logs --tail 50 <container>

docker inspect <container> --format='{{.State.ExitCode}} {{.State.OOMKilled}}'

Fix now

Exit code 0 = CMD completed (wrong CMD). Exit code 1 = app error (check logs). Exit code 137 = OOM killed (--memory too low). Exit code 139 = segfault (base image mismatch).

Container running but not responding to requests.+

Two containers cannot communicate on the same network.+

docker build is extremely slow.+

Secrets exposed in docker history or docker inspect.+

Docker Storage Mechanisms Compared

Aspect	Docker Volume	Bind Mount	tmpfs Mount
Managed by	Docker daemon	You (host path)	Docker daemon (RAM)
Data persists after container stop	Yes	Yes (it's a host file)	No — gone immediately
Data persists after docker rm	Yes	Yes (it's a host file)	N/A — already gone
Best use case	Production databases, uploads	Dev hot-reload, config injection	Secrets, session tokens, scratch space
Portability	High — works on any Docker host	Low — tied to host path structure	High — host-agnostic
Performance	Good (slight overhead)	Best (direct host I/O)	Excellent (RAM speed)
Visible to 'docker volume ls'	Yes	No	No
Works in Docker Compose	Yes — named volume syntax	Yes — relative path syntax	Yes — tmpfs key

⚙ Quick Reference

13 commands from this guide

File	Command / Code	Purpose
iothecodeforgeOptimisedNodeApp.dockerfile	FROM node:20-alpine AS base	Core Concepts: Images, Containers, and the Daemon
iothecodeforgestorage_setup.sh	docker volume create thecodeforge_db_data	Volumes vs Bind Mounts vs tmpfs
iothecodeforgedocker-compose.yml	version: '3.9'	Docker Networking Deep Dive
iothecodeforgemultistage.dockerfile	FROM golang:1.22-alpine AS builder	Multi-Stage Builds and Image Size
InspectContainerProcess.py	container_id = "a1b2c3d4e5f6"	What Is a Docker Container, Really? (Forget the Marketing)
VerifyImageIntegrity.py	client = docker.from_env()	Docker Hub Isn't a Backup Plan
container_count.py	def get_container_counts():	How Many Containers Are Actually Running? The Stats Command
label_filter.py	def get_containers_by_label(label_key, label_value):	Docker Object Labels
SecureContainer.py	cmd = [	Container Hardening
CheckEdition.py	version = subprocess.run(	Docker CE vs EE
Dockerfile	FROM golang:1.20 AS builder	Multi-Stage Builds
docker-compose.yml	version: '3.8'	Docker Compose vs Kubernetes
Dockerfile	FROM node:18-alpine AS base	Container Security

Key takeaways

Instruction order in a Dockerfile is a performance decision

least-volatile instructions (OS packages, dependency installs) must come before most-volatile ones (source code copy) or you eliminate all caching benefit and rebuild from scratch on every code change.

Named volumes survive 'docker rm'; bind mounts are host filesystem paths that survive by definition; tmpfs is RAM-only and is the only Docker storage mechanism that guarantees data never touches disk

critical for ephemeral secrets handling.

Docker's built-in DNS lets containers on the same network resolve each other by service name, not IP address. Never hardcode container IPs

they change on every restart. Always reference services by name and let Docker handle the resolution.

Multi-stage builds are not an optimisation you do later

they're the default pattern for any compiled language. Shipping a Go or Java app in a single-stage image that includes the compiler is equivalent to shipping a kitchen with every meal you deliver.

Shell form CMD wraps your process in /bin/sh which does not forward SIGTERM. Your app gets SIGKILL after 10 seconds. Always use exec form (array syntax) for production containers.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

FAQ · 5 QUESTIONS

Frequently Asked Questions

What is the difference between a Docker image and a Docker container?

How do you handle secrets like passwords or API keys in Docker containers?

Can you run a container without the Docker Daemon?

What does exit code 137 mean for a Docker container?

What is the difference between docker stop and docker kill?

Naren Founder & Principal Engineer

20+ years shipping production code across the stack, with years spent interviewing engineers. Drawn from code that ran under real load.

✓ Verified

production tested

July 18, 2026

last updated

2,466

articles · all by Naren

🔥

That's DevOps Interview. Mark it forged?

11 min read · try the examples if you haven't