Advanced 10 min · March 06, 2026

Kubernetes Pods and Deployments - Missing Startup Probe 502

A readiness probe with initialDelaySeconds=5 caused 8 minutes of 502 errors in production.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Pod: Ephemeral, non-self-healing. Shares a network namespace (localhost) and volumes across containers.
  • Deployment: Manages ReplicaSets which manage Pods. Handles rolling updates, rollbacks, and scaling.
  • ReplicaSet: The intermediate controller that ensures N replicas exist. You rarely interact with it directly.
  • Probes: Liveness (restart if deadlocked) vs Readiness (remove from service endpoints if not ready).
  • maxSurge: More surge = faster rollout but higher peak resource usage.
  • maxUnavailable: 0 = zero-downtime but slower rollout. Higher = faster but riskier.
  • Creating Pods directly instead of Deployments. Direct Pods are not self-healing. If the node dies, the Pod stays dead forever.
Plain-English First

Think of a Pod as a shipping container — it holds your application and its immediate companions (like a logging sidecar). A Deployment is the shipping company's logistics system — it makes sure the right number of containers are always at the dock, replaces any that get damaged, and can swap out old cargo for new cargo without stopping operations.

Every production Kubernetes workload ultimately runs as Pods managed by Deployments. The Pod is the atomic scheduling unit — a group of containers sharing a network namespace and volumes. The Deployment is the declarative controller that ensures the right number of Pods exist, handles rolling updates, and rolls back on failure.

Misconfiguring either object causes production incidents: Pods without resource limits cause noisy-neighbor OOMKills, Deployments without proper probes cause 502 errors during rollouts, and direct Pod creation bypasses self-healing entirely. Understanding the reconciliation loop — how the Deployment controller continuously drives current state toward desired state — is the foundation for debugging every higher-level Kubernetes object.

Pod Basics: The Atomic Unit

In the world of Kubernetes, the Pod is the atomic unit of scheduling. While you might be used to thinking in terms of 'containers,' Kubernetes thinks in 'Pods.' A Pod can host a single container, or a tightly coupled group of containers (like an app container and a 'sidecar' logging agent) that need to share the same local network (localhost) and storage volumes.

Crucially, Pods are ephemeral. They are born, they live, and they die. They are never 'repaired'; they are replaced.

io/thecodeforge/k8s/pod.yamlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Package: io.thecodeforge.k8s.manifests
# pod.yaml — used primarily for local debugging, rarely in production
apiVersion: v1
kind: Pod
metadata:
  name: forge-api-pod
  labels:
    app: forge-api
    env: production
spec:
  containers:
    - name: api-container
      image: io.thecodeforge/api:1.2.0
      ports:
        - containerPort: 8080
      resources:
        requests:
          memory: "128Mi"
          cpu: "100m"
        limits:
          memory: "256Mi"
          cpu: "500m"
      env:
        - name: APP_COLOR
          value: "blue"
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: cloud-secrets
              key: db-password
Output
# Note: Direct Pods are not self-healing. If the node dies, the pod stays dead.
Pods Are Cattle, Not Pets
  • Pod IP is assigned at creation and changes on restart. Use Service DNS for discovery.
  • Container filesystem is ephemeral. Use PersistentVolumes for data that must survive restarts.
  • Pods in the same Pod share localhost networking. Containers in different Pods do not.
  • Sidecar containers (logging, proxy) share the Pod's lifecycle. If one crashes, the Pod is restarted.
  • Init containers run before the main container. They block Pod startup until they complete successfully.
Production Insight
The most dangerous Pod anti-pattern is creating Pods directly (via kubectl run or a Pod manifest) for production workloads. Direct Pods have no self-healing: if the node crashes, the Pod is gone forever with no replacement. The Deployment controller is what provides self-healing, rolling updates, and rollback. Every production workload must use a Deployment, StatefulSet, or DaemonSet — never a raw Pod.
Key Takeaway
Pods are the atomic scheduling unit but they are ephemeral and non-self-healing. They share a network namespace and volumes across containers. Never create Pods directly for production workloads — always use a Deployment.

Deployments: Orchestrating the Desired State

A Deployment is a high-level object that manages a ReplicaSet, which in turn manages Pods. Its job is to ensure the 'Desired State' matches the 'Current State.' If you tell a Deployment you want 3 replicas, and a node crashes taking one Pod with it, the Deployment controller notices the discrepancy and immediately schedules a new Pod on a healthy node.

Deployments are also the primary vehicle for Rolling Updates. By manipulating the maxSurge and maxUnavailable parameters, you can swap out version 1.0 for 2.0 without dropping a single user request.

io/thecodeforge/k8s/deployment.yamlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# deployment.yaml — The production standard
apiVersion: apps/v1
kind: Deployment
metadata:
  name: forge-api-deployment
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: forge-api
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: forge-api
    spec:
      containers:
        - name: api
          image: io.thecodeforge/api:2.0.1
          ports:
            - containerPort: 8080
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 20
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
Output
# kubectl apply -f deployment.yaml
# deployment.apps/forge-api-deployment created
The Reconciliation Loop
  • Deployment creates ReplicaSets. ReplicaSets create Pods. You interact with Deployments.
  • Each rollout creates a new ReplicaSet. Old ReplicaSets are kept for rollback (default% of desired pods are always up template: : 10 revisions).
  • The Deployment controller manages the transition: scale up new ReplicaSet, scale down old ReplicaSet.
  • maxSurge: How many extra Pods can exist above the desired count during rollout.
  • maxUnavailable: How many Pods can be missing below the desired count during rollout.
Production Insight
The maxSurge and maxUnavailable parameters directly control rollout speed vs resource usage. maxSurge: 25%, maxUnavailable: 0 means: during rollout, create up to 25% extra new Pods before terminating old Pods. This ensures zero-downtime but requires 125% of normal resource capacity. For resource-constrained clusters, set maxSurge: 0, maxUnavailable: 25% to terminate old Pods first (faster, less resource usage, but brief capacity reduction). The worst configuration is maxSurge: 0, maxUnavailable: 0 — the rollout blocks forever because it cannot create new Pods (no surge) and cannot terminate old Pods (no unavailability allowed).
Key Takeaway
Deployments manage ReplicaSets which manage Pods. The reconciliation loop continuously drives current state toward desired state. maxSurge and maxUnavailable control rollout speed vs resource usage. Never set both to 0 — the rollout will block forever.

Probe Comparison: Liveness vs Readiness vs Startup

Kubernetes provides three distinct probe types — liveness, readiness, and startup — each serving a different purpose in the Pod lifecycle. Misunderstanding their roles is the root cause of many production incidents, including the 502 error scenario described earlier.

Liveness Probe: Determines if the container is alive. If it fails, the kubelet kills the container and restarts it (according to the Pod's restartPolicy). Use it to recover from deadlocks or infinite loops. Never check external dependencies in a liveness probe — if the database is slow, the probe fails, the container restarts, and the restart increases load on the database, causing cascading failure.

Readiness Probe: Determines if the container is ready to serve traffic. If it fails, the Pod's IP is removed from the Service endpoints — traffic stops, but the container is NOT restarted. Use it for applications that need to load cache, connect to databases, or run startup migrations before handling requests. The readiness probe controls traffic flow, not container lifecycle.

Startup Probe: Gates both liveness and readiness probes. While the startup probe has not yet succeeded, liveness and readiness probes are disabled. Use it for applications with startup times greater than 30 seconds. Set a high failureThreshold (e.g., 60) with a short periodSeconds (e.g., 5) to allow up to 300 seconds for startup. Once the startup probe succeeds, liveness and readiness probes begin their normal checks.

All three probes support the same handler types: httpGet, tcpSocket, exec, and grpc.

io/thecodeforge/k8s/probes.yamlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# probes.yaml — production configuration for a slow-starting app
apiVersion: v1
kind: Pod
metadata:
  name: probe-demo
spec:
  containers:
  - name: app
    image: myapp:1.0.0
    ports:
    - containerPort: 8080
    startupProbe:
      httpGet:
        path: /startupz
        port: 8080
      failureThreshold: 60
      periodSeconds: 5
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      failureThreshold: 3
      periodSeconds: 15
      initialDelaySeconds: 0
    readinessProbe:
      httpGet:
        path: /readyz
        port: 8080
      failureThreshold: 3
      periodSeconds: 10
      initialDelaySeconds: 0
Output
# Probe comparison:
# startupProbe: Allows 300s for boot. Gates liveness/readiness.
# livenessProbe: Restarts if app deadlocks after startup.
# readinessProbe: Removes from Service if app not ready.
When to Use Each Probe
Use a startup probe when the application takes more than 30 seconds to become ready. Without it, the liveness probe fires during boot and triggers restarts. Use readiness probes for any application that should not receive traffic until fully warmed. Use liveness probes only for recovering from deadlocks — never for dependency checks.
Production Insight
In the 502 incident described earlier, the missing startup probe caused the readiness probe to begin checking immediately. The application took 45 seconds to warm up, so the first readiness probe (after 5 seconds) failed, but Kubernetes had already added the Pod to the Service endpoints based on the Pod's Running status. A startup probe with failureThreshold: 60 and periodSeconds: 5 would have prevented traffic from reaching the Pod until the startup probe succeeded, completely eliminating the error window.
Key Takeaway
Startup probes gate liveness and readiness probes for slow-starting apps. Liveness probes restart dead containers; readiness probes control traffic flow. Never check external dependencies in liveness probes — use readiness probes for that.

Liveness/Readiness/Startup Probe Comparison Table

While the previous section explained each probe's purpose, a comparison table helps you quickly decide which probe to use in any scenario. The table below summarizes the differences across key dimensions: behavior on failure, impact on the Pod, typical use cases, handler types, and best practices for configuration.

AspectLiveness ProbeReadiness ProbeStartup Probe
PurposeIs the container alive?Is the container ready to serve traffic?Has the container finished booting?
On failureKubelet kills container, restarts per restartPolicyPod removed from Service endpoints (no restart)Gates liveness & readiness; while failing, liveness/readiness disabled
ImpactContainer restarts (may cause CrashLoopBackOff)Traffic stops, but container stays runningBlocks startup progress if failing; once passes, liveness/readiness start
Typical use caseDeadlock detection, infinite loop recoveryCache warmup, DB connection, migration completionApps with >30s boot time, legacy apps, heavy initialization
Handler supporthttpGet, tcpSocket, exec, grpchttpGet, tcpSocket, exec, grpchttpGet, tcpSocket, exec, grpc
Configuration adviceKeep simple: high threshold, long period; avoid dependency checksSet initialDelaySeconds to 0 when using startup probe; check real readinessHigh failureThreshold (e.g., 60) × short periodSeconds (e.g., 5) to cover max startup
Common mistakesUsing to check database (cascading failure)Setting initialDelaySeconds too low (adds Pod to endpoints before ready)Not used at all for slow-start apps (liveness kills during boot)

When to combine: Always pair readiness probes with startup probes for applications that warm up slowly. The startup probe disables liveness and readiness during boot, preventing premature restarts and premature traffic. For fast-starting apps (<30s), a simple readiness probe with initialDelaySeconds may suffice, but adding a startup probe costs nothing and adds safety.

Probe Configuration Cheat Sheet
For most production APIs: startupProbe (failureThreshold=60, periodSeconds=5), livenessProbe (failureThreshold=3, periodSeconds=15), readinessProbe (failureThreshold=3, periodSeconds=10). Adjust periods based on your app's expected behavior, but keep the multipliers reasonable to avoid killing healthy Pods.
Production Insight
The probe comparison table is your quick reference for incident response. When you see 502 errors during a rollout, the first thing to check is whether the readiness probe is correctly delaying traffic until the app is truly ready. The second is whether a startup probe exists. Without this mental model, engineers often spend hours debugging application code when the fix is a YAML change. Document your probe configuration in your runbook and validate it during every canary deploy.
Key Takeaway
Use the probe comparison table to quickly select the right probe type: startup for boot gating, readiness for traffic control, liveness for deadlock recovery. Always combine startup + readiness for slow-start apps.

Deployment Strategies: RollingUpdate vs Recreate

When updating a Deployment, Kubernetes supports two strategies: RollingUpdate (default) and Recreate. The choice between them directly impacts availability, resource usage, and rollout speed.

RollingUpdate: Replaces old Pods with new Pods incrementally. Controlled by two parameters: - maxSurge: How many extra Pods (count or percentage) can be created above the desired replica count during the update. - maxUnavailable: How many Pods (count or percentage) can be unavailable during the update.

During a rolling update, the Deployment controller creates new Pods in a new ReplicaSet, waits for them to become ready, then scales down the old ReplicaSet. The process repeats until all old Pods are replaced. This strategy enables zero-downtime deployments, but at the cost of requiring extra cluster capacity (at least maxSurge proportion of overhead).

Recreate: Terminates all old Pods simultaneously, then creates new Pods. This is a simple 'kill all, start all' pattern. During the update, the application is completely unavailable — no traffic can be served until all new Pods are ready. Use this strategy only when the application cannot have multiple versions running concurrently (e.g., stateful applications with incompatible schemas, or when file locks prevent coexistence). Recreate is also useful for cost-constrained environments where you cannot afford the overhead of extra Pods during a rollout.

When to choose
  • RollingUpdate: Stateless APIs, microservices, web frontends — anything that needs 100% uptime during deploys.
  • Recreate: Stateful databases (during schema migrations), batch jobs, or any application that enforces exclusive access to a resource.
io/thecodeforge/k8s/deployment-strategy.yamlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# rollingupdate-strategy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: rolling-deploy
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: rolling
    spec:
      containers:
      - name: app
        image: nginx:1.25
---
# recreate-strategy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: recreate-deploy
spec:
  replicas: 3
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: recreate
    spec:
      containers:
      - name: app
        image: nginx:1.25
Output
# RollingUpdate: gradual replacement, zero-downtime.
# Recreate: kill all, then start all — downtime expected.
Recreate Means Downtime
The Recreate strategy terminates all Pods before creating new ones. Your application will be completely unavailable from the moment the last old Pod is deleted until the first new Pod passes its readiness probe. For most production services, this is unacceptable. Only use Recreate when you understand and accept the downtime tradeoff.
Production Insight
The RollingUpdate strategy is not free — it requires additional cluster capacity equal to maxSurge proportion of replicas. For a 10-replica Deployment with maxSurge: 25%, you need capacity for 12.5 Pods (round up to 13) during the rollout. In resource-constrained clusters, you can trade availability for capacity by setting maxUnavailable: 25% and maxSurge: 0, which terminates old Pods before creating new ones (brief capacity dip but less overhead). Always calculate the peak resource requirement for rolling updates and ensure the cluster can handle it.
Key Takeaway
RollingUpdate provides zero-downtime at the cost of extra resource capacity. Recreate is simple but causes downtime. Choose based on availability requirements and cluster headroom.

RollingUpdate vs Recreate Visual Comparison

While the previous section explained when to use each strategy, this visual comparison highlights the key differences in resource usage, timeline, and availability during a deployment. Use this diagram and table to communicate rollout behavior to your team and to decide which strategy fits your workload.

Timeline Comparison (3 replicas, 10s per Pod startup): - Recreate: 0s — all 3 old pods terminated. 10s — first new pod ready. 20s — all new pods ready. Total downtime: ~10s (from 0s to first pod ready). - RollingUpdate (maxSurge=1, maxUnavailable=0): 0s — 3 old pods serving, 1 new pod created. 10s — new pod ready, old pod terminated. 15s — second new pod created. 25s — second new pod ready, second old pod terminated. 30s — third new pod created. 40s — third new pod ready, third old pod terminated. Total downtime: 0s (always at least 2 pods serving).

Resource Peak Comparison (3 replicas, each request 256Mi memory, 500m CPU): - Recreate: 0s-10s: 0 pods, 0 resources used. 10s-20s: 3 pods, 768Mi/1.5 CPU. Peak: same as steady state. - RollingUpdate (maxSurge=1, maxUnavailable=0): Overlap phase: 2 old + 1 new = 3 pods, plus 1 extra during surge = 4 pods simultaneously. Peak memory: 4×256Mi = 1024Mi. Peak CPU: 4×500m = 2000m. Requires 33% more capacity than steady state.

Capacity Planning for Rolling Updates
When using RollingUpdate with maxUnavailable=0, ensure your cluster has enough CPU and memory to accommodate the peak number of Pods (desired + maxSurge). If the cluster is already near capacity, the rollout may stall because new Pods cannot be scheduled. Monitor cluster utilization before triggering a rolling update.
Production Insight
The visual comparison makes it clear: RollingUpdate trades resource overhead for zero downtime. In cost-conscious environments, you can optimize by setting maxSurge to a lower value (e.g., 1 Pod instead of a percentage) or by temporarily scaling down non-critical workloads before the deploy. Always benchmark your app's startup time — it directly affects the rollout duration. For apps that take 2 minutes to boot, a rolling update with 3 replicas can take 6+ minutes. In such cases, consider canary deployments with a smaller batch size first.
Key Takeaway
RollingUpdate uses more resources during rollout but maintains availability. Recreate uses no extra resources but incurs downtime. Use the timeline and resource comparison to plan capacity and set expectations with your team.

Resource Management: Requests, Limits, and QoS Classes

Every container in Kubernetes should specify CPU and memory requests and limits. These settings directly affect scheduling, runtime performance, and cluster stability. Misconfiguring them is a top cause of production incidents.

Requests are the minimum amount of resources guaranteed to the container. The Kubernetes scheduler uses requests to make placement decisions — it only schedules a Pod on a node that has at least the sum of all Pod requests available. Requests also ensure the container gets at least that much CPU and memory under contention.

Limits are the maximum amount of resources a container is allowed to consume. If a container exceeds its CPU limit, it gets throttled (not killed). If it exceeds its memory limit, it gets OOMKilled (exit code 137). Limits prevent a single container from starving other containers on the same node.

QoS Classes: Kubernetes categorizes Pods into three Quality of Service classes based on request and limit settings: - Guaranteed: requests == limits for all resources (e.g., cpu: 500m, limits cpu: 500m). These Pods are least likely to be evicted and get the most predictable performance. - Burstable: At least one container has requests < limits for any resource. These Pods get their requested minimum but can burst to their limit if node capacity is available. - BestEffort: No requests or limits set. These Pods receive no guarantees and are the first to be evicted under node pressure. Avoid BestEffort in production.

Production Best Practices: 1. Always set requests and limits for every container. Never leave them unset. 2. Set requests equal to limits for critical workloads to achieve Guaranteed QoS and predictable performance. 3. Base request values on steady-state resource usage observed in production over 7 days. Do not guess. 4. Set memory limits to 1.5x the observed maximum to handle transient spikes without OOMKill. 5. CPU limits are less critical than memory limits, but still set them to prevent noisy neighbors. CPU throttling is better than OOMKill. 6. Use Vertical Pod Autoscaler (VPA) in recommendation mode to generate initial request/limit values, then refine manually.

io/thecodeforge/k8s/resources.yamlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# resources.yaml — Guaranteed QoS example
apiVersion: v1
kind: Pod
metadata:
  name: guaranteed-pod
spec:
  containers:
  - name: app
    image: myapp:1.0
    resources:
      requests:
        memory: "256Mi"
        cpu: "500m"
      limits:
        memory: "256Mi"
        cpu: "500m"
---
# Burstable example
apiVersion: v1
kind: Pod
metadata:
  name: burstable-pod
spec:
  containers:
  - name: app
    image: myapp:1.0
    resources:
      requests:
        memory: "128Mi"
        cpu: "100m"
      limits:
        memory: "512Mi"
        cpu: "1"
Output
# QoS classes:
# Guaranteed: requests == limits, most predictable, last to evict.
# Burstable: requests < limits, can burst, moderate eviction priority.
# BestEffort: no requests/limits, first to evict, unpredictable performance.
Guaranteed QoS for Critical Workloads
For production APIs and databases, always set requests equal to limits (Guaranteed QoS). This ensures predictable CPU scheduling and prevents OOMKills during memory spikes. The trade-off is that you cannot overcommit resources — but for critical services, predictability is worth the cost.
Production Insight
The most common resource incident in production is the 'noisy neighbor' scenario: one Pod without limits consumes all available memory on a node, causing other Pods to be OOMKilled. This cascades as the evicted Pods restart on other nodes, potentially overwhelming the cluster. Properly set requests and limits, combined with Guaranteed QoS for critical workloads, prevent this pattern. Additionally, setting memory limits too low causes needless OOMKills — monitor actual usage with tools like kubectl top and adjust limits based on real data, not guesses.
Key Takeaway
Requests guarantee scheduling and minimum resources; limits prevent runaway consumption. Always set both. Guaranteed QoS (requests == limits) provides the most predictable performance and is recommended for production workloads.

Resource Management Best Practices (Requests/Limits)

While the previous section explained the mechanics of requests, limits, and QoS classes, this section provides a concise list of best practices that you can apply immediately to your production Deployments. These recommendations come from real-world incidents and thousands of production clusters.

1. Never leave resources unset. A container without requests runs as BestEffort QoS — it gets no guarantees and is first to be evicted. A container without limits can consume all node memory and OOMKill other Pods. Always set both.

2. Base requests on steady-state usage, not peak. Monitor your application for at least 7 days using kubectl top pod or a metrics system (Prometheus). Set requests to the 50th percentile of observed usage. Set limits to 1.5x the 95th percentile for memory (to handle spikes) and 2x the 95th percentile for CPU (since throttling is acceptable).

3. Use Guaranteed QoS for all critical Deployments. Setting requests == limits gives the Pod the highest eviction priority and most stable CPU scheduling. The only cost is that you cannot overcommit, but for critical services this is a feature, not a bug.

4. Test resource configurations under load. A common mistake: setting memory too low based on idle measurement. During traffic spikes, memory usage can double. Run load tests that mimic peak traffic and verify that memory stays below the limit.

5. Avoid setting identical CPU and memory values for all containers. Each container type has different resource profiles. For example, a sidecar proxy (Envoy) needs different requests than the main application container. Use per-container settings.

6. Implement resource quotas at the namespace level. Even with per-Pod limits, a runaway Deployment can create many Pods that collectively use too many resources. Use ResourceQuota objects to enforce aggregate limits per namespace.

7. Monitor for OOMKill events and CPU throttling. Set up alerts on kube_pod_container_status_terminated_reason (OOMKilled) and container_cpu_cfs_throttled_seconds_total. These events indicate misconfigured resource limits.

8. Use VPA in recommendation mode (not auto) for initial tuning. The Vertical Pod Autoscaler can analyze historical usage and suggest request/limit values. Review its recommendations before applying. Do not enable auto mode for stateless workloads unless you understand the disruption it causes (Pod restart on every recommendation).

9. Consider using alternative scheduling policies. For batch workloads that can tolerate lower priority, use priorityClassName to distinguish critical vs. best-effort. Combined with Guaranteed QoS, priority ensures your most important workloads survive contention.

10. Document your resource strategy. Include typical request/limit values for each service, the reasoning behind them, and instructions for adjusting. This prevents future engineers from guessing or copying incorrect values from other services.

io/thecodeforge/k8s/resource-best-practices.yamlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# deployment-with-resources.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-deployment
spec:
  template:
    spec:
      containers:
      - name: app
        image: myapp:1.0
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "768Mi"
            cpu: "1000m"
      - name: sidecar
        image: envoy:latest
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "500m"
Output
# Best practice: different containers have different resource profiles.
# Use ResourceQuota to enforce namespace-wide limits.
Memory Limits: The Most Common Mistake
Setting memory limits too low is far more dangerous than setting them too high. An OOMKill causes immediate Pod restart, which can cascade into cluster instability. When in doubt, start with a higher limit (e.g., 2x expected peak) and gradually lower it based on monitoring. Never guess — always measure.
Production Insight
Resource management is not a one-time configuration. As your application evolves, its resource usage changes. Set up periodic reviews of request/limit values (e.g., every quarter). The cost of over-provisioning is often lower than the cost of an OOMKill-induced incident. In most production clusters, the number of CPU throttling events directly correlates with how well resource limits are tuned. If you see heavy throttling, increase CPU limits or reduce requests. If you see zero throttling, you are probably over-provisioning — which is fine for critical services but wasteful for batch jobs.
Key Takeaway
Resource management best practices: base requests on measured usage, set limits to 1.5-2x peak, use Guaranteed QoS for critical workloads, and implement namespace quotas. Monitor OOMKills and CPU throttling as indicators of misconfiguration.

Production Operations: Essential kubectl Patterns

Managing Deployments in production requires more than just apply. You need to be able to inspect the rollout history, trigger instant rollbacks, and scale on demand.

io/thecodeforge/k8s/deployment-ops.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# 1. Trigger a manual rolling update by changing the image
kubectl set image deployment/forge-api-deployment api=io.thecodeforge/api:2.1.0

# 2. Watch the rollout in real-time
kubectl rollout status deployment/forge-api-deployment

# 3. Emergency! Roll back to the previous stable version
kubectl rollout undo deployment/forge-api-deployment

# 4. View the rollout history to find a specific revision
kubectl rollout history deployment/forge-api-deployment

# 5. Scale the workload horizontally (manual scaling)
kubectl scale deployment/forge-api-deployment --replicas=10

# 6. Deep inspection of pod failures
kubectl describe pod <pod-name>
kubectl logs -f deployment/forge-api-deployment --all-containers
Output
# rollout "forge-api-deployment" successfully rolled back
Rollback Is Not Free
  • Rollback = scale up old ReplicaSet + scale down new ReplicaSet. No image pull needed if cached.
  • revisionHistoryLimit (default 10): How many old ReplicaSets to keep. Increase to 20-50 for critical services.
  • kubectl rollout pause/resume: Pause a rollout mid-way to test a subset of new Pods before completing.
  • kubectl rollout undo --to-revision=N: Rollback to a specific revision, not just the previous one.
  • kubectl rollout restart: Triggers a rolling restart without changing the image. Useful for picking up ConfigMap changes.
Production Insight
The kubectl rollout pause command is the foundation of canary deployments without a service mesh. Pause the rollout after one new Pod is created, send a small percentage of traffic to it via Service endpoint selection, monitor error rates, then either resume or undo. This gives you canary behavior with standard Kubernetes objects. Combine with maxSurge: 1 to ensure only one canary Pod is created during the pause.
Key Takeaway
Production operations require more than kubectl apply. Use rollout undo for instant rollbacks, rollout pause for canary deployments, and rollout restart for ConfigMap-driven restarts. Keep old images available — rollback fails if the image is garbage-collected.
● Production incidentPOST-MORTEMseverity: high

Rolling Update Caused 502 Errors for 8 Minutes During Production Deploy

Symptom
HTTP 502 and connection refused errors spiked immediately after kubectl apply. The rollout completed in 3 minutes but 502 errors persisted for 5 additional minutes. kubectl get pods showed all Pods in Running state. No CrashLoopBackOff. No OOMKill events.
Assumption
The new application version had a bug that caused it to reject requests.
Root cause
The Deployment had maxUnavailable: 0 (correct for zero-downtime) but the readiness probe was misconfigured: initialDelaySeconds: 5 and periodSeconds: 10 with failureThreshold: 1. The application took 45 seconds to warm its cache and connect to the database. During this 45-second window, the readiness probe failed once after 5 seconds, but Kubernetes had already added the Pod to the Service endpoints because the first probe check had not yet run. The kube-proxy rules were updated to route traffic to the new Pod before it was actually ready. Additionally, the old Pods were terminated immediately when the new Pods passed their first readiness check, creating a window where neither old nor new Pods were fully serving traffic.
Fix
1. Added a startup probe with failureThreshold: 60 and periodSeconds: 5 (300 seconds max startup time) to gate liveness and readiness probes until the app is fully booted. 2. Changed readiness probe initialDelaySeconds to 0 (startup probe handles the delay) and set failureThreshold: 3 with periodSeconds: 5. 3. Added terminationGracePeriodSeconds: 60 with a pre-stop hook that sleeps 15 seconds to drain in-flight connections before the container is killed. 4. Set maxSurge: 1 to ensure at least one new Pod is fully ready before old Pods are terminated. 5. Added a PodDisruptionBudget with minAvailable: 2 to prevent simultaneous termination of multiple Pods.
Key lesson
  • Readiness probes must reflect actual readiness, not just process liveness. A Pod that is 'running' is not the same as a Pod that is 'ready'.
  • Startup probes are mandatory for applications with warm-up times greater than 30 seconds. Without them, liveness probes kill the container during boot.
  • Always combine terminationGracePeriodSeconds with a pre-stop hook to drain in-flight connections before container shutdown.
  • Test rolling updates in staging with real traffic patterns. The first time you see a rollout failure should not be in production.
  • PodDisruptionBudgets prevent the Deployment from terminating too many Pods simultaneously during updates or node drains.
Production debug guideSymptom-first investigation path for Pod lifecycle and Deployment rollout failures.6 entries
Symptom · 01
Pod stuck in CrashLoopBackOff.
Fix
Check container logs with kubectl logs <pod> --previous to see why it crashed. Check for OOMKill (exit code 137), missing environment variables, or failed startup dependencies. If the crash happens during startup, add a startup probe.
Symptom · 02
Pod stuck in Pending with no events.
Fix
Check node resources with kubectl describe nodes | grep -A 5 Allocatable. Check for insufficient CPU/memory, PVC binding failures, or taints/tolerations mismatches. If no nodes can schedule the Pod, it stays Pending indefinitely.
Symptom · 03
Deployment rollout hangs at 'Waiting for rollout'.
Fix
Check kubectl rollout status deployment/<name>. If maxUnavailable is 0 and a new Pod cannot become ready, the rollout blocks. Check readiness probe failures. Check for PDB conflicts that prevent Pod termination.
Symptom · 04
Pods restarting frequently but not in CrashLoopBackOff.
Fix
Check liveness probe configuration. If failureThreshold is too low or periodSeconds is too aggressive, healthy-but-slow Pods are killed. Check liveness probe endpoint latency. Consider using a startup probe to gate liveness.
Symptom · 05
502/503 errors during Deployment rollout.
Fix
Check readiness probe timing. Ensure new Pods are ready before old Pods are terminated. Check terminationGracePeriodSeconds and pre-stop hooks. Verify maxSurge and maxUnavailable settings. Check Service endpoints during rollout: kubectl get endpoints <service> -w.
Symptom · 06
Pod evicted with reason 'Evicted' and no OOMKill.
Fix
The kubelet evicted the Pod due to node pressure (disk, memory, PID). Check kubectl describe node <node> | grep Conditions. Check if the Pod's QoS class is BestEffort (first to be evicted). Set resource requests and limits to achieve Guaranteed QoS.
★ Pod and Deployment Triage CommandsRapid commands to isolate Pod lifecycle and Deployment issues.
Pod in CrashLoopBackOff.
Immediate action
Check previous container logs and exit code.
Commands
kubectl logs <pod> --previous --tail=50
kubectl get pod <pod> -o jsonpath='{.status.containerStatuses[0].lastState.terminated.exitCode}'
Fix now
Exit code 137 = OOMKill (increase memory limit). Exit code 1 = application error (check logs). Exit code 143 = SIGTERM (check graceful shutdown).
Pod stuck in Pending.+
Immediate action
Check Pod events and node resources.
Commands
kubectl describe pod <pod> | grep -A 20 Events
kubectl describe nodes | grep -A 5 'Allocated resources'
Fix now
If 'Insufficient cpu/memory', scale cluster or reduce requests. If 'node(s) had taint', add tolerations. If 'waiting for volume', check PVC status.
Deployment rollout stuck.+
Immediate action
Check rollout status and new Pod readiness.
Commands
kubectl rollout status deployment/<name> --timeout=30s
kubectl get pods -l app=<label> -o wide | grep -v Running
Fix now
If new Pods are not Ready, check readiness probe. If rollout is stuck, rollback: kubectl rollout undo deployment/<name>.
502 errors during deploy.+
Immediate action
Check Service endpoints during rollout.
Commands
kubectl get endpoints <service> -w
kubectl get pod -l app=<label> -o jsonpath='{range .items[*]}{.metadata.name} {.status.conditions[?(@.type=="Ready")].status}{"\n"}{end}'
Fix now
If endpoints are empty during rollout, Pods are not passing readiness probes. Add startup probe and increase terminationGracePeriodSeconds.
Pod evicted from node.+
Immediate action
Check node conditions and Pod QoS class.
Commands
kubectl describe node <node> | grep -A 10 Conditions
kubectl get pod <pod> -o jsonpath='{.status.qosClass}'
Fix now
If QoS is BestEffort, set resource requests and limits. If node has DiskPressure, clean up unused images. If MemoryPressure, reduce Pod memory usage.
Pod Management Objects: Deployment vs StatefulSet vs DaemonSet vs Job
AspectDeploymentStatefulSetDaemonSetJob
Use caseStateless applications (APIs, web servers)Stateful applications (databases, queues)Node-level agents (log collectors, monitors)One-off or batch tasks (migrations, reports)
Pod identityInterchangeable. No stable identity.Stable identity (ordinal index: pod-0, pod-1).One Pod per node (or subset of nodes).Single Pod (or N completions). Runs to completion.
ScalingHorizontal (add/remove replicas freely).Ordered: scale up 0->1->2. Scale down in reverse.Automatic: one Pod per matching node.parallelism controls concurrent Pods.
StorageEphemeral or shared PersistentVolume.PersistentVolumeClaim per Pod (stable storage).HostPath or shared storage.Ephemeral. Output to external storage.
Rolling updateSupported (RollingUpdate strategy).Supported (OrderedReady or Parallel).Supported (RollingUpdate or OnDelete).Not applicable (Pods run to completion).
NetworkingPod IP changes on restart. Use Service.Stable network identity (pod-0.svc.cluster.local).Pod IP tied to node IP. Use hostNetwork.Pod IP is ephemeral.
Self-healingYes. Replaces failed Pods automatically.Yes. Replaces failed Pods with same identity.Yes. Replaces failed Pods on the same node.Yes. Retries failed Pods up to backoffLimit.
Production exampleWeb API, microservice, frontendPostgreSQL, Kafka, Redis clusterFluentd, node-exporter, Cilium agentDatabase migration, backup job, report generation

Key takeaways

1
Pods are ephemeral and non-self-healing
never deploy individual Pods for production workloads.
2
Deployments are controllers that maintain the desired number of Pod replicas and handle lifecycle management.
3
Use 'RollingUpdate' with 'maxUnavailable
0' to achieve zero-downtime application updates.
4
Liveness probes manage the container lifecycle (restarts), while Readiness probes manage the traffic flow (service discovery).
5
Resource 'requests' determine scheduling (which node); 'limits' determine runtime enforcement (CPU throttling/OOM killing).
6
Startup probes are mandatory for applications with warm-up times greater than 30 seconds. Without them, liveness probes kill containers during boot.
7
PodDisruptionBudgets prevent simultaneous Pod termination during node drains and rolling updates.
8
Rollback is not free
it fails if the old image has been garbage-collected. Keep old images available and increase revisionHistoryLimit for critical services.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

FAQ · 6 QUESTIONS

Frequently Asked Questions

01
What is the difference between a liveness probe and a readiness probe?
02
Why should I set both requests and limits for my containers?
03
How does a Deployment know which Pods it owns?
04
What happens to a Deployment if a Node fails?
05
What is a startup probe and when should I use it?
06
How do I perform a canary deployment without a service mesh?
🔥

That's Kubernetes. Mark it forged?

10 min read · try the examples if you haven't

Previous
Introduction to Kubernetes
2 / 12 · Kubernetes
Next
Kubernetes Services and Ingress