kubectl describe: read state + events (aggregated GET + events query).
kubectl apply: write desired state (POST/PUT/PATCH requests).
kubectl delete: remove state (DELETE requests).
The --output flag controls formatting: -o yaml, -o json, -o wide, -o custom-columns.
The -n flag scopes commands to a namespace. Without it, commands target 'default'.
The --context flag switches between clusters (dev, staging, prod).
kubectl describe is the single most valuable debugging command — the Events section shows scheduler decisions, image pull failures, and probe failures.
kubectl logs --previous is the only way to see why a container crashed before it restarted.
Running kubectl delete pod --force --grace-period=0 on a StatefulSet pod — this destroys the pod without letting it drain connections, potentially corrupting persistent storage.
✦ Definition~90s read
What is kubectl Commands?
A kubectl commands cheatsheet is a curated reference of the most essential kubectl invocations for day-to-day Kubernetes operations. It exists because Kubernetes exposes hundreds of API objects and subcommands, and even experienced operators can't keep every flag and resource type in working memory.
★
kubectl is like a remote control for your Kubernetes cluster.
A good cheatsheet distills the 20-30 commands you actually use—like kubectl get pods -o wide, kubectl describe, kubectl logs -f, and kubectl exec -it—into a single page you can glance at during an incident or deployment. It's not a replacement for the official kubectl --help or the Kubernetes API docs; rather, it's a survival tool for when you're under pressure and need to find a misconfigured pod, drain a node, or roll back a bad deployment in seconds.
The best cheatsheets also include context-switching commands (kubectl config use-context, kubectl config current-context) because the most common kubectl mistake is running a destructive command against the wrong cluster. If you're managing more than a handful of clusters or working in a team, a printed or pinned cheatsheet saves you from costly typos and context confusion.
Plain-English First
kubectl is like a remote control for your Kubernetes cluster. Instead of clicking through a web interface, you type commands to see what is running, check logs, fix problems, and deploy updates. Think of it as the steering wheel, dashboard, and toolbox combined into one command-line tool.
kubectl is not just a command-line tool — it is a Kubernetes API client. Every command you type translates into HTTP requests to the kube-apiserver. Understanding this relationship explains why certain commands are fast (cached reads), why others are slow (watch calls), and why some fail with 'connection refused' when the API server is unreachable.
The common misconception is that kubectl is a deployment tool. It is a state inspection and mutation tool. Deployment is one use case. The real value is observability — seeing what the cluster is doing right now, what it did in the past, and why a specific Pod is stuck. Engineers who master kubectl's inspection commands debug production incidents in minutes. Engineers who only know apply and delete debug by redeploying and hoping.
This cheatsheet is organized by intent: what are you trying to do? Find a resource? Debug a failure? Roll back a deployment? Each section includes the commands, the output you should expect, and the production gotchas that bite when you use them at scale.
What a kubectl Commands Cheatsheet Actually Does
A kubectl commands cheatsheet is a curated reference of the most essential kubectl commands for interacting with Kubernetes clusters. It condenses the full kubectl CLI — which has over 100 verbs and thousands of flag combinations — into a focused set of operations that cover 90% of daily tasks: pod inspection, resource creation, log retrieval, and debugging. The core mechanic is mapping human-readable commands (like kubectl get pods -o wide) to API calls against the Kubernetes API server, which then returns structured JSON or YAML output.
In practice, a good cheatsheet groups commands by intent: cluster info, workload management, troubleshooting, and resource mutation. It highlights the most common flags (-o wide, -n <namespace>, --all-namespaces, -w for watch) and shows how to combine them for real workflows. For example, kubectl describe pod gives you events and status, while kubectl logs --tail=50 -f streams recent logs — two commands that together resolve most pod failures in under 30 seconds.
Use a cheatsheet when you need to move fast without memorizing every flag. It’s not a replacement for the official docs or kubectl --help, but it eliminates context-switching during incidents. Teams that internalize a cheatsheet reduce time-to-diagnosis by 40-60% in production outages, because they don’t stop to look up syntax under pressure.
Cheatsheet vs. Reference
A cheatsheet is not a substitute for understanding kubectl's resource model — it's a performance tool for engineers who already know what they want to do.
Production Insight
During a node failure, an engineer ran kubectl delete pod --force on a StatefulSet pod, which orphaned the PVC and caused the replacement pod to fail with PersistentVolumeClaim is not bound.
The symptom: the new pod stayed in Pending with an error while running volume binding event, and the old PVC showed Lost status.
Rule of thumb: never force-delete a StatefulSet pod — always scale down the StatefulSet first, or use kubectl delete pod <name> --cascade=orphan if you must, then manually clean up the PVC.
Key Takeaway
A cheatsheet is a force multiplier for incident response — learn it, don't just bookmark it.
The most dangerous commands are the ones that work too well: --force, --grace-period=0, and --all can cause cascading failures.
Always validate with kubectl api-resources and kubectl explain before running an unfamiliar command in production.
The get and describe commands are your primary observability tools. They read cluster state from the API server without modifying anything — safe to run in production at any time.
kubectl get lists resources in a compact table format. It is fast because the API server caches the response. kubectl describe shows the same resources with full detail including events, conditions, and annotations — this is where the debugging signal lives.
The --output flag is more powerful than most engineers realize. -o jsonpath lets you extract specific fields programmatically. -o custom-columns builds custom dashboards in your terminal. -o name outputs only resource names, perfect for scripting.
getting-information.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# List resources
kubectl get pods
kubectl get pods -o wide # + node and IP
kubectl get pods --all-namespaces # all namespaces
kubectl get deployments
kubectl get services
kubectl get nodes
kubectl get all # pods, services, deployments in current namespace
# GetYAML/JSON of a resource
kubectl get pod myapp-pod -o yaml
kubectl get deployment myapp-deployment -o json
# Detailed view with events — essential for debugging
kubectl describe pod myapp-pod
kubectl describe node my-node
# Watch resources update in real time
kubectl get pods -w
# Custom columns output
kubectl get pods -o custom-columns=NAME:.metadata.name,STATUS:.status.phase
# ── PRODUCTION-GRADEEXTRAS ──────────────────────────────────────────────
# Extract a specific field with jsonpath
kubectl get pod myapp-pod -o jsonpath='{.status.podIP}'
kubectl get pods -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.podIP}{"\n"}{end}'
# Get all pods with their resource requests
kubectl get pods -o custom-columns=NAME:.metadata.name,CPU_REQ:.spec.containers[0].resources.requests.cpu,MEM_REQ:.spec.containers[0].resources.requests.memory
# Find pods without resource limits (security/compliance check)
kubectl get pods -A -o json | jq -r '.items[] | select(.spec.containers[].resources.limits == null) | .metadata.namespace + "/" + .metadata.name'
# Get events sorted by time — critical for post-incident analysis
kubectl get events --sort-by='.lastTimestamp' -A
# Get events for a specific resource
kubectl get events --field-selector involvedObject.name=myapp-pod
# CheckAPI server health
kubectl get --raw /healthz
kubectl get --raw /readyz
kubectl get --raw /livez
Output
NAME READY STATUS RESTARTS AGE
myapp-pod-abc 1/1 Running 0 5m
10.244.1.45
myapp-pod-abc 10.244.1.45
myapp-pod-def 10.244.2.78
ok
get vs describe — When to Use Each
Events have a TTL (default: 1 hour). If you investigate late, the events are gone. Export them early.
kubectl get events --sort-by='.lastTimestamp' shows the chronological story of what happened.
kubectl describe pod shows events scoped to that Pod. kubectl get events -A shows cluster-wide events.
Production Insight
kubectl get with jsonpath is the foundation of production automation scripts. CI/CD pipelines, health checks, and alerting scripts all rely on extracting specific fields from kubectl output. The -o jsonpath syntax is powerful but fragile — field names change between API versions. Always pin your API version in scripts: apiVersion: apps/v1 not just apiVersion: v1. For complex extraction, pipe to jq instead of wrestling with jsonpath edge cases.
Key Takeaway
kubectl get is for scanning; kubectl describe is for investigating. The Events section in describe is the primary debugging signal — it shows scheduler decisions, probe failures, and image pull errors. Use -o jsonpath or jq for programmatic extraction in automation scripts.
Choosing the Right Output Format
IfQuick overview of many resources.
→
Usekubectl get <resource> — default table output.
IfNeed a specific field for scripting.
→
Usekubectl get <resource> -o jsonpath='{.status.phase}' or pipe to jq.
IfDebugging a specific resource's state.
→
Usekubectl describe <resource> <name> — read the Events and Conditions sections.
IfNeed the full resource definition for backup or modification.
→
Usekubectl get <resource> <name> -o yaml — export, edit, and reapply.
IfWatching for changes in real time.
→
Usekubectl get <resource> -w — streams updates as they happen.
Debugging
The debugging commands — logs, exec, port-forward, cp — are how you interact with running (or crashed) containers. These commands go through the kubelet on the target node, not directly to the container runtime.
kubectl logs retrieves stdout/stderr from the container. The --previous flag is critical — it retrieves logs from the container instance that just crashed, which is the only way to see why a CrashLoopBackOff Pod is failing.
kubectl exec opens a shell inside the container. This is the Kubernetes equivalent of SSH. It requires the container to be running and have a shell binary (bash or sh). Distroless containers do not have a shell — use ephemeral debug containers instead.
kubectl port-forward creates a tunnel from your local machine to a Pod or Service inside the cluster. It bypasses Ingress, LoadBalancer, and NodePort — useful for accessing services that are not exposed externally.
debugging.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# Container logs
kubectl logs myapp-pod
kubectl logs myapp-pod -c container-name # specific container
kubectl logs myapp-pod --previous # logs from crashed container
kubectl logs -f deployment/myapp # follow deployment logs
kubectl logs myapp-pod --tail=100 # last 100 lines
# Execute commands inside a container
kubectl exec -it myapp-pod -- bash
kubectl exec -it myapp-pod -- sh # if bash not available
kubectl exec myapp-pod -- env # print env vars
kubectl exec myapp-pod -- cat /config/app.properties
# Port forward to access a service locally
kubectl port-forward pod/myapp-pod 8080:8000
kubectl port-forward service/myapp-svc 8080:80
# Copy files to/from a pod
kubectl cp myapp-pod:/var/log/app.log ./app.log
kubectl cp ./config.yaml myapp-pod:/app/config.yaml
# ── PRODUCTION-GRADEDEBUGGING ───────────────────────────────────────────
# Logs from all pods in a deployment (parallel)
kubectl logs deployment/myapp --all-containers --prefix
# Logs with timestamps — essential for correlating with external events
kubectl logs myapp-pod --timestamps
# Logs from a specific time window
kubectl logs myapp-pod --since=1h
kubectl logs myapp-pod --since-time=2026-04-07T10:00:00Z
# Ephemeral debug container for distroless images (K8s1.23+)
kubectl debug -it myapp-pod --image=busybox --target=app-container
# Debug a node by creating a privileged debug Pod
kubectl debug node/my-node -it --image=ubuntu
# Copy logs from a crashed pod before it is garbage collected
kubectl logs <pod> --previous > crash.log 2>&1
# Stream logs from multiple pods with a label selector
kubectl logs -l app=payment-service --all-containers --prefix --follow
# Check what environment variables a running pod has
kubectl exec myapp-pod -- printenv | sort
# Test network connectivity from inside a pod
kubectl exec myapp-pod -- curl -s http://other-service.default.svc.cluster.local:8080/health
kubectl exec myapp-pod -- nslookup other-service.default.svc.cluster.local
# Check mounted volumes and config
kubectl exec myapp-pod -- df -h
kubectl exec myapp-pod -- mount | grep config
Output
# port-forward: http://localhost:8080 → container port 8000
# ephemeral debug container
Defaulting debug container name to debugger-xxxxx.
/ # ls /app
config.yaml app.jar lib/
Distroless and Debug Containers
kubectl debug -it <pod> --image=busybox --target=<container> shares the process namespace.
kubectl debug node/<node> -it --image=ubuntu creates a Pod on that node with host namespaces mounted.
Always use --rm with debug containers to ensure they are cleaned up.
Production Insight
kubectl logs --previous is the most underused debugging command. When a Pod enters CrashLoopBackOff, the current container has no logs — it just crashed. Only --previous shows the stdout/stderr from the crashed instance. Set up your incident response runbook to always run logs --previous before investigating further. Additionally, logs are ephemeral by default — they are lost when the Pod is deleted. Use a log aggregation stack (Fluentd/Fluent Bit -> Loki/ELK) for persistent log retention. kubectl logs is for real-time debugging, not historical analysis.
Key Takeaway
kubectl logs --previous is the first command for CrashLoopBackOff debugging. kubectl exec requires a shell in the container — use ephemeral debug containers for distroless images. kubectl port-forward bypasses all networking layers for direct Pod access. Always capture logs before a Pod is garbage collected.
Debugging Tool Selection
IfContainer is running and you need to inspect its state.
→
Usekubectl exec -it <pod> -- sh — get a shell and investigate.
IfContainer is running but has no shell (distroless).
Usekubectl logs <pod> --previous — get logs from the crashed instance.
IfNeed to access a service that is not exposed externally.
→
Usekubectl port-forward service/<name> 8080:80 — tunnel to the service from your machine.
IfNeed to extract a log file or core dump from a Pod.
→
Usekubectl cp <pod>:/path/to/file ./local-file — copy files in either direction.
IfNeed to debug node-level issues (networking, disk, kernel).
→
Usekubectl debug node/<node> -it --image=ubuntu — launch a privileged Pod on the node.
Managing Deployments
The apply, set image, scale, and rollout commands are how you modify cluster state. These are write operations — they change what is running. Use them with the same care as database writes.
kubectl apply is the declarative entry point. It reads a YAML file, compares it with the live state, and sends a PATCH request to the API server to reconcile the difference. This is idempotent — running apply twice with the same file produces no change.
kubectl rollout undo is the most important safety net. It reverts a Deployment to the previous ReplicaSet revision. This is not a delete-and-recreate — it is a controlled rollback that respects rolling update parameters (maxUnavailable, maxSurge).
kubectl delete is the imperative counterpart to apply. It removes resources. For Deployments, it deletes the Deployment and its ReplicaSets but does not delete the Pods — the Pods become orphaned until garbage collected. Always prefer apply with a modified YAML over delete-then-create.
managing-deployments.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# Apply config
kubectl apply -f deployment.yaml
kubectl apply -f ./k8s/ # apply all files in directory
# Update image
kubectl set image deployment/myapp api=myapp:2.0
# Scale
kubectl scale deployment myapp --replicas=5
# Rollout management
kubectl rollout status deployment/myapp
kubectl rollout history deployment/myapp
kubectl rollout undo deployment/myapp # rollback one step
kubectl rollout undo deployment/myapp --to-revision=2 # rollback to specific revision
# Delete
kubectl delete pod myapp-pod
kubectl delete deployment myapp
kubectl delete -f deployment.yaml # delete what was applied
# Force delete a stuck pod
kubectl delete pod myapp-pod --force --grace-period=0
# ── PRODUCTION-GRADEDEPLOYMENTMANAGEMENT ───────────────────────────────
# Apply with server-side diff (dry-run) — see what would change
kubectl apply -f deployment.yaml --server-dry-run --diff
# Apply with field validation — reject invalid manifests
kubectl apply -f deployment.yaml --validate=true --server-side
# Restart all pods in a deployment (rolling restart)
kubectl rollout restart deployment/myapp
# Pause a rollout mid-way (canary testing)
kubectl rollout pause deployment/myapp
# ... verify canary pods ...
kubectl rollout resume deployment/myapp
# Check rollout revision details
kubectl rollout history deployment/myapp --revision=3
# Scale with resource-aware scripting
CURRENT=$(kubectl get deployment myapp -o jsonpath='{.spec.replicas}')
DESIRED=$((CURRENT + 2))
kubectl scale deployment myapp --replicas=$DESIRED
echo "Scaled from $CURRENT to $DESIRED replicas"
# Delete with label selector (dangerous — always dry-run first)
kubectl delete pods -l app=temp-worker --dry-run=client
kubectl delete pods -l app=temp-worker
# Apply with pruning — delete resources not in the applied set
kubectl apply -f ./k8s/ --prune -l app=myapp
# Export current state before making changes (safety myapp-backup-$(date +%Y%m%d-%H%M%S).yaml
Output
deployment.apps/myapp-deployment configured
Waiting for deployment "myapp" rollout to finish: 3 out of 5 new replicas have been updated...
deployment.apps/myapp-deployment successfully rolled out
apply vs replace vs create
apply with --server-side (K8s 1.22+) avoids conflicts from multiple actors updating the same resource.
apply with --prune deletes resources that are no longer in your manifest directory — powerful but dangerous.
Always run --dry-run=client or --server-dry-run before applying to production.
Production Insight
kubectl rollout pause is the foundation of canary deployments. Pause a rollout after the first new Pod starts, verify it is healthy (check logs, metrics, error rates), then resume. This gives you a manual gate between 'new code is running' and 'all Pods are new code.' Combine with PodDisruptionBudgets and readiness probes for zero-downtime deployments. The biggest production mistake is running kubectl apply without --dry-run first. A misconfigured manifest can delete all Pods simultaneously if the selector changes.
Key Takeaway
kubectl apply is the declarative standard — always dry-run first. kubectl rollout undo is the fastest rollback mechanism, reverting to the previous ReplicaSet in seconds. kubectl rollout pause enables canary deployments. Never use --force on StatefulSet Pods — it breaks the identity contract.
Usekubectl set image deployment/<name> <container>=<image>:<tag> — faster than editing YAML.
IfNeed to restart all Pods without changing the spec.
→
Usekubectl rollout restart deployment/<name> — rolling restart with zero downtime.
IfNeed to test new code on a subset of Pods before full rollout.
→
Usekubectl rollout pause deployment/<name> after first Pod starts. Verify. Then kubectl rollout resume deployment/<name>.
IfDeployment is broken — need to go back to the previous version.
→
Usekubectl rollout undo deployment/<name> — instant rollback to last known good ReplicaSet.
IfNeed to roll back to a specific older revision.
→
Usekubectl rollout history deployment/<name> (find revision), then kubectl rollout undo deployment/<name> --to-revision=<N>.
IfA Pod is stuck in Terminating and won't die.
→
Usekubectl delete pod <name> --force --grace-period=0 — last resort. Not for StatefulSets.
Kubectl Context and Configuration: The Cluster You Think You're In
You can have five clusters configured, but kubectl only cares about one: the current-context. Forget this and you'll run a production delete against staging at 3 AM. I've seen it. The kubeconfig file lives at ~/.kube/config by default. That file is a list of clusters, users, and contexts — but the 'current-context' field decides which one kubectl touches.
Before any destructive command, run kubectl config current-context. Then kubectl config get-contexts to see all available contexts and their namespaces. If you need to switch, use kubectl config use-context prod-east-1. For quick namespace switching without juggling full context, kubectl config set-context --current --namespace=payments-api saves you from typing -n on every call.
Merging kubeconfig files? Export KUBECONFIG=/path/to/file1:/path/to/file2 and kubectl config view --flatten. This is how you handle multi-cluster setups without a catalog of aliases.
ContextChecks.ymlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
// io.thecodeforge — devops tutorial
// check current context before a rollout
kubectl config current-context
// output: gke_production-us-east1_payments-cluster
// switch to a different cluster
kubectl config use-context gke_staging-us-west1_payments-staging
// output: Switched to context "gke_staging-us-west1_payments-staging".
// set default namespace for current context
kubectl config set-context --current --namespace=billing-service
// output: Context"gke_staging-us-west1_payments-staging" modified.
Output
gke_production-us-east1_payments-cluster
Production Trap:
Never rely on kubectl config current-context output alone — validate with kubectl get nodes before a rollout. Context names can lie; the nodes list doesn't.
Key Takeaway
Always run kubectl config current-context before apply, delete, or cordon. Your kubeconfig is the single source of truth until it isn't.
kubectl apply: Declarative Over Imperative, or Regret It Later
kubectl create and kubectl run are imperative — they work now, but they bypass version control and drift your cluster from source truth. kubectl apply is the declarative hammer. It reads a YAML file and ensures the cluster state matches that file. Run it once, run it in CI/CD, run it from GitOps — it idempotent.
But here's the gotcha: kubectl apply merges patches, it doesn't replace. If someone manually edited a Deployment (and someone will), kubectl apply will merge those changes with your file. That's how you get fields in the live object that exist nowhere in your manifests. Solution: --server-side apply or Kustomize's prune labels.
Always use --dry-run=client first, then --dry-run=server to check validation against the API server. Then drop the flag. And for the love of reproducible builds, never pipe curl into kubectl apply — that's how you deploy a rootkit.
ApplyWorkflow.ymlYAML
1
2
3
4
5
6
7
8
9
10
11
12
// io.thecodeforge — devops tutorial
// dry-run before apply
kubectl apply -f deployment-payments-v2.yaml --dry-run=client
// output: deployment.apps/payments-api created (dry run)
// actual apply with server-side validation
kubectl apply --server-side -f deployment-payments-v2.yaml
// output: deployment.apps/payments-api serverside-applied
// check what changed
kubectl diff -f deployment-payments-v2.yaml
Output
deployment.apps/payments-api serverside-applied
Senior Shortcut:
Set kubectl apply --prune -l app=payments to automatically remove resources from the cluster that aren't in your manifest directory. Keeps drift at zero.
Key Takeaway
Use kubectl apply --server-side in production. Never let kubectl create or manual edits become your source of truth — only YAML in Git.
Deleting Resources: Clean Up Before Your Cluster Bleeds Out
Deleting is not a panic button — it's maintenance. Every dangling Pod, orphaned Service, or stale ConfigMap is debt. Your cluster doesn't get a vacation; it keeps polling, routing, and logging garbage. Delete deliberately.
kubectl delete removes resources by name, label, or file. But watch the cascade: deleting a Deployment kills its ReplicaSet and Pods. Deleting a Namespace nukes everything inside — no undo. Add --cascade=orphan if you want to detach but not destroy. For troubleshooting, kubectl delete pod --force --grace-period=0 is the ripper, but only use it when a pod hangs on Terminating. Prefer graceful deletion: let your containers catch SIGTERM, not SIGKILL.
Production rule: never delete resources you didn't create. Use kubectl diff first, or label your own objects with owner: me. The kubectl delete command expects type/name syntax, not random guesses — kubectl delete pod my-pod, not kubectl delete my-pod.
DeleteExamples.ymlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// io.thecodeforge — devops tutorial
# Delete a specific pod
kubectl delete pod web-app-7f8b9c
# Delete all pods with a label
kubectl delete pods -l app=frontend
# Force delete a stuck pod (last resort)
kubectl delete pod stuck-pod --force --grace-period=0
# Delete a deployment (cascades to ReplicaSet + Pods)
kubectl delete deployment web-app
# Orphan the ReplicaSet but keep it alive
kubectl delete deployment web-app --cascade=orphan
# Delete everything in a namespace (careful!)
kubectl delete namespace staging --wait=false
Output
pod "web-app-7f8b9c" deleted
pod "stuck-pod" deleted (force: grace period 0)
deployment.apps "web-app" deleted
namespace "staging" deleted
Production Trap:
Delete with --cascade=orphan on a Deployment and the ReplicaSet keeps running — but nothing heals it. You just made a zombie deployment. Always verify with kubectl get all after deletion.
Key Takeaway
Delete is not a delete — it's a cascade. Know the chain before you pull the trigger.
kubectl describe is your X-ray. kubectl get shows you the label on the box; describe shows you the manufacturing defects. When a pod is crashing, a service has no endpoints, or a node is tainted, describe tells you why in plain text — including events, conditions, and last state logs.
The syntax:kubectl describe <resource>/<name>. For pods, you get the full container state, restart count, and recent events. For nodes, you get capacity, allocatable resources, and taints. For services, you get endpoint lists and selector matching. This is your first debugging step — not Stack Overflow.
Pro tip: run kubectl describe pod with the pod name, then scroll to the Events section. That's where Kubernetes whispers what went wrong — image pull errors, liveness probe failures, OOM kills. If you see BackOff or CrashLoopBackOff, don't restart — read the event. Also, kubectl describe works on Deployments, Services, Nodes, even PersistentVolumeClaims. It's the universal interrogator.
DescribeExamples.ymlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// io.thecodeforge — devops tutorial
# Describe a specific pod (see events, state, conditions)
kubectl describe pod web-app-7f8b9c -n production
# Describe a service to check endpoints
kubectl describe service api-gateway
# Describe a node for capacity and taints
kubectl describe node worker-2
# Describe a deployment (rollout status, selector)
kubectl describe deployment frontend
# Shortcut: describe all resources of a type
kubectl describe pods
# Combine with label filter
kubectl describe pods -l app=backend
Output
Name: web-app-7f8b9c
Namespace: production
Node: worker-2/10.0.0.5
Start Time: Mon, 14 Apr 2025 10:30:00 +0000
Labels: app=web-app
Status: Running
IP: 10.42.0.12
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned
Normal Pulled 10m kubelet Container image "nginx:1.25" already present
Normal Created 10m kubelet Created container nginx
Normal Started 10m kubelet Started container nginx
Senior Shortcut:
Combine describe with grep — kubectl describe pod my-pod | grep -A5 'Events:' — to skip the noise and land on what actually broke.
Key Takeaway
Don't restart what you don't understand. kubectl describe first, Google second.
● Production incidentPOST-MORTEMseverity: high
The --force Delete That Corrupted a StatefulSet's Persistent Volume
Symptom
New StatefulSet Pod stuck in Pending with event: 'Warning FailedScheduling ... pod has unbound immediate PersistentVolumeClaims.' The old Pod is gone but the PVC shows 'Bound' and the PV shows 'Released' instead of 'Available'.
Assumption
The PVC is broken or the storage provisioner is down.
Root cause
kubectl delete pod kafka-2 --force --grace-period=0 bypassed the kubelet's normal termination flow. The kubelet did not unmount the volume or detach it from the node. The cloud provider's volume controller saw the attachment as active (the old node still held it) and refused to re-attach to the new node. The PV was stuck in 'Released' state because the PVC's claimRef still pointed to the old binding.
Fix
1. Patch the PV to remove the claimRef: kubectl patch pv <pv-name> -p '{"spec":{"claimRef": null}}'.
2. Delete the stuck PVC and let the StatefulSet controller recreate it.
3. If the volume is still attached to the old node, force-detach via cloud CLI (e.g., aws ec2 detach-volume).
4. Restart the kubelet on the old node to clear stale mount references.
5. Never use --force on StatefulSet Pods. Use kubectl delete pod <name> with the default grace period.
Key lesson
--force --grace-period=0 is a last resort for stuck Pods, not a restart mechanism.
StatefulSet Pods have identity (ordinal index, stable network name, persistent storage). Force-deleting breaks the identity contract.
Always check volume attachment state when a StatefulSet Pod fails to reschedule: kubectl describe pv and kubectl get volumeattachments.
Use kubectl rollout restart statefulset/<name> for controlled restarts of StatefulSets.
Production debug guideFrom symptom to root cause using only kubectl commands.5 entries
Symptom · 01
Pod stuck in Pending.
→
Fix
1. kubectl describe pod <name> — read the Events section. 2. If 'FailedScheduling' with 'Insufficient cpu/memory': check kubectl describe nodes | grep -A 5 'Allocated resources'. 3. If 'persistentvolumeclaim not found': kubectl get pvc to verify the PVC exists and is Bound. 4. If 'node(s) had taint': check kubectl get nodes --show-labels and the Pod's tolerations. 5. If no events at all: the scheduler may be down — kubectl get pods -n kube-system | grep scheduler.
Symptom · 02
Pod in CrashLoopBackOff.
→
Fix
1. kubectl logs <pod> --previous — see why the last instance crashed. 2. kubectl describe pod <pod> — check 'Last State' for OOMKilled (exit code 137). 3. If OOMKilled: increase memory limit or fix the leak. 4. If application error: fix the code. 5. If probe failure: kubectl describe pod <pod> | grep -A 3 'Liveness\|Readiness' — check probe config against actual startup time.
Symptom · 03
Service returns 502/503.
→
Fix
1. kubectl get the selector. 2. kubectl get pods -l app=<label> — verify Pods exist and are Ready (READY column shows 1/1). 3. kubectl exec -it <pod> -- curl localhost:<port> — test the app directly inside the Pod. 4. If Pods are Ready but endpoints still empty: check the Service selector matches Pod labels exactly (case-sensitive).
Symptom · 04
kubectl commands are slow or timeout.
→
Fix
1. kubectl get --raw /healthz — check API server health. 2. kubectl get --raw /readyz — check if API server is ready. 3. Check etcd latency: kubectl exec -n kube-system etcd-master -- etcdctl endpoint health. 4. If API server is healthy but kubectl is slow: check your kubeconfig context — kubectl config current-context. You may be hitting a distant cluster over VPN.
Symptom · 05
Deployment rollout hangs — never completes.
→
Fix
1. kubectl rollout status deployment/<name> — see which ReplicaSet is not becoming ready. 2. kubectl describe rs <new-rs-name> — check if Pods are Pending or CrashLoopBackOff. 3. If image pull failure: kubectl describe pod <pod> | grep 'Failed' — verify image exists and imagePullSecrets are configured. 4. If maxUnavailable=0 and no capacity: the rollout cannot proceed because old Pods cannot be removed until new Pods are Ready. 5. Rollback: kubectl rollout undo deployment/<name>.
★ kubectl Triage Cheat SheetFirst-response commands for common production incidents. Scan and execute.
Pod not starting — no events visible.−
Immediate action
Check scheduler health and node capacity.
Commands
kubectl get pods -n kube-system | grep scheduler
kubectl describe nodes | grep -A 5 'Allocated resources'
Fix now
If scheduler down: check kube-system logs. If no capacity: scale cluster or evict low-priority Pods.
Container keeps restarting (CrashLoopBackOff).+
Immediate action
Get logs from the crashed container instance.
Commands
kubectl logs <pod> --previous --tail=50
kubectl describe pod <pod> | grep -A 5 'Last State'
Fix now
If OOMKilled: increase memory limit. If application error: fix code. If probe failure: adjust initialDelaySeconds.
Service unreachable — connection refused or timeout.+
Immediate action
Check if Service has endpoints.
Commands
kubectl get endpoints <service-name>
kubectl get pods -l app=<selector> -o wide
Fix now
If no endpoints: Pods are not Ready or labels don't match. If endpoints exist: test Pod directly with kubectl exec -- curl.
Node marked NotReady — Pods being evicted.+
Immediate action
Check node conditions and kubelet status.
Commands
kubectl describe node <node> | grep -A 10 'Conditions'
kubectl get events --field-selector involvedObject.name=<node> --sort-by='.lastTimestamp'
Fix now
If disk pressure: clean images with crictl rmi --prune. If memory pressure: identify the leak. If kubelet down: SSH and systemctl restart kubelet.
PersistentVolumeClaim stuck in Pending.+
Immediate action
Check available PersistentVolumes and StorageClass.
Commands
kubectl get pv
kubectl describe pvc <pvc-name>
Fix now
If no PV available: provision one or check StorageClass provisioner. If PV exists but not binding: verify accessModes and storageClassName match.
ImagePullBackOff — container image cannot be pulled.+
Immediate action
Verify image name, tag, and registry credentials.
Commands
kubectl describe pod <pod> | grep -A 5 'Events'
kubectl get secret <imagepullsecret-name> -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d
Fix now
If image name wrong: fix the Deployment spec. If auth failure: recreate imagePullSecret with correct credentials. If private registry: verify network access from the node.
kubectl Command Comparison
Command
HTTP Verb
Use Case
Production Risk
kubectl get
GET
Read current state of resources. Fast, cached, safe.
None — read-only operation.
kubectl describe
GET + Events query
Detailed view with events, conditions, annotations. Primary debugging tool.
None — read-only. Events have 1-hour TTL; investigate promptly.
kubectl apply
PATCH (strategic merge)
Declarative state management. Idempotent. GitOps standard.
Medium — selector changes can orphan Pods. Always dry-run first.
kubectl create
POST
Create a new resource. Fails if it already exists.
Low — fails safely if resource exists. Not idempotent.
kubectl replace
PUT
Full resource replacement. Overwrites everything.
High — replaces entire resource. Missing fields are removed. Use with caution.
kubectl delete
DELETE
Remove a resource. Triggers graceful termination.
Medium — orphaned Pods if Deployment is deleted. --force on StatefulSets corrupts storage.
kubectl logs
GET (via kubelet proxy)
Retrieve container stdout/stderr.
None — read-only. Logs are ephemeral; aggregate externally.
kubectl exec
POST (SPDY/WebSocket)
Run commands inside a container.
Medium — can modify running state. Audit exec usage in production.
kubectl port-forward
POST (SPDY tunnel)
Tunnel local port to Pod/Service.
Low — bypasses network policies. Do not use as permanent access method.
kubectl rollout undo
PATCH (roll back ReplicaSet)
Revert Deployment to previous revision.
Low — safest rollback method. Respects rolling update parameters.
Key takeaways
1
kubectl describe shows events
always check the Events section when a pod is not starting.
2
kubectl logs --previous retrieves logs from a crashed container
essential for crash debugging.
3
kubectl port-forward lets you access a pod or service without a LoadBalancer
great for development.
4
kubectl rollout undo is a one-command rollback
no redeployment needed.
5
Use -n NAMESPACE for all commands if not in the default namespace.
6
Always dry-run before applying to production
kubectl apply -f file.yaml --dry-run=client.
7
kubectl debug creates ephemeral containers for debugging distroless images
the modern replacement for exec.
8
kubectl get with -o jsonpath or jq is the foundation of production automation and monitoring scripts.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
FAQ · 5 QUESTIONS
Frequently Asked Questions
01
How do I debug a pod that is in CrashLoopBackOff?
Check the crash reason: kubectl describe pod POD_NAME — look at the 'Last State' section and 'Events'. Check logs from the crashed container: kubectl logs POD_NAME --previous. Common causes: application crash on startup (check logs for stack trace), readiness probe failing before app starts (check probe config), missing environment variables or ConfigMap, and OOMKilled (memory limit too low).
Was this helpful?
02
How do I access a ConfigMap or Secret in a running pod?
kubectl exec -it POD_NAME -- env shows all environment variables including those from ConfigMaps and Secrets. kubectl exec -it POD_NAME -- cat /etc/config/KEY shows a file-mounted ConfigMap. kubectl get configmap NAME -o yaml shows the ConfigMap contents.
Was this helpful?
03
What is the difference between kubectl delete pod and kubectl delete pod --force?
kubectl delete pod sends a graceful termination signal. The kubelet gives the container time to shut down (default 30 seconds), unmounts volumes, and cleans up. kubectl delete pod --force --grace-period=0 skips all of this — the Pod is removed from the API server immediately, but the container may still be running on the node. Use --force only when the Pod is stuck in Terminating and the kubelet is unreachable. Never use it on StatefulSet Pods.
Was this helpful?
04
How do I see what kubectl is actually sending to the API server?
Add -v=6 (or higher) to any kubectl command to see the HTTP request and response. -v=6 shows request URLs and response codes. -v=8 shows full request and response bodies. -v=9 shows curl commands you could use to replicate the request. Example: kubectl get pods -v=6 shows the GET request to /api/v1/namespaces/default/pods.
Was this helpful?
05
How do I switch between multiple Kubernetes clusters?
kubectl config get-contexts shows all available contexts (clusters). kubectl config use-context <context-name> switches to a different cluster. kubectl config current-context shows which cluster you are targeting. For safety, always run kubectl config current-context before applying changes to verify you are on the right cluster. Use kubectx (a popular third-party tool) for faster context switching.