Advanced 7 min · March 06, 2026

Kubernetes Interview Questions

Kubernetes Namespace Terminating — Finalizer Debug Strategy

Q: Why is my Pod OOMKilled even if the node has plenty of free RAM?

OOMKill (Exit Code 137) is enforced at the container level by the Cgroup, not the node level. If your container's memory usage exceeds its defined 'Limit' in the YAML, the kernel will kill the process to protect the rest of the node, regardless of how much 'free' RAM the physical machine has.

Q: What is the difference between a 'Taint' and a 'NodeSelector'?

A NodeSelector (or NodeAffinity) is a preference or requirement for a Pod to go to a specific node (the Pod wants the Node). A Taint is the opposite: it allows a Node to repel a set of Pods (the Node rejects the Pod) unless those Pods have a specific 'Toleration'.

Q: What happens if etcd goes down?

If etcd is unavailable, the cluster becomes 'read-only.' Existing workloads will continue to run, but no new Pods can be scheduled, no Deployments can be updated, and the API server will return 500 errors for any write operations. High availability for etcd (3 or 5 nodes) is critical for production clusters.

Q: How does the scheduler decide where to place a Pod?

The scheduler uses a two-phase process: Filtering (eliminates nodes that cannot run the Pod based on resource availability, taints, affinity, topology constraints) and Scoring (ranks feasible nodes by desirability using resource balance, image locality, pod spread). The highest-scoring node wins.

Q: What is the admission webhook chain and why does it matter?

Every Kubernetes write request passes through: Authentication -> Authorization -> Mutating Admission Webhooks -> Schema Validation -> Validating Admission Webhooks -> etcd. Mutating webhooks can modify objects (e.g., inject sidecars). Validating webhooks can reject objects (e.g., OPA policies). If a webhook is unavailable and has `failurePolicy: Fail`, the entire operation is rejected.

Q: How do I debug a namespace stuck in Terminating?

List all resources in the namespace and check for finalizers: `kubectl api-resources --verbs=list -o name | xargs -n 1 kubectl get -n --ignore-not-found -o json | jq '.items[] | select(.metadata.finalizers)'`. Each finalizer blocks deletion until the responsible controller acknowledges cleanup. If the controller is broken, you may need to patch the finalizer to null manually.

A missing IAM permission caused CCM to fail to remove finalizer, blocking namespace deletion for 3 days — debug this for production and interviews..

Naren Founder & Principal Engineer

20+ years shipping production code across the stack, with years spent interviewing engineers. Everything here is grounded in real deployments.

✓ Production

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

Before you start⏱ 30 min

✓Deep production experience
✓Understanding of internals and trade-offs
✓Experience debugging complex systems

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

Control Plane request lifecycle: Auth -> Mutating Webhook -> Validation -> etcd -> Controllers -> Scheduler -> Kubelet
etcd: Raft consensus, split-brain scenarios, compaction, and why disk latency kills clusters
Networking: CNI overlay vs flat networking, kube-proxy iptables vs IPVS, NetworkPolicy enforcement
Resource Management: Requests vs Limits, QoS classes, OOMKill behavior, CPU throttling
Autoscaling: HPA algorithm, stabilization windows, KEDA, HPA/VPA conflict
RBAC and Admission: Webhook chains, OPA/Gatekeeper, service account token risks

✦ Definition~90s read

What is Kubernetes Interview Questions?

Kubernetes interview questions are a litmus test for whether you understand the system as a distributed control plane, not just a container orchestrator. They probe beyond surface-level YAML manipulation into the internals: how kubectl apply triggers a series of API server validations, admission webhooks, and etcd writes; how kube-proxy and the CNI plugin collaborate to route traffic through iptables or eBPF; and how etcd's Raft consensus handles leader elections and split-brain scenarios.

★

Imagine a massive airport with hundreds of flights (your apps), gates (servers), ground crew (Kubernetes components), and air traffic control (the scheduler).

The best candidates can explain why a Namespace gets stuck in Terminating state—typically a finalizer on a lingering resource like a VolumeAttachment or a custom resource that the controller hasn't cleaned up—and can debug it by inspecting kubectl get namespace <ns> -o json | jq '.spec.finalizers' and manually patching out the blocker. These questions separate those who memorize commands from those who grasp the control loop's failure modes, resource QoS classes (Guaranteed vs.

Burstable vs. BestEffort), and RBAC chain-of-trust implications. If you're preparing, focus on the request lifecycle from client to etcd, the reconciliation pattern, and the specific debugging tactics for stuck resources—because that's what senior engineers actually do when production breaks.

Plain-English First

Imagine a massive airport with hundreds of flights (your apps), gates (servers), ground crew (Kubernetes components), and air traffic control (the scheduler). Kubernetes is the entire airport management system — it decides which plane parks at which gate, reroutes flights when a gate breaks, and makes sure no single runway gets overloaded. When an interviewer asks about Kubernetes internals, they're asking you to explain how the airport actually runs — not just that planes land and take off.

Kubernetes has become the de facto operating system for cloud-native infrastructure. At senior and staff-level interviews, nobody is going to ask you what a Pod is. They want to know what happens inside the API server when you run kubectl apply, why your HPA isn't scaling when CPU is clearly spiking, or how etcd consistency guarantees affect your cluster's behaviour under partition.

The gap between 'I know Kubernetes' and 'I understand Kubernetes' comes down to internals. When something breaks at 3am — a node drains but Pods stay Pending, a Deployment rolls out but traffic never shifts, a namespace hangs in Terminating forever — the engineers who can diagnose and fix fast are the ones who understand the watch-loop reconciliation model, the scheduler predicates and priorities, and how the CNI interacts with kube-proxy.

This guide covers the failure modes, edge cases, and architectural decisions that surface in real senior/staff-level interviews at companies running Kubernetes at scale. Every question maps to a production incident you will eventually encounter.

What Kubernetes Interview Questions Actually Test

Kubernetes interview questions are not trivia — they probe your understanding of distributed system mechanics under pressure. The core mechanic is simple: an interviewer presents a scenario (e.g., a namespace stuck in Terminating) and expects you to trace the control loop, identify the blocking condition, and state the exact command to resolve it. This is a test of mental model, not memorization.

In practice, these questions focus on three properties: how finalizers block deletion until a controller completes cleanup, how the garbage collector propagates owner references, and how to use kubectl patch to force-remove a stuck resource. For example, a namespace stuck in Terminating usually means a finalizer (like kubernetes) is waiting on a controller that is down or misconfigured. The fix is kubectl get namespace -o json | jq '.spec.finalizers = []' | kubectl replace --raw /api/v1/namespaces//finalize -f -.

You use this knowledge when debugging production clusters where a namespace won't delete, blocking CI/CD pipelines or resource reclamation. It matters because a single stuck namespace can cascade into failed deployments, leaking resources, and alert fatigue. Senior engineers don't guess — they read the finalizer list, check the controller logs, and decide whether to patch or restart.

⚠ Don't Patch Blindly

Removing finalizers without understanding why they're stuck can orphan resources — always verify the controller is truly dead before force-deleting.

📊 Production Insight

A Helm upgrade fails because a previous release's namespace is stuck Terminating due to a finalizer from a deleted CRD controller.

The symptom: kubectl get ns shows the namespace in Terminating state for hours, and kubectl describe ns reveals a finalizer referencing a custom resource that no longer exists.

Rule of thumb: always remove the CRD controller before deleting the CRD instances, or expect to patch finalizers manually.

🎯 Key Takeaway

Finalizers are the only thing that can block namespace deletion — always check them first.

The kubernetes finalizer is safe to remove if the controller is gone; custom finalizers require controller recovery.

Use kubectl replace --raw to edit the finalize endpoint — never delete the namespace with --force alone.

thecodeforge.io

Kubernetes Interview Questions

The Anatomy of a Request: What Happens When You Run 'kubectl apply'?

A senior candidate must articulate the journey of a manifest from the CLI to the Kubelet. It isn't just 'the API server saves it.' The lifecycle involves Authentication/Authorization, Mutating Admission Webhooks (which might inject sidecars like Istio or Linkerd), Schema Validation, and finally, Validating Admission Webhooks (like OPA/Gatekeeper).

Once persisted in etcd, the Control Plane controllers see the state change via a watch event. The Deployment controller creates a ReplicaSet, which creates Pod objects. These Pods remain in a 'Pending' state with an empty nodeName until the Kube-Scheduler performs its two-step dance: Filtering (Predicates) to find capable nodes, and Scoring (Priorities) to find the best node. Only then does the Kubelet on the target node see the Pod and instruct the Container Runtime (CRI) to pull images and start containers.

io/thecodeforge/k8s/production-pod.yamlYAML

apiVersion: v1
kind: Pod
metadata:
  name: forge-app
  namespace: production
  labels:
    app: forge-api
    tier: backend
spec:
  containers:
  - name: forge-container
    image: io.thecodeforge/api:v1.2.0
    resources:
      requests:
        memory: "256Mi"
        cpu: "500m"
      limits:
        memory: "512Mi"
        cpu: "1"
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 15
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: kubernetes.io/hostname
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app: forge-api

Output

pod/forge-app created

Mental Model

The Request Lifecycle Chain

This chain is why debugging 'access denied' errors requires checking both RBAC (authorization) and admission webhooks (validating). They fail at different stages.

Authentication: Service account tokens, OIDC, certificates.
Authorization: RBAC, ABAC, Webhook authorizers.
Mutating Webhooks: Istio sidecar injection, default resource limits, label injection.
Validating Webhooks: OPA/Gatekeeper policies, image signature verification, namespace quotas.
etcd: Only persisted after all gates pass. The API Server is the only component that writes to etcd.

📊 Production Insight

Mutating admission webhooks can cause cascading failures. If a webhook is unavailable (timeout or crash), the API Server cannot complete the admission chain and all create/update operations for affected resources fail. This is a common cause of 'the entire cluster stopped accepting changes' incidents. Mitigate with: failurePolicy: Ignore for non-critical webhooks, webhook HA (multiple replicas), and monitoring webhook latency. Never set failurePolicy: Fail on a webhook that is not absolutely critical.

🎯 Key Takeaway

The kubectl apply lifecycle is a chain of gates, not a single write. Each gate can fail independently. Understanding which gate failed (auth, mutating webhook, validation, etcd) is the key to debugging API Server errors.

Networking Internals: Services, Kube-Proxy, and the CNI

A Service in Kubernetes is not a process; it's a virtual IP (VIP) managed by kube-proxy. You should be prepared to explain the difference between the legacy iptables mode and the modern IPVS mode. While iptables uses sequential rule checking (O(n) complexity), IPVS uses hash tables (O(1) complexity), making it significantly more performant for clusters with thousands of services.

Furthermore, the CNI (Container Network Interface) is responsible for the 'plumbing' — assigning IPs to Pods and ensuring they can talk across nodes. If an interviewer asks why a Pod can't reach another Pod, your answer should start with the CNI overlay (Calico/Cilium) and move to NetworkPolicies, rather than just 'checking the app logs.'

io/thecodeforge/k8s/debug-network.shBASH

# TheCodeForge Network Debugging Toolkit
# Package: io.thecodeforge.k8s

# 1. Check if the Service IP is active in iptables
iptables -L -t nat | grep FORGE-SERVICE-NAME

# 2. Inspect the CNI logs on the specific node
journalctl -u kubelet | grep cni

# 3. Test Pod-to-Pod connectivity bypassing the Service VIP
kubectl exec -it debug-pod -- curl <target-pod-ip>:8080/healthz

# 4. Check kube-proxy mode and health
kubectl get configmap kube-proxy -n kube-system -o yaml | grep mode
kubectl logs -n kube-system -l k8s-app=kube-proxy | tail -50

# 5. Verify NetworkPolicy is not blocking traffic
kubectl get networkpolicy -n <namespace> -o yaml
# If policies exist, check ingress/egress rules against the Pod's labels

Output

Chain KUBE-SERVICES (2 references)

DNAT tcp -- 0.0.0.0/0 10.96.0.10 tcp dpt:53

Mental Model

iptables vs IPVS: The Performance Trade-off

The rule of thumb: use iptables for clusters under 1,000 Services. Switch to IPVS for larger clusters or when you need advanced load balancing algorithms.

iptables: Simple, well-understood, but O(n) rule matching. No native load balancing algorithms.
IPVS: O(1) hash matching, native LB algorithms (rr, lc, sh), but more complex debugging.
eBPF (Cilium): Bypasses both iptables and IPVS entirely. Kernel-level packet processing. The future.
kube-proxy is being replaced by eBPF-based CNIs in high-performance clusters.

📊 Production Insight

The externalTrafficPolicy field on a Service controls how traffic from outside the cluster is routed. externalTrafficPolicy: Cluster (default) distributes traffic evenly across all nodes, then to pods. This loses the client source IP. externalTrafficPolicy: Local only routes traffic to nodes that have local pods, preserving the source IP but risking uneven load distribution if pods are not evenly spread. This is a common interview question and a common production misconfiguration.

🎯 Key Takeaway

Kubernetes networking has three layers: CNI (pod-to-pod), kube-proxy (service VIP to pod), and NetworkPolicy (traffic filtering). Debugging connectivity requires checking all three. The industry is moving from iptables to eBPF-based CNIs (Cilium) for performance and observability.

thecodeforge.io

Kubernetes Interview Questions

etcd Internals: Raft, Consistency, and Failure Modes

etcd is the single source of truth for all Kubernetes cluster state. It uses the Raft consensus algorithm to replicate data across an odd number of members (typically 3 or 5). Understanding Raft is essential for diagnosing cluster-wide failures.

io/thecodeforge/k8s/etcd-debug.shBASH

# etcd Diagnostic Commands
# Package: io.thecodeforge.k8s

# 1. Check cluster member health
etcdctl endpoint health --cluster --write-out=table

# 2. Check member status (leader, DB size, Raft index)
etcdctl endpoint status --write-out=table

# 3. Check for alarm conditions (e.g., NOSPACE)
etcdctl alarm list

# 4. Defragment a member (reclaims space after compaction)
etcdctl defrag --endpoints=<endpoint>

# 5. Compact old revisions (prevents unbounded DB growth)
etcdctl compact $(etcdctl endpoint status --write-out=json | jq '.[0].Status.header.revision')

# 6. Snapshot backup
etcdctl snapshot save /backup/etcd-$(date +%Y%m%d-%H%M%S).db

Output

etcd cluster diagnostics complete.

Mental Model

Raft Consensus: How etcd Handles Partitions

If you lose quorum (e.g., 2 of 3 members go down), the entire cluster becomes read-only. No new pods, no deployments, no updates. Existing workloads continue running because the kubelet caches the last-known state.

Raft leader: Elected by members. All writes go through the leader.
Heartbeat interval: Leader sends heartbeats (default 100ms). If a follower misses elections (default 1000ms), it starts a new election.
Disk latency: etcd requires fsync on every write. Slow disks cause leader elections and cluster instability.
Compaction: Old revisions accumulate. Periodic compaction and defragmentation are required to prevent unbounded growth.

📊 Production Insight

etcd's --quota-backend-bytes (default 2GB, max 8GB) is the hard limit on the database size. If exceeded, etcd enters a maintenance mode that rejects all writes, effectively halting the cluster. Monitor etcd_mvcc_db_total_size_in_bytes and alert at 75%. Run compaction and defragmentation regularly. In large clusters with many ConfigMaps/Secrets, etcd can grow quickly. Consider externalizing large data (e.g., Helm charts) to object storage.

🎯 Key Takeaway

etcd is the cluster's single point of failure. Raft consensus prevents split-brain but requires quorum. Disk latency is the most common cause of etcd instability. Monitor, compact, defragment, and backup etcd as critical infrastructure.

Resource Management: Requests, Limits, and QoS Classes

Resource requests and limits are not just about preventing OOMKills. They define the contract between the application and the scheduler. Requests are used for scheduling decisions (can this Pod fit on this node?). Limits are enforced by the kernel cgroup (can this Pod use more than allocated?).

io/thecodeforge/k8s/resource-qos.yamlYAML

# QoS Class: Guaranteed (requests == limits for all containers)
# Highest priority during eviction. Never OOMKilled unless node is under extreme pressure.
apiVersion: v1
kind: Pod
metadata:
  name: critical-service
  namespace: production
spec:
  containers:
    - name: app
      image: io.thecodeforge/api:stable
      resources:
        requests:
          cpu: "500m"
          memory: "512Mi"
        limits:
          cpu: "500m"      # Equal to request = Guaranteed QoS
          memory: "512Mi"   # Equal to request = Guaranteed QoS
---
# QoS Class: Burstable (requests < limits)
# Medium priority. Can burst above request but may be evicted under pressure.
apiVersion: v1
kind: Pod
metadata:
  name: web-frontend
  namespace: production
spec:
  containers:
    - name: app
      image: io.thecodeforge/frontend:latest
      resources:
        requests:
          cpu: "200m"
          memory: "256Mi"
        limits:
          cpu: "1"          # Can use up to 1 CPU core
          memory: "1Gi"     # Can use up to 1Gi RAM
---
# QoS Class: BestEffort (no requests or limits set)
# Lowest priority. First to be evicted. Not recommended for production.
apiVersion: v1
kind: Pod
metadata:
  name: debug-tool
  namespace: development
spec:
  containers:
    - name: debug
      image: io.thecodeforge/debug:latest
      # No resources defined = BestEffort QoS

Output

Pods created with different QoS classes.

Mental Model

QoS Classes and Eviction Priority

Setting CPU limits equal to requests (Guaranteed QoS) prevents CPU throttling but also prevents bursting. For bursty workloads, use Burstable QoS with high/no CPU limits.

Guaranteed: requests == limits for all containers. Highest eviction priority.
Burstable: requests < limits (or only requests set). Medium priority.
BestEffort: No requests or limits. Lowest priority. First to be evicted.
CPU throttling: If CPU limit is set, the container is throttled when it exceeds the limit. This is NOT an eviction — it is a performance penalty.
Memory OOMKill: If memory usage exceeds the limit, the kernel kills the container (OOMKill, exit code 137).

📊 Production Insight

CPU limits cause throttling, not eviction. When a container exceeds its CPU limit, the kernel throttles it (reduces CPU time), which increases latency. For latency-sensitive services, consider removing CPU limits entirely and relying on CPU requests for scheduling. This allows the application to burst when node CPU is available, at the cost of potential noisy-neighbor effects. Monitor CPU throttling via container_cpu_cfs_throttled_periods_total in cAdvisor metrics.

🎯 Key Takeaway

Resource requests drive scheduling. Limits drive enforcement. QoS classes determine eviction order. For production, use Guaranteed QoS for critical services and Burstable for variable workloads. Never use BestEffort in production.

RBAC, Service Accounts, and Admission Control

RBAC (Role-Based Access Control) is the primary authorization mechanism in Kubernetes. It defines who (Subject) can do what (Verb) on which resources (Resource) in which scope (Namespace or Cluster). Understanding RBAC is critical for security and for debugging 'access denied' errors.

io/thecodeforge/k8s/rbac-least-privilege.yamlYAML

# Least-privilege RBAC for a microservice
# Package: io.thecodeforge.k8s

# 1. Dedicated ServiceAccount (not default)
apiVersion: v1
kind: ServiceAccount
metadata:
  name: order-service
  namespace: production
automountServiceAccountToken: false  # Disable unless API access needed
---
# 2. Namespace-scoped Role with minimal permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: order-service-role
  namespace: production
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["order-service-config"]  # Only specific configmap
    verbs: ["get", "watch"]
  - apiGroups: [""]
    resources: ["secrets"]
    resourceNames: ["db-credentials"]  # Only specific secret
    verbs: ["get"]
---
# 3. Bind Role to ServiceAccount
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: order-service-binding
  namespace: production
subjects:
  - kind: ServiceAccount
    name: order-service
    namespace: production
roleRef:
  kind: Role
  name: order-service-role
  apiGroup: rbac.authorization.k8s.io

Output

Least-privilege RBAC configured for order-service.

Mental Model

RBAC Evaluation: How Authorization Decisions Are Made

For deny-based policies, use OPA/Gatekeeper or Kyverno as validating admission webhooks. They can enforce policies that RBAC cannot express.

Role: Namespace-scoped. RoleBinding binds it to subjects within the namespace.
ClusterRole: Cluster-scoped. ClusterRoleBinding binds it to subjects across all namespaces.
ServiceAccount: The identity for a Pod. Default SA is mounted into every Pod unless automountServiceAccountToken: false.
Aggregated ClusterRoles: Combine multiple ClusterRoles using label selectors. Used by operators to extend permissions dynamically.

📊 Production Insight

The default ServiceAccount in every namespace has minimal permissions, but its token is mounted into every Pod by default. If an attacker compromises a Pod, they can use this token to query the Kubernetes API. In many clusters, the default SA has been granted additional permissions over time, expanding the blast radius. Mitigate by: setting automountServiceAccountToken: false as the namespace default, creating dedicated ServiceAccounts per workload, and auditing ClusterRoleBindings regularly with kubectl auth can-i --list --as=system:serviceaccount:<ns>:<sa>.

🎯 Key Takeaway

RBAC is additive with no deny mechanism. Least-privilege design is the only defense. Disable automounting, use dedicated ServiceAccounts, and audit permissions regularly. For deny-based policies, layer OPA/Gatekeeper on top of RBAC.

Scheduler Internals: Filtering, Scoring, and Custom Schedulers

The Kubernetes scheduler is a control loop that watches for Pods with an empty nodeName and assigns them to nodes. It does not actually run Pods — it only sets the nodeName field, and the kubelet on that node picks up the Pod. The scheduler's decision process has two phases: Filtering (formerly Predicates) and Scoring (formerly Priorities).

io/thecodeforge/k8s/scheduling/SchedulerDecisionModel.javaJAVA

// Simplified scheduler decision model
// Package: io.thecodeforge.k8s.scheduling
package io.thecodeforge.k8s.scheduling;

import java.util.List;
import java.util.Map;

public class SchedulerDecisionModel {

    /**
     * Phase 1: Filtering — Eliminate nodes that cannot run the Pod.
     * Filters are applied in order. If no nodes pass, the Pod stays Pending.
     */
    public List<String> filterNodes(List<String> allNodes, Pod pod) {
        return allNodes.stream()
            .filter(node -> hasEnoughResources(node, pod))     // NodeResourcesFit
            .filter(node -> matchesNodeAffinity(node, pod))    // NodeAffinity
            .filter(node -> toleratesTaints(node, pod))        // TaintToleration
            .filter(node -> matchesPodTopology(node, pod))     // PodTopologySpread
            .filter(node -> hasVolumeCapacity(node, pod))      // VolumeBinding
            .toList();
    }

    /**
     * Phase 2: Scoring — Rank feasible nodes by desirability.
     * Each scoring plugin assigns 0-100 points. Scores are summed.
     * The node with the highest total score wins.
     */
    public Map<String, Integer> scoreNodes(List<String> feasibleNodes, Pod pod) {
        // Simplified: In reality, each plugin scores independently
        return feasibleNodes.stream()
            .collect(java.util.stream.Collectors.toMap(
                node -> node,
                node -> scoreResourceBalancing(node, pod)    // NodeResourcesBalancedAllocation
                      + scorePodSpread(node, pod)             // PodTopologySpread
                      + scoreInterPodAffinity(node, pod)      // InterPodAffinity
                      + scoreImageLocality(node, pod)         // ImageLocality
            ));
    }

    private boolean hasEnoughResources(String node, Pod pod) { return true; }
    private boolean matchesNodeAffinity(String node, Pod pod) { return true; }
    private boolean toleratesTaints(String node, Pod pod) { return true; }
    private boolean matchesPodTopology(String node, Pod pod) { return true; }
    private boolean hasVolumeCapacity(String node, Pod pod) { return true; }
    private int scoreResourceBalancing(String node, Pod pod) { return 50; }
    private int scorePodSpread(String node, Pod pod) { return 50; }
    private int scoreInterPodAffinity(String node, Pod pod) { return 50; }
    private int scoreImageLocality(String node, Pod pod) { return 50; }
}

Output

Scheduler decision model: filter then score.

Mental Model

Filtering vs Scoring: The Two-Phase Dance

When a Pod is Pending, always check kubectl describe pod <name> for the scheduling failure event. It tells you exactly which filter failed.

NodeResourcesFit: Checks if the node has enough CPU/memory for the Pod's requests.
NodeAffinity: Matches nodeSelector and nodeAffinity rules.
TaintToleration: Ensures the Pod tolerates all taints on the node.
PodTopologySpread: Enforces topology spread constraints (zone, hostname).
VolumeBinding: Ensures required PVs can be bound on the node.
ImageLocality: Prefers nodes that already have the container image cached.

📊 Production Insight

The scheduler's scoring phase includes an ImageLocality scorer that prefers nodes with pre-pulled images. This can cause scheduling skew: nodes that have run a Pod before score higher for the same Pod, leading to uneven distribution. To counter this, use topologySpreadConstraints or podAntiAffinity to force spread. Also, the scheduler's --percentage-of-nodes-to-score flag (default 50%) limits scoring to a subset of feasible nodes for performance. In small clusters, set this to 100% to ensure optimal placement.

🎯 Key Takeaway

The scheduler is a filter-then-score pipeline. Pending Pods always have a reason — check the events. Custom scheduling can be achieved via scheduler plugins or a second scheduler. Image locality scoring can cause unexpected skew.

Probes Deep Dive: Liveness, Readiness, and Startup

Probes are the kubelet's mechanism for monitoring container health. Misconfigured probes are one of the most common causes of production incidents: liveness probes that kill healthy-but-slow containers, readiness probes that flap during cache warm-up, and missing startup probes that cause crash loops on legacy applications.

io/thecodeforge/k8s/probes-production.yamlYAML

# Production-grade probe configuration
# Package: io.thecodeforge.k8s
apiVersion: v1
kind: Pod
metadata:
  name: api-server
  namespace: production
spec:
  containers:
    - name: api
      image: io.thecodeforge/api:3.0.0
      # Startup probe: Gates liveness/readiness until app boots
      # Critical for apps with slow startup (>30s)
      startupProbe:
        httpGet:
          path: /healthz
          port: 8080
        initialDelaySeconds: 5
        periodSeconds: 5
        failureThreshold: 30    # 30 * 5s = 150s max startup time
        successThreshold: 1
      # Liveness probe: Detects deadlocks and hung processes
      # Only active after startup probe succeeds
      livenessProbe:
        httpGet:
          path: /healthz
          port: 8080
        periodSeconds: 10
        failureThreshold: 3     # 3 failures = restart after 30s
        successThreshold: 1
        timeoutSeconds: 5
      # Readiness probe: Controls traffic routing
      # Failing = removed from Service endpoints
      readinessProbe:
        httpGet:
          path: /ready
          port: 8080
        periodSeconds: 5
        failureThreshold: 2     # 2 failures = remove from endpoints after 10s
        successThreshold: 1     # 1 success = add back to endpoints
        timeoutSeconds: 3
      resources:
        requests:
          cpu: "500m"
          memory: "512Mi"

Output

Pod created with production-grade probe configuration.

Mental Model

Probe Interaction: The Startup Gate

If your application takes more than 30 seconds to start, you MUST use a startup probe. Without it, the liveness probe will kill the container during startup, creating a crash loop.

Startup probe: Only runs during boot. Gates liveness/readiness.
Liveness probe: Runs continuously. Failure = container restart.
Readiness probe: Runs continuously. Failure = remove from Service endpoints.
Probe types: httpGet, tcpSocket, exec (command).
timeoutSeconds: Must be less than periodSeconds, or the probe is always considered failed.

📊 Production Insight

Liveness probes that check downstream dependencies (database, cache) cause cascading failures. If the database is slow, the liveness probe fails, the container restarts, and the restart increases load on the database, causing more liveness failures. The fix: liveness probes should check ONLY the application's internal health (is the process responsive?). Readiness probes can check dependencies (is the application ready to serve traffic?). This separation prevents dependency failures from causing container restarts.

🎯 Key Takeaway

Use startup probes for slow-booting applications. Liveness probes check internal health only. Readiness probes can check dependencies. Never let a liveness probe check downstream services — it causes cascading restart storms.

ConfigMaps and Secrets: The 'We Pushed Creds to Git' Interview Question

Most juniors can recite what ConfigMaps and Secrets are. The interview question isn't about definitions—it's about whether you've been on call at 3 AM because someone base64-encoded a production database password and committed it.

ConfigMaps store non-sensitive configuration (environment variables, config files). Secrets store sensitive data—but Kubernetes only claims they're secure. Default encryption is at rest in etcd unless you enable encryption at rest with a KMS provider. The base64 encoding is not encryption; it's obfuscation.

The real gotcha: Secrets are mounted as files or env vars. If a pod crashes and you exec into it, those env vars are still in /proc. Don't treat Secrets as bulletproof. Use external secret stores (Vault, AWS Secrets Manager) with CSI drivers for production. Interviewers want to hear you've thought about the attack surface, not just the API object.

ConfigmapSecretAudit.pyPYTHON

// io.thecodeforge — interview tutorial

import subprocess
import json

def check_secret_encryption():
    """Check if etcd encryption is enabled - common interview trap."""
    result = subprocess.run(
        ["kubectl", "get", "secrets", "-o", "json"],
        capture_output=True, text=True
    )
    secrets = json.loads(result.stdout)
    
    # Count secrets without encryption annotation
    unencrypted = [
        s["metadata"]["name"]
        for s in secrets["items"]
        if s["metadata"].get("annotations", {}).get("encryption", "false") == "false"
    ]
    
    print(f"Total secrets: {len(secrets['items'])}")
    print(f"Unencrypted secrets: {len(unencrypted)}")
    # This returns nothing in most clusters - that's the trap

if __name__ == "__main__":
    check_secret_encryption()

Output

Total secrets: 12

Unencrypted secrets: 12

Warning: Encryption at rest not configured in etcd

⚠ Production Trap:

Never assume Kubernetes Secrets are secure. Enable encryption at rest via EncryptionConfiguration and use a KMS provider. Otherwise, anyone with etcd access (including backups) has your credentials in plain text.

🎯 Key Takeaway

ConfigMaps for config, Secrets for sensitive data—but treat Secrets as base64-encoded plain text, not encrypted.

Namespaces: The 'Why Is My Pod Missing?' Trap

Namespaces are how you isolate resources in a cluster—think virtual clusters inside a physical one. The question isn't 'What is a namespace?' It's 'What happens when you forget to specify one?'

Every kubectl command targets the default namespace unless you pass -n or change your context. That's fine for dev, but in production you'll have namespaces for teams, environments, or feature flags. The failure mode: you kubectl get pods after a deployment, see nothing, and panic. Then you realize you're in default and the pod is in production.

Namespaces provide scope for resource quotas, network policies, and RBAC bindings. They don't isolate network traffic by default—you need NetworkPolicies for that. They also don't create a security boundary; a compromised pod in one namespace can still reach another namespace's service unless you restrict egress.

The interview hot take: Namespaces are organizational, not security. Use them to avoid naming collisions and enforce resource limits, but don't rely on them for isolation without NetworkPolicies.

NamespaceResourceAudit.pyPYTHON

// io.thecodeforge — interview tutorial

import subprocess
import json

def check_namespace_quotas():
    """Check resource quotas across namespaces."""
    result = subprocess.run(
        ["kubectl", "get", "namespaces", "-o", "json"],
        capture_output=True, text=True
    )
    ns_data = json.loads(result.stdout)
    
    for ns in ns_data["items"]:
        name = ns["metadata"]["name"]
        # Skip system namespaces
        if name.startswith("kube-"):
            continue
            
        quota_cmd = ["kubectl", "get", "resourcequota", "-n", name]
        quota = subprocess.run(quota_cmd, capture_output=True, text=True)
        
        if "No resources found" in quota.stdout:
            print(f"WARNING: {name} has NO resource quotas")
        else:
            print(f"OK: {name} has quotas")

if __name__ == "__main__":
    check_namespace_quotas()

Output

WARNING: staging has NO resource quotas

WARNING: development has NO resource quotas

OK: production has quotas

OK: monitoring has quotas

🔥Senior Shortcut:

Default namespace is the easiest way to create silent failures. Production clusters should enforce namespaces via admission controllers (e.g., Kyverno) that require -n flags on all kubectl commands.

🎯 Key Takeaway

Namespaces organize resources but don't secure them—pair with NetworkPolicies and ResourceQuotas for real isolation.

Persistent Volumes & Claims: Storage That Survives Pod Death

Containers are ephemeral; their filesystems vanish with the pod. Persistent Volumes (PVs) are cluster-wide storage resources provisioned by an admin or dynamically via StorageClass. A Persistent Volume Claim (PVC) is a request for storage by a user, specifying size and access mode (ReadWriteOnce, ReadOnlyMany, ReadWriteMany). Kubernetes binds a PVC to a matching PV, then pods reference the PVC as a volume. Why this matters in interviews: they test if you understand that PV/PVC decouple storage consumption from provisioning. A common trap: two pods sharing a PVC with ReadWriteOnce will fail if scheduled on different nodes. Always match access modes with your workload's concurrency needs.

pvc_pod_binding.pyPYTHON

// io.thecodeforge — interview tutorial

# PVC definition
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
# Pod using the PVC
apiVersion: v1
kind: Pod
metadata:
  name: storage-pod
spec:
  containers:
  - name: app
    image: nginx
    volumeMounts:
    - mountPath: /data
      name: storage
  volumes:
  - name: storage
    persistentVolumeClaim:
      claimName: my-pvc

Output

PVC 'my-pvc' bound to PV; pod mounts /data

⚠ Production Trap:

Forgetting to delete a PVC when done leaves the PV in 'Released' state, unusable. Always set reclaimPolicy: Delete or manually clean up.

🎯 Key Takeaway

PVs are infrastructure; PVCs are requests. Bind them correctly for stateful workloads.

Rolling Updates & Rollbacks: Zero-Downtime Deployments

A Deployment manages a ReplicaSet that creates pods. When you change the pod template (image, env, etc.), Kubernetes performs a rolling update: it creates a new ReplicaSet, scales it up while scaling the old one down, ensuring a configurable number of pods are always available. The maxSurge and maxUnavailable fields control this pace. If the update fails (e.g., ImagePullBackOff), you rollback with kubectl rollout undo. Why this is a gotcha: interviewers ask about strategy vs. revision history. By default, Kubernetes keeps 10 revisions in .spec.revisionHistoryLimit. Without it, you can't rollback past the limit. Always test rollbacks in staging—container start failures often mask as healthy until traffic hits.

rollout_commands.pyPYTHON

// io.thecodeforge — interview tutorial

# Trigger rolling update
kubectl set image deployment/nginx-deployment nginx=nginx:1.25

# Monitor status
kubectl rollout status deployment/nginx-deployment

# Undo last rollout
kubectl rollout undo deployment/nginx-deployment

# View history
kubectl rollout history deployment/nginx-deployment

Output

deployment.apps/nginx-deployment rolled out

deployment.apps/nginx-deployment rolled back

⚠ Production Trap:

Rolling update to a broken image without readiness probes will serve traffic to failing pods. Always combine with probes.

🎯 Key Takeaway

Rolling updates safe by default; rollback depends on revision history limit.

Pod Disruption Budgets: Surviving Node Failures Gracefully

Voluntary disruptions (node drains, cluster upgrades) can kill pods. Without controls, you lose all replicas simultaneously. A PodDisruptionBudget (PDB) specifies the minimum available or maximum unavailable pods for a set of labels. When a node is drained, the eviction API checks PDBs: if removing a pod would violate the budget, the drain waits. Why this matters: interviewers test if you separate voluntary from involuntary disruptions (node crashes). Involuntary disruptions ignore PDBs—you need multiple replicas across nodes via anti-affinity or topology spread constraints. Common mistake: setting minAvailable too high blocks drains entirely. Aim for 1-2 unavailable per service, especially for stateful workloads.

pdb_definition.pyPYTHON

// io.thecodeforge — interview tutorial

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-service-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-service
---
# Check status
kubectl get pdb my-service-pdb -o wide

Output

NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE

my-service-pdb 2 N/A 1 5m

⚠ Production Trap:

PDBs don't prevent involuntary disruptions (node crash). Combine with PodTopologySpreadConstraints to spread replicas across failure domains.

🎯 Key Takeaway

PDBs protect against voluntary disruptions; always pair with multi-zone scheduling for real HA.

Kubernetes Scheduling: Node Affinity, Taints, Tolerations, Topology Spread

Kubernetes scheduling determines which node a pod runs on. Beyond simple resource requests, advanced scheduling features like node affinity, taints/tolerations, and topology spread constraints give fine-grained control. Node affinity allows you to constrain which nodes a pod can be scheduled on based on node labels. It comes in two types: requiredDuringSchedulingIgnoredDuringExecution (hard requirement) and preferredDuringSchedulingIgnoredDuringExecution (soft preference). For example, to schedule GPU workloads only on nodes with label 'gpu=true', use a required affinity rule. Taints and tolerations work together to repel pods from nodes unless they explicitly tolerate the taint. A node with taint 'key=value:NoSchedule' will not accept pods without a matching toleration. This is useful for dedicating nodes to specific workloads (e.g., infra nodes). Topology spread constraints distribute pods across failure domains like zones or hosts to achieve high availability. For instance, you can enforce that pods of a deployment are spread across at least three zones with a max skew of 1. These features are critical for production clusters to ensure resilience, resource isolation, and cost optimization.

scheduling-example.yamlYAML

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: gpu
            operator: In
            values:
            - "true"
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"
  topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        app: my-app
  containers:
  - name: my-container
    image: nvidia/cuda:latest

🔥Scheduling Gotchas

📊 Production Insight

In production, combine taints/tolerations with node affinity to ensure critical workloads run on dedicated nodes, and use topology spread constraints to guarantee multi-zone distribution for high availability.

🎯 Key Takeaway

Node affinity, taints/tolerations, and topology spread constraints provide powerful mechanisms to control pod placement for resilience, isolation, and cost efficiency.

Service Mesh: Istio, Linkerd, Consul Comparison

A service mesh provides a dedicated infrastructure layer for handling service-to-service communication, including traffic management, security, and observability. The three most popular service meshes are Istio, Linkerd, and Consul. Istio is feature-rich, offering fine-grained traffic control (e.g., canary deployments, circuit breaking), mTLS, and deep observability with Prometheus and Jaeger. It uses Envoy sidecars and has a steep learning curve. Linkerd is lightweight and focuses on simplicity and performance, using a Rust-based proxy (linkerd-proxy) with minimal resource overhead. It provides automatic mTLS, HTTP/2, and gRPC support, and is easier to adopt. Consul (by HashiCorp) integrates service mesh with service discovery and KV store, using Envoy sidecars. It offers multi-cloud and multi-datacenter support, and integrates with Consul Connect for mTLS. For interviews, understand trade-offs: Istio for complex traffic policies, Linkerd for simplicity and low latency, Consul for hybrid cloud and service discovery. Example: Istio can route 1% of traffic to a new version via VirtualService and DestinationRule. Linkerd can do similar with ServiceProfiles. Consul uses intentions for access control.

istio-virtual-service.yamlYAML

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: my-service
spec:
  hosts:
  - my-service
  http:
  - match:
    - headers:
        version:
          exact: v2
    route:
    - destination:
        host: my-service
        subset: v2
      weight: 100
  - route:
    - destination:
        host: my-service
        subset: v1
      weight: 90
    - destination:
        host: my-service
        subset: v2
      weight: 10

⚠ Service Mesh Overhead

📊 Production Insight

In production, start with Linkerd for simplicity and low overhead, then migrate to Istio if you need advanced traffic policies. Consul is ideal for environments already using HashiCorp tools or requiring multi-datacenter service mesh.

🎯 Key Takeaway

Istio, Linkerd, and Consul each offer service mesh capabilities with different trade-offs in complexity, performance, and feature set; choose based on your operational needs.

thecodeforge.io

Kubernetes Interview Questions

Kubernetes Security: RBAC, Pod Security Standards, Network Policies

Kubernetes security is multi-layered. RBAC (Role-Based Access Control) controls who can access the API and what actions they can perform. Define Roles and ClusterRoles with rules (verbs on resources), then bind them to users/groups via RoleBindings or ClusterRoleBindings. For example, a developer might have get/list/pod permissions in a namespace. Pod Security Standards (PSS) replace PodSecurityPolicy (deprecated). They define three policies: privileged, baseline, and restricted. Enforce via admission controllers (e.g., OPA/Gatekeeper) or built-in Pod Security Admission. For instance, a restricted policy prevents privileged containers and hostPath volumes. Network Policies control pod-to-pod communication. By default, all pods can communicate. A NetworkPolicy selects pods and defines ingress/egress rules based on labels, IP blocks, or ports. For example, allow traffic from frontend to backend only on port 8080. These three pillars—RBAC, PSS, Network Policies—form a defense-in-depth strategy. In interviews, demonstrate understanding of least privilege, defense in depth, and common pitfalls like overly permissive RBAC or missing network policies.

network-policy-example.yamlYAML

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-policy
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080

⚠ Default Allow All

📊 Production Insight

In production, use tools like OPA/Gatekeeper to enforce Pod Security Standards and audit RBAC regularly. Combine Network Policies with service mesh mTLS for defense in depth.

🎯 Key Takeaway

RBAC, Pod Security Standards, and Network Policies are essential for securing Kubernetes clusters; implement them with least privilege and defense in depth.

● Production incidentPOST-MORTEMseverity: high

Namespace Stuck in Terminating: Finalizer Blocking Cluster Decommission

Symptom

kubectl delete namespace <ns> returned successfully but namespace remained in Terminating state. No new resources could be created in the namespace. Cluster decommission timeline slipped by 3 days.

Assumption

The namespace deletion was stuck due to remaining resources (Pods, PVCs) that had not been fully cleaned up.

Root cause

The Service of type LoadBalancer had a finalizer (service.kubernetes.io/load-balancer-cleanup). The cloud controller manager (CCM) was responsible for removing this finalizer after deleting the cloud load balancer. However, the CCM had been redeployed with a new service account that lacked IAM permissions to delete load balancers. The CCM silently failed to remove the finalizer, and Kubernetes refused to complete namespace deletion because finalizers were still present on resources within the namespace.

Fix

1. Restored IAM permissions for the CCM service account to manage load balancers. 2. Patched the Service to remove the finalizer manually: kubectl patch service <name> -p '{"metadata":{"finalizers":null}}'. 3. Verified the cloud load balancer was already deleted (no orphaned resources). 4. Namespace deletion completed immediately after finalizer removal. 5. Added monitoring for namespaces in Terminating state for more than 5 minutes.

Key lesson

Finalizers block deletion until the responsible controller acknowledges cleanup. If the controller is broken, deletion hangs indefinitely.
Never manually delete cloud resources (load balancers, volumes) without ensuring the controller can reconcile. Orphaned resources cost money.
Monitor for resources stuck in Terminating state. It is always a sign of a broken controller or missing permissions.
When debugging Terminating hangs, check kubectl get <resource> -o json | jq .metadata.finalizers to identify which controller is blocking.

Production debug guideSymptom-first investigation path for senior-level Kubernetes failures.6 entries

Symptom · 01

Pod stuck in Pending with no events.

→

Fix

Check if the scheduler is running and healthy. Verify node resources are sufficient. Check for taints/tolerations mismatches. Look for PVC binding failures if the Pod uses persistent volumes.

Symptom · 02

Pod stuck in ImagePullBackOff despite image existing.

→

Fix

Check ImagePullSecrets on the Pod's ServiceAccount. Verify node IAM roles for private registries. Check disk pressure on the node. Inspect kubelet logs for the actual pull error.

Symptom · 03

Namespace stuck in Terminating.

→

Fix

List all resources in the namespace with finalizers:

kubectl api-resources --verbs=list -o name | xargs -n 1 kubectl get -n <ns> --ignore-not-found -o json | jq '.items[] | select(.metadata.finalizers) | {kind: .kind, name: .metadata.name, finalizers: .metadata.finalizers}'

. Patch or investigate each blocking resource.

Symptom · 04

Service returns 503 intermittently.

→

Fix

Check readiness probes. Pods failing readiness are removed from endpoints but existing connections may still be routed. Check externalTrafficPolicy — if set to Local, traffic only routes to nodes with local pods. Check kube-proxy mode and logs.

Symptom · 05

Deployment rollout hangs at 'Waiting for rollout'.

→

Fix

Check kubectl rollout status deployment/<name>. If maxUnavailable is 0 and a new pod cannot be scheduled, the rollout blocks forever. Check for resource quota limits, PDB conflicts, and node capacity.

Symptom · 06

etcd cluster unhealthy, leader elections failing.

→

Fix

Check etcd member health: etcdctl endpoint health --cluster. Check disk latency on etcd nodes (iostat -x 1). High fsync latency causes Raft timeouts. Check network connectivity between etcd members.

★ Advanced Kubernetes Triage CommandsRapid commands for diagnosing complex Kubernetes failures.

Pod stuck in Pending.−

Immediate action

Check scheduler events and node capacity.

Commands

kubectl describe pod <pod> | grep -A 20 Events

kubectl describe nodes | grep -A 5 Allocatable -B 2

Fix now

If 'Insufficient cpu/memory', scale cluster or reduce requests. If 'node(s) had taint', add tolerations or remove taints.

Namespace stuck in Terminating.+

Service returning 503.+

etcd cluster degraded.+

RBAC permission denied errors in application logs.+

Probe Types: Liveness vs Readiness vs Startup

Aspect	Liveness Probe	Readiness Probe	Startup Probe
Primary Goal	Detect deadlocks and hung processes	Control traffic routing to the Pod	Gate liveness/readiness until boot completes
Failure Action	Kubelet kills the container; triggers restart	Pod removed from Service endpoints; no traffic	If it fails, container is restarted like liveness
Success Action	Container continues running	Pod added to Service endpoints; receives traffic	Liveness and readiness probes are activated
Runs When	After startup probe succeeds (or immediately if no startup probe)	After startup probe succeeds (or immediately if no startup probe)	Immediately when container starts
Typical Use Case	Catching deadlocks, memory leaks, infinite loops	Waiting for cache warm-up, DB connection pool init	Legacy apps with 2+ minute startup times
Failure Threshold	3 (default) — restart after 3 failures	3 (default) — remove from endpoints after 3 failures	30 (recommended) — allows up to 150s startup with 5s period
Common Mistake	Checking downstream dependencies (DB, cache) — causes cascading restarts	Too aggressive — causes endpoint flapping during transient load	Missing entirely — causes CrashLoopBackOff for slow-starting apps

⚙ Quick Reference

15 commands from this guide

File	Command / Code	Purpose
iothecodeforgek8sproduction-pod.yaml	apiVersion: v1	The Anatomy of a Request
iothecodeforgek8sdebug-network.sh	iptables -L -t nat \| grep FORGE-SERVICE-NAME	Networking Internals
iothecodeforgek8setcd-debug.sh	etcdctl endpoint health --cluster --write-out=table	etcd Internals
iothecodeforgek8sresource-qos.yaml	apiVersion: v1	Resource Management
iothecodeforgek8srbac-least-privilege.yaml	apiVersion: v1	RBAC, Service Accounts, and Admission Control
iothecodeforgek8sschedulingSchedulerDecisionModel.java	public class SchedulerDecisionModel {	Scheduler Internals
iothecodeforgek8sprobes-production.yaml	apiVersion: v1	Probes Deep Dive
ConfigmapSecretAudit.py	def check_secret_encryption():	ConfigMaps and Secrets
NamespaceResourceAudit.py	def check_namespace_quotas():	Namespaces
pvc_pod_binding.py	apiVersion: v1	Persistent Volumes & Claims
rollout_commands.py	kubectl set image deployment/nginx-deployment nginx=nginx:1.25	Rolling Updates & Rollbacks
pdb_definition.py	apiVersion: policy/v1	Pod Disruption Budgets
scheduling-example.yaml	apiVersion: v1	Kubernetes Scheduling
istio-virtual-service.yaml	apiVersion: networking.istio.io/v1beta1	Service Mesh
network-policy-example.yaml	apiVersion: networking.k8s.io/v1	Kubernetes Security

Key takeaways

Kubernetes is a state-reconciliation engine; controllers constantly work to drive 'Current State' toward the 'Desired State' stored in etcd.

The Control Plane request flow (Auth -> Mutating -> Validating -> Etcd) is the gatekeeper of cluster stability.

Resource management isn't just about avoiding OOMKills; it's about defining 'Requests' accurately so the Scheduler can make intelligent placement decisions.

Networking in K8s relies on a combination of the CNI (Pod-to-Pod) and kube-proxy (Service abstraction) to handle the ephemeral nature of IPs.

etcd is the single point of failure. Raft consensus prevents split-brain but requires quorum. Disk latency is the most common cause of instability.

RBAC is additive with no deny mechanism. Least-privilege design, dedicated ServiceAccounts, and admission webhooks for policy enforcement are the production standard.

Probes are the difference between a resilient service and a cascading failure. Liveness checks internal health only. Readiness can check dependencies. Startup gates both.

Every production Kubernetes failure has a root cause in one of

etcd, scheduler, kubelet, CNI, or admission webhooks. Knowing which component to check first is the skill.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

FAQ · 6 QUESTIONS

Frequently Asked Questions

Why is my Pod OOMKilled even if the node has plenty of free RAM?

What is the difference between a 'Taint' and a 'NodeSelector'?

What happens if etcd goes down?

How does the scheduler decide where to place a Pod?

What is the admission webhook chain and why does it matter?

How do I debug a namespace stuck in Terminating?

Naren Founder & Principal Engineer

20+ years shipping production code across the stack, with years spent interviewing engineers. Everything here is grounded in real deployments.

✓ Verified

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

🔥

That's DevOps Interview. Mark it forged?

7 min read · try the examples if you haven't