Skip to content
Home DevOps Kubernetes Architecture Explained: Control Plane, Nodes & Internals

Kubernetes Architecture Explained: Control Plane, Nodes & Internals

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Kubernetes → Topic 10 of 12
Kubernetes architecture deep-dive: control plane internals, node components, scheduler decisions, etcd consistency, and production gotchas senior engineers must know.
🔥 Advanced — solid DevOps foundation required
In this tutorial, you'll learn
Kubernetes architecture deep-dive: control plane internals, node components, scheduler decisions, etcd consistency, and production gotchas senior engineers must know.
  • Kubernetes is a set of independent control loops reconciling actual state toward a declared desired state. This model is powerful but introduces eventual consistency.
  • The Control Plane's reliability is paramount. etcd is the single source of truth and its performance (disk, network) dictates the entire cluster's health.
  • Nodes are intelligent agents, not dumb workers. Node-level issues (kubelet, runtime, kernel) require node-level debugging tools.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer
  • Control Plane: The brain (API Server, Scheduler, Controller Manager, etcd).
  • Nodes: The workers (kubelet, kube-proxy, container runtime).
  • etcd: The single source of truth for all cluster state.
  • Scheduler: Assigns pods to nodes based on resources and constraints.
  • Kubelet: Ensures containers described in PodSpecs are running and healthy.
🚨 START HERE
Kubernetes Control Plane & Node Triage
Rapid commands to isolate cluster issues.
🟡Cluster-wide unresponsiveness.
Immediate ActionCheck etcd health and API Server logs.
Commands
etcdctl endpoint health --cluster
kubectl get --raw='/readyz?verbose'
Fix NowIf etcd is unhealthy, check disk latency on etcd nodes (`iostat -x 1`). Isolate etcd immediately.
🟡Specific node not scheduling new pods.
Immediate ActionCheck node conditions and resource allocation.
Commands
kubectl describe node <node-name> | grep -A 10 Conditions
kubectl top node <node-name>
Fix NowIf under DiskPressure, clean up unused images/containers. If under MemoryPressure, identify memory-hungry pods.
🟡Pods failing to start on a node.
Immediate ActionCheck kubelet and container runtime logs.
Commands
journalctl -u kubelet -n 50 --no-pager
crictl ps -a # or docker ps -a
Fix NowIf kubelet cannot talk to the runtime, restart the runtime service. Check for CNI plugin errors.
Production IncidentCascading API Server Failure Due to etcd Disk LatencyCluster became unresponsive. `kubectl` commands timed out. Pods stopped scheduling.
SymptomAll kubectl operations returned timeout errors. Controller-manager logs showed failed lease renewals. New pods stuck in Pending.
AssumptionNetwork partition or API Server OOM.
Root causeetcd members were deployed on the same nodes as other workloads. A batch job caused high disk I/O on those nodes. etcd's consensus protocol (Raft) requires fsync to disk within an election timeout. High disk latency caused leader elections to fail, which made the API Server unable to write new state.
Fix1. Isolated etcd onto dedicated nodes with local SSDs and no other workloads. 2. Configured etcd heartbeat-interval and election-timeout appropriately for the network. 3. Set up monitoring for etcd fsync duration and disk IOPS.
Key Lesson
etcd is the cluster's central nervous system. Its performance is non-negotiable.Disk latency, not network, is the most common cause of etcd instability.etcd must be isolated and its hardware provisioned for predictable, low-latency I/O.
Production Debug GuideA symptom-first investigation path for control plane and node issues.
kubectl commands timeout or fail with 'server error'.Check API Server and etcd health first. The API Server is the gateway; if it's down, nothing else works.
Pods stuck in Pending state.Investigate scheduler logs and node resource availability (kubectl describe node). Check for resource fragmentation or taints/tolerations mismatches.
Pods in CrashLoopBackOff.Inspect kubelet logs on the node and the pod's events (kubectl describe pod). The container runtime (e.g., containerd) logs are critical here.
Node marked NotReady.SSH to the node. Check kubelet and container runtime status (systemctl status kubelet). Check disk pressure, memory pressure, and PID pressure.

Kubernetes replaces bespoke deployment scripts and manual server management with a declarative control loop. You describe the desired state, and the cluster continuously reconciles reality toward it. The architecture is not a black box; it's a set of coordinated components with specific failure domains and performance characteristics. Understanding these internals is what separates engineers who debug Kubernetes from those who are confused by it.

This is not a getting-started guide. It is for engineers already running Kubernetes who need to understand the 'why' behind scheduler decisions, etcd consistency guarantees, and kubelet behavior under pressure. We will trace what happens, component by component, when you run kubectl apply -f deployment.yaml, and identify the production decisions that bite teams hardest.

What is Kubernetes Architecture Explained?

Kubernetes Architecture Explained is a core concept in DevOps. Rather than starting with a dry definition, let's see it in action and understand why it exists. The architecture's design directly enables its core promise: a self-healing, declarative system for running distributed applications.

io/thecodeforge/kubernetes/ForgeExample.java · DEVOPS
12345678910
// TheCodeForgeKubernetes Architecture Explained example
// Always use meaningful names, not x or n
package io.thecodeforge.kubernetes;

public class ForgeExample {
    public static void main(String[] args) {
        String topic = "Kubernetes Architecture Explained";
        System.out.println("Learning: " + topic);
    }
}
▶ Output
Learning: Kubernetes Architecture Explained
Mental Model
The Declarative Control Loop
This is why you don't 'run a deploy'. You 'update the desired state'. The system then does the work.
  • You write a Deployment YAML (desired state).
  • The Deployment controller creates ReplicaSets.
  • The ReplicaSet controller ensures the right number of Pods exist.
  • The Scheduler assigns Pods to Nodes.
  • The Kubelet on each Node runs the containers.
📊 Production Insight
The power and risk of this model is eventual consistency. The system will eventually reach the desired state, but there is a delay. During a rolling update, both old and new versions run simultaneously. Your application must be designed for this. The controller's reconciliation frequency (--sync-period) and the API Server's watch cache are levers that affect how fast 'eventually' is.
🎯 Key Takeaway
Kubernetes is a set of independent control loops, not a monolithic program. This decoupling makes it resilient but introduces eventual consistency. Your debugging mindset must shift from 'what command failed?' to 'what state is the system converging toward, and what is blocking it?'.

Control Plane: The Cluster's Brain

The Control Plane makes global decisions (e.g., scheduling) and detects and responds to cluster events. It consists of the API Server, etcd, Scheduler, and Controller Manager. In production, it is almost always replicated across multiple nodes for high availability.

control_plane_check.sh · BASH
12345678910111213
#!/bin/bash
# Check health of core control plane components
# Package: io.thecodeforge.ops

# 1. API Server health
kubectl get --raw='/healthz?verbose'

# 2. etcd member list (run on a control plane node)
etcdctl member list -w table

# 3. Scheduler and Controller Manager leader election
kubectl get endpoints kube-scheduler -n kube-system -o yaml
kubectl get endpoints kube-controller-manager -n kube-system -o yaml
Mental Model
etcd: The Source of Truth
This is why disk performance is critical. A slow disk on one etcd member can slow down the entire cluster's write operations.
  • All cluster state (pods, configs, secrets) is stored in etcd.
  • The API Server is the only component that talks to etcd.
  • etcd's performance is governed by its heartbeat-interval and election-timeout.
  • A network partition can cause leader elections, pausing all writes.
📊 Production Insight
etcd is the single point of failure. A production cluster must run etcd on dedicated nodes with fast, local SSDs (not network-attached storage). The etcdctl snapshot save command is your most important backup tool. Run it frequently and test restores. The --quota-backend-bytes flag must be monitored; if etcd's database exceeds this, it will go into maintenance mode and reject writes, effectively halting the cluster.
🎯 Key Takeaway
The Control Plane's reliability is the cluster's reliability. The API Server's horizontal scaling is limited by etcd's write throughput. To scale the control plane, you often need to optimize etcd first (dedicated hardware, tuning, defragmentation).

Nodes: The Worker Machines

A Node is a worker machine (VM or physical) where containers are run. Each node contains the services necessary to run Pods: the kubelet, the container runtime (e.g., containerd), and the kube-proxy.

node_debug.sh · BASH
12345678910111213141516
#!/bin/bash
# Deep inspection of a Node's status
# Package: io.thecodeforge.ops

NODE_NAME="worker-node-1"

# 1. Node conditions and allocatable resources
kubectl describe node $NODE_NAME | grep -A 15 Conditions
kubectl describe node $NODE_NAME | grep -A 5 Allocatable

# 2. Check kubelet logs (run on the node)
journalctl -u kubelet --since "1 hour ago" | grep -i "error\|fail"

# 3. Check container runtime (containerd example)
sudo crictl info | jq '.config.systemdCgroup'
sudo crictl ps
Mental Model
The kubelet: Node-Level Controller
When a pod is Terminating, the kubelet is executing the pre-stop hook and sending SIGTERM. The terminationGracePeriodSeconds is the kubelet's deadline before it sends SIGKILL.
  • Runs liveness and readiness probes.
  • Mounts volumes specified in the PodSpec.
  • Reports node conditions (MemoryPressure, DiskPressure).
  • Manages cgroups for resource isolation.
📊 Production Insight
The kubelet's --max-pods flag (default 110) is a hard limit. Exceeding it will prevent new pods from being scheduled, even if the node has CPU/RAM. Conversely, resource-based scheduling relies on the Allocatable field, which subtracts system/kubelet reservations from the node's total capacity. Misconfiguring reservations leads to resource fragmentation where the scheduler sees 'available' resources the node cannot actually provide.
🎯 Key Takeaway
Nodes are not dumb. They are intelligent agents enforcing local state. Node failures are often local (disk pressure, kubelet crash, runtime hang) and require node-level debugging (journalctl, crictl), not just kubectl. The kubelet is the bridge between the global control plane and the local node reality.
🗂 Control Plane Components: Role and Failure Impact
Understanding what each component does and what breaks if it fails.
ComponentPrimary FunctionFailure ImpactRecovery Action
API ServerFrontend for the control plane. All communication goes through it.Cluster becomes unresponsive. kubectl fails.Check logs, ensure etcd is healthy, scale horizontally if load is high.
etcdConsistent, fault-tolerant store for all cluster data.Cluster state cannot be updated. No new pods, updates, or scaling.Check member health, disk latency. Restore from backup if quorum is lost.
SchedulerAssigns new pods to nodes based on constraints and resources.New pods remain in Pending state indefinitely.Check scheduler logs for constraint conflicts or resource fragmentation.
Controller ManagerRuns core control loops (ReplicaSet, Node, Endpoint, etc.).Cluster stops self-healing. Node failures not detected, replicas not reconciled.Check leader election. Restart process. Investigate specific controller logs.
kubelet (per Node)Ensures pods are running on its node. Reports status.Pods on that node stop being managed. NotReady condition appears.SSH to node. Check kubelet service, container runtime, disk/memory pressure.

🎯 Key Takeaways

  • Kubernetes is a set of independent control loops reconciling actual state toward a declared desired state. This model is powerful but introduces eventual consistency.
  • The Control Plane's reliability is paramount. etcd is the single source of truth and its performance (disk, network) dictates the entire cluster's health.
  • Nodes are intelligent agents, not dumb workers. Node-level issues (kubelet, runtime, kernel) require node-level debugging tools.
  • Production readiness requires understanding failure modes: etcd quorum loss, scheduler conflicts, resource fragmentation, and network partitions.
  • The most effective way to learn is to break things. Intentionally crash etcd, fill a node's disk, and kill the kubelet. Observe the system's behavior and recovery.

⚠ Common Mistakes to Avoid

    Memorising syntax before understanding the declarative control loop model.
    Skipping practice and only reading theory.
    Treating etcd like a regular database instead of a replicated log with strict latency requirements.
    Ignoring node-level conditions (DiskPressure, MemoryPressure) and wondering why pods get evicted.
    Not setting resource requests and limits, leading to resource starvation or inefficient packing.
    Assuming high availability of the control plane means no single component can fail. etcd quorum loss is catastrophic.
    Overlooking the importance of the container runtime (e.g., containerd) and its configuration (systemd cgroups).

Interview Questions on This Topic

  • QTrace the exact sequence of events from kubectl apply to a container running on a node.
  • QHow does the Kubernetes scheduler make a placement decision? What factors does it consider?
  • QExplain the role of etcd. Why is its disk performance so critical?
  • QWhat happens when a node loses network connectivity but is still running? How does the cluster detect and respond?
  • QA pod is stuck in Pending. Walk me through your debugging process.
  • QWhat is the difference between a liveness probe and a readiness probe? What are the failure modes for each?
  • QHow would you design a highly available control plane? What are the trade-offs?

Frequently Asked Questions

What is Kubernetes Architecture Explained in simple terms?

Kubernetes Architecture Explained is the blueprint of the system. It shows how the brain (Control Plane) makes decisions and how the workers (Nodes) carry them out. Understanding it is key to operating and debugging clusters effectively.

Why is etcd so important and sensitive?

etcd is the only persistent store for all cluster state. Its Raft consensus protocol requires low-latency disk writes for every cluster operation. Slow disks or network partitions can cause leader elections, halting all cluster changes until resolved.

How does the Scheduler decide where to put a pod?

The Scheduler filters nodes that meet the pod's resource requests and constraints (nodeSelector, affinity, taints). It then scores the feasible nodes based on a set of priority functions (e.g., spreading pods, balancing resource usage) and picks the highest-scoring node.

What's the biggest production gotcha with the kubelet?

Resource misconfiguration. If you don't set proper requests and limits, a single pod can consume all node resources, causing evictions or node instability. Also, the kubelet's --max-pods limit is a hard cap that overrides available CPU/RAM.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousKubernetes Monitoring with PrometheusNext →Kubernetes Network Policies
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged