Kubernetes Architecture Explained: Control Plane, Nodes & Internals
- Kubernetes is a set of independent control loops reconciling actual state toward a declared desired state. This model is powerful but introduces eventual consistency.
- The Control Plane's reliability is paramount. etcd is the single source of truth and its performance (disk, network) dictates the entire cluster's health.
- Nodes are intelligent agents, not dumb workers. Node-level issues (kubelet, runtime, kernel) require node-level debugging tools.
- Control Plane: The brain (API Server, Scheduler, Controller Manager, etcd).
- Nodes: The workers (kubelet, kube-proxy, container runtime).
- etcd: The single source of truth for all cluster state.
- Scheduler: Assigns pods to nodes based on resources and constraints.
- Kubelet: Ensures containers described in PodSpecs are running and healthy.
Cluster-wide unresponsiveness.
etcdctl endpoint health --clusterkubectl get --raw='/readyz?verbose'Specific node not scheduling new pods.
kubectl describe node <node-name> | grep -A 10 Conditionskubectl top node <node-name>Pods failing to start on a node.
journalctl -u kubelet -n 50 --no-pagercrictl ps -a # or docker ps -aProduction Incident
kubectl operations returned timeout errors. Controller-manager logs showed failed lease renewals. New pods stuck in Pending.fsync duration and disk IOPS.Production Debug GuideA symptom-first investigation path for control plane and node issues.
kubectl describe node). Check for resource fragmentation or taints/tolerations mismatches.kubectl describe pod). The container runtime (e.g., containerd) logs are critical here.systemctl status kubelet). Check disk pressure, memory pressure, and PID pressure.Kubernetes replaces bespoke deployment scripts and manual server management with a declarative control loop. You describe the desired state, and the cluster continuously reconciles reality toward it. The architecture is not a black box; it's a set of coordinated components with specific failure domains and performance characteristics. Understanding these internals is what separates engineers who debug Kubernetes from those who are confused by it.
This is not a getting-started guide. It is for engineers already running Kubernetes who need to understand the 'why' behind scheduler decisions, etcd consistency guarantees, and kubelet behavior under pressure. We will trace what happens, component by component, when you run kubectl apply -f deployment.yaml, and identify the production decisions that bite teams hardest.
What is Kubernetes Architecture Explained?
Kubernetes Architecture Explained is a core concept in DevOps. Rather than starting with a dry definition, let's see it in action and understand why it exists. The architecture's design directly enables its core promise: a self-healing, declarative system for running distributed applications.
// TheCodeForge — Kubernetes Architecture Explained example // Always use meaningful names, not x or n package io.thecodeforge.kubernetes; public class ForgeExample { public static void main(String[] args) { String topic = "Kubernetes Architecture Explained"; System.out.println("Learning: " + topic); } }
- You write a Deployment YAML (desired state).
- The Deployment controller creates ReplicaSets.
- The ReplicaSet controller ensures the right number of Pods exist.
- The Scheduler assigns Pods to Nodes.
- The Kubelet on each Node runs the containers.
--sync-period) and the API Server's watch cache are levers that affect how fast 'eventually' is.Control Plane: The Cluster's Brain
The Control Plane makes global decisions (e.g., scheduling) and detects and responds to cluster events. It consists of the API Server, etcd, Scheduler, and Controller Manager. In production, it is almost always replicated across multiple nodes for high availability.
#!/bin/bash # Check health of core control plane components # Package: io.thecodeforge.ops # 1. API Server health kubectl get --raw='/healthz?verbose' # 2. etcd member list (run on a control plane node) etcdctl member list -w table # 3. Scheduler and Controller Manager leader election kubectl get endpoints kube-scheduler -n kube-system -o yaml kubectl get endpoints kube-controller-manager -n kube-system -o yaml
- All cluster state (pods, configs, secrets) is stored in etcd.
- The API Server is the only component that talks to etcd.
- etcd's performance is governed by its
heartbeat-intervalandelection-timeout. - A network partition can cause leader elections, pausing all writes.
etcdctl snapshot save command is your most important backup tool. Run it frequently and test restores. The --quota-backend-bytes flag must be monitored; if etcd's database exceeds this, it will go into maintenance mode and reject writes, effectively halting the cluster.Nodes: The Worker Machines
A Node is a worker machine (VM or physical) where containers are run. Each node contains the services necessary to run Pods: the kubelet, the container runtime (e.g., containerd), and the kube-proxy.
#!/bin/bash # Deep inspection of a Node's status # Package: io.thecodeforge.ops NODE_NAME="worker-node-1" # 1. Node conditions and allocatable resources kubectl describe node $NODE_NAME | grep -A 15 Conditions kubectl describe node $NODE_NAME | grep -A 5 Allocatable # 2. Check kubelet logs (run on the node) journalctl -u kubelet --since "1 hour ago" | grep -i "error\|fail" # 3. Check container runtime (containerd example) sudo crictl info | jq '.config.systemdCgroup' sudo crictl ps
Terminating, the kubelet is executing the pre-stop hook and sending SIGTERM. The terminationGracePeriodSeconds is the kubelet's deadline before it sends SIGKILL.- Runs liveness and readiness probes.
- Mounts volumes specified in the PodSpec.
- Reports node conditions (MemoryPressure, DiskPressure).
- Manages cgroups for resource isolation.
--max-pods flag (default 110) is a hard limit. Exceeding it will prevent new pods from being scheduled, even if the node has CPU/RAM. Conversely, resource-based scheduling relies on the Allocatable field, which subtracts system/kubelet reservations from the node's total capacity. Misconfiguring reservations leads to resource fragmentation where the scheduler sees 'available' resources the node cannot actually provide.journalctl, crictl), not just kubectl. The kubelet is the bridge between the global control plane and the local node reality.| Component | Primary Function | Failure Impact | Recovery Action |
|---|---|---|---|
| API Server | Frontend for the control plane. All communication goes through it. | Cluster becomes unresponsive. kubectl fails. | Check logs, ensure etcd is healthy, scale horizontally if load is high. |
| etcd | Consistent, fault-tolerant store for all cluster data. | Cluster state cannot be updated. No new pods, updates, or scaling. | Check member health, disk latency. Restore from backup if quorum is lost. |
| Scheduler | Assigns new pods to nodes based on constraints and resources. | New pods remain in Pending state indefinitely. | Check scheduler logs for constraint conflicts or resource fragmentation. |
| Controller Manager | Runs core control loops (ReplicaSet, Node, Endpoint, etc.). | Cluster stops self-healing. Node failures not detected, replicas not reconciled. | Check leader election. Restart process. Investigate specific controller logs. |
| kubelet (per Node) | Ensures pods are running on its node. Reports status. | Pods on that node stop being managed. NotReady condition appears. | SSH to node. Check kubelet service, container runtime, disk/memory pressure. |
🎯 Key Takeaways
- Kubernetes is a set of independent control loops reconciling actual state toward a declared desired state. This model is powerful but introduces eventual consistency.
- The Control Plane's reliability is paramount. etcd is the single source of truth and its performance (disk, network) dictates the entire cluster's health.
- Nodes are intelligent agents, not dumb workers. Node-level issues (kubelet, runtime, kernel) require node-level debugging tools.
- Production readiness requires understanding failure modes: etcd quorum loss, scheduler conflicts, resource fragmentation, and network partitions.
- The most effective way to learn is to break things. Intentionally crash etcd, fill a node's disk, and kill the kubelet. Observe the system's behavior and recovery.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QTrace the exact sequence of events from
kubectl applyto a container running on a node. - QHow does the Kubernetes scheduler make a placement decision? What factors does it consider?
- QExplain the role of etcd. Why is its disk performance so critical?
- QWhat happens when a node loses network connectivity but is still running? How does the cluster detect and respond?
- QA pod is stuck in
Pending. Walk me through your debugging process. - QWhat is the difference between a liveness probe and a readiness probe? What are the failure modes for each?
- QHow would you design a highly available control plane? What are the trade-offs?
Frequently Asked Questions
What is Kubernetes Architecture Explained in simple terms?
Kubernetes Architecture Explained is the blueprint of the system. It shows how the brain (Control Plane) makes decisions and how the workers (Nodes) carry them out. Understanding it is key to operating and debugging clusters effectively.
Why is etcd so important and sensitive?
etcd is the only persistent store for all cluster state. Its Raft consensus protocol requires low-latency disk writes for every cluster operation. Slow disks or network partitions can cause leader elections, halting all cluster changes until resolved.
How does the Scheduler decide where to put a pod?
The Scheduler filters nodes that meet the pod's resource requests and constraints (nodeSelector, affinity, taints). It then scores the feasible nodes based on a set of priority functions (e.g., spreading pods, balancing resource usage) and picks the highest-scoring node.
What's the biggest production gotcha with the kubelet?
Resource misconfiguration. If you don't set proper requests and limits, a single pod can consume all node resources, causing evictions or node instability. Also, the kubelet's --max-pods limit is a hard cap that overrides available CPU/RAM.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.