Advanced 8 min · March 06, 2026

AWS EKS ENI Limits — Why Healthy Nodes Reject Pods

EC2 ENI slots cap EKS pod count per node, even with free CPU and memory.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
Quick Answer
  • EKS is a managed Kubernetes control plane on AWS.
  • Control plane runs in a separate AWS account, multi-AZ for HA.
  • VPC CNI assigns each pod a VPC IP address directly.
  • Node groups: managed, self-managed, or Fargate.
  • IRSA uses OIDC to grant IAM roles to pods.
  • Biggest gotcha: hitting ENI limits causes pod scheduling failures.

Kubernetes is the de facto standard for running containerised workloads at scale, but running a production-grade Kubernetes control plane yourself is genuinely brutal. etcd upgrades, API server HA, certificate rotation, audit log pipelines — it's a full-time job before you've written a single line of application code. That's the gap AWS EKS was built to fill, and in 2024 it powers thousands of production systems from fintech to streaming to machine learning pipelines.

The problem EKS solves isn't just 'run Kubernetes for me.' It's the deep integration question: how do your pods get IAM permissions without storing static credentials? How does pod networking interact with AWS VPC routing tables? How do you autoscale nodes without leaving zombie instances behind? These are the questions that burn teams at 2 AM, and they all have specific EKS answers that differ from vanilla Kubernetes.

By the end of this article you'll understand exactly how the EKS control plane is architected and why, how VPC CNI assigns IPs to pods and where it breaks under load, how IAM Roles for Service Accounts (IRSA) works at the token level, how to choose between managed node groups, self-managed nodes, and Fargate, and which production gotchas have silently broken real deployments. This is the article you'll come back to before your next EKS architecture review.

EKS Control Plane Internals

AWS EKS runs the Kubernetes control plane (etcd, API server, controller manager, scheduler) in a separate AWS account that you never see. It's fully managed, meaning AWS handles upgrades, patching, and multi-AZ high availability for free. But 'managed' doesn't mean you can ignore it. The control plane exposes a public or private API server endpoint. You interact with it exactly like a vanilla Kubernetes API server, but there are subtle differences. For example, you cannot directly access etcd — AWS abstracts it completely. Audit logs must be enabled via CloudTrail integration. And the API server is fronted by an Elastic Load Balancer (NLB) that can be internal or public. The control plane also includes the AWS-specific admission webhooks, like the one that enforces IRSA annotations on pods.

Under the hood, AWS runs the Kubernetes control plane components as containers on EC2 instances in its own infrastructure. They're isolated per tenant. The etcd nodes are encrypted at rest and in transit. If you need to inspect control plane health, you rely on CloudWatch metrics like apiserver_request_duration_seconds or etcd_request_duration_seconds. AWS exposes a subset of etcd metrics through AWS managed Prometheus (AMP) or CloudWatch. In production, the most common control plane issue is throttling from the API server when your client requests or watches generate too many requests per second. AWS applies rate limits at the NLB and the API server itself.

Debugging hint: If you see 429 Too Many Requests from kubectl, you've hit the API server rate limit. Enable retries with backoff in your automation tools.

VPC CNI Networking: Pod IP Allocation and Limits

The AWS VPC CNI is what makes EKS unique among Kubernetes distributions. Instead of using an overlay network (like Flannel or Calico in IPIP mode), each pod gets a real IP address from your VPC subnet. This means pods can communicate with other VPC resources (RDS, EC2, Lambda) without any NAT or proxy. The CNI achieves this by attaching multiple Elastic Network Interfaces (ENIs) to each EC2 node and assigning secondary IP addresses from those ENIs to pods. When a pod is created, the CNI plugin picks an unused IP from a warm pool and assigns it to the pod's network namespace. This setup gives native VPC integration, but it comes with hard limits.

The first limit is per-instance ENI count. An m5.large has a maximum of 3 ENIs, each with up to 10 IPs, so total pods = 30. An m5.4xlarge supports 8 ENIs × 15 IPs = 120 pods. You hit these limits faster than you think. The second limit is IP address exhaustion in your VPC subnet. If you run 1000 pods on a /24 subnet (256 IPs), you'll quickly run out. The CNI also competes with other services for IPs.

To scale beyond these limits, AWS offers custom networking (assign pods IPs from different subnet ranges) and prefix delegation (assign /28 prefixes to ENIs, giving many more IPs per ENI). Prefix delegation is enabled via the AWS_VPC_CNI_PREFIX_DELEGATION environment variable on the aws-node daemonset. For new clusters, it's enabled by default.

Another common production trap: The CNI plugin uses the EC2 API to attach/detach ENIs, which means the node's IAM role needs specific permissions (ec2:AttachNetworkInterface, ec2:CreateNetworkInterface, etc.). If those permissions are missing, the CNI silently fails, and pods stay Pending with a FailedCreatePodSandBox error.

Node Group Strategies: Managed vs Self-Managed vs Fargate

EKS offers three modes to run worker nodes: managed node groups, self-managed node groups, and AWS Fargate. Each has trade-offs in operational overhead, cost, and flexibility.

Managed node groups are the default choice. You specify an AMI family (Amazon Linux 2 or Bottlerocket), instance type, and scaling config. AWS handles patching the AMI (rolling updates), replacing unhealthy nodes, and draining them gracefully. You also get automatic security group rules for the control plane. The downside? You cannot use a custom AMI, and you're limited to the instance types AWS supports. If you need GPU drivers, custom kernel parameters, or pre-installed agents, you're out of luck.

Self-managed node groups give you full control. You launch EC2 instances, install kubelet, and join them to the cluster. You control the AMI, the bootstrap script, and the lifecycle policies. This is necessary for GPU-heavy workloads (like ML training) or when you need to pin specific kernel versions. The trade-off is operational burden: you must manage AMI updates, security patches, and node replacement yourself.

Fargate is the serverless option. You define a Fargate profile, and pods that match certain selectors run on Fargate instead of EC2. This is ideal for batch jobs, CI/CD runners, or sporadic workloads that don't justify always-on instances. The catch: you pay per pod-second, with a minimum charge of 1 minute. Fargate also limits your pod to a max of 4 vCPU and 30 GB memory, and you cannot run daemonsets or privileged containers.

A real-world pattern: Run critical microservices on managed node groups (steady state), bursty data processing on Fargate, and GPU training on a self-managed node group with custom AMI.

IAM Roles for Service Accounts (IRSA) — How Pods Get AWS Credentials

Before IRSA, you had to store AWS access keys in Kubernetes Secrets or in the node's instance profile and use kube2iam or kiam. Both were fragile. IRSA solves this by using OpenID Connect (OIDC) federation. Every EKS cluster has an OIDC issuer URL (e.g., oidc.eks.<region>.amazonaws.com/id/XXXXX). You create an IAM role with a trust policy that allows the OIDC provider to assume the role for a specific service account in a specific namespace. The Kubernetes API server issues a signed token (a JWT) that includes the pod's service account and namespace. The pod's aws-iam-token mutating webhook injects the token into the pod as a file (by default at /var/run/secrets/eks.amazonaws.com/serviceaccount/token). The AWS SDK uses a credential chain that picks up this token and calls STS:AssumeRoleWithWebIdentity to get temporary AWS credentials.

Common pitfalls: The OIDC provider's thumbprint must match the EKS cluster's certificate (you set this during IRSA setup). If the trust policy's aud claim doesn't match sts.amazonaws.com, the token is rejected. Also, the token has a time-to-live (TTL) of about 15 minutes, but the AWS SDK automatically refreshes it — as long as the pod can access the OIDC provider URL. If your VPC doesn't have internet access and you're using a private cluster, you need to expose the OIDC endpoint via a VPC endpoint (com.amazonaws.region.eks.auth).

Another gotcha: The token is issued by the Kubernetes API server, not by AWS STS. It's literally a Kubernetes service account token. The IAM trust policy verifies the token's signature using the OIDC provider's public keys. If the cluster is upgraded and the OIDC thumbprint changes, the token validation fails, and pods lose all AWS permissions until you update the thumbprint in the IAM role trust policy.

Production Gotchas and How to Avoid Them

Even with managed services, EKS clusters suffer from specific failure patterns. Here are the ones we've seen destroy weekends.

CNI plugin race condition: When nodes start, the aws-node daemonset must be running before pods can get IPs. If the node's kubelet starts too fast, pods may try to launch before the CNI is ready, leading to a FailedCreatePodSandBox. This is especially common during node group scaling events. Fix: Add an init container or delay kubelet readiness until the CNI is ready.

Cluster Autoscaler not scaling down: The autoscaler needs a list of what's preventing scale-down (e.g., pods with cluster-autoscaler.kubernetes.io/safe-to-evict: false). But a common culprit is kube-system pods that are not backed by a PDB (Pod Disruption Budget). Without a PDB, the autoscaler will never scale down a node because it can't guarantee safe eviction. Fix: Always add PDBs for critical system pods.

AWS Load Balancer Controller (ALB/NLB) misconfig: The controller needs IAM permissions to create Target Groups, Listeners, etc. If the integration test passes but production times out, check for subnet tags — the subnets where ALBs are created must be tagged with kubernetes.io/role/elb (public) or kubernetes.io/role/internal-elb (private). Without those tags, the controller silently fails to provision load balancers.

DNS resolution issues with CoreDNS: CoreDNS pods sometimes get scheduled on nodes under memory pressure, causing them to be OOMKilled. This leads to intermittent DNS failures across the cluster. Fix: Set resource requests and limits on CoreDNS, and consider deploying two replicas on different node types.

EBS CSI driver not installed: If you run stateful workloads, you need the EBS CSI driver. Many teams miss this and see pods stuck in ContainerCreating with an event regarding volumes. Fix: Verify the EBS CSI add-on is enabled in the EKS console or via eksctl.

Security group rule limits: Each EKS cluster can have up to 5 security groups per ENI. If you use many security groups per pod (via the CNI later), you'll hit an AWS API limit. Plan your security groups carefully.

EKS Node Group Types Comparison
FeatureManaged Node GroupsSelf-Managed Node GroupsFargate
AMI controlLimited to AWS-provided AMIsFull control, custom AMINo control, uses AWS-managed infra
UpgradesAutomatic rolling updatesManual (AMIs, kubelet)Automatic (infra is ephemeral)
Instance typesAny EC2 instance typeAny EC2 instance typeFixed: 4 vCPU max, 30 GB memory
DaemonsetsFull supportFull supportNot supported
Cost modelEC2 instances (per hour)EC2 instances (per hour)Per pod-second (min 1 min)
Pod networkingVPC CNI (native)VPC CNI (native) or overlayVPC CNI (native, limited)

Key Takeaways

  • EKS abstracts the control plane but exposes metrics — monitor etcd and API server request latency.
  • VPC CNI is the most powerful EKS feature and the most dangerous — plan subnet size per node type.
  • IRSA eliminates static credentials but requires precise OIDC thumbprint management after upgrades.
  • Most production incidents are network-related: ENI limits, DNS, or security group caps.
  • Use a hybrid node group strategy: managed for steady workloads, Fargate for bursty jobs, self-managed for custom needs.

Common Mistakes to Avoid

  • Not planning IP address exhaustion in subnets
    Symptom: Cluster runs out of IPs and new pods cannot be scheduled. No errors in CloudWatch, just Pending events.
    Fix: Monitor IP usage per subnet with VPC CNI metrics. Use prefix delegation or add secondary CIDRs. Plan pod density per instance type before deployment.
  • Assuming managed node groups handle all security patching
    Symptom: Node AMIs become outdated with severe CVEs. AWS patches the control plane but not the nodes automatically unless you update the node group.
    Fix: Set up regular node group updates (via eksctl or console). Use Bottlerocket for automated OS updates. Enable security scanning tools.
  • Running critical workloads without Pod Disruption Budgets
    Symptom: During new node deployments, pods are evicted without warning, causing service disruption. Cluster autoscaler cannot drain nodes.
    Fix: Define PDBs for all production services: minAvailable: 2 for replicas >= 3. Test node group updates in staging first.

Interview Questions on This Topic

  • QHow does EKS VPC CNI assign IPs to pods? What limit does it have?Mid-levelReveal
    The VPC CNI attaches ENIs to each EC2 node and assigns secondary IPs from those ENIs to pods. The limit is per-instance ENI count and IPs per ENI. For example, an m5.large can have 3 ENIs × 10 IPs = 30 pods. You can use prefix delegation to assign /28 prefixes per ENI, dramatically increasing pod density. Without it, you'll hit the limit quickly.
  • QExplain IRSA (IAM Roles for Service Accounts) and how it works at the token level.SeniorReveal
    IRSA uses OIDC federation. Each EKS cluster has an OIDC issuer URL. You create an IAM role whose trust policy allows the OIDC provider to assume the role for a specific service account. The Kubernetes API server issues a short-lived JWT token that the pod uses via STS:AssumeRoleWithWebIdentity. The AWS SDK automatically refreshes the token. The key is that the token contains the service account identity, and the IAM role checks the aud and sub claims.
  • QYour EKS cluster's pods are stuck in Pending but nodes have free CPU and memory. What do you investigate?Mid-levelReveal
    First check pod events: kubectl describe pod. Look for signs of IP exhaustion (pod limit), volume attachment limits, or CNI errors. Then check node allocatable pods: kubectl describe node. Also verify subnet IP space isn't exhausted using aws ec2 describe-subnets. Another common cause is AWS EBS volume attachment limits per instance type. Finally, check the CNI daemonset logs if it's failing to allocate IPs.
  • QWhat are the trade-offs between managed node groups and self-managed node groups?SeniorReveal
    Managed node groups reduce operational overhead: automated AMI updates, node draining, and integration with EKS. But you lose the ability to use custom AMIs (needed for GPU drivers, kernel tuning) and you're limited to supported instance types. Self-managed gives full control but requires manual AMI management, patching, and scaling logic. For 90% of workloads, managed node groups are the right choice. For ML training with custom drivers, self-managed is necessary.

Frequently Asked Questions

Can I use EKS without any VPC experience?

Not safely. EKS networking depends heavily on VPC design. You need to understand subnets, route tables, security groups, and ENI limits before you launch a cluster. We recommend a dedicated VPC for each environment.

How does EKS pricing work?

You pay $0.10 per cluster per hour (regardless of number of nodes). Nodes are charged separately as EC2 instances. Fargate pods are charged per vCPU-hour and memory (minimum 1 minute). The control plane itself is free beyond the per-cluster fee.

Can I run a private EKS cluster without internet access?

Yes. You can create an EKS cluster with private API server endpoint and use VPC endpoints for ECR, S3, and other AWS services. CoreDNS and other cluster add-ons need to be reachable without outbound internet. The CNI plugin and kubelet will still work because they cache needed artifacts.

Why do my pods fail to start after an EKS cluster upgrade?

Common causes: IRSA thumbprint mismatch, Kubernetes API deprecations, or CNI plugin version incompatibility. Always upgrade the CNI add-on first, then the cluster version. Test in a staging environment before production.

🔥

That's Cloud. Mark it forged?

8 min read · try the examples if you haven't

Previous
Serverless Architecture Explained
16 / 23 · Cloud
Next
Google Cloud Run Basics