IP Subnetting - The /25 Mask That Broke Internet Access
- What is IP Addressing and Subnetting?
- CIDR Notation: How to Read and Calculate Hosts
- Subnet Masks: Binary and Decimal
- IP addressing uniquely identifies devices; subnetting divides address space into smaller, routable blocks.
- CIDR notation (e.g., /24) replaces classful addressing and defines how many host bits you get.
- Hosts = 2^(32 — prefix) - 2 — the
-2is for network and broadcast addresses you cannot assign. - Production failure: a /25 instead of /24 changes the network boundary and can make your default gateway unreachable.
- Performance insight: each wrong bit in a subnet mask can route traffic to the wrong VLAN or blackhole it entirely.
- Biggest mistake: thinking subnetting is only about saving IPs. It's about routing — wrong mask, wrong network.
Subnet Calculation Quick Cheat Sheet
Need to know how many usable IPs a CIDR provides
ipcalc 10.0.1.0/24ipcalc 10.0.1.0/24 --ipaddress 10.0.1.15Wondering if two subnets overlap
ipcalc 10.0.1.0/24 10.0.2.0/24python3 -c "from ipaddress import ip_network; print('overlap' if ip_network('10.0.1.0/24').overlaps(ip_network('10.0.2.0/24')) else 'no')"Need the subnet mask from a CIDR prefix length
printf '/24 = %s\n' $(python3 -c "import ipaddress; print(ipaddress.IPv4Network('0.0.0.0/24').netmask)")Need to check if a CIDR is in the RFC 1918 private range
python3 -c "from ipaddress import ip_network; net = ip_network('10.0.0.0/8'); print('private' if net.is_private else 'public')"Need to calculate how many /24 subnets fit into a /20 VPC
python3 -c "print(2**(20-24))" # gives 16 for /20 -> /24python3 -c "from ipaddress import ip_network; net = ip_network('10.0.0.0/20'); print(list(net.subnets(new_prefix=24))[:5])"Need to verify the network address of a host IP
python3 -c "from ipaddress import ip_network; net = ip_network('10.0.1.55/24', strict=False); print(net.network_address)"Need to check if a subnet has enough room for a given number of hosts
python3 -c "from ipaddress import ip_network; net = ip_network('10.0.0.0/24'); print('Usable:', net.num_addresses - 2)"python3 -c "print('AWS usable:', 2**(32-24) - 5)"Production Incident
Production Debug GuideQuick reference for diagnosing common subnet-related production issues.
aws ec2 describe-subnets --subnet-ids and look at 'AvailableIpAddressCount'. Increase subnet size or create a larger one.ipcalc to calculate the network address for both prefixes. If they differ, one router must be reconfigured with a matching mask or the route must be summarised.ipcalc or ipaddress.collapse_addresses to check for gaps.Every packet that crosses your network — an API call, a database query, a Kubernetes pod talking to another — carries a source and destination IP. Without IP addressing, you're just shouting into the void. Get the subnet mask wrong and traffic doesn't just slow down. It stops.
IPv4 has about 4.3 billion addresses, and we ran out years ago. Subnetting, CIDR, and private ranges are the engineering fixes that made the internet keep working. They're baked into every VPC, every router config, every cloud environment you'll ever touch.
Here's the truth: most engineers won't calculate subnets by hand daily. But the one time you need to, a single wrong mask can silence an entire production fleet. That's why you need to understand it — not just pass a cert exam.
This guide covers CIDR math, binary masks, the /25 that broke internet access, and the Python commands that'll save you from subnet calculators forever.
The real cost of a misconfigured subnet isn't just wasted IPs – it's hours of debugging, missed SLAs, and sometimes a full incident post-mortem. That's why this guide focuses on what actually breaks and how to fix it fast.
What is IP Addressing and Subnetting?
IP Addressing and Subnetting is a core concept in CS Fundamentals. Rather than starting with a dry definition, let's see it in action and understand why it exists. An IP address is a 32-bit binary number, usually written in dotted decimal. The subnet mask separates the address into a network part and a host part. Routers use the network part to forward packets; the host part identifies a specific device on that network. Subnetting lets you split one large network into smaller ones, a technique that reduces routing table size, improves security, and conserves addresses. Without it, every router on the internet would need to know the location of every single host — an impossible task. In production, the key insight is that the boundary between network and host is purely a design decision: you choose the mask. Choose wrong and you either waste addresses or break routing.
Here's the mental model: think of an IP address as a phone number. The area code is the network part, the local number is the host part. Routers only care about the area code to forward your call to the right exchange. Subnetting lets you create new area codes within a city — without it, every router would need to know every local number individually. That doesn't scale.
One more angle: subnetting also creates security boundaries. A router won't forward broadcast traffic across subnets. That means a misconfigured device can't flood your whole network with ARP requests if it's locked to its own subnet. That's a feature you'll appreciate after your first broadcast storm.
Let me tell you something I learned the hard way: when a mask is off by one bit, it's not just a little wrong — it's completely wrong. In production, a /25 instead of a /24 shifts the network boundary so that the default gateway becomes unreachable. The router sees your traffic as belonging to a different network and simply drops it. No error. No log entry. Just silence.
You might think you'll never make that mistake. But I've seen it three times in the last two years alone. Each time the engineer stared at the route table for hours before someone finally checked the subnet mask. Always verify the mask first.
package io.thecodeforge.subnetting; public class SubnetDemo { public static void main(String[] args) { String ip = "192.168.1.55"; String mask = "255.255.255.0"; String network = ipAndMask(ip, mask); System.out.println("Network: " + network); } static String ipAndMask(String ip, String mask) { String[] ipParts = ip.split("\."); String[] maskParts = mask.split("\."); StringBuilder result = new StringBuilder(); for (int i = 0; i < 4; i++) { int ipPart = Integer.parseInt(ipParts[i]); int maskPart = Integer.parseInt(maskParts[i]); result.append(ipPart & maskPart); if (i < 3) result.append("."); } return result.toString(); } }
CIDR Notation: How to Read and Calculate Hosts
CIDR (Classless Inter-Domain Routing) notation replaced the rigid classful system (A, B, C) back in the '90s. Instead of assuming network boundaries based on the first octet, you specify the prefix length explicitly: 192.168.1.0/24 means the first 24 bits are the network prefix, and the remaining 8 bits are host bits. That gives you 2^8 = 256 total addresses, but you lose two: the network address (all host bits 0) and the broadcast address (all host bits 1). So usable hosts = 2^(32 - prefix) - 2. For /24, that's 254. For /16, it's 65534. For /28, it's 14 — way too small for most production workloads.
The formula is simple, but the production trap is thinking that 'size' means usable hosts. I've seen teams provision a /28 for an API service that needed 20 IPs per AZ, then scramble to redesign after the launch failed. Always add 20-30% buffer.
CIDR also enabled route aggregation (supernetting), which dramatically shrinks the global routing table. Before CIDR, the internet was running out of routes. Now, a single /8 aggregate can represent millions of addresses.
Here's a quick way to estimate: for any /X, usable hosts ≈ 2^(32-X). For /24, that's ~250. For /23, ~500. For /22, ~1000. The pattern doubles each time you reduce the prefix by 1. So /16 gives ~65000. That's your mental shortcut.
Another common mistake: confusing the CIDR notation with the subnet mask. When someone says "the CIDR is 255.255.255.0", they mean the mask, not the prefix. CIDR notation is /24. Keep that straight in team discussions.
A production-grade tip: always document your CIDR blocks in a central spreadsheet or IPAM tool. I've seen teams waste hours because they didn't know which /24 was already used. Automation is your friend here.
One more thing: in cloud environments, subnet sizes are often limited by the provider's reserved addresses. In AWS, every subnet loses 5 IPs, not 2. So a /28 gives you only 11 usable IPs. Your 14-host formula is wrong for AWS. Always check the cloud provider's documentation.
And don't forget about overlapping CIDRs: if you accidentally assign the same /24 to two subnets, routing chaos follows. Use a central IPAM tool to prevent that.
import ipaddress def cidr_info(cidr: str) -> dict: net = ipaddress.IPv4Network(cidr, strict=False) return {'network_address': str(net.network_address), 'broadcast_address': str(net.broadcast_address), 'netmask': str(net.netmask), 'prefix_length': net.prefixlen, 'total_addresses': net.num_addresses, 'usable_hosts': net.num_addresses - 2} # Usage: for cidr in ['192.168.1.0/24', '10.0.0.0/16', '172.16.0.0/28']: info = cidr_info(cidr) print(f'{cidr} -> {info["usable_hosts"]} usable hosts')
- The network bits are fixed and define the neighbourhood.
- The host bits are variable — they define the specific house.
- Shorter prefix (/16) = more hosts, fewer networks.
- Longer prefix (/28) = fewer hosts, more networks — useful for point-to-point links.
- Each reduction in prefix by 1 doubles the number of hosts.
-2 is non-negotiable: network and broadcast addresses cannot be assigned.Subnet Masks: Binary and Decimal
The subnet mask is a 32-bit number that, in binary, has a contiguous block of 1s for the network portion followed by 0s for the host portion. The dotted decimal representation (e.g., 255.255.255.0) is just a human-friendly way to write those 32 bits. Convert each octet to decimal and you get the familiar mask. /24 = 255.255.255.0; /16 = 255.255.0.0; /8 = 255.0.0.0.
But here's the production trap: you can't always trust the dotted decimal. I've seen config files where someone typed 255.255.254.0 expecting a /23, but a typo gave 255.255.240.0 (/20) — the device accepted it but routing broke silently because the network addresses changed. Always cross-verify the binary representation, especially when editing configs manually.
Let's walk through an example: IP 192.168.1.55 with mask 255.255.255.0. Binary: 11000000.10101000.00000001.00110111. AND with mask gives 11000000.10101000.00000001.00000000 = 192.168.1.0 (network). The host part is 00110111 = 55. If mask were 255.255.254.0, the network would be 192.168.0.0, and 192.168.1.55 would be part of that network — completely different routing behaviour.
A quick way to convert a mask to binary: for each octet, subtract from 255 to get the number that matters. For /23, the mask is 255.255.254.0; the third octet is 254, which is 11111110 in binary, meaning 7 bits for network in that octet, 1 bit for host. That 1 bit gives you 2^1 = 2 networks? No, /23 gives 512 addresses total. It's easier to think in prefix length.
Another trap: a non-contiguous subnet mask like 255.128.128.0 is invalid. The binary must be a continuous run of 1s from the left. Always check with ipcalc if you're unsure.
One more nuance: some legacy systems use "wildcard masks" (inverse masks) for OSPF or ACLs. That's the bitwise NOT of the subnet mask. Don't confuse them. A wildcard of 0.0.0.255 matches a /24 network, but it's written as inverted bits.
Here's a story from the field: a colleague once typed 255.255.255.255 by accident (a /32) instead of 255.255.255.0 for an interface. The interface came up but no traffic could reach the subnet — the router thought the whole /8 was its own host. It took three hours to find the typo. Always use automation to validate masks.
Another quick sanity check: if you see a mask like 255.255.256.0, that's invalid because 256 is out of range. Catch those before they hit production.
import ipaddress def mask_to_cidr(mask: str) -> int: """Convert dotted decimal mask to CIDR prefix length.""" net = ipaddress.IPv4Network(f'0.0.0.0/{mask}', strict=False) return net.prefixlen def cidr_to_mask(prefix: int) -> str: """Convert CIDR prefix to dotted decimal mask.""" return str(ipaddress.IPv4Network(f'0.0.0.0/{prefix}', strict=False).netmask) # Examples print(mask_to_cidr('255.255.255.0')) # 24 print(cidr_to_mask(16)) # 255.255.0.0 def validate_mask(mask: str) -> bool: """Check if mask is contiguous""" import re binary = ''.join(f'{int(octet):08b}' for octet in mask.split('.')) # mask must be contiguous ones followed by zeros return re.match(r'^1+0+$', binary) is not None print(validate_mask('255.255.255.0')) # True print(validate_mask('255.128.128.0')) # False
show ip interface (Cisco) or ip addr show (Linux) and verify the mask matches the documentation.route -n (Linux).Private IP Ranges and RFC 1918: Why 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16 Are Everywhere
RFC 1918 reserves three blocks of IPv4 addresses for private use: 10.0.0.0/8 (16.7 million addresses), 172.16.0.0/12 (1 million), and 192.168.0.0/16 (65,536). These addresses are not routable on the public internet — they're meant for internal networks. That's why every home router uses 192.168.x.x, and every AWS VPC uses 10.x.x.x or 172.16.x.x.
The choice between them is about scale. 10.0.0.0/8 is huge — you can build a sprawling enterprise network without overlapping. 172.16.0.0/12 is good for medium-sized orgs. 192.168.0.0/16 is tiny and often leads to collisions when companies merge or need to peer VPCs. Production lesson: never use 192.168.0.0/16 for a corporate network — you'll hit address collisions the moment you need to connect to a partner or acquire another company.
You can also use public IP ranges internally if you control them (uncommon). But the standard practice is to pick a /16 from the 10.x range for your VPC and subnet from there. This gives you flexibility and avoids the RFC 1918 collision risk that 192.168 brings.
One more thing: don't forget about RFC 6598 (Carrier-Grade NAT space: 100.64.0.0/10). This is used by ISPs for CGNAT, but you might encounter it in shared environments. Avoid using it internally unless you're building an ISP network.
Also note: just because you can't route private IPs on the internet doesn't mean they can't be leaked. Misconfigured BGP can advertise private ranges. Always filter outbound routes to your upstream provider.
A real-world story: A startup used 192.168.0.0/16 for their entire infrastructure. When they tried to connect to a customer's VPN that also used 192.168.0.0/16, routing fell apart. They had to re-IP their whole network over a weekend. Don't be that team.
Another lesson: when you acquire a company, the first thing to check is their private IP range. If you both use 10.0.0.0/16, you'll need to re-address one side or use NAT. That's expensive and error-prone. Plan ahead.
Here's a quick tip: if you're designing a multi-cloud environment, use a different /16 for each cloud provider. That way, peering between clouds won't cause conflicts.
import ipaddress def is_private(ip_cidr: str) -> bool: """Check if a CIDR block is within RFC 1918 private ranges.""" net = ipaddress.IPv4Network(ip_cidr, strict=False) private_ranges = [\n ipaddress.IPv4Network('10.0.0.0/8'),\n ipaddress.IPv4Network('172.16.0.0/12'),\n ipaddress.IPv4Network('192.168.0.0/16'),\n ] return any(net.subnet_of(pr) for pr in private_ranges) def is_cgnat(ip_cidr: str) -> bool: net = ipaddress.IPv4Network(ip_cidr, strict=False) cgnat = ipaddress.IPv4Network('100.64.0.0/10') return net.subnet_of(cgnat) # Examples print(is_private('10.0.1.0/24')) # True print(is_cgnat('100.64.1.0/24')) # True
Designing Subnets in AWS VPC: A Real-World Example
Let's design a VPC for a typical three-tier web application. We'll use the private IPv4 range 10.0.0.0/16 (65534 usable addresses). We need: - Public subnets for load balancers and NAT gateways (at least 2 AZs, small) - Private subnets for application servers (more IPs needed for scaling) - Database subnets (locked down, no internet access)
Best practice is to allocate contiguous blocks to keep routing simple. Here's a sample design: - Public: 10.0.1.0/24 (us-east-1a), 10.0.2.0/24 (us-east-1b) — 254 IPs each - App: 10.0.10.0/23 (512 IPs), 10.0.12.0/23 — enough for auto-scaling groups - DB: 10.0.20.0/24, 10.0.21.0/24 — RDS takes one IP per instance plus Multi-AZ
Notice we left gaps (10.0.3.0-9.0) for future use. That's the planning rule: never fill a VPC completely. Leave at least 30% address space unallocated. Production lesson: I once saw a VPC with 90% utilisation because someone allocated 10.0.0.0/16 into /24s end-to-end. When a new service needed a new subnet, they had to rebuild the VPC.
A pro tip: use this same design pattern in AWS by creating subnets with explicit CIDR blocks in CloudFormation or Terraform. Validate that no two subnets overlap and that all are within the VPC CIDR.
One more nuance: AWS reserves 5 IPs per subnet, not just 3 as commonly thought. For a /24, you lose .0 (network), .1 (router), .2 (DNS), .3 (future), and .255 (broadcast). That's 5 IPs gone, so you really have 251 usable, not 254. Factor that into your capacity planning.
Also note: when you use a NAT Gateway in a public subnet, it consumes an Elastic IP and one usable IP from that subnet. Make sure your public subnets have enough headroom for both NAT Gateways and future ALB/NLBs.
A lesson from the field: I've seen teams run out of IPs in their app subnet because they didn't account for the fact that each pod in EKS gets its own VPC IP. A /24 supports 251 pods — fine for small clusters, but a production cluster can blow through that in days. Use a /20 for pod subnets.
Another trap: when you create a VPC, you must also consider the CIDR for future peering. If you use 10.0.0.0/16 and later peer with another VPC that also uses 10.0.0.0/16, you'll have overlapping CIDRs and peering will be impossible. Plan a larger /8 or use different /16s for different environments.
And don't forget about the bastion host: if you need to SSH into private instances, you'll need a jump box in a public subnet. That public subnet should be sized to allow at least one EC2 instance plus the NAT Gateway.
import ipaddress def generate_subnets(vpc_cidr: str, subnet_cidrs: list) -> list: vpc = ipaddress.IPv4Network(vpc_cidr, strict=False) subnets = [ipaddress.IPv4Network(c, strict=False) for c in subnet_cidrs] # Validate all subnets are within VPC and non-overlapping for s in subnets: if not vpc.supernet_of(s): raise ValueError(f'{s} is not within {vpc_cidr}') for i, s1 in enumerate(subnets): for s2 in subnets[i+1:]: if s1.overlaps(s2): raise ValueError(f'{s1} overlaps with {s2}') return [str(s) for s in subnets] # Example design design = generate_subnets( '10.0.0.0/16', ['10.0.1.0/24', '10.0.2.0/24', '10.0.10.0/23', '10.0.12.0/23', '10.0.20.0/24', '10.0.21.0/24'] ) print('Valid design:', design)
Common Subnetting Mistakes and How to Fix Them
After years of debugging network problems, I've seen the same patterns over and over. Here are the top three:
- Overlapping subnets: When two subnets in different VPCs (or the same VPC!) overlap, routing becomes unpredictable. The router doesn't know which is the correct destination. In VPC peering, AWS rejects overlapping CIDRs entirely.
- Wrong gateway IP: The default gateway is not always the first usable IP. In AWS, the first IP (.1) is the VPC router, but in on-premises networks, the gateway might be .254 or something else. Hardcoding .1 as gateway is a common mistake when migrating from cloud to on-prem.
- Forgetting the broadcast address: Some applications accidentally use the broadcast address as a host IP. When that happens, traffic to that 'host' floods the entire subnet, causing performance issues and mysterious packet loss.
These are the mistakes that cause 'can't reproduce in dev' incidents. Always validate your subnet plan with automation.
Another mistake: using non-contiguous mask bits (e.g., 255.255.255.128 is fine because it's contiguous, but a mask like 255.128.128.0 is invalid). Always ensure the binary mask is a continuous string of 1s followed by 0s.
One more trap: using a default subnet size without thinking about the service requirements. I've seen teams use /24 for a point-to-point VPN link, wasting 252 IPs. Use /30 or /31 for those links to conserve address space.
A hidden mistake: forgetting that subnets need to be sized for high availability. In AWS, if you lose an Availability Zone, the remaining AZ must handle all traffic. That means your subnet in the surviving AZ must have enough IPs to accommodate all instances. Plan for AZ failure — not just normal operation.
Here's a real one: a team used overlapping subnets in two different VPCs and then peered them. The peering succeeded (because AWS only checks overlap at peering time for certain scenarios), but traffic was intermittently blackholed because the routing table couldn't decide which /24 to use. The fix involved tearing down the peering and redesigning one VPC's CIDR.
Also worth mentioning: when using Terraform, you can avoid overlap with cidrsubnet function and proper variable management. Always use a validation step before apply.
import ipaddress def validate_subnet_design(subnets: list[str]): """ Validate a list of CIDR subnets for common mistakes. Returns list of issues found. """ issues = [] nets = [ipaddress.IPv4Network(s, strict=False) for s in subnets] # Check for overlaps for i, n1 in enumerate(nets): for n2 in nets[i+1:]: if n1.overlaps(n2): issues.append(f'Overlap: {n1} and {n2}') # Check for broadcast usage # This is a simplified check: flag if any host address is the broadcast for net in nets: bcast = str(net.broadcast_address) # In real code check against actual IP assignments return issues if issues else ['Design is valid'] # Example design = ['10.0.1.0/24', '10.0.2.0/24', '10.0.2.128/26'] # last one overlaps print(validate_subnet_design(design))
ipcalc, subnetcalc, or Python's ipaddress module can catch overlaps, wrong sizes, or misaligned boundaries before they become production incidents.Subnetting for Kubernetes: Pod CIDR and Service CIDR
Kubernetes adds two more CIDR layers on top of your VPC: the pod CIDR and the service CIDR. Each node gets a slab of the pod CIDR (e.g., /24 per node), and each pod gets an IP from that node's slab. The service CIDR is a separate block used for ClusterIP services. These CIDRs must not overlap with each other or with the VPC CIDR. If they do, traffic routing breaks silently — pods can't reach services, or worse, traffic destined for a service IP goes to an unrelated VPC resource.
Production lesson: plan your cluster CIDRs before creating the cluster. If your VPC uses 10.0.0.0/16, you might set pod CIDR to 10.1.0.0/16 and service CIDR to 10.2.0.0/16. But watch out: if you have multiple clusters, each needs its own non-overlapping pod and service CIDRs. In AWS EKS, Amazon VPC CNI allows pods to receive VPC IPs, which can exhaust the subnet quickly. Use a dedicated /18 or larger for pods. In self-managed clusters, ensure the pod network plugin (Calico, Flannel) is configured with a CIDR that doesn't conflict with anything else.
Another trap: when using a service mesh like Istio, the mesh may require additional IP ranges. Always document all CIDR allocations upfront.
And don't forget about `kube-proxy` mode: if you use IPVS mode instead of iptables, the service CIDR is handled differently. The IPVS mode can handle more services, but it introduces its own quirks. Make sure your service CIDR doesn't overlap with your node CIDR.
A useful check: before creating a cluster, run a quick Python script (like the one below) to verify non-overlap of all three ranges.
One more production-grade tip: in EKS, the default maximum pods per node is calculated based on the node's primary IP limit. If you use a /24 for your pod subnet, you'll max out at around 250 pods per node, but EC2 instances have lower IP limits. Check the AWS docs for your instance type's max-pods before planning.
I once debugged a cluster where the pod CIDR overlapped with the VPC CIDR by just one bit. Pods trying to reach the API server at 10.0.0.1 were routed to a pod instead. It took two weeks to reproduce because the behaviour was intermittent — it only happened when a pod happened to have the same IP as the service.
If you're using Calico with IPIP encapsulation, you can avoid VPC CIDR conflicts by using a separate IP pool. That's a common solution for overlapping issues.
# Check pod and service CIDRs in a k8s cluster using kubeadm kubeadm config print init-defaults | grep -E 'podSubnet|serviceSubnet' # Or check from a running cluster config kubectl get configmap -n kube-system kube-proxy -o yaml | grep -E 'clusterCIDR|podCIDR' # Check per-node pod CIDR kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}' | tr ' ' '\n' # Validate no overlap between VPC, pod, service CIDRs python3 -c " from ipaddress import ip_network vpc = ip_network('10.0.0.0/16') pod = ip_network('10.1.0.0/16') svc = ip_network('10.2.0.0/16') assert not vpc.overlaps(pod), 'VPC and pod CIDR overlap' assert not vpc.overlaps(svc), 'VPC and service CIDR overlap' assert not pod.overlaps(svc), 'Pod and service CIDR overlap' print('All good: no overlap') "
🎯 Key Takeaways
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.