Senior 16 min · March 06, 2026

IP Subnetting - The /25 Mask That Broke Internet Access

A /25 subnet mask instead of /24 made EC2 instances unreachable from the internet gateway.

N
Naren Founder & Principal Engineer

20+ years shipping production systems from the metal up. Lessons pulled from things that broke in production.

Follow
Production
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer
  • IP addressing uniquely identifies devices; subnetting divides address space into smaller, routable blocks.
  • CIDR notation (e.g., /24) replaces classful addressing and defines how many host bits you get.
  • Hosts = 2^(32 - prefix) - 2 — the -2 is for network and broadcast addresses you cannot assign.
  • Production failure: a /25 instead of /24 changes the network boundary and can make your default gateway unreachable.
  • Performance insight: each wrong bit in a subnet mask can route traffic to the wrong VLAN or blackhole it entirely.
  • Biggest mistake: thinking subnetting is only about saving IPs. It's about routing — wrong mask, wrong network.
✦ Definition~90s read
What is IP Addressing and Subnetting?

IP subnetting is the practice of dividing a larger IP network into smaller, logical subnetworks. It exists because the original classful addressing scheme (Class A, B, C) was wildly inefficient—organizations got either too many or too few addresses, and routing tables were bloated.

Think of the internet like a massive city with billions of houses.

Subnetting lets you carve out exactly the number of hosts you need, reduce broadcast domains, and control traffic flow. The /25 mask (255.255.255.128) is a classic trap: it gives you two subnets of 126 usable hosts each, but if you misconfigure the gateway, DHCP range, or route table, you'll silently break internet access for half your network.

This is why understanding the binary math behind the mask isn't academic—it's the difference between a working VPC and a ticket at 2 AM.

CIDR (Classless Inter-Domain Routing) notation, like /25, is how you express the subnet mask in a compact form. The number after the slash is the count of leading 1-bits in the 32-bit mask. For /25, that's 25 ones followed by 7 zeros: 11111111.11111111.11111111.10000000.

The host bits (the zeros) determine how many addresses are available: 2^7 = 128 total, minus 2 for the network and broadcast addresses, leaves 126 usable hosts. When you're designing subnets in AWS VPC, you must respect these boundaries—AWS reserves the first four and last IP in each subnet for routing, DNS, and broadcast.

If you size your subnet too tightly (e.g., a /28 for a service that needs 14 instances plus a NAT gateway), you'll exhaust IPs and your autoscaling group will fail silently.

Private IP ranges from RFC 1918—10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16—are the foundation of every internal network you've ever touched. They exist because public IPv4 addresses are scarce; these ranges are reserved for internal use and are not routable on the public internet.

In practice, 10.0.0.0/8 is your go-to for large-scale AWS VPCs (you can carve out /16s for each environment), 172.16.0.0/12 is common in corporate VPNs and on-premises networks, and 192.168.0.0/16 is what your home router uses. The most common subnetting mistake?

Overlapping CIDR blocks when peering VPCs or connecting to on-prem—if your VPC uses 10.0.0.0/16 and your data center uses 10.0.0.0/8, traffic to half your hosts will vanish into a black hole. Fix it by planning your IP allocation upfront, using consistent prefix lengths, and never assuming a /24 is 'small enough' for production.

Plain-English First

Think of the internet like a massive city with billions of houses. Every house needs a unique street address so mail can reach it — that's an IP address. But a city isn't just one giant street; it's divided into neighbourhoods, zip codes, and districts to keep things organised and efficient. Subnetting is exactly that: carving a big block of addresses into smaller neighbourhoods so traffic flows to the right place without chaos. Without it, your router would be like a postman trying to deliver to every house on Earth from a single sorting office.

Every packet that crosses your network — an API call, a database query, a Kubernetes pod talking to another — carries a source and destination IP. Without IP addressing, you're just shouting into the void. Get the subnet mask wrong and traffic doesn't just slow down. It stops.

IPv4 has about 4.3 billion addresses, and we ran out years ago. Subnetting, CIDR, and private ranges are the engineering fixes that made the internet keep working. They're baked into every VPC, every router config, every cloud environment you'll ever touch.

Here's the truth: most engineers won't calculate subnets by hand daily. But the one time you need to, a single wrong mask can silence an entire production fleet. That's why you need to understand it — not just pass a cert exam.

This guide covers CIDR math, binary masks, the /25 that broke internet access, and the Python commands that'll save you from subnet calculators forever.

The real cost of a misconfigured subnet isn't just wasted IPs – it's hours of debugging, missed SLAs, and sometimes a full incident post-mortem. That's why this guide focuses on what actually breaks and how to fix it fast.

What Subnetting Actually Does to Your Network

IP subnetting divides a single IP network into smaller, logically isolated segments. The core mechanic is borrowing host bits from the default subnet mask to create a network prefix that identifies each subnet. For example, a /24 network (255.255.255.0) with 256 addresses can be split into two /25 networks (255.255.255.128), each with 128 addresses. This reduces broadcast domains and improves routing efficiency.

Each subnet has a network address (all host bits zero), a broadcast address (all host bits one), and usable host addresses in between. A /25 subnet yields 126 usable addresses (128 minus 2). The subnet mask determines the boundary: any IP address AND its mask reveals the network address. Misconfiguring the mask by even one bit can silently isolate machines or create overlapping subnets that break routing.

Use subnetting when you need to segment traffic for security, performance, or IP conservation. In production, it's essential for VPC design, multi-tenant isolation, and controlling broadcast storms. Without proper subnet planning, you'll exhaust IP space or create routing black holes that are hard to debug.

The /25 Trap
A /25 mask splits a /24 into two subnets, but the second subnet's network address is the first subnet's broadcast address plus one — easy to misconfigure.
Production Insight
A payment service deployed a new microservice with a /25 subnet mask that overlapped the existing /24 subnet. The result: intermittent routing failures where packets to the new service were dropped because the router saw two routes for the same IP range. The rule: never assign a subnet that overlaps any existing route — always verify with a subnet calculator before deployment.
Key Takeaway
Subnetting is about borrowing host bits to create smaller broadcast domains.
A /25 subnet gives 126 usable addresses, not 128 — always subtract the network and broadcast addresses.
Overlapping subnets cause silent routing failures; verify with a subnet calculator before deploying.
IP Subnetting with /25 Mask THECODEFORGE.IO IP Subnetting with /25 Mask From CIDR notation to AWS VPC design and hybrid cloud overlap CIDR Notation & Host Calculation /25 = 255.255.255.128, 126 hosts per subnet Subnet Masks: Binary & Decimal Convert /25 to binary: 11111111.11111111.11111111.10000000 Private IP Ranges (RFC 1918) 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16 AWS VPC Subnet Design Example: 10.0.0.0/16 split into /24 subnets Kubernetes Pod & Service CIDR Pod CIDR /16, Service CIDR /20, avoid overlap ⚠ Overlapping CIDRs in hybrid cloud break routing Use unique RFC 1918 ranges per environment THECODEFORGE.IO
thecodeforge.io
IP Subnetting with /25 Mask
Ip Addressing Subnetting

CIDR Notation: How to Read and Calculate Hosts

CIDR (Classless Inter-Domain Routing) notation replaced the rigid classful system (A, B, C) back in the '90s. Instead of assuming network boundaries based on the first octet, you specify the prefix length explicitly: 192.168.1.0/24 means the first 24 bits are the network prefix, and the remaining 8 bits are host bits. That gives you 2^8 = 256 total addresses, but you lose two: the network address (all host bits 0) and the broadcast address (all host bits 1). So usable hosts = 2^(32 - prefix) - 2. For /24, that's 254. For /16, it's 65534. For /28, it's 14 — way too small for most production workloads.

The formula is simple, but the production trap is thinking that 'size' means usable hosts. I've seen teams provision a /28 for an API service that needed 20 IPs per AZ, then scramble to redesign after the launch failed. Always add 20-30% buffer.

CIDR also enabled route aggregation (supernetting), which dramatically shrinks the global routing table. Before CIDR, the internet was running out of routes. Now, a single /8 aggregate can represent millions of addresses.

Here's a quick way to estimate: for any /X, usable hosts ≈ 2^(32-X). For /24, that's ~250. For /23, ~500. For /22, ~1000. The pattern doubles each time you reduce the prefix by 1. So /16 gives ~65000. That's your mental shortcut.

Another common mistake: confusing the CIDR notation with the subnet mask. When someone says "the CIDR is 255.255.255.0", they mean the mask, not the prefix. CIDR notation is /24. Keep that straight in team discussions.

A production-grade tip: always document your CIDR blocks in a central spreadsheet or IPAM tool. I've seen teams waste hours because they didn't know which /24 was already used. Automation is your friend here.

One more thing: in cloud environments, subnet sizes are often limited by the provider's reserved addresses. In AWS, every subnet loses 5 IPs, not 2. So a /28 gives you only 11 usable IPs. Your 14-host formula is wrong for AWS. Always check the cloud provider's documentation.

And don't forget about overlapping CIDRs: if you accidentally assign the same /24 to two subnets, routing chaos follows. Use a central IPAM tool to prevent that.

io/thecodeforge/subnetting/cidr_calculator.pyPYTHON
1
2
3
4
5
6
7
8
9
10
import ipaddress

def cidr_info(cidr: str) -> dict:
    net = ipaddress.IPv4Network(cidr, strict=False)
    return {'network_address': str(net.network_address), 'broadcast_address': str(net.broadcast_address), 'netmask': str(net.netmask), 'prefix_length': net.prefixlen, 'total_addresses': net.num_addresses, 'usable_hosts': net.num_addresses - 2}

# Usage:
for cidr in ['192.168.1.0/24', '10.0.0.0/16', '172.16.0.0/28']:
    info = cidr_info(cidr)
    print(f'{cidr} -> {info["usable_hosts"]} usable hosts')
Mental Model: CIDR as a Sliding Window
  • The network bits are fixed and define the neighbourhood.
  • The host bits are variable — they define the specific house.
  • Shorter prefix (/16) = more hosts, fewer networks.
  • Longer prefix (/28) = fewer hosts, more networks — useful for point-to-point links.
  • Each reduction in prefix by 1 doubles the number of hosts.
Production Insight
I once saw a team use /28 for a Kubernetes node subnet — they hit IP exhaustion in three days because each node consumes one IP plus pods get IPs from the same block.
Always choose a /24 as the default for any subnet that might grow.
Rule: when in doubt, pick /24 — it fits most workloads and leaves room to breathe.
Key Takeaway
Hosts = 2^(32 - prefix) - 2.
The -2 is non-negotiable: network and broadcast addresses cannot be assigned.
Plan for 30% growth — running out of IPs in a subnet is a production incident you can avoid.
Choosing the Right CIDR Size
IfSingle point-to-point link (two devices)
UseUse /30 or /31 — gives 2 or 0 usable addresses respectively (RFC 3021 allows /31 for PtP)
IfSmall production service (< 50 IPs needed)
UseUse /26 (62 usable) or /25 (126 usable) — leave 30% headroom
IfStandard tier (up to 200 IPs)
UseUse /24 (254 usable) — the industry standard for most subnets
IfLarge subnet (hundreds of IPs, e.g., private network)
UseUse /20 (4094 usable) or larger — but avoid /8 unless you're a large enterprise

Subnet Masks: Binary and Decimal

The subnet mask is a 32-bit number that, in binary, has a contiguous block of 1s for the network portion followed by 0s for the host portion. The dotted decimal representation (e.g., 255.255.255.0) is just a human-friendly way to write those 32 bits. Convert each octet to decimal and you get the familiar mask. /24 = 255.255.255.0; /16 = 255.255.0.0; /8 = 255.0.0.0.

But here's the production trap: you can't always trust the dotted decimal. I've seen config files where someone typed 255.255.254.0 expecting a /23, but a typo gave 255.255.240.0 (/20) — the device accepted it but routing broke silently because the network addresses changed. Always cross-verify the binary representation, especially when editing configs manually.

Let's walk through an example: IP 192.168.1.55 with mask 255.255.255.0. Binary: 11000000.10101000.00000001.00110111. AND with mask gives 11000000.10101000.00000001.00000000 = 192.168.1.0 (network). The host part is 00110111 = 55. If mask were 255.255.254.0, the network would be 192.168.0.0, and 192.168.1.55 would be part of that network — completely different routing behaviour.

A quick way to convert a mask to binary: for each octet, subtract from 255 to get the number that matters. For /23, the mask is 255.255.254.0; the third octet is 254, which is 11111110 in binary, meaning 7 bits for network in that octet, 1 bit for host. That 1 bit gives you 2^1 = 2 networks? No, /23 gives 512 addresses total. It's easier to think in prefix length.

Another trap: a non-contiguous subnet mask like 255.128.128.0 is invalid. The binary must be a continuous run of 1s from the left. Always check with ipcalc if you're unsure.

One more nuance: some legacy systems use "wildcard masks" (inverse masks) for OSPF or ACLs. That's the bitwise NOT of the subnet mask. Don't confuse them. A wildcard of 0.0.0.255 matches a /24 network, but it's written as inverted bits.

Here's a story from the field: a colleague once typed 255.255.255.255 by accident (a /32) instead of 255.255.255.0 for an interface. The interface came up but no traffic could reach the subnet — the router thought the whole /8 was its own host. It took three hours to find the typo. Always use automation to validate masks.

Another quick sanity check: if you see a mask like 255.255.256.0, that's invalid because 256 is out of range. Catch those before they hit production.

io/thecodeforge/subnetting/mask_utils.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import ipaddress

def mask_to_cidr(mask: str) -> int:
    """Convert dotted decimal mask to CIDR prefix length."""
    net = ipaddress.IPv4Network(f'0.0.0.0/{mask}', strict=False)
    return net.prefixlen

def cidr_to_mask(prefix: int) -> str:
    """Convert CIDR prefix to dotted decimal mask."""
    return str(ipaddress.IPv4Network(f'0.0.0.0/{prefix}', strict=False).netmask)

# Examples
print(mask_to_cidr('255.255.255.0'))   # 24
print(cidr_to_mask(16))                # 255.255.0.0

def validate_mask(mask: str) -> bool:
    """Check if mask is contiguous"""
    import re
    binary = ''.join(f'{int(octet):08b}' for octet in mask.split('.'))
    # mask must be contiguous ones followed by zeros
    return re.match(r'^1+0+$', binary) is not None

print(validate_mask('255.255.255.0'))  # True
print(validate_mask('255.128.128.0'))  # False
Warning: Mask Mismatch Breaks Routing
Two devices on the same wire must agree on the subnet mask. If one uses /24 and the other /25, they will disagree on whether an IP is local or remote, causing packets to be sent to the default gateway even for neighbours that are directly connected. This is a classic silent failure that only shows up as dropped pings.
Production Insight
When debugging an inter-VLAN routing issue, the first thing to check is mask consistency on both ends.
I once spent four hours chasing a routing table problem that turned out to be a /23 mask on one router and /24 on its peer.
Rule: after any interface config change, run show ip interface (Cisco) or ip addr show (Linux) and verify the mask matches the documentation.
Key Takeaway
Subnet mask defines the network boundary.
Both ends must agree — mismatch causes silent packet drops.
Always validate masks in binary or use automated tools to avoid typos.
Debugging Mask Mismatch
IfTwo VMs on same VLAN cannot ping each other
UseCheck subnet mask on both VMs. If different, they see different network boundaries. Set both to the same mask.
IfVM can ping gateway but not other hosts on same subnet
UseGateway might have a wrong mask or the routing table might be overriding the directly connected route. Check route -n (Linux).

Private IP Ranges and RFC 1918: Why 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16 Are Everywhere

RFC 1918 reserves three blocks of IPv4 addresses for private use: 10.0.0.0/8 (16.7 million addresses), 172.16.0.0/12 (1 million), and 192.168.0.0/16 (65,536). These addresses are not routable on the public internet — they're meant for internal networks. That's why every home router uses 192.168.x.x, and every AWS VPC uses 10.x.x.x or 172.16.x.x.

The choice between them is about scale. 10.0.0.0/8 is huge — you can build a sprawling enterprise network without overlapping. 172.16.0.0/12 is good for medium-sized orgs. 192.168.0.0/16 is tiny and often leads to collisions when companies merge or need to peer VPCs. Production lesson: never use 192.168.0.0/16 for a corporate network — you'll hit address collisions the moment you need to connect to a partner or acquire another company.

You can also use public IP ranges internally if you control them (uncommon). But the standard practice is to pick a /16 from the 10.x range for your VPC and subnet from there. This gives you flexibility and avoids the RFC 1918 collision risk that 192.168 brings.

One more thing: don't forget about RFC 6598 (Carrier-Grade NAT space: 100.64.0.0/10). This is used by ISPs for CGNAT, but you might encounter it in shared environments. Avoid using it internally unless you're building an ISP network.

Also note: just because you can't route private IPs on the internet doesn't mean they can't be leaked. Misconfigured BGP can advertise private ranges. Always filter outbound routes to your upstream provider.

A real-world story: A startup used 192.168.0.0/16 for their entire infrastructure. When they tried to connect to a customer's VPN that also used 192.168.0.0/16, routing fell apart. They had to re-IP their whole network over a weekend. Don't be that team.

Another lesson: when you acquire a company, the first thing to check is their private IP range. If you both use 10.0.0.0/16, you'll need to re-address one side or use NAT. That's expensive and error-prone. Plan ahead.

Here's a quick tip: if you're designing a multi-cloud environment, use a different /16 for each cloud provider. That way, peering between clouds won't cause conflicts.

io/thecodeforge/subnetting/rfc1918_check.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import ipaddress

def is_private(ip_cidr: str) -> bool:
    """Check if a CIDR block is within RFC 1918 private ranges."""
    net = ipaddress.IPv4Network(ip_cidr, strict=False)
    private_ranges = [
        ipaddress.IPv4Network('10.0.0.0/8'),
        ipaddress.IPv4Network('172.16.0.0/12'),
        ipaddress.IPv4Network('192.168.0.0/16'),
    ]
    return any(net.subnet_of(pr) for pr in private_ranges)

def is_cgnat(ip_cidr: str) -> bool:
    net = ipaddress.IPv4Network(ip_cidr, strict=False)
    cgnat = ipaddress.IPv4Network('100.64.0.0/10')
    return net.subnet_of(cgnat)

# Examples
print(is_private('10.0.1.0/24'))       # True
print(is_cgnat('100.64.1.0/24'))       # True
AWS VPC Default CIDR
When you create a default VPC in AWS, it uses 172.31.0.0/16 — that's from the 172.16.0.0/12 private range. It's a valid choice, but if you later need to peer with another VPC using 172.31.0.0/16, you're stuck. Always plan your private range allocation centrally.
Production Insight
A common failure: two divisions of the same company both used 192.168.1.0/24 for their internal networks. When they tried to interconnect via VPN, routing broke because the same IPs existed on both sides.
Solution: Use NAT on one side or re-address one network — both are painful.
Rule: reserve a unique /16 from 10.0.0.0/8 for each business unit or environment.
Key Takeaway
Private IP ranges are free to use internally but must never leak to the internet.
Choose 10.0.0.0/8 for flexibility; avoid 192.168.0.0/16 for corporate networks.
Plan your private address allocation globally to prevent collision-induced routing nightmares.
Choosing a Private Range
IfSmall office or home network (< 100 devices)
UseUse 192.168.x.x /24 — simple, widely supported.
IfCorporate network with multiple sites
UseUse 10.x.x.x with a /16 per site — plenty of room and avoids overlap.
IfCloud VPC for a startup (single region)
UseUse 10.0.0.0/16 — leaves room for expansion, works with any cloud.
IfEnterprise with multi-cloud / hybrid
UseUse 10.x.x.x with a global /8 allocation plan — coordinate centralised to avoid overlap.

Designing Subnets in AWS VPC: A Real-World Example

Let's design a VPC for a typical three-tier web application. We'll use the private IPv4 range 10.0.0.0/16 (65534 usable addresses). We need: - Public subnets for load balancers and NAT gateways (at least 2 AZs, small) - Private subnets for application servers (more IPs needed for scaling) - Database subnets (locked down, no internet access)

Best practice is to allocate contiguous blocks to keep routing simple. Here's a sample design: - Public: 10.0.1.0/24 (us-east-1a), 10.0.2.0/24 (us-east-1b) — 254 IPs each - App: 10.0.10.0/23 (512 IPs), 10.0.12.0/23 — enough for auto-scaling groups - DB: 10.0.20.0/24, 10.0.21.0/24 — RDS takes one IP per instance plus Multi-AZ

Notice we left gaps (10.0.3.0-9.0) for future use. That's the planning rule: never fill a VPC completely. Leave at least 30% address space unallocated. Production lesson: I once saw a VPC with 90% utilisation because someone allocated 10.0.0.0/16 into /24s end-to-end. When a new service needed a new subnet, they had to rebuild the VPC.

A pro tip: use this same design pattern in AWS by creating subnets with explicit CIDR blocks in CloudFormation or Terraform. Validate that no two subnets overlap and that all are within the VPC CIDR.

One more nuance: AWS reserves 5 IPs per subnet, not just 3 as commonly thought. For a /24, you lose .0 (network), .1 (router), .2 (DNS), .3 (future), and .255 (broadcast). That's 5 IPs gone, so you really have 251 usable, not 254. Factor that into your capacity planning.

Also note: when you use a NAT Gateway in a public subnet, it consumes an Elastic IP and one usable IP from that subnet. Make sure your public subnets have enough headroom for both NAT Gateways and future ALB/NLBs.

A lesson from the field: I've seen teams run out of IPs in their app subnet because they didn't account for the fact that each pod in EKS gets its own VPC IP. A /24 supports 251 pods — fine for small clusters, but a production cluster can blow through that in days. Use a /20 for pod subnets.

Another trap: when you create a VPC, you must also consider the CIDR for future peering. If you use 10.0.0.0/16 and later peer with another VPC that also uses 10.0.0.0/16, you'll have overlapping CIDRs and peering will be impossible. Plan a larger /8 or use different /16s for different environments.

And don't forget about the bastion host: if you need to SSH into private instances, you'll need a jump box in a public subnet. That public subnet should be sized to allow at least one EC2 instance plus the NAT Gateway.

io/thecodeforge/subnetting/vpc_designer.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import ipaddress

def generate_subnets(vpc_cidr: str,
        subnet_cidrs: list) -> list:
    vpc = ipaddress.IPv4Network(vpc_cidr, strict=False)
    subnets = [ipaddress.IPv4Network(c, strict=False) for c in subnet_cidrs]
    # Validate all subnets are within VPC and non-overlapping
    for s in subnets:
        if not vpc.supernet_of(s):
            raise ValueError(f'{s} is not within {vpc_cidr}')
    for i, s1 in enumerate(subnets):
        for s2 in subnets[i+1:]:
            if s1.overlaps(s2):
                raise ValueError(f'{s1} overlaps with {s2}')
    return [str(s) for s in subnets]

# Example design
design = generate_subnets(
    '10.0.0.0/16',
    ['10.0.1.0/24', '10.0.2.0/24', '10.0.10.0/23', '10.0.12.0/23', '10.0.20.0/24', '10.0.21.0/24']
)
print('Valid design:', design)
AWS VPC IP Reservation
AWS reserves the first four IP addresses and the last one in every subnet. For a /24, they occupy 10.0.1.0, .1, .2, .3 and .255. So you really have only 251 usable IPs (254 - 5). Factor this into sizing; a /28 with 16 total minus 5 leaves only 11 usable IPs — barely enough for a single application tier.
Production Insight
The most common subnet design failure in AWS is running out of IPs in a subnet because growth wasn't forecast.
A /28 might seem fine for a Proof-of-Concept, but once it goes live with auto-scaling, you'll exhaust it within a week.
Rule: always allocate subnets with at least 3x your initial estimated need — IPs are free, subnet redesigns are not.
Key Takeaway
Design subnets with 3x headroom.
AWS reserves 5 IPs per subnet — account for that.
Leave large gaps in the VPC CIDR for future services.
Choosing Subnet Size for Tiers
IfPublic subnet with NAT Gateway and a few instances
UseUse /24 — ensures enough IPs for NAT, ELB, and a small number of EC2s
IfApplication subnet with auto-scaling (up to 100 instances)
UseUse /23 (512 IPs) to leave room for peak scaling. /24 can run out during deployments.
IfDatabase subnet with RDS Multi-AZ
UseUse /24 — each DB cluster consumes 1 IP, plus you may need replica instances. Rarely needs more.

Common Subnetting Mistakes and How to Fix Them

After years of debugging network problems, I've seen the same patterns over and over. Here are the top three:

  1. Overlapping subnets: When two subnets in different VPCs (or the same VPC!) overlap, routing becomes unpredictable. The router doesn't know which is the correct destination. In VPC peering, AWS rejects overlapping CIDRs entirely.
  2. Wrong gateway IP: The default gateway is not always the first usable IP. In AWS, the first IP (.1) is the VPC router, but in on-premises networks, the gateway might be .254 or something else. Hardcoding .1 as gateway is a common mistake when migrating from cloud to on-prem.
  3. Forgetting the broadcast address: Some applications accidentally use the broadcast address as a host IP. When that happens, traffic to that 'host' floods the entire subnet, causing performance issues and mysterious packet loss.

These are the mistakes that cause 'can't reproduce in dev' incidents. Always validate your subnet plan with automation.

Another mistake: using non-contiguous mask bits (e.g., 255.255.255.128 is fine because it's contiguous, but a mask like 255.128.128.0 is invalid). Always ensure the binary mask is a continuous string of 1s followed by 0s.

One more trap: using a default subnet size without thinking about the service requirements. I've seen teams use /24 for a point-to-point VPN link, wasting 252 IPs. Use /30 or /31 for those links to conserve address space.

A hidden mistake: forgetting that subnets need to be sized for high availability. In AWS, if you lose an Availability Zone, the remaining AZ must handle all traffic. That means your subnet in the surviving AZ must have enough IPs to accommodate all instances. Plan for AZ failure — not just normal operation.

Here's a real one: a team used overlapping subnets in two different VPCs and then peered them. The peering succeeded (because AWS only checks overlap at peering time for certain scenarios), but traffic was intermittently blackholed because the routing table couldn't decide which /24 to use. The fix involved tearing down the peering and redesigning one VPC's CIDR.

Also worth mentioning: when using Terraform, you can avoid overlap with cidrsubnet function and proper variable management. Always use a validation step before apply.

io/thecodeforge/subnetting/validation.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import ipaddress

def validate_subnet_design(subnets: list[str]):
    """
    Validate a list of CIDR subnets for common mistakes.
    Returns list of issues found.
    """
    issues = []
    nets = [ipaddress.IPv4Network(s, strict=False) for s in subnets]
    # Check for overlaps
    for i, n1 in enumerate(nets):
        for n2 in nets[i+1:]:
            if n1.overlaps(n2):
                issues.append(f'Overlap: {n1} and {n2}')
    # Check for broadcast usage
    # This is a simplified check: flag if any host address is the broadcast
    for net in nets:
        bcast = str(net.broadcast_address)
        # In real code check against actual IP assignments
    return issues if issues else ['Design is valid']

# Example
design = ['10.0.1.0/24', '10.0.2.0/24', '10.0.2.128/26']  # last one overlaps
print(validate_subnet_design(design))
Pro Tip: Automate Validation
Before deploying any subnet configuration, run it through a validation script. Tools like ipcalc, subnetcalc, or Python's ipaddress module can catch overlaps, wrong sizes, or misaligned boundaries before they become production incidents.
Production Insight
A classic silent failure: two microservices deployed in overlapping subnets within the same VPC. Traffic between them worked intermittently because the VPC router used longest prefix match, but the application code made assumptions about specific IPs.
Debugging took two days and involved packet traces.
Rule: never assume subnets are non-overlapping — always validate programmatically.
Key Takeaway
Overlapping subnets cause unpredictable routing.
Never hardcode gateway IPs — use DHCP or cloud metadata.
Automate subnet validation as part of your CI/CD pipeline.
Mistake Severity Decision Tree
IfSubnet overlap detected
UseImmediate re-address; routing is unpredictable.
IfMask mismatch between peers
UseFix mask on one side, verify both ends.
IfSubnet too small (IP exhaustion)
UseCreate larger subnet, migrate resources, plan headroom.
IfNon-contiguous mask used
UseInvalid configuration; replace with proper mask.

Subnetting for Kubernetes: Pod CIDR and Service CIDR

Kubernetes adds two more CIDR layers on top of your VPC: the pod CIDR and the service CIDR. Each node gets a slab of the pod CIDR (e.g., /24 per node), and each pod gets an IP from that node's slab. The service CIDR is a separate block used for ClusterIP services. These CIDRs must not overlap with each other or with the VPC CIDR. If they do, traffic routing breaks silently — pods can't reach services, or worse, traffic destined for a service IP goes to an unrelated VPC resource.

Production lesson: plan your cluster CIDRs before creating the cluster. If your VPC uses 10.0.0.0/16, you might set pod CIDR to 10.1.0.0/16 and service CIDR to 10.2.0.0/16. But watch out: if you have multiple clusters, each needs its own non-overlapping pod and service CIDRs. In AWS EKS, Amazon VPC CNI allows pods to receive VPC IPs, which can exhaust the subnet quickly. Use a dedicated /18 or larger for pods. In self-managed clusters, ensure the pod network plugin (Calico, Flannel) is configured with a CIDR that doesn't conflict with anything else.

Another trap: when using a service mesh like Istio, the mesh may require additional IP ranges. Always document all CIDR allocations upfront.

And don't forget about `kube-proxy` mode: if you use IPVS mode instead of iptables, the service CIDR is handled differently. The IPVS mode can handle more services, but it introduces its own quirks. Make sure your service CIDR doesn't overlap with your node CIDR.

A useful check: before creating a cluster, run a quick Python script (like the one below) to verify non-overlap of all three ranges.

One more production-grade tip: in EKS, the default maximum pods per node is calculated based on the node's primary IP limit. If you use a /24 for your pod subnet, you'll max out at around 250 pods per node, but EC2 instances have lower IP limits. Check the AWS docs for your instance type's max-pods before planning.

I once debugged a cluster where the pod CIDR overlapped with the VPC CIDR by just one bit. Pods trying to reach the API server at 10.0.0.1 were routed to a pod instead. It took two weeks to reproduce because the behaviour was intermittent — it only happened when a pod happened to have the same IP as the service.

If you're using Calico with IPIP encapsulation, you can avoid VPC CIDR conflicts by using a separate IP pool. That's a common solution for overlapping issues.

io/thecodeforge/subnetting/k8s_cidr_check.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Check pod and service CIDRs in a k8s cluster using kubeadm
kubeadm config print init-defaults | grep -E 'podSubnet|serviceSubnet'

# Or check from a running cluster config
kubectl get configmap -n kube-system kube-proxy -o yaml | grep -E 'clusterCIDR|podCIDR'

# Check per-node pod CIDR
kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}' | tr ' ' '\n'

# Validate no overlap between VPC, pod, service CIDRs
python3 -c "
from ipaddress import ip_network
vpc = ip_network('10.0.0.0/16')
pod = ip_network('10.1.0.0/16')
svc = ip_network('10.2.0.0/16')
assert not vpc.overlaps(pod), 'VPC and pod CIDR overlap'
assert not vpc.overlaps(svc), 'VPC and service CIDR overlap'
assert not pod.overlaps(svc), 'Pod and service CIDR overlap'
print('All good: no overlap')
"
Warning: Overlapping CIDRs in Kubernetes
If your pod CIDR overlaps with your VPC CIDR, pods cannot communicate with services or external resources that fall within that overlapping range. Always assign non-overlapping blocks for VPC, pods, and services.
Production Insight
I once joined a team where the pod CIDR (10.0.0.0/14) overlapped with the VPC CIDR (10.0.0.0/16). Pods trying to reach the DNS service at 10.0.0.10 couldn't tell if it was a pod or a VPC resource. Traffic was misrouted for weeks before someone noticed.
The fix: recreate the cluster with a non-overlapping pod CIDR. That meant data-loss risk and downtime.
Rule: never let pod, service, and VPC CIDRs overlap — triple-check before cluster creation.
Key Takeaway
Kubernetes adds pod and service CIDRs — they must not overlap with the VPC CIDR.
Plan all CIDRs before cluster creation; fixing later is painful.
Use dedicated, non-overlapping blocks for each cluster to avoid routing chaos.
Choosing K8s CIDR Blocks
IfVPC uses 10.0.0.0/16, single cluster
UseUse pod CIDR 10.1.0.0/16, service CIDR 10.2.0.0/16 — no overlap.
IfMultiple clusters in same VPC or peered VPCs
UseUse unique /16 per cluster, e.g., cluster1: pod=10.1.0.0/16, svc=10.2.0.0/16; cluster2: pod=10.3.0.0/16, svc=10.4.0.0/16.
IfUsing AWS VPC CNI (pods get VPC IPs)
UseAllocate dedicated subnets for pods, e.g., /18 each. Ensure total pod IPs don't exceed subnet size.
IfService mesh (e.g., Istio) requires extra IPs
UseReserve an additional /20 or /16 for mesh traffic, outside the pod and service CIDRs.

Subnetting in Hybrid Cloud: Avoiding Overlap with On-Premises

When you connect an on-premises network to a cloud VPC via VPN or Direct Connect, the biggest risk is CIDR overlap. If both sides use 10.0.0.0/16, traffic to any 10.x.x.x address is ambiguous — does it go on-prem or to the cloud? This causes asymmetric routing, dropped packets, and hours of debugging.

The fix: before any hybrid connection, audit both sides' CIDR allocations. Assign a unique /16 from 10.0.0.0/8 to each environment (e.g., on-prem gets 10.1.0.0/16, cloud-prod gets 10.2.0.0/16, cloud-dev gets 10.3.0.0/16). If overlap is unavoidable, use NAT to translate overlapping addresses at the boundary.

A real-world example: a company with a 10.0.0.0/16 on-prem wanted to migrate to AWS using a Direct Connect. They created an AWS VPC with 10.0.0.0/16. The Direct Connect failed to establish. They had to re-IP the entire on-prem network to 10.1.0.0/16 — a multi-month project.

Another approach: use RFC 6598 (Carrier-Grade NAT) addresses for one side if you control both ends, but that's rare. Most enterprises stick to RFC 1918 and manage with careful planning.

Tooling: maintain a central IP address management (IPAM) system that tracks all CIDR blocks across on-prem and cloud. Tools like phpIPAM, NetBox, or even a spreadsheet with validation scripts can prevent overlaps before they happen.

Final tip: always include a 'last mile' step in your migration plan that validates no overlap between the new cloud CIDR and existing on-prem CIDRs. A simple Python script can save you weeks of rework.

io/thecodeforge/subnetting/overlap_check.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import ipaddress

def check_overlap(cidr1: str, cidr2: str) -> bool:
    """Return True if two CIDRs overlap."""
    net1 = ipaddress.IPv4Network(cidr1, strict=False)
    net2 = ipaddress.IPv4Network(cidr2, strict=False)
    return net1.overlaps(net2)

# Example: check cloud vs on-prem
cloud_cidr = '10.0.0.0/16'
onprem_cidr = '10.0.0.0/16'
if check_overlap(cloud_cidr, onprem_cidr):
    print("ERROR: Overlap detected! Change one CIDR.")
else:
    print("OK: No overlap.")
Output
ERROR: Overlap detected! Change one CIDR.
Warning: Hybrid CIDR Overlap
If you already have on-premises 10.0.0.0/16, do not use the same block in your cloud VPC. You will either have to re-address one side or use complex NAT rules. Prevention is far cheaper than correction.
Production Insight
I've been part of two major re-IP projects because no one checked CIDR overlap before setting up VPN connections. Each took months and caused significant application downtime.
Solution: implement a mandatory CIDR conflict check in your change management process.
Rule: never create a new VPC without first querying your IPAM tool for existing allocations.
Key Takeaway
Hybrid connectivity requires non-overlapping CIDRs.
Audit both sides before connecting.
Use an IPAM tool to track all allocations and prevent overlaps at design time.
Hybrid CIDR Planning
IfOn-prem and cloud CIDRs do not overlap
UseProceed with VPN/Direct Connect setup. Ensure routes are propagated.
IfOn-prem and cloud CIDRs overlap partially
UseUse NAT on one side for the overlapping range, or re-address the smaller block.
IfComplete overlap (same /16)
UseRe-address one environment to a non-overlapping block. This is painful but necessary.

Why Subnetting Exists: The Real Reason Your Network Is Not a Flat Parking Lot

Imagine a single network with 200 machines. A printer sends a broadcast. Every single NIC on that wire wakes up, processes the packet, then goes back to sleep. That's a flat network. It works until it doesn't. Subnetting carves your broadcast domain into pieces. Each subnet is its own broadcast island. Traffic stays local unless a router explicitly forwards it. You get three things: containment, isolation, and efficiency. Containment stops broadcast storms from taking down the entire org. Isolation means HR can't accidentally ping the production database. Efficiency means you stop wasting addresses on a classful scheme that doesn't fit your real headcount. The 'Need of Subnetting' competitors mention is this: you subnet because a /24 with 254 addresses is wasteful for 10 IoT sensors. You subnet because security requires boundaries. You subnet because routing protocols get confused when every router thinks it owns the same block. That's the why. The how follows.

network_audit.goGO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// io.thecodeforge
package main

import (
	"fmt"
	"net"
)

// Production trap: broadcast storms in flat /16s killed this API gateway
func auditBroadcastDomain(cidr string) {
	ip, ipnet, _ := net.ParseCIDR(cidr)
	ones, bits := ipnet.Mask.Size()
	hosts := 1 << (bits - ones)
	fmt.Printf("CIDR %s: %d addresses, %d usable\n", cidr, hosts, hosts-2)
	if hosts > 1024 {
		fmt.Println("WARN: Unnecessarily large broadcast domain. Segment this.")
	}
}

func main() {
	auditBroadcastDomain("10.0.0.0/16") // 65534 addresses — ouch
	auditBroadcastDomain("10.0.1.0/24") // 254 — sane
}
Output
CIDR 10.0.0.0/16: 65536 addresses, 65534 usable
WARN: Unnecessarily large broadcast domain. Segment this.
CIDR 10.0.1.0/24: 256 addresses, 254 usable
Production Trap:
Don't subnet to 'save IPs' in private space — you have millions. Subnet to control broadcast radius and enforce routing boundaries. A /16 for 50 containers is a disaster waiting for a broadcast.
Key Takeaway
Subnet first for broadcast containment, then for security, then for address efficiency — in that order.

Classful Subnetting: The Ancient Art of /8, /16, /24 and Why You Should Forget It

Classful addressing was the original mold: Class A got 16M hosts, Class B got 65K, Class C got 254. It was rigid and wasteful. If you had 300 machines, you took a Class B — 65K addresses flushed down the toilet. Subnetting smashed that mold by borrowing host bits to create smaller networks. But here's the trap: IP classes still infect routing tables and firewall rules. Many networking tools still default to classful boundaries. If you misconfigure a /23 and a device assumes classful /24, packets vanish. The competitors walk through Class A, B, C subnet tables with binary math. Fine. But the senior engineer takeaway is: treat your subnet mask as a binary mask, not a class label. Use CIDR everywhere. Never assume '192.168.x.x is a /24' — it's a /16 if you don't mask it. The key concept is that a subnet mask is just a 32-bit prefix length. Memorize the powers of two up to 32. Forget the classes.

subnet_calc.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# io.thecodeforge
import ipaddress

def explain_subnet(cidr: str):
    net = ipaddress.IPv4Network(cidr, strict=False)
    print(f"Network: {net.network_address}")
    print(f"Netmask: {net.netmask} (/{net.prefixlen})")
    print(f"Usable hosts: {net.num_addresses - 2}")
    print(f"First host: {net.network_address + 1}")
    print(f"Broadcast: {net.broadcast_address}")

# Real production scenario: aligning kubernetes pod CIDR
explain_subnet("10.100.0.0/23")
print("---")
# Don't trust classful defaults
explain_subnet("192.168.50.0/24")
Output
Network: 10.100.0.0
Netmask: 255.255.254.0 (/23)
Usable hosts: 510
First host: 10.100.0.1
Broadcast: 10.100.1.255
---
Network: 192.168.50.0
Netmask: 255.255.255.0 (/24)
Usable hosts: 254
First host: 192.168.50.1
Broadcast: 192.168.50.255
Senior Engineer Rule:
Never write a subnet calculator by hand. The ipaddress library in Python or net package in Go handles binary math. Your job is to pick the right prefix length for the workload.
Key Takeaway
Forget IP classes. They died in 1993. Every subnet is just a network address plus a prefix length (CIDR).
● Production incidentPOST-MORTEMseverity: high

The /25 That Killed Internet Access

Symptom
EC2 instances in a public subnet could not reach the internet (yum update failed, API calls timed out), despite having a correct route table with 0.0.0.0/0 pointing to an internet gateway.
Assumption
The route table must be broken — someone must have deleted the default route.
Root cause
The subnet was created with a /25 mask (10.0.1.0/25) instead of the intended /24. The internet gateway's attachment was in the original VPC /16 range, but the subnet's network address (10.0.1.0) and the gateway's address (10.0.1.1) were actually in different subnets due to the mask boundary shift. The gateway simply wasn't reachable from that subnet.
Fix
Recreated the subnet with the correct CIDR 10.0.1.0/24 and migrated the instances. No route table change was needed.
Key lesson
  • Always double-check subnet mask boundaries — a /25 vs /24 shifts the network address and can break connectivity silently.
  • When designing public subnets, use at least /24 to avoid confusion and leave room for growth.
  • Automate subnet creation with infrastructure-as-code and validate CIDR alignment before applying.
  • Always validate subnet mask before attaching internet gateway — a mismatch can take down outbound traffic silently.
  • After the fix, run a connectivity test from inside the subnet: ping 8.8.8.8 should succeed immediately.
  • Lesson: When in doubt, use /24. The cost of a larger subnet is zero; the cost of debugging a wrong mask is hours.
Production debug guideQuick reference for diagnosing common subnet-related production issues.10 entries
Symptom · 01
EC2 launch fails with 'Insufficient IP addresses'
Fix
Check subnet size: use aws ec2 describe-subnets --subnet-ids and look at 'AvailableIpAddressCount'. Increase subnet size or create a larger one.
Symptom · 02
Instance gets an IP but cannot reach internet (public subnet)
Fix
Verify subnet is correctly associated with a route table that has 0.0.0.0/0 to an internet gateway. Then confirm the internet gateway is attached to the VPC.
Symptom · 03
Two peers cannot communicate over VPC peering
Fix
Check for overlapping CIDR blocks between the two VPCs. If they overlap, peering fails silently.
Symptom · 04
Ping to a neighbour fails but config looks correct
Fix
Verify both ends have the same subnet mask. Mismatched masks cause routing to treat the neighbour as on a different network.
Symptom · 05
VPN connection failing between on-premises and cloud
Fix
Ensure the on-premises CIDR and cloud VPC CIDR do not overlap. If they do, re-address one side or use NAT. Check VPN tunnel status and BGP prefixes.
Symptom · 06
Two routers with the same public IP range but different prefixes cannot peer
Fix
Use ipcalc to calculate the network address for both prefixes. If they differ, one router must be reconfigured with a matching mask or the route must be summarised.
Symptom · 07
Application logs show intermittent connectivity to a specific service
Fix
Check if the service runs in a different subnet that overlaps with a subnet of another VPC. Overlapping can cause asymmetric routing.
Symptom · 08
EC2 instance gets IP but internal traffic to another subnet fails
Fix
Verify that the subnet's route table has routes for the destination subnet. A missing route or a mismatched mask on the subnet itself can cause this.
Symptom · 09
Auto Scaling group fails to launch instances due to IP exhaustion
Fix
Check the subnet's current available IP count. If below 10% of total, consider adding a larger subnet or distributing across more subnets. Use a /23 or larger for auto-scaling groups.
Symptom · 10
Route summarisation causes traffic blackhole
Fix
Verify that the aggregate route exactly covers only the subnets you own. Use ipcalc or ipaddress.collapse_addresses to check for gaps.
★ Subnet Calculation Quick Cheat SheetUse these commands to validate CIDR, mask, and host counts when debugging network designs.
Need to know how many usable IPs a CIDR provides
Immediate action
Run `ipcalc 10.0.1.0/24` (Linux/macOS) or use an online calculator.
Commands
ipcalc 10.0.1.0/24
ipcalc 10.0.1.0/24 --ipaddress 10.0.1.15
Fix now
Use a /23 if you need more than 254 hosts — never use /28 for production workloads unless you're sure.
Wondering if two subnets overlap+
Immediate action
Convert both to binary and compare network bits. Tools like subnetcalc.net can do this.
Commands
ipcalc 10.0.1.0/24 10.0.2.0/24
python3 -c "from ipaddress import ip_network; print('overlap' if ip_network('10.0.1.0/24').overlaps(ip_network('10.0.2.0/24')) else 'no')"
Fix now
Redesign the two overlapping blocks — you cannot use the same address space in two different parts of your network without NAT or VPC peering with non-overlapping CIDRs.
Need the subnet mask from a CIDR prefix length+
Immediate action
Memorise the common ones: /24 = 255.255.255.0, /16 = 255.255.0.0, /8 = 255.0.0.0.
Commands
printf '/24 = %s\n' $(python3 -c "import ipaddress; print(ipaddress.IPv4Network('0.0.0.0/24').netmask)")
Fix now
For any prefix, convert: subnet mask = 0xFFFFFFFF << (32 - prefix) as dotted decimal.
Need to check if a CIDR is in the RFC 1918 private range+
Immediate action
Use Python's ipaddress module to test against known private ranges.
Commands
python3 -c "from ipaddress import ip_network; net = ip_network('10.0.0.0/8'); print('private' if net.is_private else 'public')"
Fix now
If a CIDR is public but used internally, ensure it's not leaked via routing. Use private ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) to avoid conflicts.
Need to calculate how many /24 subnets fit into a /20 VPC+
Immediate action
Divide the total addresses: a /20 has 4096 total addresses, so 16 /24 subnets. Use ipcalc or Python.
Commands
python3 -c "print(2**(20-24))" # gives 16 for /20 -> /24
python3 -c "from ipaddress import ip_network; net = ip_network('10.0.0.0/20'); print(list(net.subnets(new_prefix=24))[:5])"
Fix now
When planning subnets, start from the largest block and work down. Ensure the total subnets don't exceed available space.
Need to verify the network address of a host IP+
Immediate action
Use `ipcalc` or Python: `ip_network('10.0.1.55/24', strict=False).network_address`
Commands
python3 -c "from ipaddress import ip_network; net = ip_network('10.0.1.55/24', strict=False); print(net.network_address)"
Fix now
If the network address doesn't match the subnet's base, you've got a mask mismatch.
Need to check if a subnet has enough room for a given number of hosts+
Immediate action
Calculate usable hosts: 2^(32-prefix)-2. For AWS subtract 5. Use Python.
Commands
python3 -c "from ipaddress import ip_network; net = ip_network('10.0.0.0/24'); print('Usable:', net.num_addresses - 2)"
python3 -c "print('AWS usable:', 2**(32-24) - 5)"
Fix now
Always add 30% headroom. If you need 100 IPs, use /25 (126 usable) not /26 (62).

Key takeaways

1
Subnetting divides IP space into routable blocks; the subnet mask defines the network boundary.
2
Usable hosts = 2^(32 - prefix) - 2; cloud providers may reserve additional IPs.
3
Private ranges (RFC 1918) must be planned globally to avoid overlaps in hybrid/multi-cloud.
4
Kubernetes adds pod and service CIDRs that must not overlap with the VPC CIDR.
5
Always validate subnet masks, CIDR boundaries, and non-overlap using automation before deployment.

Common mistakes to avoid

5 patterns
×

Overlapping subnets in different VPCs or environments

Symptom
VPN/VPC peering fails or traffic is intermittently blackholed; routing tables have conflicting entries.
Fix
Run ipaddress overlap check before peering. Ensure each environment uses a dedicated /16 from 10.0.0.0/8. If already overlapping, re-address one side or use NAT.
×

Hardcoding the default gateway as the first usable IP (e.g., 10.0.1.1) without verification

Symptom
Instances in a subnet cannot reach the internet or other subnets because the gateway IP is wrong (e.g., on-prem uses .254).
Fix
Use DHCP or cloud metadata to obtain the correct gateway. In AWS, gateway is always the first IP (.1); on-prem check with ip route show default.
×

Using a non-contiguous subnet mask (e.g., 255.128.128.0)

Symptom
Router rejects the mask or routing behaves unpredictably because the mask doesn't have a contiguous block of 1s.
Fix
Always use masks like 255.255.255.0 (contiguous). Validate with Python: mask_to_cidr should not throw. Replace with a valid contiguous mask.
×

Underestimating IP consumption in Kubernetes (pod CIDR too small)

Symptom
Pod creation fails with 'No IP addresses available' or nodes experience IP exhaustion.
Fix
Allocate a /18 or larger for pod CIDR in EKS when using VPC CNI. In self-managed, use a /16 for pods. Monitor IP usage with kubectl describe nodes.
×

Forgetting AWS reserves 5 IPs per subnet

Symptom
Launching instances fails with 'Insufficient IP addresses' even though the subnet seems large enough.
Fix
Account for 5 reserved IPs when sizing. For a /24, usable = 251, not 254. Use /28 gives only 11 usable, which is too small for production.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Explain the difference between a /24 and a /25 subnet in terms of usable...
Q02SENIOR
Your VPC uses 10.0.0.0/16 and you need to peer with another VPC using 10...
Q03JUNIOR
Describe how you would calculate the number of usable IPs in a /20 subne...
Q04SENIOR
A junior engineer sets up a public subnet with CIDR 10.0.1.0/25 but the ...
Q01 of 04SENIOR

Explain the difference between a /24 and a /25 subnet in terms of usable host addresses and network boundaries. Give a production scenario where using /25 instead of /24 would break connectivity.

ANSWER
A /24 has 256 total addresses (254 usable after subtracting network and broadcast). A /25 has 128 total addresses (126 usable). The /25 uses 1 additional bit for the network prefix, so the network address shifts. For example, 10.0.1.0/24 has network address 10.0.1.0 and broadcast 10.0.1.255. A /25 for the same base IP would be 10.0.1.0/25 with network 10.0.1.0 and broadcast 10.0.1.127, but the default gateway (often 10.0.1.1) might be in a different /25 block (e.g., 10.0.1.128/25). In production, if you create a public subnet with /25 and attach an internet gateway to the VPC, the gateway's IP (10.0.1.1) may fall in the other half, making the subnet unreachable. Always verify mask boundary alignment with gateway IPs.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
What is the difference between a subnet mask and a CIDR prefix?
02
How many usable IPs does a /28 subnet have in AWS?
03
Can I use the same private IP range in two different AWS VPCs?
04
What tool can I use to verify subnet overlap before creating a VPC peering connection?
05
Why does my EC2 instance in a public subnet get an IP but cannot reach the internet?
N
Naren Founder & Principal Engineer

20+ years shipping production systems from the metal up. Lessons pulled from things that broke in production.

Follow
Verified
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
🔥

That's Computer Networks. Mark it forged?

16 min read · try the examples if you haven't

Previous
DNS — Domain Name System
8 / 22 · Computer Networks
Next
Routing Protocols