DNS Outage: A Deleted A Record Took Down an E-Commerce Site
- A computer network is a system of interconnected devices that exchange data using protocols.
- Data is broken into packets; each packet includes headers for addressing, routing, and error recovery.
- DNS and DHCP are critical services — misconfigurations cause silent outages.
- Computer networks are interconnected devices sharing data using protocols
- Data travels in packets through layers (OSI/TCP/IP) with headers and payload
- DNS translates domain names to IPs; DHCP assigns addresses dynamically
- Latency adds ~5ms per network hop; packet loss >1% breaks TCP throughput
- Production network failures often stem from DNS misconfig or subnet overlap
- Biggest mistake: assuming the network is reliable — it's not, and it drops silently
Network Troubleshooting Cheat Sheet
No network connectivity at all
ip link show (or ifconfig)ping -c 4 8.8.8.8DNS resolution fails
nslookup example.comdig +trace example.comApplication-specific timeout (e.g., database)
nc -zv db-server 3306ss -tunap | grep 3306Production Incident
Production Debug GuideQuick symptom-to-action map for the most common network failures
telnet api.example.com 443 from the server.dig +short example.com and check TTL values. Compare against authoritative NS responses.mtr --report target-ip to identify the hop with loss. Check for bandwidth saturation or misconfigured MTU on that link.arp -a). Verify subnet mask consistency. Look for VLAN misconfig on the switch.Every single time you open Instagram, pay for something online, or video-call a friend on the other side of the world, a computer network is the invisible plumbing making it happen. Networks are not just a niche topic for network engineers — they're the foundation of almost every piece of software ever built. If you don't understand how devices communicate, you'll spend your career confused about why your app is slow, why a request times out, or what an API even is at a physical level. This article breaks down the essentials: how data actually moves from your laptop to a server across the globe, what protocols are, and the real-world failures you'll hit when the network breaks.
What is a Computer Network?
A computer network is a collection of interconnected devices — laptops, servers, routers, switches — that exchange data using agreed-upon protocols. Networks come in different sizes: LAN (Local Area Network) connects devices within a single building, WAN (Wide Area Network) stretches across cities or continents, and the Internet itself is the biggest WAN of all. The core job of a network is to move data from source to destination reliably and efficiently. That means handling addressing (who gets the data), routing (which path it takes), and error recovery (what happens when a packet is lost).
At the simplest level, every device gets a unique identifier — an IP address — and data is split into packets. Each packet carries the destination IP, the source IP, and a payload. Routers along the way inspect the destination and forward the packet toward its target. This is the fundamental mechanism behind everything from loading a webpage to streaming a video.
#!/bin/bash # TheCodeForge - basic network diagnostics # Check local IP and connectivity ip addr show eth0 echo "---" ping -c 2 google.com # Trace route to a host traceroute 8.8.8.8
- Your device (house) has a return address (IP).
- DNS is the phone book: it tells you the address of "google.com".
- TCP is registered mail — it confirms delivery and retries if lost.
- Routers are sorting offices that decide the next hop.
How Data Travels: The OSI and TCP/IP Models
Data travels through multiple layers, each adding its own header. The OSI model defines seven layers: Physical, Data Link, Network, Transport, Session, Presentation, Application. In practice, TCP/IP collapses these into four: Link, Internet, Transport, Application.
When you send an HTTP request, the application layer (e.g., browser) creates the payload. The transport layer (TCP) adds a header with source and destination ports, splits data into segments, and guarantees delivery. The internet layer (IP) wraps each segment into a packet with source and destination IP addresses. Finally, the link layer adds MAC addresses and sends the frame over the wire.
Each intermediate router strips and re-adds the link-layer header but keeps the IP packet intact. The destination host unwraps layers in reverse order, reassembles the segments, and delivers the data to the application.
# TheCodeForge - simulate packet encapsulation def encapsulate(data, src_port, dst_port, src_ip, dst_ip): # Transport layer: TCP segment segment = f"{src_port}:{dst_port}|{data}" # Network layer: IP packet packet = f"{src_ip}->{dst_ip}|{segment}" # Link layer: Ethernet frame (simplified) frame = f"[MAC src->MAC dst]{packet}" return frame print(encapsulate("GET /index.html", 54321, 80, "192.168.1.5", "142.250.80.46"))
IP Addressing and Subnetting
Every device on a network needs a unique IP address. IPv4 addresses are 32-bit numbers, usually written as four octets (e.g., 192.168.1.1). IPv6 uses 128 bits to solve address exhaustion. Subnetting divides a network into smaller logical segments. A subnet mask (e.g., 255.255.255.0 or /24) defines which part of the address is the network prefix and which part identifies the host.
CIDR (Classless Inter-Domain Routing) notation replaces classful addressing. For instance, 10.0.0.0/16 means the first 16 bits are the network, giving 65,534 usable host addresses. Subnetting allows efficient use of IP space and improves security by isolating broadcast domains. In production, misconfiguring subnet masks is a common cause of connectivity issues — two hosts with different subnet masks may think the other is on a different network and send traffic to the default gateway, even though they're on the same physical segment.
# TheCodeForge - simple subnet calculator def subnet_info(ip_cidr): ip, prefix = ip_cidr.split('/') prefix = int(prefix) mask = (0xFFFFFFFF << (32 - prefix)) & 0xFFFFFFFF mask_str = '.'.join(str((mask >> (24 - 8*i)) & 0xFF) for i in range(4)) return f"{ip}/{prefix} subnet mask: {mask_str}" print(subnet_info("10.0.0.0/16"))
Key Network Services: DNS and DHCP
DNS (Domain Name System) translates human-readable domain names (e.g., google.com) into IP addresses. It's a hierarchical, distributed database. When your browser looks up a domain, it queries a resolver (usually your ISP or a public DNS like 8.8.8.8), which walks the chain of root, TLD, and authoritative name servers to find the IP. DNS uses UDP on port 53 for queries, with TCP for zone transfers and large responses.
DHCP (Dynamic Host Configuration Protocol) automatically assigns IP addresses, subnet masks, default gateways, and DNS servers to devices when they join a network. Without DHCP, every device would need manual configuration. In production, DHCP lease times affect address availability; short leases (e.g., 5 minutes) cause churn, long leases (e.g., 24 hours) can exhaust the pool during scale-out events.
# TheCodeForge - resolve a domain and see the query path dig +trace thecodeforge.com # Check DHCP lease ip addr show | grep dynamic
- Root servers (/) know where .com lives.
- TLD servers (.com) know where authoritative nameservers are.
- Authoritative servers return the actual IP for example.com.
- DNS resolvers cache results to speed up subsequent lookups.
Common Network Failures and Debugging
Network failures are inevitable in production. The most common: DNS failures (domain not resolving), routing issues (packets taking wrong path), firewall blocks (silent drops), ARP cache poisoning, MTU mismatches, and bandwidth saturation. Debugging requires a systematic approach: start at the application layer and work downward.
Essential tools: ping (basic reachability), traceroute/mtr (path analysis), nslookup/dig (DNS), netstat/ss (listening ports), tcpdump/Wireshark (packet inspection), and curl/wget (HTTP layer). Many silent failures happen because ICMP is blocked — path MTU discovery and traceroute rely on it.
A real story: a team deployed a Kubernetes cluster with overlay network MTU 1450, but the physical network had MTU 1500. Applications experienced intermittent timeouts because packets were fragmented at the IP layer and the fragments were dropped by the AWS network load balancer. The fix was to set the overlay MTU to 1430 (to account for VXLAN overhead) or enable PMTUD at the application level.
#!/bin/bash # TheCodeForge - systematic network debug echo "1. Check local interface and IP" ip addr show echo "2. Check default gateway reachability" ping -c 2 $(ip route | grep default | awk '{print $3}') echo "3. DNS resolution" nslookup google.com echo "4. Port reachability to remote" nc -zv db.example.com 5432 echo "5. Full path analysis" mtr --report github.com
| Type | Scope | Typical Speed | Example |
|---|---|---|---|
| LAN | Single building / campus | 1 Gbps – 10 Gbps | Office network, home network |
| WAN | Cities / continents | 10 Mbps – 10 Gbps | Internet, corporate MPLS |
| MAN | City-wide | 100 Mbps – 10 Gbps | ISP backbone, municipal Wi-Fi |
🎯 Key Takeaways
- A computer network is a system of interconnected devices that exchange data using protocols.
- Data is broken into packets; each packet includes headers for addressing, routing, and error recovery.
- DNS and DHCP are critical services — misconfigurations cause silent outages.
- Always design applications to handle network failures; they are not reliable.
- Debug network issues systematically: application → transport → internet → link layer.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QExplain how a client connects to a server using TCP. What happens during the three-way handshake?JuniorReveal
- QWhat happens when you type a URL into a browser and press Enter? Describe the network flow.Mid-levelReveal
- QHow does a subnet mask affect communication between two hosts? Give an example of a misconfiguration.SeniorReveal
- QDescribe a production incident you debugged that was caused by a network issue. How did you diagnose and fix it?SeniorReveal
Frequently Asked Questions
What is the difference between a hub, a switch, and a router?
A hub broadcasts all data to all ports (simple, insecure). A switch learns MAC addresses and forwards data only to the intended port (layer 2, efficient). A router forwards packets between different networks using IP addresses (layer 3, connects LAN to WAN/Internet). In most production networks, you'll use switches for internal LAN and routers for WAN connectivity.
Why does my application sometimes get 'Connection refused' vs 'Connection timed out'?
Connection refused means the server actively rejected the connection (no service listening on that port, or firewall sent a RST). Connection timed out means the server didn't respond at all (network path broken, firewall dropped the packet silently, or the server is overloaded and not accepting connections). The two errors have very different root causes: 'refused' is usually a server-side port issue, while 'timeout' is a network or load issue.
What is NAT and why is it needed?
NAT (Network Address Translation) allows multiple devices on a private network (e.g., 192.168.x.x) to share a single public IP address when accessing the Internet. It rewrites the source IP and port in outgoing packets and remembers the mapping so return traffic is forwarded to the correct internal device. NAT conserves IPv4 address space and adds a layer of security (external hosts cannot directly reach internal devices). Drawback: it breaks end-to-end connectivity and complicates protocols that embed IP addresses (e.g., SIP, FTP).
What is the difference between TCP and UDP? When would you use each?
TCP is connection-oriented, provides reliable delivery, in-order data, flow control, and error recovery via retransmission. It has higher overhead (headers + handshake). Use TCP for applications that require all data to arrive correctly and in order: HTTP, email, file transfers. UDP is connectionless, fire-and-forget; no guarantees on delivery or order. Use UDP for real-time applications where speed matters over completeness: video streaming, VoIP, DNS queries, online gaming.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.