IP Addressing and Anycast: How to Route Traffic to the Closest Server and Survive Failures
IP addressing and anycast routing explained with production patterns.
20+ years shipping large-scale distributed systems. Everything here is grounded in real deployments.
Anycast allows multiple servers to advertise the same IP address from different locations. Routers send traffic to the nearest server based on BGP path cost. This improves latency and provides automatic failover if one server goes down.
Think of anycast like a chain of coffee shops with the same name. When you search for 'Coffee' on your phone, you're directed to the nearest shop, not a central headquarters. If that shop is closed, you get routed to the next closest. Each shop looks the same to you — same name, same menu — but they're independent. Anycast does the same for IP addresses: multiple servers announce the same IP, and the internet's routing system sends you to the nearest one.
Most engineers think an IP address belongs to one server. That's a lie. Anycast lets you put the same IP on hundreds of servers worldwide, and the internet magically routes each user to the closest one. It's how DNS root servers, CDNs, and Google's front-ends work. But when it breaks — and it will — you'll be debugging BGP withdrawals at 3am.
The problem anycast solves is simple: latency and availability. Without it, all traffic for a service hits a single location. That means high latency for distant users and a single point of failure. Anycast distributes load across regions and provides instant failover when a server dies.
By the end of this article, you'll understand how anycast works under the hood, how to design services that use it, and — more importantly — the three failure modes that will wake you up at night. You'll also get a production-ready BGP configuration snippet and a debugging checklist.
How Anycast Routing Actually Works: BGP Lies and Route Propagation
Anycast relies on BGP (Border Gateway Protocol) to announce the same IP prefix from multiple locations. Each router picks the route with the shortest AS path (or lowest MED, etc.). This is not real-time — BGP updates take seconds to minutes to propagate. When a server fails, its BGP session drops, the route is withdrawn, and traffic shifts to the next closest server. This convergence delay is the Achilles' heel.
The key insight: anycast doesn't balance load intelligently. It's purely topological. If one site has a shorter AS path, it gets all traffic from that region. You can manipulate this with AS-path prepending or community strings, but it's coarse.
Here's a production scenario: you run a global authentication service. You deploy servers in US-East, EU-West, and AP-Southeast, all advertising 203.0.113.0/24. A user in London is routed to EU-West because the AS path is shorter than to US-East. If EU-West dies, traffic shifts to US-East after BGP converges. That's the theory. In practice, you'll see asymmetric routing, TCP resets during convergence, and stateful services breaking.
Designing Services for Anycast: Stateful vs Stateless
Anycast works beautifully for stateless services like DNS, HTTP redirectors, or static content. Each request can go to any server. But stateful services — databases, WebSocket connections, login sessions — break hard. If a user's TCP connection lands on server A, then BGP reconverges and the next packet goes to server B, the connection resets. The user sees a timeout.
The fix: use anycast only for the entry point, then pin the session to a specific backend. CDNs do this: anycast routes to the closest edge node, which then proxies to an origin server using unicast. For your own services, put anycast on a load balancer tier, not on the application servers directly.
Here's a pattern: deploy anycast IPs on HAProxy or Nginx instances in each region. These terminate TCP and forward to backend servers via internal unicast IPs. The anycast IP provides regional load balancing and failover; the backend servers stay stable.
proxy_protocol directive does this.When Anycast Breaks: Three Failure Modes You'll Hit
- BGP Route Flapping: If a server's BGP session keeps going up and down, routes are continuously withdrawn and re-advertised. This causes global routing instability. Every flap triggers a BGP update to all peers. Mitigation: use route dampening or flap damping on upstream routers.
- Asymmetric Routing: Different paths for forward and return traffic. This happens when your anycast IP is announced with different AS paths from different sites. The return traffic might take a different path, causing stateful firewalls to drop packets. Fix: ensure symmetric routing by using the same upstream ISP or by manipulating MED values.
- Hot Potato Routing: ISPs prefer to hand off traffic to you as soon as possible. If your anycast sites have different upstream ISPs, traffic may be routed to a suboptimal site because the ISP wants to minimize its own transit costs. This is hard to control. Use BGP communities to influence upstream decisions.
Anycast vs DNS Load Balancing: When to Use Which
DNS load balancing returns different IPs to different clients based on geo or round-robin. It's simple but slow — DNS caching means changes take minutes to hours. Anycast changes propagate in seconds (with BFD) and don't depend on client-side caching.
Use anycast when you need fast failover and low latency for global users. Use DNS when you need fine-grained control (e.g., send users to specific servers based on load) or when you can't control BGP (e.g., cloud environments).
Many production systems combine both: DNS returns an anycast IP, and anycast routes to the closest data center. This gives you the best of both: DNS provides a static entry point, anycast handles regional routing and failover.
Debugging Anycast: Tools and Commands That Actually Work
When anycast goes wrong, you need to see the world from the router's perspective. traceroute from multiple vantage points shows you the path. mtr combines traceroute and ping for continuous monitoring. dig +trace shows DNS resolution paths.
On your BGP routers, show ip bgp 203.0.113.0/24 shows the best path and all alternatives. show ip bgp neighbors shows session state. ping from a remote location confirms reachability.
For a global view, use RIPE Atlas or ThousandEyes to probe from hundreds of locations. These tools reveal asymmetric routing and black holes that you can't see from your own network.
The DNS Outage That Took Down Half the Internet
- Anycast is only as reliable as your BGP configuration.
- One fat-fingered
no networkcommand can take down a global service.
traceroute from a user in that region to the anycast IP. 2. Check BGP table on your router for that prefix. 3. Verify the local anycast node is healthy and BGP session is up. 4. Check for route flapping using show ip bgp flap-statistics.bgp fast-external-fallover is enabled. 3. Verify BFD is configured and working. 4. Check if upstream ISP is filtering your prefix.show ip bgp. 2. Adjust AS-path prepending on overloaded sites. 3. Use BGP MED or local-preference to influence path selection. 4. Monitor traffic volume per site with netflow.show ip bgp 203.0.113.0/24show ip bgp neighbors 192.0.2.1 advertised-routesnetwork 203.0.113.0 mask 255.255.255.0Key takeaways
Interview Questions on This Topic
How does anycast handle TCP connections when a server fails? What happens to in-flight connections?
Frequently Asked Questions
20+ years shipping large-scale distributed systems. Everything here is grounded in real deployments.
That's Networking. Mark it forged?
4 min read · try the examples if you haven't