Mid-level 6 min · March 06, 2026

ARP Cache Timeout — Why 300s Default Breaks HA Failover

Switches ignore gratuitous ARP until cache expires.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • ARP maps IPs to MACs on local Ethernet/Wi-Fi. Hardware doesn't understand IPs.
  • Request: broadcast 'who has 192.168.1.1?'; reply: target responds with its MAC.
  • Cache entries live 60-300 seconds. That delay is the #1 cause of slow failover.
  • Performance: first packet to new IP triggers a broadcast; subsequent packets use cached MAC.
  • Production trap: ARP has zero authentication. Spoofing trivially redirects traffic.
  • Biggest mistake: assuming gratuitous ARP instantly updates all caches (it doesn't).
Plain-English First

Imagine you move to a new neighborhood and you know your friend Sarah's house number (42 Maple Street) but you don't know what her front door looks like. So you stand outside and shout 'Hey, who lives at number 42?' — Sarah hears you, waves, and now you know exactly which door to knock on. ARP does the same thing on a network: your computer knows the IP address it wants to reach, but it needs the physical MAC address to actually deliver the data. It broadcasts a 'who has this IP?' question to everyone on the local network, and the right machine shouts back its MAC address.

Every time you load a webpage, send a Slack message, or ping a server, your operating system has to solve a puzzle before a single byte leaves your machine: it knows the destination's IP address, but your network hardware — your Ethernet card, your Wi-Fi adapter — doesn't understand IP addresses. It only speaks in MAC addresses, those 48-bit hardware identifiers burned into every network interface at the factory. Without a way to bridge that gap, your packets go nowhere.

This is the exact problem ARP was designed to solve back in 1982 (RFC 826), and it's still doing that job on virtually every LAN on the planet. It sits at the boundary between Layer 2 (Data Link) and Layer 3 (Network) of the OSI model, acting as a live translation service that maps 'logical' IP addresses to 'physical' MAC addresses. When it works, it's invisible. When it breaks — or gets exploited — things get interesting fast.

By the end you'll understand exactly how ARP request and reply packets are constructed, why the ARP cache exists and what happens when it goes stale, how ARP spoofing works at a packet level so you can reason about network security, and how to inspect and manipulate ARP behavior on a real Linux or macOS machine. This is the kind of depth that separates engineers who just use networks from engineers who actually understand them.

What is ARP — Address Resolution Protocol?

ARP — Address Resolution Protocol is a core networking mechanism that bridges Layer 2 (MAC) and Layer 3 (IP). Instead of a dry definition, let's see it in action. When your machine wants to send a packet to another machine on the same Ethernet segment, it needs the destination's MAC address. It broadcasts an ARP request: 'who-has 192.168.1.42? Tell 192.168.1.1'. The target unicasts back its MAC. Your OS caches that mapping so future packets don't need to broadcast again.

That's the entire protocol in one paragraph. The details — packet format, cache behavior, timeouts — are where production issues live.

io/thecodeforge/networking/arp_inspect.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash
# Inspect ARP cache and watch ARP traffic on Linux

echo "=== Current ARP cache ==="
ip neigh show

echo ""
echo "=== Watch ARP packets for 5 seconds ==="
sudo timeout 5 tcpdump -i eth0 -n arp -c 10 2>/dev/null || echo "No ARP traffic"

echo ""
echo "=== Clear ARP entry for a specific IP ==="
# Uncomment to clear: sudo arp -d 192.168.1.1
# Or: sudo ip neigh del 192.168.1.1 dev eth0

echo ""
echo "=== Force re-resolution via arping ==="
# arping -c 2 -I eth0 192.168.1.1
Why ARP Uses Broadcast But Not Unicast
The sender doesn't know the target's MAC — that's the whole problem. So it broadcasts (MAC FF:FF:FF:FF:FF:FF) to everyone. Only the owner replies, and it replies unicast because it now knows the sender's MAC from the request. Efficient: one broadcast, one unicast reply.
Production Insight
ARP resolves IP to MAC only for local subnet traffic. Packets to a remote subnet go to the default gateway's MAC, not the final destination.
The ARP cache reduces broadcast overhead but causes failover delay. Stale entries persist for minutes.
Rule: Always check ARP cache when debugging 'network is up but traffic fails'. An incomplete or wrong MAC is a common cause.
Key Takeaway
ARP maps IPs to MACs only on the local network. External traffic goes to gateway MAC, not final destination.
ARP cache entries live for minutes (default 60-300 seconds). That cache is the single biggest cause of slow failover.
Rule: When ping works but higher-layer apps fail, check the ARP cache — wrong MAC means packets go to the wrong host.
Is ARP Involved in This Network Issue?
IfDestination IP is on same subnet (check netmask)
UseARP is used. Source sends 'who-has' ARP request for destination IP. Expect reply before packets flow.
IfDestination IP is on different subnet
UseARP is NOT used for final destination. Packet sent to default gateway's MAC. Gateway then routes. If gateway ARP fails, packet cannot leave subnet.
IfPing fails but arp -a shows 'incomplete'
UseNo ARP reply was received. Destination may be down, switch isolating ports, or firewall blocking ARP (unlikely). Use tcpdump to see if request leaves.
IfPing works but intermittent (some packets succeed, some fail)
UseARP cache may be flapping if two devices claim same IP. Check arp -a for same IP with different MACs across multiple queries.
IfFailover takes minutes instead of seconds
UseARP cache timeouts too high on switches/clients. Reduce gc_stale_time and switch aging time. Test with arping -U on failover.

ARP Spoofing: The Attack That Redirects Traffic Without Routes

ARP spoofing (ARP cache poisoning) exploits the fact that ARP has no authentication. An attacker sends unsolicited ARP replies (gratuitous ARP) claiming to own the IP address of the default gateway or another host. The victim's ARP cache updates with the attacker's MAC, and all traffic destined for that IP is sent to the attacker instead.

How it works: attacker sends "192.168.1.1 is at aa:bb:cc:dd:ee:ff" (where aa:bb:cc:dd:ee:ff is attacker's MAC). The target believes this unsolicited update and forwards all traffic. The attacker can then inspect, modify, or block the traffic — a classic man-in-the-middle attack.

Mitigations: dynamic ARP inspection (DAI) on switches validates ARP packets against DHCP snooping bindings. Port security limits MAC addresses per port. Static ARP entries for critical IPs (gateway, DNS, NTP) prevent poisoning but are administratively heavy. Use arp_filter and arp_ignore sysctl on Linux to reject unsolicited ARP on some interfaces.

Detection: use arpwatch (logs ARP changes) or arp-scan to detect duplicate IP claims. On Linux, arp -a may show the same IP with different MACs over time. Anomaly detection can alert when gateway MAC changes outside maintenance windows.

Why ARP Spoofing Still Works in 2026
ARP has no authentication because the same 1982 protocol is still used. No MAC address validation, no cryptographic signatures. Enterprises mitigate with switch security (DAI), but many small networks remain vulnerable. Never trust that ARP mappings are correct without additional security layers like IPsec or HTTPS.
Production Insight
ARP spoofing can intercept traffic without touching routing tables. It's purely layer-2, so firewalls see legitimate src/dst IPs.
Detection: an IP with two MACs in the cache is the clearest sign. Use arpwatch to detect 'flip' events.
Rule: For any sensitive network, enable switch security: DHCP snooping + dynamic ARP inspection (DAI). It's not perfect but raises the bar significantly.
Key Takeaway
ARP spoofing redirects traffic without touching IP routes — pure layer-2 attack. No authentication in ARP.
Detection: lookout for same IP with multiple MAC addresses in cache. arpwatch is the standard monitoring tool.
Rule: For sensitive networks, enable Dynamic ARP Inspection (DAI) on switches. For public networks, encrypt everything — ARP is untrustable.
ARP Security Controls Selection
IfCritical infrastructure (payment, auth, database)
UseUse static ARP entries for critical IPs (gateway, DNS, NTP). Disable dynamic ARP learning on those entries. Use port security and MAC limiting on switches.
IfGeneral production network with switch support
UseEnable DHCP snooping + Dynamic ARP Inspection (DAI). DAI validates ARP packets against the DHCP binding table, discarding unsolicited or mismatched ARP.
IfPublic Wi-Fi or untrusted network
UseUse VPN or IPsec. ARP spoofing is trivial on shared networks (coffee shops, airports). Do not rely on ARP security at all; encrypt from endpoint.
IfCloud environment (AWS, GCP, Azure)
UseARP is not used for east-west traffic. Cloud SDN replaces layer-2 with overlay networks. However, ARP spoofing between tenants is impossible because each tenant has isolated MAC address space.
IfSmall office / home office (SOHO)
UseUpgrade to a switch that supports DAI, or use static ARP for gateway. Most home routers are vulnerable. Use HTTPS and TLS everywhere as defense-in-depth.

Gratuitous ARP: The Double-Edged Sword

Gratuitous ARP (GARP) is an ARP announcement sent without a corresponding request. It's used for IP address takeover (failover), MAC address updates, and duplicate address detection (DAD).

In gratuitous ARP, the sender puts its own IP in the 'target IP' field (not the usual request format). The message says 'this IP is now at this MAC'. Recipients may update their ARP cache immediately, even though they didn't ask.

Common uses: - HA failover (VRRP, CARP): Standby server sends GARP to update switch MAC tables and client caches when VIP moves. - MAC address change: If a NIC MAC changes (rare, but possible with virtual machines), GARP can notify the network. - Duplicate IP detection: A node that receives GARP claiming an IP it already owns can detect conflict.

Why GARP fails in production: - Many switches and client OSes ignore unsolicited ARP updates (security hardening). They only update cache in response to requests. - Even when accepted, some implementations only update if the entry doesn't already exist or is stale. - The solution is to send a series of ARP requests for the same IP, forcing a cache refresh via reply.

io/thecodeforge/networking/gratuitous_arp_test.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#!/bin/bash
# Demonstrating gratuitous ARP and testing cache update behavior

# Using arping to send gratuitous ARP (unsolicited)
# This tells the network that IP address 192.168.1.100 is at this interface's MAC

echo "=== Sending gratuitous ARP from this machine ==="
# -U means unsolicited (gratuitous)
# -c 3 sends 3 packets to ensure delivery
# -I specifies the network interface (eth0, wlan0, etc.)
arping -U -c 3 -I eth0 192.168.1.100

echo ""
echo "=== Forcing ARP cache update via request to target ==="
# Alternative: send a request for the IP (as if we're looking for it).
# The target will reply, and we learn its MAC. This is guaranteed to work.
arping -c 2 -I eth0 192.168.1.100

echo ""
echo "=== Check current ARP cache entry for the IP ==="
arp -a 192.168.1.100

echo ""
echo "=== To test GARP acceptance on another machine ==="
# On a Linux destination, check if GARP updated cache:
# sudo tcpdump -i eth0 arp
# ip neigh show
# The sysctl 'arp_accept' controls whether unsolicited ARP updates are accepted.
Linux arp_accept Controls GARP Behavior
sysctl -w net.ipv4.conf.eth0.arp_accept=1 forces Linux to accept unsolicited ARP updates. Default is 0 (ignore). Many distributions leave it at 0 for security. Always test if your GARP is actually being accepted in your environment.
Production Insight
Gratuitous ARP is not magic. Many OSes (including Linux with default arp_accept=0) ignore unsolicited updates entirely.
Failover scripts often fail because they assume GARP works everywhere. Test with packet capture.
Rule: For critical failover, use VRRP/CARP (which includes GARP but also MAC address takeover). Or use BFD (Bidirectional Forwarding Detection) on routed interfaces to bypass ARP entirely.
Key Takeaway
Gratuitous ARP is a hint, not a guarantee. OS and switch implementations vary wildly in whether they accept unsolicited updates.
For reliable failover, use VRRP/CARP (MAC takeover) or layer-3 health checks + routing.
Rule: Test your failover with packet capture. If you see GARP sent but traffic still goes to old MAC, cache TTL is too high.
Ensuring Fast Failover with ARP
IfVirtual IP failover (keepalived, heartbeat)
UseUse keepalived with VRRP — VRRP changes MAC address of interface, not just ARP. VRRP between routers is standard.
IfSwitches must update MAC table quickly on failover
UseReduce MAC address aging time on switch from default 300s to 30s. Apply globally or per VLAN.
IfClients (Linux) need faster ARP expiry
UseSet net.ipv4.neigh.default.gc_stale_time = 30 and net.ipv4.neigh.default.proxy_qlen = 96
IfGratuitous ARP completely unsupported on network
UseUse layer-3 solutions: BFD + ECMP with health checks. Or cloud load balancer (AWS NLB) which handles failover at proxy level.
IfNeed sub-second failover on local network
UseUse MAC takeover (VRRP) not ARP updates. VRRP sends multicast so the switch learns new MAC immediately via normal MAC learning process.

ARP Cache Internals: Aging, GC and Production Tuning

The ARP cache is a simple key-value store: IP → MAC. But its behavior is governed by several timers and thresholds that directly impact production reliability.

Key Linux sysctl parameters: - gc_stale_time (default 60s): how long an entry can be stale before it's considered for garbage collection. A stale entry means the MAC hasn't been verified recently, but the entry still exists. - gc_thresh1 (default 128): if the cache has fewer entries than this, GC doesn't run. - gc_thresh2 (default 512): if cache exceeds this, GC runs more aggressively. - gc_thresh3 (default 1024): hard limit. Once reached, new ARP resolutions fail with "neighbour table overflow". - base_reachable_time (default 30s): base time for an entry to be considered reachable; actual reachable time = base_reachable_time + random(0, gc_stale_time/2). - delay_first_probe_time (default 5s): time to wait before first probe after an entry becomes stale.

Windows ARP cache: netsh interface ip delete arpcache flushes. Default timeout is 300 seconds (ARP cache timeout = 60 seconds for neighbor unreachability detection actually). Windows uses a different mechanism (NUD).

Switch MAC aging: Layer-2 switches have a MAC address table that maps MACs to ports. Aging time default is often 300 seconds. When a GARP arrives, the switch may update the MAC table if the entry is not the same MAC on different port? Actually, MAC learning updates on any frame with source MAC. If the frame comes from a different port than the current entry, the switch updates immediately (MAC flapping). GARP triggers this. However, ARP cache on the switch (if it's a layer-3 switch) is separate and may not update from GARP.

Tuning for HA: - Reduce gc_stale_time to 15-30 seconds for faster failover. - Increase gc_thresh3 if you have many neighbors (e.g., container hosts). - Set arp_accept=1 if you trust GARP from your failover script. - Always test: send GARP and verify cache update on target with ip neigh show.

ARP Cache as a Phonebook
  • Entries have a 'reachable' state and 'stale' state. Stale entries are still usable but need verification.
  • GC runs periodically to purge entries that haven't been used. Not all stale entries are removed immediately.
  • gc_stale_time sets how long an entry can stay stale before GC considers it for deletion.
  • gc_thresh1/2/3 set the watermarks for GC aggression. Overflow causes ARP failures.
  • Tuning is a trade-off: faster failover vs more ARP broadcasts.
Production Insight
Stale ARP entries cause silent blackholing. The source sends packets to a MAC that no longer exists.
Cache overflow (gc_thresh3) causes 'neighbour table overflow' errors — new connections fail silently.
Rule: Monitor cat /proc/net/stat/arp_cache for table fullness. Increase gc_thresh3 if you have >1000 neighbours.
Key Takeaway
ARP cache tuning is a direct trade-off between failover speed and broadcast overhead.
Monitor gc_thresh to avoid silent failures. Reduce gc_stale_time for HA.
Rule: Tune cache aggressively for failover environments; test with packet capture to verify GARP acceptance.
ARP Cache Tuning Decisions
IfFailover must complete within 5 seconds
UseReduce gc_stale_time to 15s, set arp_accept=1, reduce switch MAC aging to 30s. Use VRRP for sub-second.
IfHost has many neighbors (e.g., Docker host with hundreds of containers)
UseIncrease gc_thresh3 to 4096 or higher. Monitor cache usage. Enable neigh/default/gc_interval if needed.
IfRandom ARP failures in logs: 'neighbour table overflow'
Usegc_thresh3 is too low. Increase it. Also consider reducing gc_stale_time to flush stale entries faster.
IfARP requests are flooding the network (broadcast storm)
Usegc_stale_time might be too low, causing frequent re-resolutions. Increase it to reduce broadcasts, but balance with failover needs.
IfMigrating a VM with same IP to new host (live migration)
UseAfter migration, send GARP from new host. If recipients ignore GARP (arp_accept=0), they will not update cache until next resolution (could be minutes). Set arp_accept=1 on critical clients or use gratuitous ARP with request mode.

Proxy ARP and ARP in Virtualized/Cloud Environments

Proxy ARP is a technique where a device (usually a router) answers ARP requests on behalf of another host. It's used in scenarios like VPNs, virtual IPs, and transparent bridging. The router sees an ARP request for an IP that belongs to a host behind it, and it replies with its own MAC address. This tricks the sender into forwarding traffic to the router, which then forwards the packet to the real destination.

When to use Proxy ARP: - VPN clients on a subnet need to appear as local hosts. - Load balancers that proxy connections to backend servers. - Containers in host-networking mode where the host answers for container IPs.

Production pitfalls: - Proxy ARP can cause routing loops if misconfigured. The router answers for an IP that is on the same subnet but behind itself, leading to a cycle. - It hides the true topology, making debugging harder. - Many security teams disable proxy ARP to prevent spoofing.

ARP in cloud environments (AWS, GCP, Azure): - Cloud providers use Software-Defined Networking (SDN) that replaces ARP entirely. Instances do not send ARP requests to other instances. - The hypervisor handles MAC-to-IP mapping. Even if you see MACs in arp -a, they are virtual MACs assigned by the cloud controller. - Gratuitous ARP is ignored. Failover must use cloud-specific mechanisms: health checks, load balancers, Elastic IPs (AWS), etc. - In AWS, if you move an Elastic IP to another instance, the network mapping updates in seconds — but it's not ARP-based. It's a control plane update. - Key rule: In the cloud, forget everything you know about ARP. It doesn't work the same way.

io/thecodeforge/networking/proxy_arp_config.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#!/bin/bash
# Configure Proxy ARP on Linux

echo "=== Enable Proxy ARP on an interface ==="
sysctl -w net.ipv4.conf.eth0.proxy_arp=1

echo ""
echo "=== Add a proxy ARP entry for a remote IP ==="
# Respond to ARP requests for 10.0.0.5 as if it's on the local subnet
sudo ip neigh add proxy 10.0.0.5 dev eth0

echo ""
echo "=== Verify proxy entries ==="
ip neigh show proxy

echo ""
echo "=== Check if proxying is active (from another host) ==="
# arping -I eth0 10.0.0.5
# If proxy works, this returns a reply from the router's MAC.
Cloud Tip: Forget ARP in the Cloud
AWS, GCP, and Azure replace ARP with SDN mappings. Migrating on-prem HA scripts that rely on GARP to the cloud will fail. Use cloud-native health checks and load balancers for failover.
Production Insight
Proxy ARP can solve reachability issues but introduces routing loops if misconfigured. Always verify with traceroute.
In cloud environments, ARP does not work as expected. GARP is ignored, cache entries are virtual.
Rule: Use static ARP sparingly. Prefer routed solutions (BGP, OSPF) or cloud-native abstractions.
Key Takeaway
Proxy ARP bridges subnets but adds complexity and security risk. Prefer routed solutions.
In the cloud, ARP is replaced by SDN. Forget ARP-based failover.
Rule: When moving on-prem HA to cloud, redesign the failover layer — do not port ARP scripts.
Should You Use Proxy ARP?
IfYou need to make a remote subnet appear local (VPN)
UseProxy ARP can work, but consider route-based VPN (tun) or VXLAN for cleaner abstraction. Proxy ARP adds debugging complexity.
IfLoad balancer with direct server return (DSR)
UseProxy ARP is often used to make real servers appear to have the VIP. Better: use LVS with DSR and arp_ignore/arp_announce to prevent servers from responding on VIP.
IfCloud instance (AWS, GCP)
UseDo not use Proxy ARP. It will not work as expected. Use cloud load balancer (ALB/NLB), health checks, and autoscaling.
IfHome lab / small network
UseWorks fine for testing. Enable proxy_arp on the gateway router. Monitor for loops.
● Production incidentPOST-MORTEMseverity: high

The 7-Minute Failover That Cost $400k

Symptom
Primary server crashed. Standby detected failure (3 seconds) and sent gratuitous ARP (GARP) for the VIP. But clients and the switch still sent traffic to the dead primary MAC for 7 minutes. Manual ARP cache clearing on the switch fixed it immediately, confirming the root cause.
Assumption
The team assumed gratuitous ARP was a magic 'flush cache' command. They didn't know most switches and OSes only update cache if the new MAC arrives in response to a request (unsolicited GARP is often ignored). They also assumed ARP cache expiry was seconds, not minutes.
Root cause
The switch's ARP timeout was 300 seconds (default on many devices). When the standby sent a GARP, the switch logged it but did NOT replace the existing cache entry because the entry hadn't expired. The cache still mapped the VIP to the primary's MAC. Clients that had already resolved the VIP also held it in their cache (default Linux expiry 60 seconds, Windows 300 seconds). The team had no mechanism to clear caches remotely. Failover required waiting for all caches to expire naturally — up to 5 minutes on clients, plus switch TTL.
Fix
Reduced ARP cache expiry on the switch from 300 to 30 seconds. Reduced client cache TTL via router advertisements (for stateless) or DHCP option. On failover, script that runs arping -U -I eth0 -c 3 <VIP> (unsolicited ARP) — some OSes accept this with arp_accept=1 sysctl. Implemented link-layer networking with VRRP (which sends multicast GARP with proper MAC). Used send_arp in keepalived to emit unsolicited ARPs. After tuning, failover dropped to 3 seconds.
Key lesson
  • Gratuitous ARP is a hint, not a command. Switches and clients ignore it unless configured to accept unsolicited updates. Never rely on it as your only failover mechanism.
  • Always tune ARP cache timeout for your failover requirement. 300 seconds (default switch) is too long for HA. 30-60 seconds is safer; use VRRP or BFD for sub-second.
  • Test failover with packet capture. Look for ARP requests/replies during transition. If you see GARP but traffic still goes to wrong MAC, cache is the culprit.
  • In cloud environments (AWS, GCP), ARP is disabled or replaced with SDN forwarding rules. Use health checks and load balancers, not VIPs with ARP.
Production debug guideSymptom → Action mapping for common ARP-related failures6 entries
Symptom · 01
Traffic destined to an IP is blackholed — incoming packets stop arriving
Fix
Check ARP cache on the sender. Run arp -a | grep <destination_IP>. If MAC is incomplete or wrong, ARP resolution failed or stale. Clear cache: ip neigh flush dev eth0 or arp -d <IP>. Watch tcpdump: tcpdump -i eth0 arp — see if requests send or replies come back.
Symptom · 02
Slow failover (tens of seconds) despite VIP moving quickly
Fix
Check ARP cache timeout on clients and switches. Default on many is 60-300 seconds. Reduce switch ARP timeout via mac address-table aging-time. Use arping -U from the new owner. For Linux clients, set net.ipv4.neigh.default.gc_stale_time = 30
Symptom · 03
ARP spoofing warning in IDS or security scan
Fix
Use arpwatch or arp-scan to detect anomalies. arp -a may show duplicate IP with different MACs. Mitigate: port security on switches (static ARP entries for critical IPs). Use arp_filter or arp_ignore sysctls on Linux to reject unsolicited ARP. For high-security, configure static ARP entries or use layer-3 routing not layer-2.
Symptom · 04
Intermittent connectivity — some pings work, some fail
Fix
ARP cache flapping — two devices claim same IP (IP conflict). Run arp -a | grep <IP> and see multiple MAC entries. Use arping -D -I eth0 <IP> to detect duplicate address. Fix by renumbering the conflict or shutting down the rogue device.
Symptom · 05
No ARP replies — 'who-has' requests sent, no response
Fix
Target IP not reachable at layer 2 (different subnet) or the target host is down. Check subnet mask: if destination IP is outside local network, ARP is not used (packet goes to gateway instead). Run tcpdump -i eth0 arp and host <target_IP>. Ensure both machines are on same VLAN/physical segment.
Symptom · 06
VIP reachable from some clients but not others
Fix
Clients with stale ARP caches still point to old MAC. Check arp -a on affected clients. Clear the entry and force re-resolution: arp -d <VIP>. If GARP sent but ignored, check arp_accept on Linux clients.
★ ARP Quick Debug Cheat SheetFast diagnostics for network layer-2 mapping issues. Run these commands before touching routing.
Can't ping a local IP, but IP is up
Immediate action
Check ARP entry for that IP
Commands
arp -a | grep -i <destination_IP>
ip neigh show | grep <destination_IP>
Fix now
If entry is incomplete or stale: arp -d <IP> or ip neigh del <IP> dev eth0. Then retry ping to refresh cache.
Suspect ARP cache poisoning (spoofing)+
Immediate action
Look for IP with multiple MACs in ARP cache
Commands
arp -a | awk '{print $1, $4}' | sort | uniq -d
arp-scan --localnet | grep -v 'DUP'
Fix now
Clear the suspicious entry: arp -d <IP>. For servers, static ARP: arp -s <IP> <MAC>. Use network switch security (DHCP snooping + ARP inspection).
Failover from primary to standby takes too long+
Immediate action
Check current ARP cache timeout values on switch
Commands
show mac address-table aging-time (Cisco)
sysctl net.ipv4.neigh.default.gc_stale_time
Fix now
Reduce switch ARP timeout from 300s to 30s. On Linux clients: sysctl -w net.ipv4.neigh.default.gc_stale_time=30. On failover, send GARP: arping -U -c 3 -I eth0 <VIP>.
ARP requests sent but no replies+
Immediate action
Capture ARP traffic to see request and missing reply
Commands
tcpdump -i eth0 arp and host <target_IP> -c 10
arping -c 3 -I eth0 <target_IP>
Fix now
If target IP not on same subnet, ARP won't be used. Check netmask: ip addr show on both sides. If on same subnet but no reply, check switch VLAN assignment and firewall blocking (ARP is layer 2, not routable).
Duplicate IP address detected — intermittent connectivity+
Immediate action
Find the two MACs claiming same IP
Commands
arp-scan --localnet | grep <IP>
tcpdump -i eth0 arp and host <IP>
Fix now
Shut down one device or change its IP. For DHCP, enable conflict detection. For static IP misassignment, update documentation and monitoring.
ARP vs NDP vs Inverse ARP
ProtocolLayerWhat it doesSecurityUsed inKey difference
ARP (IPv4)RFC 826Maps IPv4 address to MAC address on Ethernet/Wi-FiNo security — spoofing trivialIPv4 LANs everywhereBroadcast request: who-has?
NDP (IPv6)RFC 4861Address resolution, router discovery, prefix discoverySeND (Secure Neighbor Discovery) optionalIPv6 networksUses ICMPv6 multicast, not broadcast
Inverse ARPRFC 2390Maps DLCI (frame relay) to IP addressLegacyFrame relay networks (obsolete)Request asks 'what IP is at DLCI X?'
RARPRFC 903Reverse ARP: MAC to IP (for diskless boot)ObsoleteHistorical (replaced by DHCP)RARP server responds with IP for given MAC

Key takeaways

1
ARP maps IP addresses to MAC addresses on local Ethernet networks
required because hardware only understands MACs.
2
ARP cache entries live for minutes (default 60-300 seconds). That cache is the biggest cause of slow failover; tune expiry times aggressively for HA.
3
ARP has no authentication
spoofing is trivial. Mitigate with Dynamic ARP Inspection on switches, static ARP for critical IPs, and end-to-end encryption.
4
Gratuitous ARP is not a reliable way to update caches. Many systems ignore unsolicited ARP updates (arp_accept=0 on Linux). Always test with packet capture.
5
In cloud environments, ARP is replaced by SDN
use cloud-native failover mechanisms, not ARP.

Common mistakes to avoid

6 patterns
×

Assuming ARP works for cross-subnet communication

Symptom
arp -a shows incomplete or no entry for remote IP; pings fail despite correct routes.
Fix
ARP only resolves IPs on the same subnet. For remote IPs, packets go to default gateway MAC, not the final destination. Check subnet mask on both sides.
×

Leaving ARP cache timeout too long for failover environment

Symptom
VIP failover takes 3-5 minutes despite instant detection. GARP sent but ignored.
Fix
Reduce gc_stale_time on Linux clients (default 60s). Reduce switch MAC aging time from 300s to 30s. For Linux, set arp_accept=1 if unsolicited ARP is safe in your environment.
×

Trusting ARP for security — no additional controls

Symptom
ARP spoofing attack detected; traffic intercepted; no alarms.
Fix
Enable Dynamic ARP Inspection (DAI) on switches if supported. Use static ARP for critical IPs. Monitor with arpwatch. Encrypt traffic end-to-end (TLS, IPsec) as defense-in-depth.
×

Forgetting to clear ARP cache after moving IP to new MAC

Symptom
IP migrated to new machine but some clients still send packets to old MAC.
Fix
Send GARP from new owner AND proactively flush cache on critical clients: arp -d <IP> on Linux or arp -d <IP> on Windows. On switch, clear mac address-table for that VLAN.
×

Using `arping -U` without testing acceptance on target OS

Symptom
Gratuitous ARP sent but arp -a on receiver doesn't update. Failover scripts think update succeeded.
Fix
Check arp_accept sysctl on Linux (sysctl net.ipv4.conf.eth0.arp_accept). If 0, receiver ignores unsolicited ARP. Change to 1 or rely on arping -c 3 request mode, not -U (gratuitous).
×

Assuming ARP works in cloud as it does on-prem

Symptom
Failover script using GARP doesn't work in AWS; traffic still goes to old instance.
Fix
ARP is replaced by SDN in the cloud. Use cloud-native mechanisms: AWS Elastic IP reassignment, load balancer target group changes, or health check-based routing.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Explain the difference between ARP request, ARP reply, and gratuitous AR...
Q02SENIOR
How does ARP spoofing work, and how can you detect it on a running netwo...
Q03SENIOR
What is the ARP cache, and why does it cause slow failover in high-avail...
Q04SENIOR
What is Proxy ARP and when would you use it?
Q05SENIOR
How does ARP differ in IPv6 compared to IPv4?
Q01 of 05SENIOR

Explain the difference between ARP request, ARP reply, and gratuitous ARP. When is gratuitous ARP used?

ANSWER
ARP request: broadcast packet (who-has 192.168.1.1?) sent to find MAC of an IP. ARP reply: unicast response from the owner of that IP, containing its MAC address. Gratuitous ARP (GARP): unsolicited broadcast (or unicast) where sender puts its own IP in the target field, announcing 'this IP is now at this MAC'. GARP is used for: (1) HA failover — new owner of virtual IP announces itself; (2) Duplicate address detection — a node can check if its IP is already in use; (3) MAC address change — after NIC swap or VM migration. The problem: many network stacks ignore GARP unless configured otherwise (arp_accept=1 on Linux). So GARP is not reliable for failover without additional tuning.
FAQ · 7 QUESTIONS

Frequently Asked Questions

01
Does ARP work across routers?
02
What is the difference between ARP and RARP?
03
Why does the ARP request use broadcast but the reply is unicast?
04
How can I flush the ARP cache on Linux and Windows?
05
What happens if two devices have the same IP address on a network?
06
Is ARP used in wireless networks (Wi-Fi)?
07
How do I check if my Linux machine accepts gratuitous ARP?
🔥

That's Computer Networks. Mark it forged?

6 min read · try the examples if you haven't

Previous
CDN How It Works
17 / 22 · Computer Networks
Next
DHCP Explained