Mid-level 8 min · March 06, 2026

ARP — Address Resolution Protocol

ARP Cache Timeout — Why 300s Default Breaks HA Failover

Q: Does ARP work across routers?

No. ARP only resolves IP addresses on the same local subnet. When a packet's destination is on a different subnet, the source uses ARP to find the MAC address of its default gateway (router). The router then forwards the packet to the next hop, resolving MACs on its own interfaces. ARP packets themselves are not routed; they are layer-2 broadcasts and do not cross a router (unless you have proxy ARP, which is generally disabled for security).

Q: What is the difference between ARP and RARP?

ARP (Address Resolution Protocol) maps IP → MAC address. RARP (Reverse ARP) maps MAC → IP address. RARP was used historically by diskless workstations to determine their IP address at boot. It has been completely replaced by BOOTP and then DHCP, which provide additional configuration (gateway, DNS, etc.). RARP is obsolete and not implemented in modern networks.

Q: Why does the ARP request use broadcast but the reply is unicast?

The ARP request uses a broadcast (destination MAC FF:FF:FF:FF:FF:FF) because the sender does not know the target's MAC address. Broadcasting ensures the request reaches all hosts on the segment. However, the target knows the sender's MAC address from the request, so it can send the reply directly (unicast) to the sender without disturbing other hosts. This is more efficient — only the request is broadcast, not the reply.

Q: How can I flush the ARP cache on Linux and Windows?

On Linux: `ip neigh flush dev eth0` (flushes all entries on that interface) or `arp -d ` (remove single entry). On Windows: `netsh interface ip delete arpcache` (requires admin). On macOS: `arp -a -d` (deletes all dynamic entries).

Q: What happens if two devices have the same IP address on a network?

The ARP cache on other hosts will flip between the two MAC addresses based on which device replies last to an ARP request — this is called ARP cache flapping. Intermittent connectivity and packet loss occur because traffic alternates between the two devices. Use `arping -D` to detect duplicate addresses. DHCP conflict detection can prevent this, but static assignments require administrative coordination.

Q: Is ARP used in wireless networks (Wi-Fi)?

Yes, ARP works the same way on Wi-Fi as on Ethernet. The Wi-Fi access point forwards ARP broadcasts to all associated stations (or uses multi-cast-to-unicast conversion). However, Wi-Fi has some optimizations: ARP proxy can be used by the AP to reduce over-the-air broadcasts. In enterprise Wi-Fi, ARP may be filtered or inspected by the controller.

Q: How do I check if my Linux machine accepts gratuitous ARP?

Check `sysctl net.ipv4.conf.eth0.arp_accept`. If 0 (default), unsolicited GARP is ignored. If 1, it's accepted. You can test by sending a GARP from another machine with `arping -U` and then checking `ip neigh show` on the target to see if the MAC updated.

Switches ignore gratuitous ARP until cache expires.

Naren Founder & Principal Engineer

20+ years shipping production systems from the metal up. Notes here come from systems that actually shipped.

✓ Production

production tested

May 24, 2026

last updated

1,554

articles · all by Naren

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

ARP maps IPs to MACs on local Ethernet/Wi-Fi. Hardware doesn't understand IPs.
Request: broadcast 'who has 192.168.1.1?'; reply: target responds with its MAC.
Cache entries live 60-300 seconds. That delay is the #1 cause of slow failover.
Performance: first packet to new IP triggers a broadcast; subsequent packets use cached MAC.
Production trap: ARP has zero authentication. Spoofing trivially redirects traffic.
Biggest mistake: assuming gratuitous ARP instantly updates all caches (it doesn't).

✦ Definition~90s read

What is ARP?

★

Imagine you move to a new neighborhood and you know your friend Sarah's house number (42 Maple Street) but you don't know what her front door looks like.

It broadcasts an ARP request: 'who-has 192.168.1.42? Tell 192.168.1.1'. The target unicasts back its MAC. Your OS caches that mapping so future packets don't need to broadcast again.

That's the entire protocol in one paragraph. The details — packet format, cache behavior, timeouts — are where production issues live.

Plain-English First

Imagine you move to a new neighborhood and you know your friend Sarah's house number (42 Maple Street) but you don't know what her front door looks like. So you stand outside and shout 'Hey, who lives at number 42?' — Sarah hears you, waves, and now you know exactly which door to knock on. ARP does the same thing on a network: your computer knows the IP address it wants to reach, but it needs the physical MAC address to actually deliver the data. It broadcasts a 'who has this IP?' question to everyone on the local network, and the right machine shouts back its MAC address.

Every time you load a webpage, send a Slack message, or ping a server, your operating system has to solve a puzzle before a single byte leaves your machine: it knows the destination's IP address, but your network hardware — your Ethernet card, your Wi-Fi adapter — doesn't understand IP addresses. It only speaks in MAC addresses, those 48-bit hardware identifiers burned into every network interface at the factory. Without a way to bridge that gap, your packets go nowhere.

This is the exact problem ARP was designed to solve back in 1982 (RFC 826), and it's still doing that job on virtually every LAN on the planet. It sits at the boundary between Layer 2 (Data Link) and Layer 3 (Network) of the OSI model, acting as a live translation service that maps 'logical' IP addresses to 'physical' MAC addresses. When it works, it's invisible. When it breaks — or gets exploited — things get interesting fast.

By the end you'll understand exactly how ARP request and reply packets are constructed, why the ARP cache exists and what happens when it goes stale, how ARP spoofing works at a packet level so you can reason about network security, and how to inspect and manipulate ARP behavior on a real Linux or macOS machine. This is the kind of depth that separates engineers who just use networks from engineers who actually understand them.

What is ARP — Address Resolution Protocol?

ARP — Address Resolution Protocol is a core networking mechanism that bridges Layer 2 (MAC) and Layer 3 (IP). Instead of a dry definition, let's see it in action. When your machine wants to send a packet to another machine on the same Ethernet segment, it needs the destination's MAC address. It broadcasts an ARP request: 'who-has 192.168.1.42? Tell 192.168.1.1'. The target unicasts back its MAC. Your OS caches that mapping so future packets don't need to broadcast again.

That's the entire protocol in one paragraph. The details — packet format, cache behavior, timeouts — are where production issues live.

io/thecodeforge/networking/arp_inspect.shBASH

#!/bin/bash
# Inspect ARP cache and watch ARP traffic on Linux

echo "=== Current ARP cache ==="
ip neigh show

echo ""
echo "=== Watch ARP packets for 5 seconds ==="
sudo timeout 5 tcpdump -i eth0 -n arp -c 10 2>/dev/null || echo "No ARP traffic"

echo ""
echo "=== Clear ARP entry for a specific IP ==="
# Uncomment to clear: sudo arp -d 192.168.1.1
# Or: sudo ip neigh del 192.168.1.1 dev eth0

echo ""
echo "=== Force re-resolution via arping ==="
# arping -c 2 -I eth0 192.168.1.1

Why ARP Uses Broadcast But Not Unicast

The sender doesn't know the target's MAC — that's the whole problem. So it broadcasts (MAC FF:FF:FF:FF:FF:FF) to everyone. Only the owner replies, and it replies unicast because it now knows the sender's MAC from the request. Efficient: one broadcast, one unicast reply.

Production Insight

ARP resolves IP to MAC only for local subnet traffic. Packets to a remote subnet go to the default gateway's MAC, not the final destination.

The ARP cache reduces broadcast overhead but causes failover delay. Stale entries persist for minutes.

Rule: Always check ARP cache when debugging 'network is up but traffic fails'. An incomplete or wrong MAC is a common cause.

Key Takeaway

ARP maps IPs to MACs only on the local network. External traffic goes to gateway MAC, not final destination.

ARP cache entries live for minutes (default 60-300 seconds). That cache is the single biggest cause of slow failover.

Rule: When ping works but higher-layer apps fail, check the ARP cache — wrong MAC means packets go to the wrong host.

Is ARP Involved in This Network Issue?

IfDestination IP is on same subnet (check netmask)

→

UseARP is used. Source sends 'who-has' ARP request for destination IP. Expect reply before packets flow.

IfDestination IP is on different subnet

→

UseARP is NOT used for final destination. Packet sent to default gateway's MAC. Gateway then routes. If gateway ARP fails, packet cannot leave subnet.

IfPing fails but arp -a shows 'incomplete'

→

UseNo ARP reply was received. Destination may be down, switch isolating ports, or firewall blocking ARP (unlikely). Use tcpdump to see if request leaves.

IfPing works but intermittent (some packets succeed, some fail)

→

UseARP cache may be flapping if two devices claim same IP. Check arp -a for same IP with different MACs across multiple queries.

IfFailover takes minutes instead of seconds

→

UseARP cache timeouts too high on switches/clients. Reduce gc_stale_time and switch aging time. Test with arping -U on failover.

thecodeforge.io

ARP Cache Timeout & HA Failover Impact

Arp Address Resolution Protocol

ARP Spoofing: The Attack That Redirects Traffic Without Routes

ARP spoofing (ARP cache poisoning) exploits the fact that ARP has no authentication. An attacker sends unsolicited ARP replies (gratuitous ARP) claiming to own the IP address of the default gateway or another host. The victim's ARP cache updates with the attacker's MAC, and all traffic destined for that IP is sent to the attacker instead.

How it works: attacker sends "192.168.1.1 is at aa:bb:cc:dd:ee:ff" (where aa:bb:cc:dd:ee:ff is attacker's MAC). The target believes this unsolicited update and forwards all traffic. The attacker can then inspect, modify, or block the traffic — a classic man-in-the-middle attack.

Mitigations: dynamic ARP inspection (DAI) on switches validates ARP packets against DHCP snooping bindings. Port security limits MAC addresses per port. Static ARP entries for critical IPs (gateway, DNS, NTP) prevent poisoning but are administratively heavy. Use arp_filter and arp_ignore sysctl on Linux to reject unsolicited ARP on some interfaces.

Detection: use arpwatch (logs ARP changes) or arp-scan to detect duplicate IP claims. On Linux, arp -a may show the same IP with different MACs over time. Anomaly detection can alert when gateway MAC changes outside maintenance windows.

Why ARP Spoofing Still Works in 2026

ARP has no authentication because the same 1982 protocol is still used. No MAC address validation, no cryptographic signatures. Enterprises mitigate with switch security (DAI), but many small networks remain vulnerable. Never trust that ARP mappings are correct without additional security layers like IPsec or HTTPS.

Production Insight

ARP spoofing can intercept traffic without touching routing tables. It's purely layer-2, so firewalls see legitimate src/dst IPs.

Detection: an IP with two MACs in the cache is the clearest sign. Use arpwatch to detect 'flip' events.

Rule: For any sensitive network, enable switch security: DHCP snooping + dynamic ARP inspection (DAI). It's not perfect but raises the bar significantly.

Key Takeaway

ARP spoofing redirects traffic without touching IP routes — pure layer-2 attack. No authentication in ARP.

Detection: lookout for same IP with multiple MAC addresses in cache. arpwatch is the standard monitoring tool.

Rule: For sensitive networks, enable Dynamic ARP Inspection (DAI) on switches. For public networks, encrypt everything — ARP is untrustable.

ARP Security Controls Selection

IfCritical infrastructure (payment, auth, database)

→

UseUse static ARP entries for critical IPs (gateway, DNS, NTP). Disable dynamic ARP learning on those entries. Use port security and MAC limiting on switches.

IfGeneral production network with switch support

→

UseEnable DHCP snooping + Dynamic ARP Inspection (DAI). DAI validates ARP packets against the DHCP binding table, discarding unsolicited or mismatched ARP.

IfPublic Wi-Fi or untrusted network

→

UseUse VPN or IPsec. ARP spoofing is trivial on shared networks (coffee shops, airports). Do not rely on ARP security at all; encrypt from endpoint.

IfCloud environment (AWS, GCP, Azure)

→

UseARP is not used for east-west traffic. Cloud SDN replaces layer-2 with overlay networks. However, ARP spoofing between tenants is impossible because each tenant has isolated MAC address space.

IfSmall office / home office (SOHO)

→

UseUpgrade to a switch that supports DAI, or use static ARP for gateway. Most home routers are vulnerable. Use HTTPS and TLS everywhere as defense-in-depth.

Gratuitous ARP: The Double-Edged Sword

Gratuitous ARP (GARP) is an ARP announcement sent without a corresponding request. It's used for IP address takeover (failover), MAC address updates, and duplicate address detection (DAD).

In gratuitous ARP, the sender puts its own IP in the 'target IP' field (not the usual request format). The message says 'this IP is now at this MAC'. Recipients may update their ARP cache immediately, even though they didn't ask.

Common uses: - HA failover (VRRP, CARP): Standby server sends GARP to update switch MAC tables and client caches when VIP moves. - MAC address change: If a NIC MAC changes (rare, but possible with virtual machines), GARP can notify the network. - Duplicate IP detection: A node that receives GARP claiming an IP it already owns can detect conflict.

Why GARP fails in production: - Many switches and client OSes ignore unsolicited ARP updates (security hardening). They only update cache in response to requests. - Even when accepted, some implementations only update if the entry doesn't already exist or is stale. - The solution is to send a series of ARP requests for the same IP, forcing a cache refresh via reply.

io/thecodeforge/networking/gratuitous_arp_test.shBASH

#!/bin/bash
# Demonstrating gratuitous ARP and testing cache update behavior

# Using arping to send gratuitous ARP (unsolicited)
# This tells the network that IP address 192.168.1.100 is at this interface's MAC

echo "=== Sending gratuitous ARP from this machine ==="
# -U means unsolicited (gratuitous)
# -c 3 sends 3 packets to ensure delivery
# -I specifies the network interface (eth0, wlan0, etc.)
arping -U -c 3 -I eth0 192.168.1.100

echo ""
echo "=== Forcing ARP cache update via request to target ==="
# Alternative: send a request for the IP (as if we're looking for it).
# The target will reply, and we learn its MAC. This is guaranteed to work.
arping -c 2 -I eth0 192.168.1.100

echo ""
echo "=== Check current ARP cache entry for the IP ==="
arp -a 192.168.1.100

echo ""
echo "=== To test GARP acceptance on another machine ==="
# On a Linux destination, check if GARP updated cache:
# sudo tcpdump -i eth0 arp
# ip neigh show
# The sysctl 'arp_accept' controls whether unsolicited ARP updates are accepted.

Linux arp_accept Controls GARP Behavior

sysctl -w net.ipv4.conf.eth0.arp_accept=1 forces Linux to accept unsolicited ARP updates. Default is 0 (ignore). Many distributions leave it at 0 for security. Always test if your GARP is actually being accepted in your environment.

Production Insight

Gratuitous ARP is not magic. Many OSes (including Linux with default arp_accept=0) ignore unsolicited updates entirely.

Failover scripts often fail because they assume GARP works everywhere. Test with packet capture.

Rule: For critical failover, use VRRP/CARP (which includes GARP but also MAC address takeover). Or use BFD (Bidirectional Forwarding Detection) on routed interfaces to bypass ARP entirely.

Key Takeaway

Gratuitous ARP is a hint, not a guarantee. OS and switch implementations vary wildly in whether they accept unsolicited updates.

For reliable failover, use VRRP/CARP (MAC takeover) or layer-3 health checks + routing.

Rule: Test your failover with packet capture. If you see GARP sent but traffic still goes to old MAC, cache TTL is too high.

Ensuring Fast Failover with ARP

IfVirtual IP failover (keepalived, heartbeat)

→

UseUse keepalived with VRRP — VRRP changes MAC address of interface, not just ARP. VRRP between routers is standard.

IfSwitches must update MAC table quickly on failover

→

UseReduce MAC address aging time on switch from default 300s to 30s. Apply globally or per VLAN.

IfClients (Linux) need faster ARP expiry

→

UseSet net.ipv4.neigh.default.gc_stale_time = 30 and net.ipv4.neigh.default.proxy_qlen = 96

IfGratuitous ARP completely unsupported on network

→

UseUse layer-3 solutions: BFD + ECMP with health checks. Or cloud load balancer (AWS NLB) which handles failover at proxy level.

IfNeed sub-second failover on local network

→

UseUse MAC takeover (VRRP) not ARP updates. VRRP sends multicast so the switch learns new MAC immediately via normal MAC learning process.

ARP Cache Internals: Aging, GC and Production Tuning

The ARP cache is a simple key-value store: IP → MAC. But its behavior is governed by several timers and thresholds that directly impact production reliability.

Key Linux sysctl parameters: - gc_stale_time (default 60s): how long an entry can be stale before it's considered for garbage collection. A stale entry means the MAC hasn't been verified recently, but the entry still exists. - gc_thresh1 (default 128): if the cache has fewer entries than this, GC doesn't run. - gc_thresh2 (default 512): if cache exceeds this, GC runs more aggressively. - gc_thresh3 (default 1024): hard limit. Once reached, new ARP resolutions fail with "neighbour table overflow". - base_reachable_time (default 30s): base time for an entry to be considered reachable; actual reachable time = base_reachable_time + random(0, gc_stale_time/2). - delay_first_probe_time (default 5s): time to wait before first probe after an entry becomes stale.

Windows ARP cache: netsh interface ip delete arpcache flushes. Default timeout is 300 seconds (ARP cache timeout = 60 seconds for neighbor unreachability detection actually). Windows uses a different mechanism (NUD).

Switch MAC aging: Layer-2 switches have a MAC address table that maps MACs to ports. Aging time default is often 300 seconds. When a GARP arrives, the switch may update the MAC table if the entry is not the same MAC on different port? Actually, MAC learning updates on any frame with source MAC. If the frame comes from a different port than the current entry, the switch updates immediately (MAC flapping). GARP triggers this. However, ARP cache on the switch (if it's a layer-3 switch) is separate and may not update from GARP.

Tuning for HA: - Reduce gc_stale_time to 15-30 seconds for faster failover. - Increase gc_thresh3 if you have many neighbors (e.g., container hosts). - Set arp_accept=1 if you trust GARP from your failover script. - Always test: send GARP and verify cache update on target with ip neigh show.

ARP Cache as a Phonebook

Entries have a 'reachable' state and 'stale' state. Stale entries are still usable but need verification.
GC runs periodically to purge entries that haven't been used. Not all stale entries are removed immediately.
gc_stale_time sets how long an entry can stay stale before GC considers it for deletion.
gc_thresh1/2/3 set the watermarks for GC aggression. Overflow causes ARP failures.
Tuning is a trade-off: faster failover vs more ARP broadcasts.

Production Insight

Stale ARP entries cause silent blackholing. The source sends packets to a MAC that no longer exists.

Cache overflow (gc_thresh3) causes 'neighbour table overflow' errors — new connections fail silently.

Rule: Monitor cat /proc/net/stat/arp_cache for table fullness. Increase gc_thresh3 if you have >1000 neighbours.

Key Takeaway

ARP cache tuning is a direct trade-off between failover speed and broadcast overhead.

Monitor gc_thresh to avoid silent failures. Reduce gc_stale_time for HA.

Rule: Tune cache aggressively for failover environments; test with packet capture to verify GARP acceptance.

ARP Cache Tuning Decisions

IfFailover must complete within 5 seconds

→

UseReduce gc_stale_time to 15s, set arp_accept=1, reduce switch MAC aging to 30s. Use VRRP for sub-second.

IfHost has many neighbors (e.g., Docker host with hundreds of containers)

→

UseIncrease gc_thresh3 to 4096 or higher. Monitor cache usage. Enable neigh/default/gc_interval if needed.

IfRandom ARP failures in logs: 'neighbour table overflow'

→

Usegc_thresh3 is too low. Increase it. Also consider reducing gc_stale_time to flush stale entries faster.

IfARP requests are flooding the network (broadcast storm)

→

Usegc_stale_time might be too low, causing frequent re-resolutions. Increase it to reduce broadcasts, but balance with failover needs.

IfMigrating a VM with same IP to new host (live migration)

→

UseAfter migration, send GARP from new host. If recipients ignore GARP (arp_accept=0), they will not update cache until next resolution (could be minutes). Set arp_accept=1 on critical clients or use gratuitous ARP with request mode.

Proxy ARP and ARP in Virtualized/Cloud Environments

Proxy ARP is a technique where a device (usually a router) answers ARP requests on behalf of another host. It's used in scenarios like VPNs, virtual IPs, and transparent bridging. The router sees an ARP request for an IP that belongs to a host behind it, and it replies with its own MAC address. This tricks the sender into forwarding traffic to the router, which then forwards the packet to the real destination.

When to use Proxy ARP: - VPN clients on a subnet need to appear as local hosts. - Load balancers that proxy connections to backend servers. - Containers in host-networking mode where the host answers for container IPs.

Production pitfalls: - Proxy ARP can cause routing loops if misconfigured. The router answers for an IP that is on the same subnet but behind itself, leading to a cycle. - It hides the true topology, making debugging harder. - Many security teams disable proxy ARP to prevent spoofing.

ARP in cloud environments (AWS, GCP, Azure): - Cloud providers use Software-Defined Networking (SDN) that replaces ARP entirely. Instances do not send ARP requests to other instances. - The hypervisor handles MAC-to-IP mapping. Even if you see MACs in arp -a, they are virtual MACs assigned by the cloud controller. - Gratuitous ARP is ignored. Failover must use cloud-specific mechanisms: health checks, load balancers, Elastic IPs (AWS), etc. - In AWS, if you move an Elastic IP to another instance, the network mapping updates in seconds — but it's not ARP-based. It's a control plane update. - Key rule: In the cloud, forget everything you know about ARP. It doesn't work the same way.

io/thecodeforge/networking/proxy_arp_config.shBASH

#!/bin/bash
# Configure Proxy ARP on Linux

echo "=== Enable Proxy ARP on an interface ==="
sysctl -w net.ipv4.conf.eth0.proxy_arp=1

echo ""
echo "=== Add a proxy ARP entry for a remote IP ==="
# Respond to ARP requests for 10.0.0.5 as if it's on the local subnet
sudo ip neigh add proxy 10.0.0.5 dev eth0

echo ""
echo "=== Verify proxy entries ==="
ip neigh show proxy

echo ""
echo "=== Check if proxying is active (from another host) ==="
# arping -I eth0 10.0.0.5
# If proxy works, this returns a reply from the router's MAC.

Cloud Tip: Forget ARP in the Cloud

AWS, GCP, and Azure replace ARP with SDN mappings. Migrating on-prem HA scripts that rely on GARP to the cloud will fail. Use cloud-native health checks and load balancers for failover.

Production Insight

Proxy ARP can solve reachability issues but introduces routing loops if misconfigured. Always verify with traceroute.

In cloud environments, ARP does not work as expected. GARP is ignored, cache entries are virtual.

Rule: Use static ARP sparingly. Prefer routed solutions (BGP, OSPF) or cloud-native abstractions.

Key Takeaway

Proxy ARP bridges subnets but adds complexity and security risk. Prefer routed solutions.

In the cloud, ARP is replaced by SDN. Forget ARP-based failover.

Rule: When moving on-prem HA to cloud, redesign the failover layer — do not port ARP scripts.

Should You Use Proxy ARP?

IfYou need to make a remote subnet appear local (VPN)

→

UseProxy ARP can work, but consider route-based VPN (tun) or VXLAN for cleaner abstraction. Proxy ARP adds debugging complexity.

IfLoad balancer with direct server return (DSR)

→

UseProxy ARP is often used to make real servers appear to have the VIP. Better: use LVS with DSR and arp_ignore/arp_announce to prevent servers from responding on VIP.

IfCloud instance (AWS, GCP)

→

UseDo not use Proxy ARP. It will not work as expected. Use cloud load balancer (ALB/NLB), health checks, and autoscaling.

IfHome lab / small network

→

UseWorks fine for testing. Enable proxy_arp on the gateway router. Monitor for loops.

How ARP Actually Resolves an IP — The Two-Message Dance

When your machine needs to send a packet to 192.168.1.105, it checks the ARP cache. No entry? It broadcasts an ARP Request: "Who has 192.168.1.105? Tell 00:1A:2B:3C:4D:5E." Every host on the broadcast domain sees this. The owner of that IP unicasts back an ARP Reply containing its MAC address. Your kernel now inserts this mapping into the ARP cache and sends the frame. This entire exchange happens before a single TCP handshake or UDP datagram leaves the interface. Timeout on a flooded network? You lose. The request is broadcast, meaning every idle NIC on the subnet wakes up, processes it, and drops it. In a /24 with 250 hosts, ARP storms can spike CPU on older switch ASICs. Design your network so that ARP traffic is the exception, not the heartbeat.

arp_cache_lookup.cC

// io.thecodeforge
#include <stdio.h>
#include <string.h>

struct arp_entry {
    unsigned char ip[4];
    unsigned char mac[6];
    int valid;
};

int arp_lookup(struct arp_entry *cache, int size, unsigned char *target_ip) {
    for (int i = 0; i < size; i++) {
        if (cache[i].valid &&
            memcmp(cache[i].ip, target_ip, 4) == 0) {
            return i;
        }
    }
    return -1; // Cache miss; must broadcast ARP request
}

Output

Returns index of cached entry or -1 to trigger ARP request flood.

Production Trap:

Do not rely on the ARP cache alone for security. A malicious host can reply before the real owner does — ARP poisoning works because first reply wins on many kernels.

Key Takeaway

Every packet to a new host triggers an ARP broadcast. That broadcast is the weakest link in L2 security.

ARP Message Format — What the Wire Actually Carries

The ARP packet is tiny — 28 bytes for IPv4 over Ethernet. It sits inside a Layer 2 frame with EtherType 0x0806. The format is dead simple: hardware type (1 for Ethernet), protocol type (0x0800 for IPv4), hardware size (6 for MAC), protocol size (4 for IPv4), and an opcode (1 for request, 2 for reply). Then come four addresses: sender MAC, sender IP, target MAC (zeroed in requests), target IP. That's it. No headers. No checksum. No validation. Why does this matter? Because a malformed ARP packet with a spoofed sender MAC can corrupt your cache faster than any ACL can stop it. When debugging "intermittent connectivity" on a VLAN, dump the ARP frames with tcpdump -i eth0 arp. If you see duplicate IPs with different MACs, you've found your gremlin.

arp_packet_parse.pyPYTHON

# io.thecodeforge
import struct

def parse_arp(packet):
    # Skip Ethernet header (14 bytes)
    arp = packet[14:42]
    htype, ptype, hlen, plen, opcode = struct.unpack('!HHBBH', arp[:8])
    sender_mac = ':'.join(f'{b:02x}' for b in arp[8:14])
    sender_ip = '.'.join(str(b) for b in arp[14:18])
    target_mac = ':'.join(f'{b:02x}' for b in arp[18:24])
    target_ip = '.'.join(str(b) for b in arp[24:28])
    return {
        'opcode': 'REQUEST' if opcode == 1 else 'REPLY',
        'sender': f'{sender_ip} @ {sender_mac}',
        'target': f'{target_ip} @ {target_mac}'
    }

Output

{'opcode': 'REQUEST', 'sender': '192.168.1.10 @ 00:1a:2b:3c:4d:5e', 'target': '192.168.1.105 @ 00:00:00:00:00:00'}

Know Your Wire:

ARP has no authentication. Any host can forge a reply. In cloud VPCs, ARP is often suppressed — use the cloud provider's metadata service instead.

Key Takeaway

28 bytes. No authentication. No checksum. That simplicity is why spoofing works.

Production ARP Attacks You Will Actually See

Users don't complain about "ARP cache poisoning." They complain about "random disconnects" and "my bank login page shows a certificate warning." Here are the three attacks that survive past certs and firewalls. First, ARP spoofing: attacker sends forged ARP replies associating the gateway's IP with its own MAC. Traffic flows through the attacker, allowing MITM sniffing. Second, ARP denial: attacker floods the network with fake MAC claims for a target IP. The target's cache flips to an invalid MAC, dropping traffic until timeout. Third, gratuitous ARP floods: a misconfigured VM or container announces its MAC at startup thousands of times per second. The switch's MAC table fills, causing packet punting to the CPU. In production, mitigate with Dynamic ARP Inspection (DAI) on managed switches. DAI intercepts and validates ARP packets against a trusted DHCP snooping database. Without it, your network is flat and vulnerable.

arp_spoof_detect.shBASH

# io.thecodeforge
#!/bin/bash
# Detect ARP spoofing by tracking MAC changes
# Requires arpwatch or manual arp cache scanning

GATEWAY="192.168.1.1"
KNOWN_MAC="00:11:22:33:44:55"

while true; do
  CUR_MAC=$(arp -n $GATEWAY | tail -1 | awk '{print $3}')
  if [ "$CUR_MAC" != "$KNOWN_MAC" ]; then
    echo "ALERT: Gateway MAC changed from $KNOWN_MAC to $CUR_MAC" | logger -t arp_watch
  fi
  sleep 5
done

Output

ALERT: Gateway MAC changed from 00:11:22:33:44:55 to aa:bb:cc:dd:ee:ff

MITM Is Too Easy:

Any user on the same L2 segment can ARP spoof your gateway. Segment untrusted devices into separate VLANs with ACLs. Enable port security to limit MACs per port.

Key Takeaway

ARP spoofing is trivial, devastating, and completely undetectable to the application layer. Trust nothing on L2.

● Production incidentPOST-MORTEMseverity: high

The 7-Minute Failover That Cost $400k

Symptom

Primary server crashed. Standby detected failure (3 seconds) and sent gratuitous ARP (GARP) for the VIP. But clients and the switch still sent traffic to the dead primary MAC for 7 minutes. Manual ARP cache clearing on the switch fixed it immediately, confirming the root cause.

Assumption

The team assumed gratuitous ARP was a magic 'flush cache' command. They didn't know most switches and OSes only update cache if the new MAC arrives in response to a request (unsolicited GARP is often ignored). They also assumed ARP cache expiry was seconds, not minutes.

Root cause

The switch's ARP timeout was 300 seconds (default on many devices). When the standby sent a GARP, the switch logged it but did NOT replace the existing cache entry because the entry hadn't expired. The cache still mapped the VIP to the primary's MAC. Clients that had already resolved the VIP also held it in their cache (default Linux expiry 60 seconds, Windows 300 seconds). The team had no mechanism to clear caches remotely. Failover required waiting for all caches to expire naturally — up to 5 minutes on clients, plus switch TTL.

Fix

Reduced ARP cache expiry on the switch from 300 to 30 seconds. Reduced client cache TTL via router advertisements (for stateless) or DHCP option. On failover, script that runs arping -U -I eth0 -c 3 <VIP> (unsolicited ARP) — some OSes accept this with arp_accept=1 sysctl. Implemented link-layer networking with VRRP (which sends multicast GARP with proper MAC). Used send_arp in keepalived to emit unsolicited ARPs. After tuning, failover dropped to 3 seconds.

Key lesson

Gratuitous ARP is a hint, not a command. Switches and clients ignore it unless configured to accept unsolicited updates. Never rely on it as your only failover mechanism.
Always tune ARP cache timeout for your failover requirement. 300 seconds (default switch) is too long for HA. 30-60 seconds is safer; use VRRP or BFD for sub-second.
Test failover with packet capture. Look for ARP requests/replies during transition. If you see GARP but traffic still goes to wrong MAC, cache is the culprit.
In cloud environments (AWS, GCP), ARP is disabled or replaced with SDN forwarding rules. Use health checks and load balancers, not VIPs with ARP.

Production debug guideSymptom → Action mapping for common ARP-related failures6 entries

Symptom · 01

Traffic destined to an IP is blackholed — incoming packets stop arriving

→

Fix

Check ARP cache on the sender. Run arp -a | grep <destination_IP>. If MAC is incomplete or wrong, ARP resolution failed or stale. Clear cache: ip neigh flush dev eth0 or arp -d <IP>. Watch tcpdump: tcpdump -i eth0 arp — see if requests send or replies come back.

Symptom · 02

Slow failover (tens of seconds) despite VIP moving quickly

→

Fix

Check ARP cache timeout on clients and switches. Default on many is 60-300 seconds. Reduce switch ARP timeout via mac address-table aging-time. Use arping -U from the new owner. For Linux clients, set net.ipv4.neigh.default.gc_stale_time = 30

Symptom · 03

ARP spoofing warning in IDS or security scan

→

Fix

Use arpwatch or arp-scan to detect anomalies. arp -a may show duplicate IP with different MACs. Mitigate: port security on switches (static ARP entries for critical IPs). Use arp_filter or arp_ignore sysctls on Linux to reject unsolicited ARP. For high-security, configure static ARP entries or use layer-3 routing not layer-2.

Symptom · 04

Intermittent connectivity — some pings work, some fail

→

Fix

ARP cache flapping — two devices claim same IP (IP conflict). Run arp -a | grep <IP> and see multiple MAC entries. Use arping -D -I eth0 <IP> to detect duplicate address. Fix by renumbering the conflict or shutting down the rogue device.

Symptom · 05

No ARP replies — 'who-has' requests sent, no response

→

Fix

Target IP not reachable at layer 2 (different subnet) or the target host is down. Check subnet mask: if destination IP is outside local network, ARP is not used (packet goes to gateway instead). Run tcpdump -i eth0 arp and host <target_IP>. Ensure both machines are on same VLAN/physical segment.

Symptom · 06

VIP reachable from some clients but not others

→

Fix

Clients with stale ARP caches still point to old MAC. Check arp -a on affected clients. Clear the entry and force re-resolution: arp -d <VIP>. If GARP sent but ignored, check arp_accept on Linux clients.

★ ARP Quick Debug Cheat SheetFast diagnostics for network layer-2 mapping issues. Run these commands before touching routing.

Can't ping a local IP, but IP is up−

Immediate action

Check ARP entry for that IP

Commands

arp -a | grep -i <destination_IP>

ip neigh show | grep <destination_IP>

Fix now

If entry is incomplete or stale: arp -d <IP> or ip neigh del <IP> dev eth0. Then retry ping to refresh cache.

Suspect ARP cache poisoning (spoofing)+

Failover from primary to standby takes too long+

ARP requests sent but no replies+

Duplicate IP address detected — intermittent connectivity+

ARP vs NDP vs Inverse ARP

Protocol	Layer	What it does	Security	Used in	Key difference
ARP (IPv4)	RFC 826	Maps IPv4 address to MAC address on Ethernet/Wi-Fi	No security — spoofing trivial	IPv4 LANs everywhere	Broadcast request: `who-has?`
NDP (IPv6)	RFC 4861	Address resolution, router discovery, prefix discovery	SeND (Secure Neighbor Discovery) optional	IPv6 networks	Uses ICMPv6 multicast, not broadcast
Inverse ARP	RFC 2390	Maps DLCI (frame relay) to IP address	Legacy	Frame relay networks (obsolete)	Request asks 'what IP is at DLCI X?'
RARP	RFC 903	Reverse ARP: MAC to IP (for diskless boot)	Obsolete	Historical (replaced by DHCP)	RARP server responds with IP for given MAC

Key takeaways

ARP maps IP addresses to MAC addresses on local Ethernet networks

required because hardware only understands MACs.

ARP cache entries live for minutes (default 60-300 seconds). That cache is the biggest cause of slow failover; tune expiry times aggressively for HA.

ARP has no authentication

spoofing is trivial. Mitigate with Dynamic ARP Inspection on switches, static ARP for critical IPs, and end-to-end encryption.

Gratuitous ARP is not a reliable way to update caches. Many systems ignore unsolicited ARP updates (arp_accept=0 on Linux). Always test with packet capture.

In cloud environments, ARP is replaced by SDN

use cloud-native failover mechanisms, not ARP.

Common mistakes to avoid

6 patterns

Assuming ARP works for cross-subnet communication

Symptom

arp -a shows incomplete or no entry for remote IP; pings fail despite correct routes.

Fix

ARP only resolves IPs on the same subnet. For remote IPs, packets go to default gateway MAC, not the final destination. Check subnet mask on both sides.

Leaving ARP cache timeout too long for failover environment

Symptom

VIP failover takes 3-5 minutes despite instant detection. GARP sent but ignored.

Fix

Reduce gc_stale_time on Linux clients (default 60s). Reduce switch MAC aging time from 300s to 30s. For Linux, set arp_accept=1 if unsolicited ARP is safe in your environment.

Trusting ARP for security — no additional controls

Symptom

ARP spoofing attack detected; traffic intercepted; no alarms.

Fix

Enable Dynamic ARP Inspection (DAI) on switches if supported. Use static ARP for critical IPs. Monitor with arpwatch. Encrypt traffic end-to-end (TLS, IPsec) as defense-in-depth.

Forgetting to clear ARP cache after moving IP to new MAC

Symptom

IP migrated to new machine but some clients still send packets to old MAC.

Fix

Send GARP from new owner AND proactively flush cache on critical clients: arp -d <IP> on Linux or arp -d <IP> on Windows. On switch, clear mac address-table for that VLAN.

Using `arping -U` without testing acceptance on target OS

Symptom

Gratuitous ARP sent but arp -a on receiver doesn't update. Failover scripts think update succeeded.

Fix

Check arp_accept sysctl on Linux (sysctl net.ipv4.conf.eth0.arp_accept). If 0, receiver ignores unsolicited ARP. Change to 1 or rely on arping -c 3 request mode, not -U (gratuitous).

Assuming ARP works in cloud as it does on-prem

Symptom

Failover script using GARP doesn't work in AWS; traffic still goes to old instance.

Fix

ARP is replaced by SDN in the cloud. Use cloud-native mechanisms: AWS Elastic IP reassignment, load balancer target group changes, or health check-based routing.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

Explain the difference between ARP request, ARP reply, and gratuitous AR...

Q02SENIOR

How does ARP spoofing work, and how can you detect it on a running netwo...

Q03SENIOR

What is the ARP cache, and why does it cause slow failover in high-avail...

Q04SENIOR

What is Proxy ARP and when would you use it?

Q05SENIOR

How does ARP differ in IPv6 compared to IPv4?

Q01 of 05SENIOR

Explain the difference between ARP request, ARP reply, and gratuitous ARP. When is gratuitous ARP used?

ANSWER

ARP request: broadcast packet (who-has 192.168.1.1?) sent to find MAC of an IP. ARP reply: unicast response from the owner of that IP, containing its MAC address. Gratuitous ARP (GARP): unsolicited broadcast (or unicast) where sender puts its own IP in the target field, announcing 'this IP is now at this MAC'. GARP is used for: (1) HA failover — new owner of virtual IP announces itself; (2) Duplicate address detection — a node can check if its IP is already in use; (3) MAC address change — after NIC swap or VM migration. The problem: many network stacks ignore GARP unless configured otherwise (arp_accept=1 on Linux). So GARP is not reliable for failover without additional tuning.

FAQ · 7 QUESTIONS

Frequently Asked Questions

Does ARP work across routers?

What is the difference between ARP and RARP?

Why does the ARP request use broadcast but the reply is unicast?

How can I flush the ARP cache on Linux and Windows?

What happens if two devices have the same IP address on a network?

Is ARP used in wireless networks (Wi-Fi)?

How do I check if my Linux machine accepts gratuitous ARP?

Naren Founder & Principal Engineer

20+ years shipping production systems from the metal up. Notes here come from systems that actually shipped.

✓ Verified

production tested

May 24, 2026

last updated

1,554

articles · all by Naren

🔥

That's Computer Networks. Mark it forged?

8 min read · try the examples if you haven't