URL Shortener Design — Why Auto-Increment Kills at Scale
Auto-increment locks dropped throughput from 1000/sec to 0 mid-campaign.
- A URL shortener maps a long URL to a short code and redirects clients via HTTP 301/302
- Hashing strategies: base62 encoding of unique IDs vs hash-then-collision-check
- Redirects are cheap: aim for <10ms total latency at P99
- Caching must handle hot keys: a single viral link can generate millions of requests per minute
- Biggest mistake: using a single database counter to generate IDs — single point of failure and bottleneck
Imagine every long book title in a library had a short call number stamped on its spine — '792.4 SHA' instead of 'The Complete Works of Shakespeare, Volume III'. A URL shortener does exactly that for web addresses. You hand it a massive, ugly link and it gives you back a tiny code — like a coat-check ticket — that it keeps pinned to the original address. When someone shows up with the ticket, the system finds the coat (the real URL) and sends them straight to it.
Every time you see a link like 'bit.ly/3xQp9R' in a tweet, a QR code, or an SMS campaign, a surprisingly complex distributed system is working behind the scenes. URL shorteners process billions of redirects per day, and companies like Bitly, TinyURL, and Twitter's t.co have quietly become some of the most read-heavy services on the internet — often handling tens of thousands of requests per second at peak. Getting this design wrong at scale doesn't just mean slow pages; it means broken marketing campaigns, dead QR codes on printed packaging, and lost revenue that can't be recovered.
The core problem sounds trivial: map a long string to a short one and reverse the mapping on demand. But that simplicity is deceptive. You need to generate short codes that are globally unique, store hundreds of millions of mappings efficiently, serve redirects in under 10 milliseconds, handle hot keys (a single viral link getting millions of hits per minute), expire links, support custom aliases, and survive datacenter failures — all simultaneously.
By the end of this article you'll have a production-grade mental model for a URL shortener: you'll know exactly how to generate collision-free short codes, why you should never put a counter in a single database row, how to layer caching to absorb viral traffic spikes, and what the interview panel is really testing when they ask you this question.
What is Design URL Shortener?
A URL shortener is a service that takes a long URL and returns a shorter, unique alias that redirects clients to the original URL. The typical flow: a client submits a long URL via an API, the service generates a short code (e.g., 'abc123'), stores the mapping in a database with optional metadata (creation time, expiration, owner), and returns the full short URL (e.g., 'https://short.url/abc123'). When a client requests that short URL, the service looks up the code, retrieves the original URL, and issues an HTTP redirect (301 for permanent, 302 for temporary). Analytics (clicks, referrers, timestamps) are usually logged asynchronously.
Short Code Generation — Hashing vs Counter-Based IDs
There are two dominant strategies for generating short codes. The first is hash-based: take the long URL, compute a hash (e.g., MD5 or SHA-256), take the first N characters (usually 6–8), check for collisions, and if one exists add a salt or retry with a different prefix. The second is ID-based: use a globally unique integer (from a distributed ID generator) and encode it in base62 (0-9, a-z, A-Z) to produce a compact alphanumeric string. Base62 encoding of a 64-bit integer yields up to 11 characters — typical shorteners use 6–7 characters, which gives 62^6 ≈ 56 billion combinations.
ID-based systems are simpler for uniqueness (just generate a unique ID) but require a reliable ID generator. Hash-based systems must handle collisions and require longer codes for the same collision probability. Most production systems prefer ID-based with base62 encoding because the code space is deterministic and collision-checking is trivial.
Database Schema & Write Path
The core database stores the mapping from short code to long URL. The schema is simple: primary key on short_code, columns for original_url, created_at, expiration_at, owner_id (optional). But at scale, the write path must be designed for high throughput during creation bursts. Write operations are not the bottleneck (traffic is ~99% reads), but if you use a single database for ID generation, you get into trouble. Instead, decouple ID generation from the database: generate IDs in an application tier using Snowflake-like algorithms (or pre-allocated segments). Then insert the mapping asynchronously? No — inserts must be synchronous for consistency, but they can be batched and buffered.
For reads, index on short_code is critical. Use a covering index (include original_url) to avoid disk access. Partition the table by short_code prefix to distribute writes. Use a read replica for analytics queries, but always route redirect lookups to the primary or cache first.
Caching Layer — Survival Guide for Viral Traffic
A single viral link can generate millions of requests per minute. Without caching, your database will melt. The caching architecture needs at least two tiers: L1 (in-memory cache per application instance) and L2 (distributed cache like Redis or Memcached). L1 stores the hottest keys (recently accessed short codes) and evicts using LRU. L2 stores a larger set of mappings with a longer TTL.
Cache-aside pattern: on a redirect request, check L1 → if miss, check L2 → if miss, fetch from DB and populate both caches. Set a TTL of 24 hours for L2, but proactive invalidation when a link is deleted or expires. For read-heavy workloads, consider a write-through cache: on creation, immediately write to cache and DB asynchronously (with a queue). That way the first read is already fast.
Hot key problem: when a single short code gets 100k requests per second, Redis can become a hotspot. Solutions: local L1 caching (each app server caches the hot key), or use Redis with replicas and client-side sharding to distribute reads.
- L1: in-memory per microservice instance. Fastest. Limited size. Evict aggressively.
- L2: Redis cluster. Shared across all instances. Tolerates higher latency but still sub-millisecond.
- Cache miss penalty: L1 miss → Redis hit ~1ms. Redis miss → DB hit ~10ms. Every miss hurts throughput.
- Proactive populate: write-through cache on URL creation prevents the first request from hitting the DB.
Redirect Mechanics — HTTP Status and Performance
When a client requests a short URL, the server must respond with an HTTP redirect. Two status codes matter: 301 (Moved Permanently) and 302 (Found). 301 tells the browser to cache the redirect permanently — subsequent requests go directly to the long URL without hitting the shortener. This is great for performance but breaks analytics if you want to count every click (because cached browsers don't hit your service). 302 tells the browser not to cache — every request hits the shortener, enabling click tracking.
Most services use 302 by default for dynamic analytics, and offer 301 as an option for permanent links. The redirect response also includes the Location header. The server must set CORS headers if the short URL is embedded in an iframe.
Performance: the entire redirect (from request to response) should complete in under 10ms at P99. This includes DNS resolution on the client side, TCP connection, TLS handshake, and the server processing. The server side is typically <1ms with caching. Server-side improvements: keepalive connections, HTTP/2 multiplexing, and edge caching (CDN).
Expiration, Custom Aliases, and Analytics
Real URL shorteners support link expiration (e.g., for temporary campaign links) and custom aliases (user picks a meaningful short code). Expiration is implemented by storing an expires_at column and checking during redirect lookup. If the current time exceeds expires_at, return 410 Gone or redirect to a fallback page. Custom aliases require a separate validation: they must be unique globally and not conflict with auto-generated codes. A common approach is to reserve a prefix for auto-generated codes (e.g., starting with a digit) and allow custom aliases to start with a letter. Or use two separate tables.
Analytics: every redirect should asynchronously log the click event (time, referrer, user-agent, IP) to a high-throughput queue (Kafka, Kinesis). A separate consumer processes the stream to update click counts and generate reports. The click count on the links table should be denormalised for quick display but must be updated asynchronously to avoid write contention. Use eventual consistency: the consumer updates the count in the DB via upsert.
The Single-Table Counter That Took Down a Shortener
- Never rely on a single database auto-increment for ID generation at scale — it's a write bottleneck and a single point of failure.
- Use distributed ID generators or pre-allocated ID ranges to eliminate contention.
- Always design for write scalability even if you expect read-heavy workload — shortener creation traffic spikes during campaigns.
Key takeaways
Common mistakes to avoid
3 patternsUsing a single database auto-increment for short code IDs
Not caching redirect lookups
Using 302 for all links (no CDN edge caching)
Interview Questions on This Topic
How would you generate a unique short code for every URL in a distributed system?
Frequently Asked Questions
That's Real World. Mark it forged?
5 min read · try the examples if you haven't