TinyURL generates short codes via Base62 encoding of a unique 64-bit ID, guaranteeing no collisions.
The system is read-heavy (100:1 ratio) — choose NoSQL (Cassandra) with Redis LRU caching.
Distributed ID generation (Snowflake/ZooKeeper) is the backbone for collision-free scale.
301 redirects let browsers cache the mapping, reducing server load; 302 redirects pass through for analytics.
Biggest mistake: using MD5 hashing for code generation — collisions force retry loops at scale.
Plain-English First
Imagine every website address is a long home address like '123 Sunflower Lane, Apartment 4B, Springfield, Illinois, 62701, USA'. TinyURL is like a nickname system — you tell the post office 'call that address #XK9' and now anyone who says '#XK9' gets redirected to the full address instantly. The post office (the server) keeps a giant lookup book that maps short nicknames to long addresses. That's the whole system — a glorified, globally-distributed lookup book that has to handle billions of lookups per day without breaking a sweat.
Every senior engineer has sat across from an interviewer who says 'design a URL shortener' with a calm smile. It sounds trivial — take a long URL, make it short. But behind that smile is a question that probes distributed systems, database design, caching strategy, hash collision handling, rate limiting, analytics, and horizontal scaling simultaneously. Bit.ly processes over 600 million redirects per day. TinyURL has been alive since 2002. These systems are deceptively simple on the surface and genuinely hard to build correctly at scale.
The core problem is a deceptively asymmetric one: writes are rare, reads are overwhelmingly frequent. When you shorten a URL, that's a one-time write. But that short link might be embedded in a viral tweet and hit 10 million times in an hour. Your design has to reflect this read-heavy reality — every architectural choice from your hashing scheme to your cache eviction policy flows from that single insight.
By the end of this article you'll be able to walk into any system design interview and design TinyURL end-to-end: justify your short code generation strategy, design a DB schema that survives traffic spikes, build a caching layer that handles 99% of reads from memory, handle custom aliases and expiration, discuss analytics pipelines, and correctly answer every follow-up an interviewer throws at you. Let's build it.
The Core Logic: Base62 Encoding vs. Hashing
In a URL shortener, the 'Magic' is how we generate the tiny string. You have two main paths: Hashing (MD5/SHA-256) or Base62 Encoding a unique ID. Hashing often leads to collisions that require complex 'check-and-retry' logic. The industry-standard approach is to use a distributed ID generator (like a Snowflake ID or a centralized Range Manager) and convert that numeric ID into a Base62 string (a-z, A-Z, 0-9).
For example, an ID like 125 converted to Base62 results in a short, predictable, and unique string. To prevent predictability (so people can't guess the 'next' URL), we can add a bit of salt or shuffle our Base62 alphabet.
io.thecodeforge.shortener.Base62Encoder.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
package io.thecodeforge.shortener;
/**
* TheCodeForgeProduction-GradeBase62Encoder
* Converts a unique LongID into a 7-character short code.
*/
publicclassBase62Encoder {
privatestaticfinalStringALPHABET = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
privatestaticfinalintBASE = ALPHABET.length();
publicstaticStringencode(long id) {
StringBuilder sb = newStringBuilder();
while (id > 0) {
sb.append(ALPHABET.charAt((int) (id % BASE)));
id /= BASE;
}
// Pad to ensure consistent length if required by business logicwhile (sb.length() < 7) {
sb.append(ALPHABET.charAt(0));
}
return sb.reverse().toString();
}
publicstaticvoidmain(String[] args) {
long uniqueId = 56800235584L; // Example ID from a distributed generatorSystem.out.println("Short Code for " + uniqueId + ": " + encode(uniqueId));
}
}
Output
Short Code for 56800235584: dXp8Baa
Forge Tip: Collision Prevention
If you use MD5, even the first 7 characters will eventually collide. Using a Counter-based approach with Base62 encoding guarantees uniqueness as long as your counter is globally unique (e.g., using ZooKeeper to manage ID ranges).
Production Insight
Using MD5 for short code generation leads to collision retries that spike write latency from <10ms to >100ms at scale.
A production system using MD5 with retries once hit a 50% fail rate under 100k writes/min because collision checks consumed DB connections.
Rule: always use a unique ID generator + deterministic Base62 encoding for write paths.
Key Takeaway
Base62 encoding of a unique ID is the most robust way to generate short codes.
Hashing leads to collisions that become a scaling bottleneck.
Rule: never use hash-based codes for a shortener beyond prototyping.
Choosing Code Generation Strategy
IfNeed deterministic, collision-free codes
→
UseUse Base62 encoding of a globally unique 64-bit ID
IfNeed stateless, no external ID generator
→
UseUse hash (MD5/SHA) with retry logic — but accept collision overhead
IfUser wants a custom alias (e.g., /mybrand)
→
UseReserve a namespace in DB, check uniqueness, allow manual input
Data Layer Strategy: Handling Scale and Redirection
Since this is a read-heavy system (100:1 read/write ratio), our database choice and caching strategy are critical. We use a NoSQL database like Cassandra or a sharded MongoDB for the URL mappings because we don't need complex joins—just a simple Key-Value lookup.
To achieve sub-millisecond redirects, we put a Redis cache in front of the database. We use an LRU (Least Recently Used) eviction policy because in the real world, 20% of the links (the viral ones) will generate 80% of the traffic.
SchemaDesign.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
-- io.thecodeforge.shortener - Database Schema-- Optimized for NoSQL or Sharded SQLCREATETABLE io_thecodeforge.url_mapping (
short_key VARCHAR(7) PRIMARYKEY,
original_url TEXTNOTNULL,
user_id BIGINT,
created_at TIMESTAMPDEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP,
click_count BIGINTDEFAULT0
);
-- Secondary Index for User ManagementCREATEINDEX idx_user_urls ON io_thecodeforge.url_mapping(user_id);
Output
Table created. In production, 'short_key' would be the shard key.
Interview Gold:
Mention 301 vs 302 redirects. Use 301 (Permanent) if you want the browser to cache the redirect and reduce server load. Use 302 (Temporary) if you need to track every single click for analytics.
Production Insight
A 302 redirect on a viral link can cause up to 10x more requests to your server than a 301.
Bit.ly uses 301 for the first hit to let browsers cache, then uses 302 for subsequent requests with a cookie to track users.
Trade-off: 301 means you lose ability to update the destination URL without changing the short code.
Key Takeaway
Choose your HTTP status code based on analytics needs.
301 reduces load at the cost of flexibility.
Rule: for production shorteners, prefer 302 with a smart client-side caching strategy.
Redirect Status Code Decision
IfNeed high throughput, minimal analytics
→
UseUse 301 — browsers cache it, reducing server load
IfNeed per-click analytics or URL updatability
→
UseUse 302 — every request hits your server
IfHybrid approach
→
UseUse 301 for first redirect, then client-side redirect with 302 for subsequent visits
Distributed ID Generation: The Backbone of Uniqueness
The unique ID that feeds into Base62 encoding must be globally unique across all servers. A simple auto-increment DB column doesn't scale — you'd have a single point of contention. The standard pattern is to use a distributed ID generator. Two common approaches: Snowflake (Twitter's algorithm) and ZooKeeper-managed ID ranges.
Snowflake generates 64-bit IDs: timestamp (41 bits) + machine ID (10 bits) + sequence (12 bits). This gives 4096 IDs per millisecond per machine, and the IDs are time-sortable. ZooKeeper assigns a range of IDs (e.g., 0-100000) to each app server; when exhausted, the server requests a new range. Both avoid collisions without a central DB write bottleneck.
In production, you'll also want to make the short code appear random. You can shuffle the Base62 alphabet permanently or XOR the ID with a secret before encoding. That prevents users from guessing sequential short codes and scraping all URLs.
Snowflake: each server gets a unique machine ID and produces tickets from its own counter — no coordination needed.
ZooKeeper: servers request fresh ticket blocks from a central coordinator. ZooKeeper is the single source of truth for block allocation.
Both methods guarantee collision-free IDs without a central DB sequence bottleneck.
Shuffle the Base62 alphabet to obscure sequential IDs from users.
Clock skew in Snowflake can cause ID collisions or negative timestamps — use NTP and monitor clock drift.
Production Insight
A production Snowflake ID generator experienced a 2-second clock skew causing 2000 duplicate IDs in 10 minutes.
The fix was to add a clock skew detection and alerting, plus a fallback to ZooKeeper for critical writes.
Rule: always monitor clock drift in Snowflake deployments and have a fallback ID generator for edge cases.
Key Takeaway
Distributed ID generation (Snowflake/ZooKeeper) is the heart of a collision-free system.
Clock skew is your enemy — monitor it.
Rule: never use a single DB sequence for ID generation in a distributed shortener.
ID Generation Strategy
IfNeed time-sortable IDs and low latency
→
UseUse Snowflake — fast, no network calls, but requires clock monitoring
IfNeed simple, no clock dependency
→
UseUse ZooKeeper ID ranges — more network overhead but safer
IfRunning on cloud with perfect NTP
→
UseSnowflake is fine. Add a ZooKeeper-based fallback for safety.
Caching Strategy: Surviving the Viral Spike
We already mentioned a Redis cluster with LRU eviction. But to really survive a viral spike, you need a multi-layer caching strategy. The first layer is an in-memory cache (like Caffeine or Guava on each application server) that holds the hottest entries with a very short TTL (1-2 seconds). The second layer is a Redis cluster, and the third is the database.
When a request arrives, the app server checks its local cache first. On miss, it queries Redis. On Redis miss, it queries the database and then populates both caches. To prevent a stampede (thundering herd) when a cached key expires, use a distributed lock or a get-or-compute pattern. Only one thread should reload a cache entry; others should wait or serve a stale value.
For short codes that go viral, you can proactively pin them to dedicated cache nodes or increase their priority. Use consistent hashing for the Redis cluster so that adding nodes doesn't cause mass cache invalidation.
Local cache returns <1ms, Redis ~5ms, DB ~50ms. Two-layer cache ensures 99.9% hits.
Cache Stampede Prevention
Use a 'probabilistic early recompute' or a distributed lock (Redis SETNX with TTL) to ensure only one thread reloads a cache entry. Otherwise, 10k concurrent requests all hit the DB when a hot key expires.
Production Insight
A 100ms DB query multiplied by 10k concurrent requests = 1000 seconds of cumulative DB time. That's how a cache stampede takes down your database.
Add a local cache with a 2-second TTL to absorb this spike; the DB only sees ~1 request per 2 seconds per key.
Rule: always layer caches and use stampede protection for any key that can go viral.
Key Takeaway
Read-heavy systems require heavy caching (Redis + local) and stampede protection.
A two-layer cache with coalescing is the minimum for production scale.
Rule: if you only have one cache layer, you have a single point of failure.
Cache Layer Placement
IfApplication servers < 100, request rate < 50k/s
→
UseSingle Redis cluster with LRU eviction is sufficient
IfNeed to survive 10x spikes
→
UseAdd local in-memory cache with short TTL
IfViral links generate >100k req/s on one key
→
UsePin that key to a dedicated Redis node, use consistent hashing
Analytics and Click Tracking Pipeline
A URL shortener is not just about redirection — it's a data business. Every click is valuable analytical data: geo-location, referrer, user agent, timestamp. You can't afford to write this data synchronously during a redirect (that would add latency). The pattern is asynchronous: the web server publishes a click event to a message queue (Kafka) and returns the 302/301 immediately. A separate consumer processes these events and updates the click_count in the database and aggregates data for dashboards.
Kafka topics can be partitioned by short key to maintain ordering per URL. The consumer can batch updates to the database (e.g., update click_count = click_count + 1 for 100 events at once). For real-time analytics, use a stream processor (Spark Streaming, Flink) to compute counters down to 1-minute granularity.
We also need to handle deduplication: users may refresh or multiple bots may click. Use a combination of IP + user agent + timestamp window to filter duplicates, or accept a small error percentage (most shorteners tolerate 1-2% overcount).
Click event published asynchronously. Redirect latency stays under 5ms.
Analytics Precision vs Latency
Using 302 redirects for every click ensures accurate analytics but increases server load. A compromise: use 301 for the first visit (browser caches) and a JavaScript pixel or service worker for subsequent visits to track them without server overhead.
Production Insight
A production shortener used synchronous click counting in the redirect handler. When a viral link hit, the DB write caused the redirect to take 2 seconds, making Twitter's crawler time out and report the link as broken.
Moving click counting to an async queue resolved the issue and cut redirect latency from 2s to 4ms.
Rule: never write synchronously in a redirect path — use async queues for analytics.
Key Takeaway
Analytics pipeline must be async to decouple from redirect performance.
Use Kafka/Flink for scalable click processing.
Rule: never mix the read path with the write path for analytics.
Click Tracking Strategy
IfClick accuracy critical, low request volume
→
UseUse synchronous DB update on redirect (acceptable for <10k req/s)
IfHigh volume, need accurate per-click data
→
UsePublish to Kafka, batch updates to DB every 1 second
IfHundreds of millions of clicks daily
→
UseUse stream processing (Flink) for real-time aggregation, write only aggregated counts to DB
● Production incidentPOST-MORTEMseverity: high
Cache Stampede Took Down Viral Link
Symptom
Users reported 503 and 504 errors. Response times jumped from <5ms to >10s. The application servers were at 100% CPU waiting for database queries.
Assumption
The assumption was that cache warming during deployment was sufficient. The team believed the 99th percentile load would stay under 50k req/s because historical data showed that pattern.
Root cause
A single short code became globally viral. The Redis cluster was sized for the average load, not for a 10x spike. The LRU cache evicted the hot key just before the spike, causing every subsequent request to hit Cassandra. The database read replicas were overwhelmed, and the primary node was occupied with write requests.
Fix
1) Implement a cache-aside pattern with a distributed read lock (Redis SETNX) so only one app server reloads a cache miss. 2) Add a local in-memory cache (Caffeine) as a second layer. 3) Increase the read replica pool and use connection pooling with high max connections.
Key lesson
Always design for traffic spikes that are 50x your mean load.
Cache stampedes are silent until they kill your DB.
A two-layer cache (local + distributed) with coalescing is necessary for viral scenarios.
Production debug guideCommon failures and the exact commands to diagnose them4 entries
Symptom · 01
Short link returns 404
→
Fix
Check Redis: GET <short_key>. If miss, query Cassandra: SELECT * FROM url_mapping WHERE short_key='<key>'. If not in DB, the link was never created or expired.
Symptom · 02
Redirect takes >1s
→
Fix
Ping Redis latency: redis-cli -p 6379 PING. Check Cassandra read latency via nodetool cfstats. Verify cache hit ratio: redis-cli INFO stats | grep hit_rate.
Symptom · 03
Custom alias already taken
→
Fix
Check user's custom alias mapping: SELECT * FROM url_mapping WHERE short_key='<custom>' AND user_id IS NOT NULL. If exists, return 409. Consider prefixing custom aliases with a separate namespace (e.g., '@').
Symptom · 04
Click count doesn't increment
→
Fix
Check analytics pipeline: ensure the async consumer (Kafka) is running and not lagging. Verify the click event was published: kafka-console-consumer --bootstrap-server localhost:9092 --topic clicks --from-beginning | grep <short_key>.
★ Quick Debug Cheat Sheet: TinyURL Production IncidentsThree most common production issues and their immediate fixes
Cache miss flood — all requests hitting DB−
Immediate action
Scale up read replicas: use AWS RDS read replica promotion or add Cassandra nodes. Reduce cache TTL to 0 and force warm-up with a batch job.
Commands
redis-cli -p 6379 INFO stats | grep 'keyspace_hits|keyspace_misses'
nodetool cfhistograms url_mapping url_mapping
Fix now
Increase Redis maxmemory and enable allkeys-lru eviction. Add a local in-memory cache with a short TTL (e.g., Caffeine, 2s).
Short code collision on write+
Immediate action
Ensure ID generator is returning unique values. Check for clock skew in Snowflake nodes. If using DB sequence, verify no two writers got same ID.
Commands
Check ID generator: select max(id) from id_sequence (if using DB).
Test with insert ignore and check affected_rows: if zero, collision occurred.
Fix now
Switch to ranged ID generation (ZooKeeper/etcd). Add a retry loop with a different salt after each collision.
Viral link causing DB outage+
Immediate action
Enable circuit breaker on DB reads. Serve stale cache (allow expired entries) until DB recovers. Add a rate limiter per short_code to throttle requests.
Deploy a standalone hot-link cache (dedicated Redis instance for viral short codes). Use consistent hashing to pin viral links to specific cache servers.
URL Shortening Approaches
Approach
Pros
Cons
Hashing (MD5/SHA)
Stateless, simple implementation
Collisions require check-before-insert
Base62 Encoding
Guaranteed unique, no collisions
Requires a centralized ID generator
Custom Aliases
Better UX/Branding
Requires manual check for availability
Key takeaways
1
Base62 encoding of a unique 64-bit ID is the most robust way to generate short codes.
2
Read-heavy systems require heavy caching (Redis + local) and stampede prevention.
3
Distributed ID generation (Snowflake/ZooKeeper) is the heart of a collision-free system.
4
Choose your HTTP status code (301 vs 302) based on analytics and caching requirements.
Using a single relational database without sharding
Symptom
Database becomes a bottleneck at 10k+ req/s. Read replicas can't keep up, connection pool exhaustion, queries queue up.
Fix
Use sharded NoSQL (Cassandra, DynamoDB) with short_key as partition key. Or shard MySQL/PostgreSQL by hash of short_key.
×
Ignoring URL validation
Symptom
Malicious links (phishing, malware) are shortened and shared, causing your domain to be blacklisted. Recursive TinyURLs create infinite redirects.
Fix
Validate original URL: check format, domain reputation (Google Safe Browsing API), reject localhost/private IPs. Detect if original URL points to your own shortener and reject.
×
Underestimating storage growth
Symptom
At 1 billion URLs per year, with 500 bytes per record (including indexes), storage grows by 500GB/year. Indexes take additional space. Eventually you hit disk limits or costs spike.
Fix
Plan for compaction: use database with built-in compression (Cassandra, ScyllaDB). Enable TTL-based expiry. Archive old records to cold storage (S3, Glacier) after 1 year.
×
Forgetting about background cleanup of expired links
Symptom
Database grows indefinitely. Expired links still consume cache space. Over time, query performance degrades.
Fix
Implement a scheduled worker (cron, Kubernetes CronJob) that deletes expired rows from DB. Use Redis keyspace notifications to invalidate cache on expiry. For database, consider using Cassandra's TTL feature.
×
Not handling cache stampedes on viral links
Symptom
When a hot key expires, thousands of requests simultaneously hit the database. DB CPU spikes, requests time out, cascading failure.
Fix
Use a distributed lock (Redis SETNX) so only one request reloads the cache. Others either wait or serve stale data. Also add a local in-memory cache with a short TTL.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01SENIOR
How would you generate unique short codes in a distributed system?
Q02SENIOR
How do you handle a viral link that gets millions of hits in an hour?
Q03SENIOR
Explain the trade-offs between 301 and 302 redirects for a URL shortener...
Q04SENIOR
How would you design the database schema for a URL shortener that suppor...
Q05SENIOR
How do you ensure high availability for a URL shortener?
Q01 of 05SENIOR
How would you generate unique short codes in a distributed system?
ANSWER
Use a globally unique ID generator like Snowflake (timestamp + machine ID + sequence) and convert the ID to a Base62 string. This guarantees no collisions and avoids the need for a central DB counter. Alternatively, use ZooKeeper to hand out ID ranges to each app server. Both scale horizontally. For additional security, shuffle the Base62 alphabet or XOR the ID with a secret to make codes non-sequential.
Q02 of 05SENIOR
How do you handle a viral link that gets millions of hits in an hour?
ANSWER
We have a multi-layer caching strategy: 1) Local in-memory cache (Caffeine) per application server with a 1-2 second TTL. 2) Distributed Redis cluster with LRU eviction. 3) Database only as the last resort. We also implement cache stampede prevention: only one thread per key reloads the cache. If the load still overwhelms, we can rate-limit requests per short code or serve stale cached data. Proactive pinning of known viral links to dedicated cache nodes helps.
Q03 of 05SENIOR
Explain the trade-offs between 301 and 302 redirects for a URL shortener.
ANSWER
301 (Permanent) makes browsers cache the redirect, so subsequent requests go directly to the target without hitting our server. This reduces server load by up to 90% for repeat visitors. The downside is we can't update the destination URL without changing the short code, and we lose per-click analytics for cached requests. 302 (Temporary) passes every request through our server, allowing click tracking, URL updates, and analytics. The trade-off is higher server load. In production, a hybrid approach works: use 301 for the first visit, then track subsequent visits via JavaScript or service worker.
Q04 of 05SENIOR
How would you design the database schema for a URL shortener that supports custom aliases and expiration?
ANSWER
Use a Cassandra table with short_key as primary key and clustering columns for versioning if needed. Schema includes: short_key (text), original_url (text), user_id (bigint, for custom aliases), created_at (timestamp), expires_at (timestamp), click_count (counter column). Use a secondary index on user_id to list all links by user. For expiration, set TTL on rows or use a separate column and a background worker to delete. Cassandra's counter columns handle atomic increments for click counting.
Q05 of 05SENIOR
How do you ensure high availability for a URL shortener?
ANSWER
We design for no single points of failure. The application layer is stateless and runs behind a load balancer across multiple availability zones. The database (Cassandra) is configured with replication factor 3 across zones. The Redis cache is a cluster with replicas. For ID generation, use Snowflake with NTP monitoring or ZooKeeper for coordination. For the DNS resolution of the short domain, use anycast with multiple DNS providers. Implement circuit breakers on external dependencies (e.g., Safe Browsing API).
01
How would you generate unique short codes in a distributed system?
SENIOR
02
How do you handle a viral link that gets millions of hits in an hour?
SENIOR
03
Explain the trade-offs between 301 and 302 redirects for a URL shortener.
SENIOR
04
How would you design the database schema for a URL shortener that supports custom aliases and expiration?
SENIOR
05
How do you ensure high availability for a URL shortener?
SENIOR
FAQ · 6 QUESTIONS
Frequently Asked Questions
01
How do you handle hash collisions if you use MD5?
You take the first 7 characters of the hash. If that key already exists in the database with a different original URL, you append a predefined string (salt) to the original URL and re-hash until you find a unique key.
Was this helpful?
02
What happens if the Redis cache is full?
We follow an LRU (Least Recently Used) eviction policy. The least accessed links are evicted to make room for new ones. Since most links follow a long-tail distribution, the 'cold' links will live in the DB while 'hot' links stay in memory.
Was this helpful?
03
How do you prevent people from guessing all your shortened URLs?
Instead of using a simple incrementing ID (1, 2, 3...), we use a distributed ID generator (Snowflake) and then shuffle the Base62 alphabet or XOR the ID with a secret. This makes the generated strings appear random to the end user while remaining technically sequential internally.
Was this helpful?
04
How do you handle custom aliases like 'mybrand' when someone wants a specific short code?
We reserve the short code in the database with a flag indicating it's a custom alias. Before creating, we check if the key is already taken (both for generated and custom). Custom aliases are stored with a prefix in the ID or in a separate table to avoid collision with generated codes. We also add validation to prevent users from taking too short or offensive codes.
Was this helpful?
05
How do you scale the database writes for click counting?
We don't write each click synchronously to the DB. Instead, we batch click events from Kafka and update the click_count in batches (e.g., update 100 clicks at once). For Cassandra, we use counter columns which are atomic and scalable. The analytics pipeline is decoupled from the redirect path.
Was this helpful?
06
What if the ID generator fails?
Have a fallback mechanism. For Snowflake, if clock skew is detected, switch to a ZooKeeper-based ID generator or use a Redis atomic increment as a temporary fallback. Also, have alerting on clock drift and sequence exhaustion.