Senior 4 min · June 25, 2026

Design Pastebin: How to Build a Production-Grade Paste Service That Won't Fall Over at 3 AM

Design pastebin for production: learn how to handle text storage, expiry, rate limiting, and sharding with real-world trade-offs and war stories..

N
Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Lessons pulled from things that broke in production.

Follow
Production
production tested
June 25, 2026
last updated
1,663
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer

To design a pastebin, you need a web server, a database (SQL or NoSQL), a unique ID generator (like base62 encoding of a counter or UUID), and a background job for expiry. Key trade-offs: SQL for consistency vs NoSQL for scale, and client-side vs server-side deduplication.

✦ Definition~90s read
What is Design Pastebin?

Design Pastebin is a system design exercise for building a service that lets users upload text snippets (pastes) and share them via unique URLs. It covers storage, expiry, deduplication, rate limiting, and scaling to millions of users.

Think of a pastebin like a public bulletin board where you can pin a note and get a ticket stub with a number.
Plain-English First

Think of a pastebin like a public bulletin board where you can pin a note and get a ticket stub with a number. Anyone with the stub can read the note. The board automatically tears down old notes after a while. If someone tries to pin the exact same note twice, the board just hands them the same stub instead of wasting space.

Most pastebin tutorials are toy projects that die the second they see real traffic. They use a single database, no caching, and no rate limiting. I've seen a paste service take down an entire API gateway because one user uploaded a 50MB log file and the server tried to load it all into memory. Don't be that team. Here's how to build a pastebin that survives production.

The core challenge is simple: accept text, store it, give back a short URL, and delete it after a TTL. But the devil is in the details — how do you generate unique IDs at scale? How do you handle duplicate pastes? What happens when a paste is 100MB? How do you prevent abuse? This article answers all of that with battle-tested patterns.

By the end, you'll be able to design a pastebin that handles 10K writes/sec and 100K reads/sec, with proper expiry, deduplication, and rate limiting. You'll also know exactly when to use SQL vs NoSQL, and why your first instinct (just hash the content!) might burn you.

Why Most Pastebin Designs Fail at Scale

The textbook pastebin design uses a single SQL database, generates IDs via auto-increment, and stores pastes as TEXT columns. This works for 100 users. At 10K users, the auto-increment becomes a bottleneck (every insert locks the sequence). At 100K users, the TEXT column causes table bloat and slow queries. And if you ever need to shard, auto-increment IDs become a nightmare. The fix: use a distributed ID generator like Snowflake or a key-value store with content-addressed hashing. Also, separate metadata (short URL, user, expiry) from content (the actual paste text) — store content in blob storage like S3, and metadata in a fast database like Cassandra or DynamoDB.

PasteStorage.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// io.thecodeforge — System Design tutorial

// Metadata table (Cassandra or DynamoDB)
CREATE TABLE paste_metadata (
  short_url text PRIMARY KEY,  // e.g., "aB3xY9"
  user_id uuid,
  content_hash text,           // SHA256 of paste content
  content_url text,            // S3 key: "pastes/{content_hash}"
  created_at timestamp,
  expires_at timestamp
);

// Create TTL index for expiry cleanup
CREATE INDEX ON paste_metadata (expires_at);

// Content storage: S3 bucket with lifecycle policy
// Bucket: paste-content
// Key: {content_hash}
// Lifecycle: expire objects after 30 days (or match TTL)

// Unique ID generation (Snowflake-like)
// 64-bit: 1 bit unused, 41 bits timestamp, 10 bits worker ID, 12 bits sequence
// Base62 encode to get short URL (7 chars = 62^73.5 trillion combinations)
Output
Tables created. S3 bucket configured. ID generator ready.
Production Trap: Auto-Increment IDs
Never use auto-increment for a public pastebin. Attackers can enumerate all pastes by incrementing the ID. Use random short URLs (base62 of a random number) or hash-based URLs.
Pastebin Production Architecture THECODEFORGE.IO Pastebin Production Architecture From ingestion to retrieval with dedup, expiry, rate limits, caching, and sharding Ingest Paste User submits content via API Deduplication Hash content, check existing Expiry Manager TTL-based deletion or archive Rate Limiter Token bucket per user/IP Cache Layer Read-through cache for hot pastes Sharded Database Hash-based sharding across nodes ⚠ Hashing alone misses duplicate content variations Use content-aware hashing + canonicalization THECODEFORGE.IO
thecodeforge.io
Pastebin Production Architecture
Design Pastebin

Deduplication: Why Hashing Alone Isn't Enough

Deduplication saves storage: if two users paste the same content, store it once and return the same URL. The naive approach: hash the content (SHA256) and use the hash as the storage key. Problem: hash collisions are astronomically unlikely, but content changes (e.g., trailing newline) produce different hashes. So you need to normalize content (trim whitespace, unify line endings) before hashing. Even then, you might want to allow duplicates for different users (e.g., for analytics). The better approach: store content by hash, but return a unique short URL per paste. The hash is just for storage dedup. Metadata still has a unique short URL per paste. This way, you save storage but preserve per-paste identity.

Deduplication.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// io.thecodeforge — System Design tutorial

// Normalize content before hashing
function normalizeContent(content: string): string {
  // Trim leading/trailing whitespace
  // Unify line endings to \n
  // Remove BOM if present
  return content.trim().replace(/\r\n?/g, '\n');
}

// Generate content hash
const crypto = require('crypto');
const contentHash = crypto.createHash('sha256')
  .update(normalizeContent(pasteContent))
  .digest('hex');

// Check if content already exists in S3
// If yes, reuse the S3 key. If no, upload.
const s3Key = `pastes/${contentHash}`;
if (!await s3.headObject({Bucket: 'paste-content', Key: s3Key}).promise()) {
  await s3.upload({Bucket: 'paste-content', Key: s3Key, Body: pasteContent}).promise();
}

// Generate unique short URL (not based on hash)
const shortUrl = generateShortUrl(); // e.g., base62(random 7 chars)

// Store metadata with unique short URL and content hash
await db.insert({
  short_url: shortUrl,
  content_hash: contentHash,
  content_url: s3Key,
  created_at: new Date(),
  expires_at: new Date(Date.now() + 7 * 24 * 60 * 60 * 1000) // 7 days
});

return shortUrl;
Output
Paste stored. Short URL: 'aB3xY9'
Senior Shortcut: Content Normalization
Always normalize content before hashing. A single trailing space changes the hash. Also, consider compressing content before storage (gzip) to save space, but decompress on read. S3 supports gzip transparently with Content-Encoding.
Deduplication Flow with Content HashingTHECODEFORGE.IODeduplication Flow with Content HashingStore once, reference many timesUser Uploads PasteRaw content received by API serverCompute SHA256 HashHash the full content stringCheck Hash in DBLookup hash in dedup indexHash Exists?Yes: return existing URL. No: store new pasteStore & Return Short URLSave content, map hash to new ID⚠ Hash alone misses trailing whitespace changes — normalize content firstTHECODEFORGE.IO
thecodeforge.io
Deduplication Flow with Content Hashing
Design Pastebin

Expiry: How to Actually Delete Pastes Without Breaking Reads

Expiry is easy to get wrong. The simplest approach: set a TTL in the database and run a cron job to delete expired rows. But if the cron job fails, expired pastes linger. Better: use database-level TTL if supported (DynamoDB TTL, Redis EXPIRE, Cassandra TTL). For S3, use lifecycle policies. But there's a catch: if you delete the content from S3 before all metadata references are cleaned up, reads will 404. Solution: soft-delete metadata first, then delete content after a grace period. Or, use a reference count: only delete content when no metadata references it. For simplicity, set S3 lifecycle to delete objects after 30 days, and delete metadata after 7 days. The content will be cleaned up eventually.

Expiry.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// io.thecodeforge — System Design tutorial

// DynamoDB table with TTL attribute
// Set 'expires_at' as TTL attribute in DynamoDB console
// DynamoDB automatically deletes items after TTL (within 48 hours)

// For SQL: use a scheduled job
// PostgreSQL: CREATE INDEX ON paste_metadata (expires_at);
// Cron job every hour:
DELETE FROM paste_metadata WHERE expires_at < NOW();

// For S3 lifecycle policy (JSON)
{
  "Rules": [
    {
      "Id": "expire-pastes",
      "Status": "Enabled",
      "Filter": { "Prefix": "pastes/" },
      "Expiration": { "Days": 30 }
    }
  ]
}

// To avoid 404s on recently deleted content, implement a grace period:
// 1. Mark metadata as deleted (soft delete)
// 2. After 1 hour, delete content from S3
// 3. During grace period, return 410 Gone instead of 404
Output
TTL configured. Lifecycle policy applied.
Never Do This: Deleting Content Before Metadata
If you delete the S3 object before the metadata row, a read request that arrives between the two deletes will get a 404. Always delete metadata first, then content. Or use soft deletes.

Rate Limiting: How to Stop Abuse Without Hurting Legit Users

Pastebin is a prime target for spam and abuse. Without rate limiting, a single user can upload thousands of pastes per second and fill your storage. The standard approach: token bucket or sliding window per user (IP or API key). But IP-based limiting is fragile behind NAT. Better: use API keys for authenticated users, and a CAPTCHA for anonymous uploads. For the rate limit itself, use a Redis-backed sliding window counter. Set limits: 10 pastes per minute for anonymous, 100 per minute for authenticated. Return 429 Too Many Requests with a Retry-After header. Also, implement a global rate limit to protect the database from traffic spikes.

RateLimiter.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// io.thecodeforge — System Design tutorial

// Redis sliding window rate limiter
const redis = require('redis');
const client = redis.createClient();

async function checkRateLimit(userId: string, limit: number, windowSeconds: number): Promise<boolean> {
  const key = `rate_limit:${userId}`;
  const now = Date.now();
  const windowStart = now - windowSeconds * 1000;

  // Remove old entries
  await client.zRemRangeByScore(key, 0, windowStart);

  // Count entries in current window
  const count = await client.zCard(key);

  if (count >= limit) {
    return false; // rate limited
  }

  // Add current request
  await client.zAdd(key, { score: now, value: `${now}` });
  await client.expire(key, windowSeconds); // auto-cleanup

  return true;
}

// Usage in upload endpoint
if (!await checkRateLimit(userId, 10, 60)) {
  return res.status(429).json({ error: 'Too many requests. Try again in 60 seconds.' });
}
Output
Rate limit check passed. Request allowed.
Interview Gold: Rate Limiting at Scale
For distributed rate limiting, use a centralized Redis cluster. But watch out for Redis being a single point of failure. Use Redis Sentinel or Cluster for high availability. Also, consider client-side rate limiting (e.g., exponential backoff) to reduce server load.

Reading Pastes: Caching Strategies That Actually Work

Pastebin is read-heavy: a popular paste can get millions of views. Without caching, every read hits the database and S3, causing high latency and cost. The solution: cache metadata in Redis (or Memcached) and cache content in a CDN (CloudFront, Cloudflare). For metadata, cache the short URL → content URL mapping with a TTL of a few minutes. For content, set S3 bucket as an origin for CDN and cache with a long TTL (e.g., 24 hours). But beware: if a paste is deleted, the CDN might serve stale content. Use cache invalidation or short TTLs for sensitive data. Also, implement a read-through cache: on cache miss, fetch from DB and populate cache.

ReadCache.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// io.thecodeforge — System Design tutorial

// Redis cache for metadata
async function getPasteMetadata(shortUrl: string): Promise<PasteMetadata | null> {
  const cacheKey = `paste:${shortUrl}`;
  let metadata = await redis.get(cacheKey);
  if (metadata) {
    return JSON.parse(metadata);
  }

  // Cache miss: fetch from DB
  metadata = await db.get({ short_url: shortUrl });
  if (metadata) {
    // Set TTL to 5 minutes
    await redis.setex(cacheKey, 300, JSON.stringify(metadata));
  }
  return metadata;
}

// CDN for content: S3 bucket as origin
// CloudFront distribution with cache policy
// Cache based on query string? No, just path.
// Set TTL: min=3600, max=86400, default=86400
// For invalidation: call CloudFront API when paste is deleted
// But invalidation costs money. Better: use short TTL (1 hour) and accept eventual consistency.
Output
Metadata cached. Content served via CDN.
Senior Shortcut: Cache Invalidation
Don't invalidate cache on every delete. Instead, use short TTLs (1 hour) and accept that deleted pastes might be accessible for up to an hour. For compliance, implement a hard delete that bypasses cache (e.g., check a blacklist in Redis before serving).

Sharding: When One Database Isn't Enough

At massive scale (billions of pastes), a single database can't handle the write throughput. You need to shard. The simplest sharding key is the short URL's first character (or a hash of it). But that leads to hot spots (e.g., 'a' might have more pastes than 'z'). Better: use consistent hashing on the short URL. Distribute shards across multiple database instances. For reads, you need to know which shard to query: either embed the shard ID in the short URL (e.g., first 2 chars = shard ID) or use a lookup service. The former is simpler: generate short URLs with a prefix that maps to a shard. For example, shard 0 handles URLs starting with '0'-'9', shard 1 handles 'a'-'z', etc. But this requires rebalancing when adding shards. Consistent hashing minimizes rebalancing.

Sharding.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// io.thecodeforge — System Design tutorial

// Consistent hashing for shard assignment
const hashRing = new ConsistentHashRing();
hashRing.addNode('shard0', 100); // virtual nodes
hashRing.addNode('shard1', 100);
hashRing.addNode('shard2', 100);

function getShard(shortUrl: string): string {
  return hashRing.getNode(shortUrl);
}

// Short URL generation with shard prefix
// Option 1: embed shard ID in URL (e.g., '0aB3xY9')
// Option 2: use a separate lookup table (not recommended for latency)

// For writes: route to correct shard
const shard = getShard(shortUrl);
await dbShards[shard].insert({ short_url: shortUrl, ... });

// For reads: same routing
const shard = getShard(shortUrl);
const metadata = await dbShards[shard].get({ short_url: shortUrl });

// Adding a new shard: rebalance only a fraction of keys
// Use virtual nodes to spread load evenly
Output
Shard routing configured.
The Classic Bug: Hot Shard on Write
If your short URL generation is not random enough, certain shards may get more writes. Use a good random source (crypto.randomBytes) and base62 encode. Avoid using timestamps as part of the URL — they cause sequential writes to the same shard.
Sharding: Range vs Consistent HashingTHECODEFORGE.IOSharding: Range vs Consistent HashingAvoid hot spots in distributed storageRange ShardingKey by first char of short URLHot spots for popular lettersRebalancing requires data moveSimple to implementConsistent HashingHash key into ring spaceEven distribution across shardsMinimal data movement on resizeSlightly more complex logicConsistent hashing wins for scale — range sharding is fine for early stagesTHECODEFORGE.IO
thecodeforge.io
Sharding: Range vs Consistent Hashing
Design Pastebin

When Not to Use This Design

This design is overkill for a small internal pastebin with <100 users. In that case, just use a SQLite file and a simple HTTP server. Also, if you need strong consistency (e.g., paste must be immediately readable after upload), avoid eventual consistency caches and CDNs. For compliance (e.g., GDPR right to deletion), you need immediate cache invalidation, which adds complexity. Finally, if your pastes are tiny (<1KB) and you have few users, just store them in the database directly — no need for S3. The trade-off is simplicity vs scalability.

● Production incidentPOST-MORTEMseverity: high

The 4GB Container That Kept Dying

Symptom
Paste uploads started failing with 'Connection reset by peer' after 30 seconds. The container was OOM-killed every 10 minutes.
Assumption
We assumed a memory leak in the Node.js process handling uploads.
Root cause
The paste service was loading the entire request body into a Buffer before writing to disk. A user uploaded a 2GB log file, which blew past the 4GB container memory limit. The OOM killer terminated the container, and the load balancer retried the request on another container, cascading the failure.
Fix
Switch to streaming uploads: pipe the request directly to disk or S3. Set a max request body size (e.g., 10MB) and reject larger payloads with a 413 error. Add a memory limit per request using a streaming counter.
Key lesson
  • Never buffer the entire request body in memory.
  • Stream or die.
Production debug guideSystematic recovery paths for the failure modes engineers actually hit.3 entries
Symptom · 01
Paste upload returns 413 Request Entity Too Large
Fix
1. Check the max body size setting in your reverse proxy (nginx: client_max_body_size). 2. Verify the application-level limit (e.g., 10MB). 3. Inform the user of the limit in the error message.
Symptom · 02
Paste read returns 404 but paste exists
Fix
1. Check if the paste has expired (compare expires_at with current time). 2. Check if the metadata exists in the database. 3. Check if the S3 object exists. 4. If using CDN, check cache invalidation.
Symptom · 03
Rate limiter blocking legitimate users
Fix
1. Check Redis for the user's rate limit key. 2. Verify the limit value and window size. 3. Check if the user is behind a NAT (all users share the same IP). 4. Implement API keys for authenticated users to bypass IP-based limits.
★ Design Pastebin Triage Cheat SheetFirst-response commands for when things go wrong — copy-paste ready.
Upload fails with `Connection reset by peer`
Immediate action
Check if request body is too large and causing OOM.
Commands
`curl -v -X POST -d @largefile.txt https://pastebin.example.com/api/paste`
`kubectl top pod -l app=pastebin`
Fix now
Set client_max_body_size 10m; in nginx and add streaming upload.
Read returns 404 for existing paste+
Immediate action
Check if paste expired or if cache is stale.
Commands
`redis-cli get paste:aB3xY9`
`aws s3api head-object --bucket paste-content --key pastes/<hash>`
Fix now
If metadata exists but content missing, restore from backup. If cache stale, flush cache key.
Rate limiter returning 429 for all users+
Immediate action
Check Redis for global rate limit key.
Commands
`redis-cli keys 'rate_limit:*' | wc -l`
`redis-cli ttl rate_limit:global`
Fix now
If global limit hit, increase limit or scale up. If Redis down, fail open (allow requests) or use a fallback.
High latency on paste reads+
Immediate action
Check if cache is missing or database is slow.
Commands
`redis-cli info stats | grep hits`
`aws cloudwatch get-metric-statistics --metric-name Latency --namespace AWS/DynamoDB`
Fix now
Increase cache TTL or add more cache nodes. If DB slow, add read replicas or switch to DynamoDB DAX.
Feature / AspectSQL (PostgreSQL)NoSQL (DynamoDB)
ConsistencyStrong ACIDEventual (or strong with DynamoDB DAX)
ScalabilityVertical + read replicasHorizontal auto-scaling
TTL supportManual cron jobBuilt-in TTL attribute
Query flexibilityRich SQLLimited to primary key + secondary indexes
Cost at scaleExpensive for high write throughputPay-per-request, cheaper for variable load

Key takeaways

1
Never buffer the entire request body in memory
stream to storage or die.
2
Use content-addressable storage (hash-based) for deduplication, but normalize content first.
3
Separate metadata from content
metadata in a fast database, content in blob storage.
4
For short URLs, use random base62 strings, not auto-increment or hash of content.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
How does your pastebin handle duplicate content under concurrent uploads...
Q02SENIOR
When would you choose DynamoDB over PostgreSQL for paste metadata?
Q03SENIOR
What happens when your Redis cache goes down? How do you mitigate?
Q04JUNIOR
How would you generate unique short URLs for a pastebin?
Q05SENIOR
A user reports that their paste returns 410 Gone even though it hasn't e...
Q06SENIOR
How would you design the pastebin to handle 1 billion pastes with 99.99%...
Q01 of 06SENIOR

How does your pastebin handle duplicate content under concurrent uploads? What if two users upload the same paste at the exact same time?

ANSWER
Use content-addressable storage: hash the content and check existence before writing. For concurrency, use a distributed lock (Redis Redlock) or optimistic locking (conditional write with 'if-none-match'). The first writer creates the S3 object, the second gets a 409 and reuses the existing object.
FAQ · 4 QUESTIONS

Frequently Asked Questions

01
How do I generate unique short URLs for a pastebin?
02
What's the difference between storing pastes in SQL vs NoSQL?
03
How do I handle paste expiry in production?
04
What happens if two users upload the exact same paste at the same time?
N
Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Lessons pulled from things that broke in production.

Follow
Verified
production tested
June 25, 2026
last updated
1,663
articles · all by Naren
🔥

That's Real World. Mark it forged?

4 min read · try the examples if you haven't

Previous
Design a Distributed Job Scheduler
27 / 40 · Real World
Next
Design TikTok