Senior 4 min · June 25, 2026

Design Pastebin: How to Build a Production-Grade Paste Service That Won't Fall Over at 3 AM

Q: How do I generate unique short URLs for a pastebin?

Use a 64-bit unique ID (like Snowflake: timestamp + worker ID + sequence) and base62 encode it to a 7-character string. This gives 62^7 ≈ 3.5 trillion combinations. Ensure randomness to prevent enumeration.

Q: What's the difference between storing pastes in SQL vs NoSQL?

SQL offers strong consistency and complex queries but scales vertically. NoSQL (DynamoDB) scales horizontally, has built-in TTL, and is cheaper for high write throughput. For pastebin, NoSQL is usually better because the access pattern is simple (get by short URL).

Q: How do I handle paste expiry in production?

Use database-level TTL (DynamoDB TTL, Redis EXPIRE) or a background job that deletes expired rows. For S3, use lifecycle policies. Always delete metadata before content to avoid 404s. Implement a grace period for soft deletes.

Q: What happens if two users upload the exact same paste at the same time?

Use content-addressable storage: hash the content and check existence before writing. Use a distributed lock or conditional write to ensure only one creates the object. The second upload reuses the existing object and returns a new short URL.

Design pastebin for production: learn how to handle text storage, expiry, rate limiting, and sharding with real-world trade-offs and war stories..

Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Lessons pulled from things that broke in production.

✓ Production

production tested

June 25, 2026

last updated

1,663

articles · all by Naren

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

To design a pastebin, you need a web server, a database (SQL or NoSQL), a unique ID generator (like base62 encoding of a counter or UUID), and a background job for expiry. Key trade-offs: SQL for consistency vs NoSQL for scale, and client-side vs server-side deduplication.

✦ Definition~90s read

What is Design Pastebin?

Design Pastebin is a system design exercise for building a service that lets users upload text snippets (pastes) and share them via unique URLs. It covers storage, expiry, deduplication, rate limiting, and scaling to millions of users.

★

Think of a pastebin like a public bulletin board where you can pin a note and get a ticket stub with a number.

Plain-English First

Think of a pastebin like a public bulletin board where you can pin a note and get a ticket stub with a number. Anyone with the stub can read the note. The board automatically tears down old notes after a while. If someone tries to pin the exact same note twice, the board just hands them the same stub instead of wasting space.

Most pastebin tutorials are toy projects that die the second they see real traffic. They use a single database, no caching, and no rate limiting. I've seen a paste service take down an entire API gateway because one user uploaded a 50MB log file and the server tried to load it all into memory. Don't be that team. Here's how to build a pastebin that survives production.

The core challenge is simple: accept text, store it, give back a short URL, and delete it after a TTL. But the devil is in the details — how do you generate unique IDs at scale? How do you handle duplicate pastes? What happens when a paste is 100MB? How do you prevent abuse? This article answers all of that with battle-tested patterns.

By the end, you'll be able to design a pastebin that handles 10K writes/sec and 100K reads/sec, with proper expiry, deduplication, and rate limiting. You'll also know exactly when to use SQL vs NoSQL, and why your first instinct (just hash the content!) might burn you.

Why Most Pastebin Designs Fail at Scale

The textbook pastebin design uses a single SQL database, generates IDs via auto-increment, and stores pastes as TEXT columns. This works for 100 users. At 10K users, the auto-increment becomes a bottleneck (every insert locks the sequence). At 100K users, the TEXT column causes table bloat and slow queries. And if you ever need to shard, auto-increment IDs become a nightmare. The fix: use a distributed ID generator like Snowflake or a key-value store with content-addressed hashing. Also, separate metadata (short URL, user, expiry) from content (the actual paste text) — store content in blob storage like S3, and metadata in a fast database like Cassandra or DynamoDB.

PasteStorage.systemdesignSYSTEMDESIGN

// io.thecodeforge — System Design tutorial

// Metadata table (Cassandra or DynamoDB)
CREATE TABLE paste_metadata (
  short_url text PRIMARY KEY,  // e.g., "aB3xY9"
  user_id uuid,
  content_hash text,           // SHA256 of paste content
  content_url text,            // S3 key: "pastes/{content_hash}"
  created_at timestamp,
  expires_at timestamp
);

// Create TTL index for expiry cleanup
CREATE INDEX ON paste_metadata (expires_at);

// Content storage: S3 bucket with lifecycle policy
// Bucket: paste-content
// Key: {content_hash}
// Lifecycle: expire objects after 30 days (or match TTL)

// Unique ID generation (Snowflake-like)
// 64-bit: 1 bit unused, 41 bits timestamp, 10 bits worker ID, 12 bits sequence
// Base62 encode to get short URL (7 chars = 62^7 ≈ 3.5 trillion combinations)

Output

Tables created. S3 bucket configured. ID generator ready.

Production Trap: Auto-Increment IDs

Never use auto-increment for a public pastebin. Attackers can enumerate all pastes by incrementing the ID. Use random short URLs (base62 of a random number) or hash-based URLs.

thecodeforge.io

Pastebin Production Architecture

Design Pastebin

Deduplication: Why Hashing Alone Isn't Enough

Deduplication saves storage: if two users paste the same content, store it once and return the same URL. The naive approach: hash the content (SHA256) and use the hash as the storage key. Problem: hash collisions are astronomically unlikely, but content changes (e.g., trailing newline) produce different hashes. So you need to normalize content (trim whitespace, unify line endings) before hashing. Even then, you might want to allow duplicates for different users (e.g., for analytics). The better approach: store content by hash, but return a unique short URL per paste. The hash is just for storage dedup. Metadata still has a unique short URL per paste. This way, you save storage but preserve per-paste identity.

Deduplication.systemdesignSYSTEMDESIGN

// io.thecodeforge — System Design tutorial

// Normalize content before hashing
function normalizeContent(content: string): string {
  // Trim leading/trailing whitespace
  // Unify line endings to \n
  // Remove BOM if present
  return content.trim().replace(/\r\n?/g, '\n');
}

// Generate content hash
const crypto = require('crypto');
const contentHash = crypto.createHash('sha256')
  .update(normalizeContent(pasteContent))
  .digest('hex');

// Check if content already exists in S3
// If yes, reuse the S3 key. If no, upload.
const s3Key = `pastes/${contentHash}`;
if (!await s3.headObject({Bucket: 'paste-content', Key: s3Key}).promise()) {
  await s3.upload({Bucket: 'paste-content', Key: s3Key, Body: pasteContent}).promise();
}

// Generate unique short URL (not based on hash)
const shortUrl = generateShortUrl(); // e.g., base62(random 7 chars)

// Store metadata with unique short URL and content hash
await db.insert({
  short_url: shortUrl,
  content_hash: contentHash,
  content_url: s3Key,
  created_at: new Date(),
  expires_at: new Date(Date.now() + 7 * 24 * 60 * 60 * 1000) // 7 days
});

return shortUrl;

Output

Paste stored. Short URL: 'aB3xY9'

Senior Shortcut: Content Normalization

Always normalize content before hashing. A single trailing space changes the hash. Also, consider compressing content before storage (gzip) to save space, but decompress on read. S3 supports gzip transparently with Content-Encoding.

thecodeforge.io

Deduplication Flow with Content Hashing

Design Pastebin

Expiry: How to Actually Delete Pastes Without Breaking Reads

Expiry is easy to get wrong. The simplest approach: set a TTL in the database and run a cron job to delete expired rows. But if the cron job fails, expired pastes linger. Better: use database-level TTL if supported (DynamoDB TTL, Redis EXPIRE, Cassandra TTL). For S3, use lifecycle policies. But there's a catch: if you delete the content from S3 before all metadata references are cleaned up, reads will 404. Solution: soft-delete metadata first, then delete content after a grace period. Or, use a reference count: only delete content when no metadata references it. For simplicity, set S3 lifecycle to delete objects after 30 days, and delete metadata after 7 days. The content will be cleaned up eventually.

Expiry.systemdesignSYSTEMDESIGN

// io.thecodeforge — System Design tutorial

// DynamoDB table with TTL attribute
// Set 'expires_at' as TTL attribute in DynamoDB console
// DynamoDB automatically deletes items after TTL (within 48 hours)

// For SQL: use a scheduled job
// PostgreSQL: CREATE INDEX ON paste_metadata (expires_at);
// Cron job every hour:
DELETE FROM paste_metadata WHERE expires_at < NOW();

// For S3 lifecycle policy (JSON)
{
  "Rules": [
    {
      "Id": "expire-pastes",
      "Status": "Enabled",
      "Filter": { "Prefix": "pastes/" },
      "Expiration": { "Days": 30 }
    }
  ]
}

// To avoid 404s on recently deleted content, implement a grace period:
// 1. Mark metadata as deleted (soft delete)
// 2. After 1 hour, delete content from S3
// 3. During grace period, return 410 Gone instead of 404

Output

TTL configured. Lifecycle policy applied.

Never Do This: Deleting Content Before Metadata

If you delete the S3 object before the metadata row, a read request that arrives between the two deletes will get a 404. Always delete metadata first, then content. Or use soft deletes.

Rate Limiting: How to Stop Abuse Without Hurting Legit Users

Pastebin is a prime target for spam and abuse. Without rate limiting, a single user can upload thousands of pastes per second and fill your storage. The standard approach: token bucket or sliding window per user (IP or API key). But IP-based limiting is fragile behind NAT. Better: use API keys for authenticated users, and a CAPTCHA for anonymous uploads. For the rate limit itself, use a Redis-backed sliding window counter. Set limits: 10 pastes per minute for anonymous, 100 per minute for authenticated. Return 429 Too Many Requests with a Retry-After header. Also, implement a global rate limit to protect the database from traffic spikes.

RateLimiter.systemdesignSYSTEMDESIGN

// io.thecodeforge — System Design tutorial

// Redis sliding window rate limiter
const redis = require('redis');
const client = redis.createClient();

async function checkRateLimit(userId: string, limit: number, windowSeconds: number): Promise<boolean> {
  const key = `rate_limit:${userId}`;
  const now = Date.now();
  const windowStart = now - windowSeconds * 1000;

  // Remove old entries
  await client.zRemRangeByScore(key, 0, windowStart);

  // Count entries in current window
  const count = await client.zCard(key);

  if (count >= limit) {
    return false; // rate limited
  }

  // Add current request
  await client.zAdd(key, { score: now, value: `${now}` });
  await client.expire(key, windowSeconds); // auto-cleanup

  return true;
}

// Usage in upload endpoint
if (!await checkRateLimit(userId, 10, 60)) {
  return res.status(429).json({ error: 'Too many requests. Try again in 60 seconds.' });
}

Output

Rate limit check passed. Request allowed.

Interview Gold: Rate Limiting at Scale

For distributed rate limiting, use a centralized Redis cluster. But watch out for Redis being a single point of failure. Use Redis Sentinel or Cluster for high availability. Also, consider client-side rate limiting (e.g., exponential backoff) to reduce server load.

Reading Pastes: Caching Strategies That Actually Work

Pastebin is read-heavy: a popular paste can get millions of views. Without caching, every read hits the database and S3, causing high latency and cost. The solution: cache metadata in Redis (or Memcached) and cache content in a CDN (CloudFront, Cloudflare). For metadata, cache the short URL → content URL mapping with a TTL of a few minutes. For content, set S3 bucket as an origin for CDN and cache with a long TTL (e.g., 24 hours). But beware: if a paste is deleted, the CDN might serve stale content. Use cache invalidation or short TTLs for sensitive data. Also, implement a read-through cache: on cache miss, fetch from DB and populate cache.

ReadCache.systemdesignSYSTEMDESIGN

// io.thecodeforge — System Design tutorial

// Redis cache for metadata
async function getPasteMetadata(shortUrl: string): Promise<PasteMetadata | null> {
  const cacheKey = `paste:${shortUrl}`;
  let metadata = await redis.get(cacheKey);
  if (metadata) {
    return JSON.parse(metadata);
  }

  // Cache miss: fetch from DB
  metadata = await db.get({ short_url: shortUrl });
  if (metadata) {
    // Set TTL to 5 minutes
    await redis.setex(cacheKey, 300, JSON.stringify(metadata));
  }
  return metadata;
}

// CDN for content: S3 bucket as origin
// CloudFront distribution with cache policy
// Cache based on query string? No, just path.
// Set TTL: min=3600, max=86400, default=86400
// For invalidation: call CloudFront API when paste is deleted
// But invalidation costs money. Better: use short TTL (1 hour) and accept eventual consistency.

Output

Metadata cached. Content served via CDN.

Senior Shortcut: Cache Invalidation

Don't invalidate cache on every delete. Instead, use short TTLs (1 hour) and accept that deleted pastes might be accessible for up to an hour. For compliance, implement a hard delete that bypasses cache (e.g., check a blacklist in Redis before serving).

Sharding: When One Database Isn't Enough

At massive scale (billions of pastes), a single database can't handle the write throughput. You need to shard. The simplest sharding key is the short URL's first character (or a hash of it). But that leads to hot spots (e.g., 'a' might have more pastes than 'z'). Better: use consistent hashing on the short URL. Distribute shards across multiple database instances. For reads, you need to know which shard to query: either embed the shard ID in the short URL (e.g., first 2 chars = shard ID) or use a lookup service. The former is simpler: generate short URLs with a prefix that maps to a shard. For example, shard 0 handles URLs starting with '0'-'9', shard 1 handles 'a'-'z', etc. But this requires rebalancing when adding shards. Consistent hashing minimizes rebalancing.

Sharding.systemdesignSYSTEMDESIGN

// io.thecodeforge — System Design tutorial

// Consistent hashing for shard assignment
const hashRing = new ConsistentHashRing();
hashRing.addNode('shard0', 100); // virtual nodes
hashRing.addNode('shard1', 100);
hashRing.addNode('shard2', 100);

function getShard(shortUrl: string): string {
  return hashRing.getNode(shortUrl);
}

// Short URL generation with shard prefix
// Option 1: embed shard ID in URL (e.g., '0aB3xY9')
// Option 2: use a separate lookup table (not recommended for latency)

// For writes: route to correct shard
const shard = getShard(shortUrl);
await dbShards[shard].insert({ short_url: shortUrl, ... });

// For reads: same routing
const shard = getShard(shortUrl);
const metadata = await dbShards[shard].get({ short_url: shortUrl });

// Adding a new shard: rebalance only a fraction of keys
// Use virtual nodes to spread load evenly

Output

Shard routing configured.

The Classic Bug: Hot Shard on Write

If your short URL generation is not random enough, certain shards may get more writes. Use a good random source (crypto.randomBytes) and base62 encode. Avoid using timestamps as part of the URL — they cause sequential writes to the same shard.

thecodeforge.io

Sharding: Range vs Consistent Hashing

Design Pastebin

When Not to Use This Design

This design is overkill for a small internal pastebin with <100 users. In that case, just use a SQLite file and a simple HTTP server. Also, if you need strong consistency (e.g., paste must be immediately readable after upload), avoid eventual consistency caches and CDNs. For compliance (e.g., GDPR right to deletion), you need immediate cache invalidation, which adds complexity. Finally, if your pastes are tiny (<1KB) and you have few users, just store them in the database directly — no need for S3. The trade-off is simplicity vs scalability.

● Production incidentPOST-MORTEMseverity: high

The 4GB Container That Kept Dying

Symptom

Paste uploads started failing with 'Connection reset by peer' after 30 seconds. The container was OOM-killed every 10 minutes.

Assumption

We assumed a memory leak in the Node.js process handling uploads.

Root cause

The paste service was loading the entire request body into a Buffer before writing to disk. A user uploaded a 2GB log file, which blew past the 4GB container memory limit. The OOM killer terminated the container, and the load balancer retried the request on another container, cascading the failure.

Fix

Switch to streaming uploads: pipe the request directly to disk or S3. Set a max request body size (e.g., 10MB) and reject larger payloads with a 413 error. Add a memory limit per request using a streaming counter.

Key lesson

Never buffer the entire request body in memory.
Stream or die.

Production debug guideSystematic recovery paths for the failure modes engineers actually hit.3 entries

Symptom · 01

Paste upload returns 413 Request Entity Too Large

→

Fix

1. Check the max body size setting in your reverse proxy (nginx: client_max_body_size). 2. Verify the application-level limit (e.g., 10MB). 3. Inform the user of the limit in the error message.

Symptom · 02

Paste read returns 404 but paste exists

→

Fix

1. Check if the paste has expired (compare expires_at with current time). 2. Check if the metadata exists in the database. 3. Check if the S3 object exists. 4. If using CDN, check cache invalidation.

Symptom · 03

Rate limiter blocking legitimate users

→

Fix

1. Check Redis for the user's rate limit key. 2. Verify the limit value and window size. 3. Check if the user is behind a NAT (all users share the same IP). 4. Implement API keys for authenticated users to bypass IP-based limits.

★ Design Pastebin Triage Cheat SheetFirst-response commands for when things go wrong — copy-paste ready.

Upload fails with `Connection reset by peer`−

Immediate action

Check if request body is too large and causing OOM.

Commands

`curl -v -X POST -d @largefile.txt https://pastebin.example.com/api/paste`

`kubectl top pod -l app=pastebin`

Fix now

Set client_max_body_size 10m; in nginx and add streaming upload.

Read returns 404 for existing paste+

Rate limiter returning 429 for all users+

High latency on paste reads+

Feature / Aspect	SQL (PostgreSQL)	NoSQL (DynamoDB)
Consistency	Strong ACID	Eventual (or strong with DynamoDB DAX)
Scalability	Vertical + read replicas	Horizontal auto-scaling
TTL support	Manual cron job	Built-in TTL attribute
Query flexibility	Rich SQL	Limited to primary key + secondary indexes
Cost at scale	Expensive for high write throughput	Pay-per-request, cheaper for variable load

Key takeaways

Never buffer the entire request body in memory

stream to storage or die.

Use content-addressable storage (hash-based) for deduplication, but normalize content first.

Separate metadata from content

metadata in a fast database, content in blob storage.

For short URLs, use random base62 strings, not auto-increment or hash of content.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

How does your pastebin handle duplicate content under concurrent uploads...

Q02SENIOR

When would you choose DynamoDB over PostgreSQL for paste metadata?

Q03SENIOR

What happens when your Redis cache goes down? How do you mitigate?

Q04JUNIOR

How would you generate unique short URLs for a pastebin?

Q05SENIOR

A user reports that their paste returns 410 Gone even though it hasn't e...

Q06SENIOR

How would you design the pastebin to handle 1 billion pastes with 99.99%...

Q01 of 06SENIOR

How does your pastebin handle duplicate content under concurrent uploads? What if two users upload the same paste at the exact same time?

ANSWER

Use content-addressable storage: hash the content and check existence before writing. For concurrency, use a distributed lock (Redis Redlock) or optimistic locking (conditional write with 'if-none-match'). The first writer creates the S3 object, the second gets a 409 and reuses the existing object.

FAQ · 4 QUESTIONS

Frequently Asked Questions

How do I generate unique short URLs for a pastebin?

What's the difference between storing pastes in SQL vs NoSQL?

How do I handle paste expiry in production?

What happens if two users upload the exact same paste at the same time?

Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Lessons pulled from things that broke in production.

✓ Verified

production tested

June 25, 2026

last updated

1,663

articles · all by Naren

🔥

That's Real World. Mark it forged?

4 min read · try the examples if you haven't