System Design Intermediate

QPS Estimation Explained — How to Calculate Queries Per Second Like a Pro

Q: What is a good QPS for a web application?

There's no universally 'good' QPS — it depends entirely on your infrastructure. A single well-tuned Nginx server can handle 10,000–50,000 simple HTTP requests per second. A single PostgreSQL instance on modern hardware typically maxes out around 5,000–10,000 complex queries per second. The real question isn't whether your QPS is 'good,' but whether your infrastructure can sustain your peak QPS with less than 100ms added latency.

Q: What's the difference between QPS, RPS, and TPS?

QPS (Queries Per Second) typically refers to database-level requests. RPS (Requests Per Second) refers to HTTP-level requests hitting your API layer. TPS (Transactions Per Second) refers to complete business operations, which may involve multiple queries. In a system design context, these terms are often used interchangeably — what matters is that you're consistent and explicit about what you're measuring when you use the term.

Q: How do I estimate QPS if I don't know the number of users yet?

Work backwards from a comparable product. If you're designing a new photo-sharing app, find the public DAU and post-frequency data for Instagram or Pinterest and use those as benchmarks, then adjust for your expected market size. Stating 'I'm modelling this on a product with similar behaviour' is completely acceptable in system design interviews — interviewers are testing your reasoning process, not your ability to recall proprietary traffic data.

📅 March 2026 ⏱ 8 min read 🎯 Intermediate

In Plain English 🔥

Imagine a lemonade stand at a summer fair. If 600 kids show up over 10 minutes and each buys one cup, your stand is handling 1 cup per second — that's your QPS. If you only have one pitcher, you're in trouble. QPS is just the speed at which requests hit your system, and estimating it correctly tells you exactly how big your 'pitcher' needs to be before the fair even starts.

⚡ Quick Answer

Every system that has ever gone down under traffic had one thing in common: the engineers underestimated how many requests per second were coming in. QPS — Queries Per Second — is the single most important number in a capacity planning conversation. It's the heartbeat of your system, and if you don't know it, you're flying blind. Twitter's 2013 Super Bowl outage, Ticketmaster collapsing during Taylor Swift's Eras Tour presale, and countless startup launch-day crashes all trace back to the same root cause: nobody did the math ahead of time.

QPS estimation solves the problem of 'how much infrastructure do I actually need?' It bridges the gap between product thinking ('we expect 10 million users!') and engineering reality ('so that means we need X database replicas, Y cache nodes, and a load balancer that can sustain Z connections per second'). Without this translation step, you're guessing — and guessing with servers is expensive.

By the end of this article you'll be able to take any back-of-the-napkin user metric — daily active users, monthly signups, event-driven spikes — and convert it into a concrete QPS number with a peak multiplier, read/write split, and storage growth rate. You'll also know the three most dangerous estimation mistakes engineers make in system design interviews, and how to sidestep all of them confidently.

The Core Formula — From DAU to QPS in Three Steps

The foundation of every QPS estimate is the same simple chain: how many users, how many actions each, spread over how many seconds.

Step 1 — Anchor on Daily Active Users (DAU). This is your starting point. Product gives you this number, or you derive it from total registered users multiplied by an engagement rate. A typical consumer app sees 10–20% of registered users active on any given day.

Step 2 — Estimate actions per user per day. Think about what a single user actually does in a session. For a Twitter-like feed app: they open the app (1 read), scroll through 20 posts (20 reads), post once (1 write), and like 5 things (5 writes). That's 26 requests per user per day. Most systems are 80–95% reads.

Step 3 — Divide by seconds in a day. One day has 86,400 seconds. Divide total daily requests by 86,400 to get your average QPS.

Average QPS is never your target. It's your baseline. Real traffic is not flat — it spikes. Always apply a peak multiplier (typically 2x–5x for consumer apps) to get the number your infrastructure must actually survive. That peak QPS is what you design for.

qps_estimator.py · PYTHON

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768

# QPS Estimator — Back-of-Napkin System Design Calculator
# Run this with Python 3. No external libraries needed.

def estimate_qps(
    daily_active_users: int,
    reads_per_user_per_day: int,
    writes_per_user_per_day: int,
    peak_multiplier: float = 3.0
) -> dict:
    """
    Converts user-level product metrics into engineering-level QPS numbers.
    peak_multiplier: how much higher than average your busiest hour gets.
                     Use 2x for stable enterprise apps, 5x for viral consumer apps.
    """
    SECONDS_IN_A_DAY = 86_400  # 60 seconds * 60 minutes * 24 hours

    total_daily_reads  = daily_active_users * reads_per_user_per_day
    total_daily_writes = daily_active_users * writes_per_user_per_day
    total_daily_requests = total_daily_reads + total_daily_writes

    # Average QPS — assumes traffic is perfectly flat across 24 hours (it never is)
    avg_read_qps  = total_daily_reads  / SECONDS_IN_A_DAY
    avg_write_qps = total_daily_writes / SECONDS_IN_A_DAY
    avg_total_qps = total_daily_requests / SECONDS_IN_A_DAY

    # Peak QPS — the number your system MUST handle without degrading
    peak_read_qps  = avg_read_qps  * peak_multiplier
    peak_write_qps = avg_write_qps * peak_multiplier
    peak_total_qps = avg_total_qps * peak_multiplier

    # Read/write ratio — critical for choosing DB architecture (replicas, caching strategy)
    read_write_ratio = total_daily_reads / max(total_daily_writes, 1)  # guard against division by zero

    return {
        "avg_read_qps":    round(avg_read_qps,  1),
        "avg_write_qps":   round(avg_write_qps,  1),
        "avg_total_qps":   round(avg_total_qps,  1),
        "peak_read_qps":   round(peak_read_qps,  1),
        "peak_write_qps":  round(peak_write_qps, 1),
        "peak_total_qps":  round(peak_total_qps, 1),
        "read_write_ratio": round(read_write_ratio, 1),
    }


# --- Example: Twitter-like social feed app ---
# Assumptions:
#   50 million DAU (mid-size social network)
#   Each user reads 30 tweets per session (timeline, explore, notifications)
#   Each user writes 1 tweet + 5 likes = 6 write actions per day
#   Peak multiplier of 3x (busy evenings vs. quiet early mornings)

results = estimate_qps(
    daily_active_users=50_000_000,
    reads_per_user_per_day=30,
    writes_per_user_per_day=6,
    peak_multiplier=3.0
)

print("=== QPS Estimation: Twitter-like App ===")
print(f"  Average Read  QPS : {results['avg_read_qps']:>10,.1f}")
print(f"  Average Write QPS : {results['avg_write_qps']:>10,.1f}")
print(f"  Average Total QPS : {results['avg_total_qps']:>10,.1f}")
print()
print(f"  Peak Read  QPS    : {results['peak_read_qps']:>10,.1f}  <-- design your read path for this")
print(f"  Peak Write QPS    : {results['peak_write_qps']:>10,.1f}  <-- design your write path for this")
print(f"  Peak Total QPS    : {results['peak_total_qps']:>10,.1f}")
print()
print(f"  Read/Write Ratio  : {results['read_write_ratio']:>10.1f}x  <-- heavy read bias → caching is critical")

▶ Output

=== QPS Estimation: Twitter-like App ===
Average Read QPS : 17,361.1
Average Write QPS : 3,472.2
Average Total QPS : 20,833.3

Peak Read QPS : 52,083.3 <-- design your read path for this
Peak Write QPS : 10,416.7 <-- design your write path for this
Peak Total QPS : 62,500.0

Read/Write Ratio : 5.0x <-- heavy read bias → caching is critical

⚠️

The 86,400 AnchorMemorise this: one day = 86,400 seconds. In interviews, round it to 100,000 for faster mental math — it's only a 16% overestimate and keeps your arithmetic clean. Interviewers care about your reasoning process, not your arithmetic precision.

Peak QPS vs. Average QPS — Why Average Will Get You Fired

If you provision infrastructure for average QPS, your system will collapse during every spike — which is exactly when your users need you most. The Super Bowl, Black Friday, a viral tweet, a product launch: all of these are predictable spike patterns, and none of them look like an average day.

Traffic follows a diurnal pattern (fancy word for 'it changes with the time of day'). For a US-based consumer app, traffic is lowest at 3–5am Eastern and peaks between 7–9pm Eastern. The ratio between peak hour and the overnight trough can easily be 10:1 or higher.

There are two types of peak you must plan for separately. The first is the predictable daily peak — use a 2x–3x multiplier above your daily average. The second is the burst peak — think of a celebrity tweeting your app link or a DDoS. This can be 10x–50x and you handle it with rate limiting and autoscaling, not by provisioning for it statically.

The multiplier you choose also informs your architectural decisions. At 2x peak you might be fine with a single primary database with replicas. At 10x peak you're looking at tiered caching, read replicas, and queue-based write buffering. The number drives the architecture — not the other way around.

traffic_pattern_simulator.py · PYTHON

12345678910111213141516171819202122232425262728293031323334353637383940

# Traffic Pattern Simulator — visualises how QPS changes over 24 hours
# Shows WHY designing for average QPS is dangerous

import math

def hourly_traffic_multiplier(hour_of_day: int) -> float:
    """
    Returns a multiplier representing relative traffic at a given hour.
    Models a typical US consumer app's diurnal (time-of-day) traffic pattern.
    Hour 0 = midnight, Hour 12 = noon, Hour 20 = 8pm (typical peak).
    """
    # A sine wave centred at 8pm (hour 20), scaled so peak = ~3x, trough = ~0.2x
    # This is a simplified but realistic approximation of real traffic curves
    phase_shift = (hour_of_day - 20) * (math.pi / 12)  # shift peak to 8pm
    raw_wave = math.cos(phase_shift)                    # -1 to +1
    multiplier = 1.6 + (1.4 * raw_wave)                # scale to 0.2x – 3.0x range
    return max(0.1, multiplier)                         # floor at 0.1x (never zero traffic)


AVERAGE_QPS = 20_833  # from our previous estimation for the Twitter-like app

print("Hour | Multiplier | Actual QPS  | Safe to provision at average? ")
print("-" * 65)

for hour in range(24):
    multiplier = hourly_traffic_multiplier(hour)
    actual_qps = AVERAGE_QPS * multiplier
    at_capacity = actual_qps > AVERAGE_QPS  # True means average provisioning is insufficient
    danger_flag = "⚠ OVERLOADED" if at_capacity else "  OK"

    print(f"  {hour:02d}:00 |  {multiplier:5.2f}x     | {actual_qps:>10,.0f}  | {danger_flag}")

print()
print(f"Average QPS (baseline):  {AVERAGE_QPS:>10,}")
peak_hour_qps = AVERAGE_QPS * hourly_traffic_multiplier(20)
print(f"Peak hour QPS (8pm):     {peak_hour_qps:>10,.0f}")
print(f"Peak-to-average ratio:   {peak_hour_qps / AVERAGE_QPS:>10.1f}x")
print()
print("Conclusion: provisioning for average QPS means your system is")
print("overloaded for roughly 8 hours every single day.")

▶ Output

Hour | Multiplier | Actual QPS | Safe to provision at average?
-----------------------------------------------------------------
00:00 | 0.83x | 17,291 | OK
01:00 | 0.44x | 9,166 | OK
02:00 | 0.22x | 4,583 | OK
03:00 | 0.20x | 4,167 | OK
04:00 | 0.37x | 7,708 | OK
05:00 | 0.78x | 16,250 | OK
06:00 | 1.30x | 27,083 | ⚠ OVERLOADED
07:00 | 1.83x | 38,125 | ⚠ OVERLOADED
08:00 | 2.27x | 47,292 | ⚠ OVERLOADED
09:00 | 2.56x | 53,333 | ⚠ OVERLOADED
10:00 | 2.66x | 55,417 | ⚠ OVERLOADED
11:00 | 2.56x | 53,333 | ⚠ OVERLOADED
12:00 | 2.27x | 47,292 | ⚠ OVERLOADED
13:00 | 1.83x | 38,125 | ⚠ OVERLOADED
14:00 | 1.30x | 27,083 | ⚠ OVERLOADED
15:00 | 0.83x | 17,291 | OK
16:00 | 0.44x | 9,166 | OK
17:00 | 0.22x | 4,583 | OK
18:00 | 0.20x | 4,167 | OK
19:00 | 0.37x | 7,708 | OK
20:00 | 2.97x | 61,875 | ⚠ OVERLOADED
21:00 | 2.97x | 61,875 | ⚠ OVERLOADED
22:00 | 2.66x | 55,417 | ⚠ OVERLOADED
23:00 | 2.27x | 47,292 | ⚠ OVERLOADED

Average QPS (baseline): 20,833
Peak hour QPS (8pm): 61,875
Peak-to-average ratio: 3.0x

Conclusion: provisioning for average QPS means your system is
overloaded for roughly 8 hours every single day.

⚠️

Watch Out: The 'Average' TrapPresenting average QPS as your design target in a system design interview is a red flag for experienced interviewers. Always state your peak multiplier explicitly and justify it — 'I'll use 3x because this is a consumer app with strong evening traffic patterns.' That one sentence shows you understand real production systems.

Storage Growth Rate — QPS Has a Write Side Effect

QPS estimation doesn't stop at request throughput. Every write request creates data, and that data accumulates. If you only estimate QPS for request handling but ignore storage growth, you'll build a system that handles traffic fine on day one but runs out of disk space on day 90.

The storage calculation is a direct extension of write QPS. Take your peak write QPS, multiply by the average payload size per write, and you get bytes per second written to disk. Multiply that by seconds per day, then by 365, and you know your one-year raw storage requirement. Always add 20–30% overhead for indexes, replication, and metadata.

This calculation also drives replication decisions. If your write QPS is 10,000 and you're running 3 replicas, each replica must sustain 10,000 writes per second. That's a very different machine spec than your read replicas, which only need to serve reads.

In interviews, connecting QPS → storage growth → replication factor is the sign of someone who's actually run production systems. It shows you're thinking about the full lifecycle of data, not just the happy path.

storage_growth_estimator.py · PYTHON

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576

# Storage Growth Estimator — connects write QPS to long-term storage needs
# This is the calculation interviewers want to see AFTER you state your write QPS

def estimate_storage_growth(
    peak_write_qps: float,
    avg_record_size_bytes: int,
    replication_factor: int = 3,
    index_overhead_pct: float = 0.30,
    projection_years: int = 3
) -> None:
    """
    Given a write QPS, calculates raw and replicated storage growth over time.
    replication_factor: how many copies of each write are stored (e.g., 3 for most distributed DBs)
    index_overhead_pct: extra space consumed by DB indexes (30% is a safe default for B-tree indexes)
    """
    SECONDS_PER_DAY  = 86_400
    BYTES_PER_GB     = 1_073_741_824  # 1024^3 — using binary GiB for accuracy
    BYTES_PER_TB     = BYTES_PER_GB * 1_024

    # Raw bytes written per second (before replication)
    raw_bytes_per_second = peak_write_qps * avg_record_size_bytes

    # Total bytes on disk per second — accounts for all replicas
    replicated_bytes_per_second = raw_bytes_per_second * replication_factor

    # Add index overhead on top of replicated data
    total_bytes_per_second = replicated_bytes_per_second * (1 + index_overhead_pct)

    print(f"=== Storage Growth Estimation ===")
    print(f"Peak Write QPS       : {peak_write_qps:>12,.0f} writes/sec")
    print(f"Avg Record Size      : {avg_record_size_bytes:>12,} bytes")
    print(f"Replication Factor   : {replication_factor:>12}x")
    print(f"Index Overhead       : {index_overhead_pct*100:>11.0f}%")
    print()
    print(f"Raw write throughput : {raw_bytes_per_second/1_000_000:>11.1f} MB/sec")
    print(f"Total disk write rate: {total_bytes_per_second/1_000_000:>11.1f} MB/sec  (with replication + indexes)")
    print()

    print(f"{'Period':<12} | {'Raw Storage':>14} | {'With Replication + Index':>24}")
    print("-" * 58)

    for period_label, seconds in [
        ("1 Day",   SECONDS_PER_DAY),
        ("1 Month", SECONDS_PER_DAY * 30),
        ("1 Year",  SECONDS_PER_DAY * 365),
        (f"{projection_years} Years", SECONDS_PER_DAY * 365 * projection_years),
    ]:
        raw_total    = raw_bytes_per_second       * seconds
        on_disk_total = total_bytes_per_second    * seconds

        # Format in human-readable units
        raw_str     = f"{raw_total    / BYTES_PER_TB:.2f} TB" if raw_total     > BYTES_PER_TB else f"{raw_total    / BYTES_PER_GB:.1f} GB"
        on_disk_str = f"{on_disk_total / BYTES_PER_TB:.2f} TB" if on_disk_total > BYTES_PER_TB else f"{on_disk_total / BYTES_PER_GB:.1f} GB"

        print(f"{period_label:<12} | {raw_str:>14} | {on_disk_str:>24}")

    print()
    print("Architecture implications:")
    yearly_tb = (total_bytes_per_second * SECONDS_PER_DAY * 365) / BYTES_PER_TB
    if yearly_tb < 10:
        print("  → Single-region storage likely sufficient for year 1")
    elif yearly_tb < 100:
        print("  → Plan for sharding or partitioning by end of year 1")
    else:
        print("  → Distributed storage (e.g. Cassandra, S3 + data lake) required from day 1")


# Using the write QPS from our Twitter-like example: 10,417 peak writes/sec
# Tweet record: ~300 bytes (tweet text + user_id + timestamp + metadata)
estimate_storage_growth(
    peak_write_qps=10_417,
    avg_record_size_bytes=300,
    replication_factor=3,
    index_overhead_pct=0.30,
    projection_years=3
)

▶ Output

=== Storage Growth Estimation ===
Peak Write QPS : 10,417 writes/sec
Avg Record Size : 300 bytes
Replication Factor : 3x
Index Overhead : 30%

Raw write throughput : 3.1 MB/sec
Total disk write rate: 12.2 MB/sec (with replication + indexes)

Period | Raw Storage | With Replication + Index
----------------------------------------------------------
1 Day | 0.26 TB | 1.03 TB
1 Month | 7.87 TB | 30.82 TB
1 Year | 96.02 TB | 375.65 TB
3 Years | 288.06 TB | 1126.94 TB

Architecture implications:
→ Distributed storage (e.g. Cassandra, S3 + data lake) required from day 1

🔥

Interview Gold: The Storage ChainInterviewers love when you say: 'My write QPS is 10,000/sec at peak. At 300 bytes per record with 3x replication and 30% index overhead, I'm writing about 12 MB/sec to disk — that's roughly 375 TB per year. That means I need a distributed storage solution from day one, which rules out a single Postgres instance.' That chain of reasoning — from QPS to architecture — is what separates senior candidates from junior ones.

Aspect	Average QPS	Peak QPS
Definition	Total daily requests ÷ 86,400 seconds	Average QPS × peak multiplier (2x–5x typical)
When to use it	Capacity cost estimation, SLA reporting	Infrastructure provisioning, autoscaling config
Danger of over-relying on it	System is overloaded during every traffic spike	Over-provisioning costs money if spike never comes
Typical multiplier from average	1x (it IS the baseline)	2x–3x consumer apps, 5x+ viral/event-driven apps
Drives which architecture decision	Database tier sizing, data retention costs	Load balancer limits, cache hit targets, replica count
How to mitigate its downside	Always state peak alongside average	Use autoscaling so you only pay for peak when needed

🎯 Key Takeaways

The core QPS formula is: (DAU × actions per user) ÷ 86,400 — memorise 86,400 seconds per day, or use 100,000 for faster interview math
Always separate read QPS from write QPS immediately — a 5:1 read/write ratio means you cache reads aggressively; a 1:1 ratio means you focus on write durability and replication lag
Design for peak QPS (average × 2x–3x for consumer apps), not average — average QPS guarantees outages during the exact moments users need you most
Write QPS × record size × replication factor × (1 + index overhead) = your storage growth rate — always complete this chain to connect throughput to long-term infrastructure cost

⚠ Common Mistakes to Avoid

✕Mistake 1: Using total registered users instead of DAU — Symptom: your QPS estimate is 10x too high, leading to wildly over-engineered (and over-budget) architecture. Fix: always ask 'what percentage of registered users are active daily?' A healthy consumer app is 10–20% DAU/MAU. A dormant B2B tool might be 5%. Use that ratio to get a realistic DAU before you start.
✕Mistake 2: Forgetting the read/write split — Symptom: you design a single-tier database that gets saturated on reads, even though writes are perfectly fine. Fix: always separate your QPS into read QPS and write QPS early. A 5:1 read/write ratio means your read path needs horizontal scaling (replicas, caching) but your write path may be fine on a single primary for now. The ratio drives completely different architectural decisions.
✕Mistake 3: Treating peak QPS as a static hard limit instead of a trigger for autoscaling — Symptom: you either provision exactly at peak (expensive 24/7) or under-provision and rely on magic. Fix: define a sustained peak (what you provision for statically) and a burst ceiling (what autoscaling must reach within 60–90 seconds). For example: 'I'll statically provision for 50,000 QPS and configure autoscaling to burst to 150,000 QPS within 90 seconds.' This is how real production teams think.

Interview Questions on This Topic

QWalk me through how you'd estimate the QPS for a URL shortener like bit.ly. Assume 100 million DAU. What assumptions would you make, and how does your QPS estimate affect your choice of database?
QYour system handles 50,000 QPS on average but spikes to 200,000 QPS during live events. How would you design the system differently to handle the spike without paying for 200,000 QPS capacity 24/7?
QA junior engineer on your team says 'our average QPS is only 5,000 so a single database instance is fine.' What's wrong with that reasoning, and what questions would you ask before agreeing or disagreeing?

Frequently Asked Questions

What is a good QPS for a web application?

There's no universally 'good' QPS — it depends entirely on your infrastructure. A single well-tuned Nginx server can handle 10,000–50,000 simple HTTP requests per second. A single PostgreSQL instance on modern hardware typically maxes out around 5,000–10,000 complex queries per second. The real question isn't whether your QPS is 'good,' but whether your infrastructure can sustain your peak QPS with less than 100ms added latency.

What's the difference between QPS, RPS, and TPS?

QPS (Queries Per Second) typically refers to database-level requests. RPS (Requests Per Second) refers to HTTP-level requests hitting your API layer. TPS (Transactions Per Second) refers to complete business operations, which may involve multiple queries. In a system design context, these terms are often used interchangeably — what matters is that you're consistent and explicit about what you're measuring when you use the term.

How do I estimate QPS if I don't know the number of users yet?

Work backwards from a comparable product. If you're designing a new photo-sharing app, find the public DAU and post-frequency data for Instagram or Pinterest and use those as benchmarks, then adjust for your expected market size. Stating 'I'm modelling this on a product with similar behaviour' is completely acceptable in system design interviews — interviewers are testing your reasoning process, not your ability to recall proprietary traffic data.

🔥

TheCodeForge Editorial Team Verified Author

Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.

About Our Team Editorial Standards

Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged