Senior 3 min · March 06, 2026

Database Partitioning — Missing DEFAULT Partition Breaks

Batch inserts fail with ORA-14400 when LIST partition lacks DEFAULT for unexpected values.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Partitioning splits a large table into smaller, manageable pieces (partitions) managed transparently by the database.
  • Range partitioning: splits by value ranges (e.g., by year) — best for time-series data lifecycle.
  • Hash partitioning: distributes rows evenly via a hash function — solves write hotspot issues.
  • List partitioning: assigns rows to partitions based on discrete values (e.g., country code).
  • Partition pruning: queries with WHERE on the partition key skip irrelevant partitions entirely, cutting I/O dramatically.
  • Production truth: dropping a partition is instant; deleting rows is slow and locks the table.

Range Partitioning — The Production Gold Standard

Range partitioning is the go-to strategy for time-series data. It maps rows to partitions based on a continuous range of values. This is incredibly powerful for data lifecycle management; instead of running expensive DELETE queries that bloat the transaction log and fragment indexes, you simply drop the oldest partition.

By using the PARTITION BY RANGE clause, the database engine gains the intelligence to perform 'Partition Pruning'—ignoring every file on disk that doesn't contain the requested date range.

ExampleSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- io.thecodeforge.db.schema
-- PostgreSQL range partitioning by year
CREATE TABLE orders (
    id          BIGINT,
    customer_id INT NOT NULL,
    order_date  DATE NOT NULL,
    total       DECIMAL(12,2)
) PARTITION BY RANGE (order_date);

-- Creating the child tables (partitions)
CREATE TABLE orders_2023 PARTITION OF orders
    FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

CREATE TABLE orders_2024 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

CREATE TABLE orders_2025 PARTITION OF orders
    FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');

-- Add indexes on children automatically via the parent
CREATE INDEX idx_orders_customer ON orders (customer_id);

-- Partition pruning in action
EXPLAIN ANALYZE 
SELECT * FROM orders 
WHERE order_date >= '2025-06-01' AND order_date <= '2025-06-30';
Output
Seq Scan on orders_2025 (cost=0.00..15.50 rows=5 width=48)
Filter: ((order_date >= '2025-06-01'::date) AND (order_date <= '2025-06-30'::date))
Production Insight
Time-series data often sees a 'hot partition' — the most recent range receives all writes.
Range partitioning makes this visible: the latest partition can become a bottleneck under high write load.
Rule: for high-volume ingest, consider sub-partitioning the hot range by hash to spread writes.
Always schedule partition maintenance (detach/drop) during low traffic to avoid lock contention.
Key Takeaway
Range partitioning is ideal for time-series.
Dropping a partition is instant; deleting rows is not.
Plan for hot partitions: sub-partition or use smaller range intervals.

Hash Partitioning — Solving Hotspot Issues

When your data doesn't have a natural 'range' (like a timestamp) or when all your writes hit the 'current' range creating a bottleneck, Hash Partitioning is the solution. It uses a hash function on the partition key to distribute rows evenly across a fixed number of partitions.

This ensures that even if 10,000 users are signing up at the same second, their data is spread across multiple physical files, reducing I/O contention.

ExampleSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
-- io.thecodeforge.db.schema
-- Hash partitioning for even load distribution
CREATE TABLE user_sessions (
    session_id UUID PRIMARY KEY,
    user_id    BIGINT NOT NULL,
    data       JSONB,
    last_seen  TIMESTAMP
) PARTITION BY HASH (user_id);

-- Create 4 partitions using the modulus/remainder logic
CREATE TABLE user_sessions_p0 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE user_sessions_p1 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE user_sessions_p2 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE user_sessions_p3 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 3);
Output
-- Rows are distributed: hash(user_id) % 4
Production Insight
Hash partitioning guarantees even distribution only if the hash function provides uniform output.
Changing the modulus later (e.g., from 4 to 8) requires dropping and recreating all partitions.
Queries with range conditions on the partition key (e.g., user_id between 1000 and 2000) scan ALL partitions.
Rule: use hash only when the primary access pattern is point lookups by the partition key.
Key Takeaway
Hash partitioning eliminates write hotspots.
Trade-off: range queries can't use pruning — full scan of all partitions.
Choose modulus based on expected data growth; changing it is an expensive migration.

List Partitioning — Categorical Data Isolation

List partitioning groups rows by discrete values such as country, status, or category. Each partition holds rows where the partition key matches a predefined list of values. It's perfect for multi-tenant systems where each tenant has a separate partition, or for data that naturally splits by region.

A critical consideration is the 'default' partition — a catch-all for values that don't match any defined list. Without it, inserts with unexpected values fail.

ExampleSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- io.thecodeforge.db.schema
-- List partitioning by region
CREATE TABLE customer_data (
    customer_id   BIGINT,
    region        TEXT NOT NULL,
    signup_date   DATE,
    lifetime_value DECIMAL(12,2)
) PARTITION BY LIST (region);

-- Partitions for major regions
CREATE TABLE customer_na PARTITION OF customer_data
    FOR VALUES IN ('US', 'CA', 'MX');
CREATE TABLE customer_eu PARTITION OF customer_data
    FOR VALUES IN ('UK', 'DE', 'FR', 'IT', 'ES');
CREATE TABLE customer_apac PARTITION OF customer_data
    FOR VALUES IN ('JP', 'CN', 'IN', 'AU');

-- Required: default partition for safety
CREATE TABLE customer_default PARTITION OF customer_data DEFAULT;
Output
Rows with region 'US' go to customer_na; region 'BR' goes to customer_default
Production Insight
Without a DEFAULT partition, adding a new region requires an ALTER TABLE to add a new partition — which can lock the table.
The DEFAULT partition can grow large if many unmapped values arrive; monitor its size.
For global apps with many regions, consider using hash or composite partitioning instead.
Rule: always include a DEFAULT partition and set up an alert when it receives rows.
Key Takeaway
List partitioning enforces data categorization.
Always add a DEFAULT partition to avoid insertion failures.
Monitor the default partition: its size indicates missing partition definitions.

Composite Partitioning — Combining Strategies

Composite (or sub) partitioning combines two partitioning methods, typically range + hash or range + list. The table is first partitioned by a range, and then each range partition is further divided into sub-partitions using hash or list. This is useful for massive tables where you need both pruning on a time dimension and distribution across storage for parallelism.

Example: partition by month, then sub-partition by hash on customer_id. Queries on a single month only scan one range partition, and writes are spread across sub-partitions within that month.

ExampleSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- io.thecodeforge.db.schema
-- Composite range-hash partitioning
CREATE TABLE audit_logs (
    log_id      BIGSERIAL,
    event_time  TIMESTAMPTZ NOT NULL,
    user_id     BIGINT NOT NULL,
    action      TEXT
) PARTITION BY RANGE (event_time);

-- Monthly partitions, each sub-partitioned by hash on user_id
CREATE TABLE audit_logs_2026_01 PARTITION OF audit_logs
    FOR VALUES FROM ('2026-01-01') TO ('2026-02-01')
    PARTITION BY HASH (user_id);

-- Sub-partitions within the month
CREATE TABLE audit_logs_2026_01_p0 PARTITION OF audit_logs_2026_01
    FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE audit_logs_2026_01_p1 PARTITION OF audit_logs_2026_01
    FOR VALUES WITH (MODULUS 4, REMAINDER 1);
-- ... p2, p3
Output
Query: WHERE event_time BETWEEN '2026-01-10' AND '2026-01-15' AND user_id = 12345
Prunes to audit_logs_2026_01_p1 (month + hash)
Production Insight
Composite partitioning increases query optimizer complexity — execution plans become harder to read.
Planned maintenance (e.g., dropping old months) must consider sub-partitions; dropping the top-level partition drops all sub-partitions.
Be careful with partition key order: place the most selective key first for maximum pruning.
Rule: only use composite partitioning when a single strategy cannot handle your data volume and access patterns.
Key Takeaway
Composite partitioning layers two strategies for flexibility.
Query optimizer overhead increases — test execution plans thoroughly.
Drop top-level partition to remove all sub-partition data instantly.

Partition Pruning — The Engine That Makes Partitioning Fast

Partition pruning is the query optimizer's ability to skip irrelevant partitions based on the WHERE clause. Without pruning, partitioning can actually degrade performance because the database must check metadata for every partition. Pruning occurs only when the partition key is used in a sargable predicate (e.g., equality, range, IN list).

Common pitfalls: wrapping the partition key in a function (e.g., DATE(order_date) = '2025-01-01') prevents pruning. Ensure the column is used directly.

ExampleSQL
1
2
3
4
5
6
7
8
9
10
-- io.thecodeforge.db.schema
-- Example where pruning works (direct column comparison)
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE order_date >= '2025-06-01' AND order_date < '2025-07-01';

-- Example where pruning fails (function on column)
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE extract(year from order_date) = 2025;

-- The second query will scan all partitions because the expression is not matched to partition bounds.
Output
-- First query: 'Subplans Removed: n' shows pruning occurred.
-- Second query: 'Subplans Removed: 0' — full scan.
Production Insight
I've seen production queries run 100x slower because a migration script used CAST(order_date AS DATE) instead of order_date directly.
Function-based indexes don't help; the partition key must be the raw column.
Rule: always test with EXPLAIN after creating or modifying queries on partitioned tables.
If pruning isn't happening, check for function wrappers, implicit type casting, or missing partition key in WHERE.
Key Takeaway
Partition pruning is the performance magic behind partitioning.
Never wrap the partition key in a function in WHERE clauses.
Always verify with EXPLAIN — you can't assume pruning works.

Containerized Database Management

To test partitioning strategies locally without polluting your system, use Docker to spin up a pre-configured instance. This ensures your staging and production environments use the exact same partitioning logic.

ExampleDOCKER
1
2
3
4
5
6
7
8
9
10
11
12
# io.thecodeforge.infrastructure
FROM postgres:16-alpine

# Optimized for large-scale partitioning tests
ENV POSTGRES_DB=forge_warehouse
ENV POSTGRES_USER=admin
ENV POSTGRES_PASSWORD=secure_password

# Copy initialization scripts that create partitions
COPY init-partitioning.sql /docker-entrypoint-initdb.d/

EXPOSE 5432
Output
Step 4/5 : COPY init-partitioning.sql /docker-entrypoint-initdb.d/
Production Insight
Docker is ideal for testing partitioning strategies, but remember: production hardware differences affect partition scan performance.
Use the same PostgreSQL major version locally as in production to avoid optimizer differences.
Volume mount a data directory to persist test partitions across container restarts.
Rule: automate partition creation in init scripts — never manually create partitions in production.
Key Takeaway
Docker containers provide reproducible partitioning test environments.
Match PostgreSQL version to production for accurate pruning behavior.
Automate partition creation as part of deployment scripts.
● Production incidentPOST-MORTEMseverity: high

The Missing Default Partition That Broke Batch Insert

Symptom
A batch insert for customer data from a newly acquired company in a country not previously in the list partition started failing with ORA-14400 (inserted partition key does not map to any partition). The pipeline retried indefinitely, causing data lag.
Assumption
The team assumed list partition would reject unknown values with a clear error. They thought the ETL would be stopped and fixed before data loss.
Root cause
The list partition was defined for specific country codes (US, UK, DE, FR). The new data came from 'JP' (Japan). No DEFAULT partition existed. The database rejected the rows, and the ETL's error handling wasn't configured to catch this specific error and route to a fallback.
Fix
Added a DEFAULT partition to catch any unmapped values. Then set up an alert on rows entering the DEFAULT partition so the team could add proper partitions proactively.
Key lesson
  • Always create a DEFAULT partition for LIST and RANGE partitioning to catch unexpected values.
  • Do not rely on application validation alone — the database should have a safety net.
  • Monitor DEFAULT partition size as a signal for missing partition definitions.
Production debug guideSymptom → Action for common partitioning failures4 entries
Symptom · 01
Query on partitioned table takes as long as full table scan
Fix
Check if WHERE clause includes the partition key. Run EXPLAIN PLAN to see if partition pruning occurred (e.g., 'PARTITION RANGE SINGLE' vs 'PARTITION RANGE ALL'). Ensure index on partition key.
Symptom · 02
Insert fails with partition key mapping error
Fix
Verify the inserted value matches defined partition boundaries. Check for missing DEFAULT partition. Use SELECT * FROM ALL_TAB_PARTITIONS to see existing partitions.
Symptom · 03
Query performance degrades over time despite partitioning
Fix
Check for partition skew — if one partition grows much larger than others (e.g., 'hot' partition in range partitioning). Use SELECT partition_name, num_rows FROM ALL_TAB_PARTITIONS to compare sizes. Consider splitting the hot partition or rebalancing hash partitions.
Symptom · 04
Maintenance (VACUUM, ANALYZE) takes too long on partitioned table
Fix
Perform maintenance on each partition individually instead of the parent table. Use concurrent VACUUM on child tables. Set autovacuum tuning per partition size.
★ Partitioning Quick Debug Cheat SheetImmediate commands to diagnose partitioning issues in PostgreSQL
Query not using partition pruning
Immediate action
Run EXPLAIN (ANALYZE, BUFFERS) on the query.
Commands
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE order_date >= '2025-01-01';
Check if 'Subplans Removed' shows 0 — indicates no pruning.
Fix now
Add a WHERE clause on the partition key (order_date) with a sargable condition. Ensure index exists on partition key.
Insert into partitioned table fails+
Immediate action
Check the insertion value against existing partitions.
Commands
SELECT tableoid::regclass, count(*) FROM parent_table GROUP BY tableoid;
SELECT * FROM information_schema.table_partitions WHERE table_catalog = 'mydb' AND table_name = 'parent_table';
Fix now
Add DEFAULT partition: CREATE TABLE default_part PARTITION OF parent_table DEFAULT;
Partition size imbalance+
Immediate action
Identify size distribution across partitions.
Commands
SELECT relname, pg_total_relation_size(oid) FROM pg_class WHERE relkind = 'r' AND relname LIKE 'orders_%' ORDER BY pg_total_relation_size DESC;
For hash partitions, check hash distribution: SELECT (hashvalue % modulus) FROM ...
Fix now
Rebalance hash partitions by rebuilding with different modulus, or split an overgrown range partition.
TypeSplit ByBest ForWeakness
RangeValue ranges (date, ID)Time-series, historical archivalHot partition if writes cluster at range boundary
HashHash of column valueEven distribution, high-concurrency writesRange-based queries force a full scan of all partitions
ListDiscrete values (country, status)Regional data, status-based isolationUnknown/New values fail unless a 'DEFAULT' partition exists
CompositeCombination of above (e.g. Range then Hash)Massive multi-tenant SaaS platformsSignificant administrative and query planning overhead

Key takeaways

1
Partitioning is transparent
applications query the parent table and the database routes to partitions.
2
Partition pruning
queries with WHERE on the partition key skip irrelevant partitions entirely.
3
Dropping a partition is instantaneous
the best way to archive old data at scale.
4
Partitioning differs from sharding
partitions live on one database instance; sharding splits across multiple.
5
Always create an index on the partition key
otherwise partition pruning cannot work.
6
Always add a DEFAULT partition for list and range partitioning to catch unexpected values.
7
Hash partitioning solves write hotspots but kills range-query pruning.

Common mistakes to avoid

4 patterns
×

Missing DEFAULT partition in LIST or RANGE partitioning

Symptom
INSERT fails with ORA-14400 (or equivalent) when a value doesn't match any partition boundary. ETL pipelines halt.
Fix
Add a DEFAULT partition for LIST, and for RANGE ensure ranges cover all expected values or add a DEFAULT partition (PostgreSQL 12+ supports DEFAULT for RANGE). Use CREATE TABLE ... PARTITION OF parent DEFAULT;.
×

Using a non-sargable expression on partition key in WHERE clause

Symptom
Queries scan all partitions despite seemingly correct filtering, causing severe performance degradation.
Fix
Always use the raw column with direct comparisons. Avoid wrapping in functions like DATE_TRUNC, TO_CHAR, or CAST. Rewrite queries to use range conditions like col >= '2025-01-01' AND col < '2025-02-01'.
×

Not creating index on partition key

Symptom
Partition pruning may still occur but full scans of the relevant partition are slow without an index.
Fix
Create an index on the partition key (either on parent or on each child). PostgreSQL 12+ allows creating index on parent which propagates to children.
×

Assuming partitioning automatically improves all queries

Symptom
Queries without WHERE on partition key become slower due to overhead of managing multiple partition objects.
Fix
Ensure all critical queries include a filter on the partition key. If not possible, consider an alternative index strategy or a different partition scheme.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
Explain the concept of 'Partition Pruning'. How does the query optimizer...
Q02SENIOR
You have a table that stores logs for 5 years. Deleting old logs is taki...
Q03SENIOR
LeetCode Standard: Given a massive dataset of transactions, how would yo...
Q04SENIOR
When would you choose Hash partitioning over Range partitioning, and wha...
Q05SENIOR
Can a primary key in a partitioned table be any column, or does it have ...
Q01 of 05JUNIOR

Explain the concept of 'Partition Pruning'. How does the query optimizer use it to reduce I/O?

ANSWER
Partition pruning is the query optimizer's ability to eliminate entire partitions from the scan when the WHERE clause includes the partition key. The optimizer compares the query predicates against the metadata of each partition (bounds for range, list values, or hash ranges). Partitions that cannot contain matching rows are not scanned at all. This reduces I/O and CPU because fewer pages are read from disk. For example, a query with WHERE order_date BETWEEN '2025-01-01' AND '2025-01-31' on a table range-partitioned by month will only scan the partition for January 2025.
FAQ · 6 QUESTIONS

Frequently Asked Questions

01
What is the difference between partitioning and sharding?
02
Does partitioning improve query performance automatically?
03
Can I partition an existing table with millions of rows?
04
What is a 'Default Partition' and why do I need it?
05
How do I choose the partition key?
06
Can I alter a partition after creation?
🔥

That's Database Design. Mark it forged?

3 min read · try the examples if you haven't

Previous
CAP Theorem and Databases
9 / 16 · Database Design
Next
Database Connection Pooling