Materialized Views SQL - Refresh Order Bug Doubles Revenue
- A materialized view is a real table on disk — index it like one. Without indexes, you've pre-computed the answer but you're still reading every row to find it.
- REFRESH MATERIALIZED VIEW CONCURRENT is almost always the right choice in production, but it needs a UNIQUE INDEX to work — create that index at the same time you create the view, not as an afterthought.
- Chained materialized views must be refreshed in strict dependency order — the database won't enforce this for you, and wrong order means stale-on-stale data with no errors or warnings.
- Materialized views store query results as physical tables on disk
- Refresh strategies: plain lock, concurrent (no-read-block), incremental (Oracle)
- Always build indexes after creation — unindexed mat views are half the solution
- Staleness is the core trade-off: accept lag for speed; monitor last_refresh
- Biggest mistake: forgetting to refresh chained mat views in dependency order
Materialized View Quick Debug Cheat Sheet
View is empty
SELECT last_refresh FROM pg_matviews WHERE matviewname = 'your_mv';REFRESH MATERIALIZED VIEW your_mv;Concurrent refresh fails
SELECT indexname FROM pg_indexes WHERE tablename = 'your_mv';CREATE UNIQUE INDEX idx_mv_pk ON your_mv (pk_col1, pk_col2);View returns old data after schema change
REFRESH MATERIALIZED VIEW your_mv;If error: roll back migration or fix column references.Chained views show inconsistent data
SELECT pg_get_viewdef('your_mv', true) AS definition;Cross-reference with other mat views in information_schema.views.Last refresh is hours stale despite scheduled job
SELECT * FROM cron.job WHERE jobname LIKE '%refresh%';SELECT * FROM cron.job_run_details WHERE jobname LIKE '%refresh%' ORDER BY start_time DESC;Production Incident
mv_raw_orders -> mv_revenue_summary. The cron job refreshed mv_revenue_summary first (the dependent view) and mv_raw_orders second. Because mv_revenue_summary read stale data from mv_raw_orders, the refresh accidentally carried over data from the previous refresh cycle, duplicating rows from two days ago. A bug in the refresh script also failed to wrap the two refreshes in a single transaction, so when the second refresh failed due to a transient deadlock, the first one had already committed — leaving the dashboard with an incomplete, merged dataset.Production Debug GuideSymptom → Action guide for the most common production failures
Every production database eventually hits the same wall: a handful of queries that are crushingly expensive, run constantly, and touch the same underlying data. Business intelligence dashboards, leaderboards, financial summaries, multi-table aggregations — these queries can take seconds or even minutes on large datasets, and they get fired dozens of times per minute. That latency doesn't just slow your app down; it locks rows, spikes CPU, and can cascade into a full outage under load.
Materialized views exist to break that cycle. Unlike a regular (virtual) view — which is just a saved SQL statement that re-executes every time you query it — a materialized view physically stores the result set on disk. Query it and the database reads pre-computed rows instead of re-joining six tables and aggregating millions of records. The trade-off is freshness: the stored result can grow stale between refreshes. Understanding how to manage that trade-off is what separates engineers who use materialized views effectively from those who introduce subtle, hard-to-debug data inconsistencies into production.
By the end of this article you'll understand how materialized view storage and indexing works under the hood, how to choose between complete, fast, and incremental refresh strategies, when staleness is acceptable versus dangerous, and the exact production mistakes that bite teams who don't read the fine print. We'll use PostgreSQL for most examples (the most nuanced implementation) and call out Oracle and SQL Server differences where they matter.
How Materialized Views Actually Store Data (And Why That Changes Everything)
A regular view is essentially a macro — the database swaps in its definition at parse time and executes it fresh every time. No storage. No state. A materialized view is a real table on disk that happens to know how it was built.
In PostgreSQL, when you run CREATE MATERIALIZED VIEW, the engine executes the defining query immediately and writes the result rows into a heap file — the same physical storage format used by ordinary tables. From that point on, querying the materialized view is indistinguishable from querying a table. You get sequential scans, bitmap index scans, all of it.
Because it's a real table, you can — and should — build indexes on it. This is the feature most developers miss. If your materialized view aggregates sales by region and date, you can create a B-tree index on (region, sale_date) so point lookups stay sub-millisecond even if the materialized view contains tens of millions of rows.
The defining query isn't re-evaluated on every SELECT. It's only re-evaluated when you explicitly REFRESH MATERIALIZED VIEW. This means the planner can't 'look through' the materialized view to push predicates down into the base tables — it just scans the stored rows. That's usually a win, but it's something to keep in mind when your filter selectivity is very high.
-- Assume we have these base tables in a production analytics schema: -- orders(order_id, customer_id, product_id, region, order_date, amount) -- products(product_id, category, unit_cost) -- This query is expensive: it joins two large tables, aggregates, and -- gets called by the dashboard on every page load (~200 req/min). -- Without a materialized view it takes ~4 seconds. Unacceptable. CREATE MATERIALIZED VIEW mv_regional_sales_summary AS SELECT o.region, p.category AS product_category, DATE_TRUNC('month', o.order_date) AS sale_month, COUNT(o.order_id) AS total_orders, SUM(o.amount) AS total_revenue, SUM(o.amount - p.unit_cost) AS total_profit, AVG(o.amount) AS avg_order_value FROM orders o JOIN products p ON o.product_id = p.product_id GROUP BY o.region, p.category, DATE_TRUNC('month', o.order_date) ORDER BY sale_month DESC, total_revenue DESC WITH DATA; -- Execute the query NOW and store results immediately. -- Use WITH NO DATA to create the structure without populating it. -- Now build indexes on the most common filter/join patterns. -- Without this step you've only solved half the problem. CREATE INDEX idx_mv_sales_region_month ON mv_regional_sales_summary (region, sale_month); CREATE INDEX idx_mv_sales_category ON mv_regional_sales_summary (product_category); -- Verify what was stored — should return rows immediately at near-zero cost EXPLAIN ANALYZE SELECT region, total_revenue FROM mv_regional_sales_summary WHERE region = 'EMEA' AND sale_month >= '2024-01-01'; -- Check the physical size of the materialized view on disk SELECT pg_size_pretty(pg_total_relation_size('mv_regional_sales_summary')) AS total_size, pg_size_pretty(pg_relation_size('mv_regional_sales_summary')) AS table_size, pg_size_pretty( pg_total_relation_size('mv_regional_sales_summary') - pg_relation_size('mv_regional_sales_summary') ) AS index_size;
-- Index Scan using idx_mv_sales_region_month on mv_regional_sales_summary
-- (cost=0.28..8.30 rows=6 width=52)
-- (actual time=0.041..0.063 rows=6 loops=1)
-- Planning Time: 0.18 ms
-- Execution Time: 0.12 ms <-- was 4,200 ms before. That's a 35,000x improvement.
-- Size query output:
-- total_size | table_size | index_size
-- ------------+------------+------------
-- 2184 kB | 1424 kB | 760 kB
Refresh Strategies: Complete, Concurrent and Incremental — Choosing the Right One
Refreshing a materialized view is where most of the complexity lives. You're essentially answering: how stale can this data be, and how much disruption can I tolerate during the refresh itself?
PostgreSQL gives you two flavours of REFRESH. A plain REFRESH MATERIALIZED VIEW takes an exclusive lock — no reads allowed during the swap. It drops the old data, re-executes the full defining query, and writes new rows. On a small view this is fine. On a view with 50 million rows, you can lock out your dashboard for 10 minutes.
REFRESH MATERIALIZED VIEW CONCURRENTLY is the production-safe alternative. It re-computes the full query behind the scenes, diffs it against the current content using a unique index, and swaps only the changed rows — all while allowing concurrent reads. The cost: it takes roughly twice as long and requires a unique index on the materialized view.
Oracle goes further with incremental (fast) refresh, where the engine tracks base-table changes in materialised view logs and applies only the delta — no full re-scan. This is dramatically faster for large datasets with small change volumes, but it comes with strict restrictions on what SQL you can use in the defining query. SQL Server's indexed views are the closest equivalent and they refresh synchronously on every DML statement — zero staleness, but DML overhead you need to account for.
-- ============================================================ -- STRATEGY 1: Plain refresh — simple but locks out readers -- Use this for: non-critical views, off-hours batch jobs, -- views that are fast to recompute (< 1 second) -- ============================================================ REFRESH MATERIALIZED VIEW mv_regional_sales_summary; -- This acquires AccessExclusiveLock. Concurrent SELECTs will WAIT. -- Safe for batch jobs at 2am. Not safe during business hours. -- ============================================================ -- STRATEGY 2: Concurrent refresh — production-safe -- Requirement: the materialized view MUST have a unique index. -- ============================================================ -- First, add the required unique index. We create a surrogate -- key by combining our natural key columns. CREATE UNIQUE INDEX idx_mv_sales_unique_key ON mv_regional_sales_summary (region, product_category, sale_month); -- ^^^ Without this, CONCURRENT refresh will throw: -- ERROR: cannot refresh materialized view concurrently without a unique index -- Now the safe, concurrent refresh: REFRESH MATERIALIZED VIEW CONCURRENTLY mv_regional_sales_summary; -- PostgreSQL re-runs the full query in the background, compares -- results, and performs targeted INSERT/UPDATE/DELETE operations. -- Readers are NEVER blocked. Writers on the base tables are not blocked. -- This takes ~2x longer but causes zero downtime. -- ============================================================ -- STRATEGY 3: Automate refresh with pg_cron (or a scheduler) -- Refresh every 15 minutes during business hours -- ============================================================ -- Install pg_cron extension (requires superuser, done once) CREATE EXTENSION IF NOT EXISTS pg_cron; -- Schedule concurrent refresh every 15 minutes, 8am-8pm weekdays SELECT cron.schedule( 'refresh-sales-summary', -- job name '*/15 8-20 * * 1-5', -- cron expression $$REFRESH MATERIALIZED VIEW CONCURRENTLY mv_regional_sales_summary$$ ); -- Check scheduled jobs SELECT jobname, schedule, command, active FROM cron.job WHERE jobname = 'refresh-sales-summary'; -- ============================================================ -- MONITORING: How stale is our materialized view right now? -- ============================================================ SELECT schemaname, matviewname, -- last_refresh is NULL if the view was never refreshed -- (i.e., created WITH NO DATA) last_refresh, NOW() - last_refresh AS data_age, CASE WHEN NOW() - last_refresh > INTERVAL '30 minutes' THEN 'STALE — consider refreshing' ELSE 'Fresh enough' END AS freshness_status FROM pg_matviews WHERE matviewname = 'mv_regional_sales_summary';
-- jobname | schedule | command | active
-- --------------------------+-----------------+--------------------------------------------+--------
-- refresh-sales-summary | */15 8-20 * * 1-5 | REFRESH MATERIALIZED VIEW CONCURRENTLY ... | t
-- Freshness check output (run 22 minutes after last refresh):
-- schemaname | matviewname | last_refresh | data_age | freshness_status
-- ------------+---------------------------+-----------------------------+------------------+----------------------------
-- public | mv_regional_sales_summary | 2024-06-15 14:15:03.421+00 | 00:22:17.843291 | STALE — consider refreshing
Production Gotchas: Staleness, Dependency Chains and Schema Drift
Using materialized views in production means living with three unavoidable tensions: staleness, fragile dependency chains, and schema evolution.
Staleness is obvious but its consequences are subtle. A dashboard showing last-hour totals from a stale materialized view can mislead a sales team into over-selling inventory. Worse, if two systems read from different materialized views that refresh on different schedules, you can get internally inconsistent reports — your revenue total doesn't match your order count total. Always document the refresh cadence alongside every materialized view and surface data_age to end users when it exceeds a threshold.
Dependency chains are the silent killer. Materialized view B built on materialized view A doesn't automatically refresh when A refreshes. You must refresh them in topological order — A first, then B. PostgreSQL won't warn you. You'll get stale-on-stale data silently.
Schema drift happens when someone alters a base table column — renames it, changes its type, drops it — and the materialized view quietly breaks. Unlike regular views, PostgreSQL doesn't invalidate a materialized view when the base table changes (it will error at refresh time, not at query time). Your dashboards will keep returning stale data from the last successful refresh with no error, until the next refresh crashes.
-- ============================================================ -- SCENARIO: Chained materialized views (A -> B) -- mv_raw_order_metrics depends on base tables -- mv_executive_dashboard depends on mv_raw_order_metrics -- ============================================================ CREATE MATERIALIZED VIEW mv_raw_order_metrics AS SELECT customer_id, DATE_TRUNC('day', order_date) AS order_day, COUNT(*) AS daily_order_count, SUM(amount) AS daily_spend FROM orders GROUP BY customer_id, DATE_TRUNC('day', order_date) WITH DATA; CREATE UNIQUE INDEX idx_raw_metrics_unique ON mv_raw_order_metrics (customer_id, order_day); -- This view depends on the one above, NOT on the base table directly CREATE MATERIALIZED VIEW mv_executive_dashboard AS SELECT order_day, COUNT(DISTINCT customer_id) AS active_customers, SUM(daily_order_count) AS platform_orders, SUM(daily_spend) AS platform_revenue, AVG(daily_spend) AS avg_customer_spend FROM mv_raw_order_metrics -- <-- reading from another mat view! GROUP BY order_day ORDER BY order_day DESC WITH DATA; -- WRONG: refreshing in the wrong order gives stale-on-stale data -- REFRESH MATERIALIZED VIEW CONCURRENTLY mv_executive_dashboard; -- REFRESH MATERIALIZED VIEW CONCURRENTLY mv_raw_order_metrics; -- ^^^ This refreshes the exec dashboard BEFORE the raw metrics are updated. -- The exec dashboard will reflect yesterday's raw metrics for another cycle. -- CORRECT: always refresh in dependency order (leaves first) REFRESH MATERIALIZED VIEW CONCURRENTLY mv_raw_order_metrics; REFRESH MATERIALIZED VIEW CONCURRENTLY mv_executive_dashboard; -- Now the exec dashboard reflects today's raw data. -- ============================================================ -- DEFENSIVE PATTERN: Wrap chained refreshes in a transaction -- with error handling using a stored procedure -- ============================================================ CREATE OR REPLACE PROCEDURE refresh_dashboard_views() LANGUAGE plpgsql AS $$ BEGIN -- Refresh in strict dependency order REFRESH MATERIALIZED VIEW CONCURRENTLY mv_raw_order_metrics; RAISE NOTICE 'mv_raw_order_metrics refreshed at %', NOW(); REFRESH MATERIALIZED VIEW CONCURRENTLY mv_executive_dashboard; RAISE NOTICE 'mv_executive_dashboard refreshed at %', NOW(); EXCEPTION WHEN OTHERS THEN -- Log the error but don't crash silently. -- In production you'd INSERT into an error_log table here. RAISE WARNING 'Dashboard refresh failed: % — %', SQLSTATE, SQLERRM; -- The materialized views retain their last successful data. -- Stale data is better than no data for a dashboard. END; $$; -- Schedule this procedure instead of raw REFRESH commands SELECT cron.schedule( 'refresh-dashboard-chain', '*/15 8-20 * * 1-5', 'CALL refresh_dashboard_views()' ); -- ============================================================ -- DETECTING SCHEMA DRIFT BEFORE IT SILENTLY CORRUPTS REPORTS -- Run this check as part of your deployment pipeline -- ============================================================ SELECT mv.matviewname, mv.definition, -- If the view references columns that no longer exist, this -- will surface as an error when you try to refresh. -- Pre-validate by attempting a dry-run EXPLAIN: 'EXPLAIN (FORMAT TEXT) SELECT * FROM ' || mv.matviewname || ' LIMIT 0' AS validation_query FROM pg_matviews mv WHERE mv.schemaname = 'public' ORDER BY mv.matviewname;
-- NOTICE: mv_raw_order_metrics refreshed at 2024-06-15 14:30:01.123456+00
-- NOTICE: mv_executive_dashboard refreshed at 2024-06-15 14:30:04.987654+00
-- If a base table column was dropped and refresh is attempted:
-- WARNING: Dashboard refresh failed: 42703 —
-- column "amount" of relation "orders" does not exist
-- (The materialized views retain their last valid data — no silent data loss)
When to Use Materialized Views vs. Regular Views, Caching Layers and Summary Tables
Materialized views aren't always the right tool. Making that call correctly is what distinguishes senior database work from cargo-culting.
Use a regular view when the underlying query is fast (under ~100ms), the data must be absolutely real-time, or you need the planner to push predicates into the base tables. Regular views compose well and have zero refresh overhead.
Use a materialized view when: the query is expensive and runs frequently, a few minutes (or hours) of staleness is acceptable to the business, and you need indexes on the result set. The sweet spot is read-heavy analytics against tables that are written to in batch (nightly ETL, hourly imports). It's a terrible fit for financial ledgers or inventory counts where users will make decisions based on the number.
Use an application-level cache (Redis, Memcached) when your query result is tiny, needs sub-millisecond reads, or needs to be shared across database clusters. A materialized view doesn't help if the bottleneck is network round-trips to the DB.
Use a dedicated summary table with manual INSERT/UPDATE logic when you need incremental updates that materialized view refresh can't support — like a running total that must be accurate to the last second. It's more code, but it's more control.
-- ============================================================ -- SCENARIO: Compare the same business question answered three ways. -- Question: What is the total lifetime spend per customer? -- Base table: orders (500 million rows, growing 1M rows/day) -- ============================================================ -- OPTION 1: Regular View — always fresh, always slow on 500M rows CREATE OR REPLACE VIEW vw_customer_lifetime_spend AS SELECT customer_id, COUNT(order_id) AS total_orders, SUM(amount) AS lifetime_spend, MIN(order_date) AS first_order_date, MAX(order_date) AS most_recent_order FROM orders GROUP BY customer_id; -- Querying this for a single customer does a full 500M-row aggregation. -- EXPLAIN shows: HashAggregate -> Seq Scan on orders (~12 seconds) -- OPTION 2: Materialized View — refreshed nightly, fast reads CREATE MATERIALIZED VIEW mv_customer_lifetime_spend AS SELECT customer_id, COUNT(order_id) AS total_orders, SUM(amount) AS lifetime_spend, MIN(order_date) AS first_order_date, MAX(order_date) AS most_recent_order FROM orders GROUP BY customer_id WITH DATA; CREATE UNIQUE INDEX idx_mv_clv_customer ON mv_customer_lifetime_spend (customer_id); -- Now a point lookup for one customer: ~0.1ms. Nightly full refresh: ~45 seconds. -- Staleness: up to 24 hours. Acceptable for a CRM 'about this customer' panel. -- NOT acceptable for a real-time fraud check. -- OPTION 3: Running summary table — always accurate, incremental updates -- Best when you need real-time accuracy without the full-scan cost. CREATE TABLE customer_spend_summary ( customer_id BIGINT PRIMARY KEY, total_orders BIGINT NOT NULL DEFAULT 0, lifetime_spend NUMERIC(14,2) NOT NULL DEFAULT 0, first_order_date DATE, last_order_date DATE, updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); -- Trigger that maintains the summary table on every new order CREATE OR REPLACE FUNCTION fn_update_customer_spend_summary() RETURNS TRIGGER LANGUAGE plpgsql AS $$ BEGIN INSERT INTO customer_spend_summary (customer_id, total_orders, lifetime_spend, first_order_date, last_order_date, updated_at) VALUES (NEW.customer_id, 1, NEW.amount, NEW.order_date, NEW.order_date, NOW()) ON CONFLICT (customer_id) DO UPDATE SET total_orders = customer_spend_summary.total_orders + 1, lifetime_spend = customer_spend_summary.lifetime_spend + NEW.amount, -- LEAST/GREATEST keep the min/max without scanning historical data first_order_date = LEAST(customer_spend_summary.first_order_date, NEW.order_date), last_order_date = GREATEST(customer_spend_summary.last_order_date, NEW.order_date), updated_at = NOW(); RETURN NEW; END; $$; CREATE TRIGGER trg_orders_update_spend_summary AFTER INSERT ON orders FOR EACH ROW EXECUTE FUNCTION fn_update_customer_spend_summary(); -- Result: point lookups in ~0.1ms AND data is accurate to the last millisecond. -- Cost: every INSERT into orders now also writes to customer_spend_summary. -- On 1M orders/day: ~12ms extra latency per insert batch. Usually acceptable.
--
-- vw_customer_lifetime_spend (regular view): 11,847 ms (full table scan)
-- mv_customer_lifetime_spend (mat view, stale): 0.09 ms (index lookup, up to 24h stale)
-- customer_spend_summary (summary table): 0.07 ms (primary key lookup, real-time)
--
-- Mat view nightly REFRESH duration: 44,231 ms (73 seconds on 500M rows)
-- Summary table trigger overhead per INSERT: ~12 ms (microseconds amortized)
Monitoring Materialized View Freshness and Performance in Production
You can't trust a materialized view unless you know how fresh it is and how healthy its refresh process is. PostgreSQL provides the pg_matviews system catalog that exposes last_refresh — but that's just the start.
Build monitoring around three signals:
- Freshness lag: query
Nand alert when it exceeds 2x the expected refresh interval. A stale view with a broken job looks perfectly healthy to the application — queries return instantly, but the data is wrong.OW()- last_refresh - Refresh duration: track how long each REFRESH takes. A sudden increase often means the base table has grown significantly, the defining query has regressed, or the mat view indexes are bloated.
- Refresh failures: catch errors from refresh commands. A failed refresh means the mat view keeps its last successful data — users see old data with no error. This is the most dangerous class of failure.
For Oracle, use DBA_MVIEWS and DBA_MVIEW_REFRESH_TIMES. For SQL Server, the sys.objects and sys.dm_db_index_physical_stats provide similar insights.
Advanced technique: create a heartbeat table that gets updated every time a critical mat view is refreshed. The monitoring system checks the heartbeat instead of querying system tables — simpler and cross-database.
-- ============================================================ -- MONITORING: PostgreSQL mat view health check -- Run every minute from a monitoring tool (cron, pg_cron, or external) -- ============================================================ -- Signal 1: Freshness lag SELECT matviewname, last_refresh, EXTRACT(EPOCH FROM (NOW() - last_refresh)) / 60 AS minutes_since_refresh, CASE WHEN last_refresh IS NULL THEN 'NEVER_REFRESHED' WHEN NOW() - last_refresh > INTERVAL '30 minutes' THEN 'STALE' ELSE 'FRESH' END AS status FROM pg_matviews WHERE schemaname = 'public'; -- Signal 2: Refresh duration (if you log start/end to a custom table) CREATE TABLE mv_refresh_log ( id SERIAL PRIMARY KEY, matview_name TEXT NOT NULL, started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), finished_at TIMESTAMPTZ, duration_ms INTEGER, success BOOLEAN, error_message TEXT ); -- Wrap refresh with logging (example for one mat view) INSERT INTO mv_refresh_log (matview_name, started_at) VALUES ('mv_regional_sales_summary', NOW()); REFRESH MATERIALIZED VIEW CONCURRENTLY mv_regional_sales_summary; UPDATE mv_refresh_log SET finished_at = NOW(), duration_ms = EXTRACT(EPOCH FROM (NOW() - started_at)) * 1000, success = TRUE WHERE id = currval('mv_refresh_log_id_seq'); -- Query recent refresh performance SELECT matview_name, AVG(duration_ms)::INT AS avg_duration_ms, MAX(duration_ms) AS max_duration_ms, COUNT(*) AS total_refreshes, SUM(CASE WHEN success = FALSE THEN 1 ELSE 0 END) AS failures FROM mv_refresh_log WHERE started_at > NOW() - INTERVAL '7 days' GROUP BY matview_name ORDER BY avg_duration_ms DESC; -- Signal 3: Alert on failures SELECT matview_name, started_at, error_message FROM mv_refresh_log WHERE success = FALSE AND started_at > NOW() - INTERVAL '1 hour'; -- ============================================================ -- Heartbeat table approach (cross-database portable) -- ============================================================ CREATE TABLE mv_heartbeat ( matview_name TEXT PRIMARY KEY, last_refresh_ok TIMESTAMPTZ NOT NULL DEFAULT NOW(), expected_refresh_interval INTERVAL NOT NULL ); -- Update heartbeat after every successful refresh UPDATE mv_heartbeat SET last_refresh_ok = NOW() WHERE matview_name = 'mv_regional_sales_summary'; -- Monitoring query SELECT matview_name, last_refresh_ok, NOW() - last_refresh_ok AS age, expected_refresh_interval, CASE WHEN NOW() - last_refresh_ok > expected_refresh_interval * 2 THEN 'ALERT: stale' ELSE 'OK' END AS status FROM mv_heartbeat;
-- matview_name | minutes_since_refresh | status
-- -----------------------------+-----------------------+--------
-- mv_regional_sales_summary | 22 | FRESH
-- mv_executive_dashboard | 480 | STALE <-- job likely died
-- Refresh log:
-- matview_name | avg_duration_ms | max_duration_ms | failures
-- --------------------------+-----------------+-----------------+---------
-- mv_regional_sales_summary | 42100 | 45800 | 0
-- mv_executive_dashboard | 78400 | 89120 | 3
| Aspect | Regular View | Materialized View | Summary Table (Trigger-Maintained) |
|---|---|---|---|
| Data freshness | Always real-time | Stale until next REFRESH | Real-time (per-row trigger) |
| Read performance | As slow as the defining query | As fast as a table + index | As fast as a table + index |
| Write overhead | None | None on DML; burst cost on REFRESH | Small overhead on every DML |
| Storage cost | Zero (no data stored) | Full result set on disk | Full result set on disk |
| Supports indexes | No (it's a virtual layer) | Yes — build any index you need | Yes — it is a real table |
| Refresh mechanism | Automatic (every query) | Manual or scheduled REFRESH | Automatic via trigger or app logic |
| Handles aggregate updates | Recomputes from scratch | Recomputes from scratch | Incremental delta only |
| Schema change resilience | Errors immediately on query | Errors only at next REFRESH | Errors immediately on INSERT |
| Concurrent reads during refresh | Always allowed | Blocked (plain) / Allowed (CONCURRENT) | Always allowed |
| Best for | Simple aliasing, security views | Read-heavy analytics, dashboards | Real-time counters, running totals |
🎯 Key Takeaways
- A materialized view is a real table on disk — index it like one. Without indexes, you've pre-computed the answer but you're still reading every row to find it.
- REFRESH MATERIALIZED VIEW CONCURRENT is almost always the right choice in production, but it needs a UNIQUE INDEX to work — create that index at the same time you create the view, not as an afterthought.
- Chained materialized views must be refreshed in strict dependency order — the database won't enforce this for you, and wrong order means stale-on-stale data with no errors or warnings.
- Staleness is the core trade-off: materialized views are wrong for real-time financial or inventory data, perfect for analytical aggregations where a 15-minute lag is business-acceptable. Always get that SLA in writing before choosing your tool.
- Monitor mat view freshness as rigorously as you monitor application health. Stale data that looks correct is more dangerous than a query that fails.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QWhat's the difference between a regular view and a materialized view, and how would you decide which one to use for a dashboard that's queried 500 times per minute?Mid-levelReveal
- QExplain how REFRESH MATERIALIZED VIEW CONCURRENTLY works internally. Why does it require a unique index, and what are the trade-offs compared to a plain refresh?SeniorReveal
- QYou have a materialized view built on top of another materialized view. A critical bug report comes in saying the executive dashboard shows revenue numbers that are one day behind, even though the refresh jobs have been running. What are the possible causes and how do you diagnose them?SeniorReveal
Frequently Asked Questions
Do materialized views update automatically when the underlying table changes?
No — in PostgreSQL and most databases, materialized views are NOT automatically updated when base table data changes. You must explicitly run REFRESH MATERIALIZED VIEW (or schedule it). The only exception is SQL Server's indexed views, which update synchronously on every DML statement, at the cost of increased write overhead.
Can you add indexes to a materialized view in PostgreSQL?
Yes, and you absolutely should. A PostgreSQL materialized view is stored as a regular heap table, so you can CREATE INDEX on any column just like a normal table. You also need at least one UNIQUE index if you want to use REFRESH MATERIALIZED VIEW CONCURRENTLY. Add indexes immediately after creation — they won't be created automatically.
What happens to a materialized view if I drop or rename a column in the base table?
In PostgreSQL, the materialized view keeps returning its last cached data without any error at query time — it just reads the stored rows, which still exist. The error only surfaces on the next REFRESH MATERIALIZED VIEW call. This means you can silently serve stale, incorrect data until the next scheduled refresh. Always refresh all dependent materialized views immediately after any schema migration to detect breakage right away.
How can I create a materialized view that refreshes automatically in PostgreSQL?
PostgreSQL does not have a built-in automatic refresh mechanism. You must use an external scheduler or the pg_cron extension. With pg_cron, you can schedule a function that runs REFRESH MATERIALIZED VIEW CONCURRENTLY at your desired interval. Other options include using Linux cron, application-level schedulers (like a background thread in your app), or orchestration tools like Airflow or dbt.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.