A PostgreSQL write succeeded but Elasticsearch timed out — no rollback left products missing from search.
Database Types and Their Sweet Spots
Each database type excels at specific access patterns. Relational databases enforce ACID and support complex joins – they're for transactional core. Document databases tolerate schema variations – ideal for catalogs. Key-value stores provide constant-time lookups – perfect for session data. Search engines invert indices for fast full-text queries. Graph databases traverse relationships efficiently. Time-series databases optimize for write-heavy append-only data. Choosing the wrong database for a pattern leads to painful workarounds: for example, using a relational database for full-text search leads to inefficient LIKE queries and poor relevance ranking.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Package: io.thecodeforge.python.system_design
# Typical polyglot architecture for an e-commerce system:
# PostgreSQL — relational: users, orders, products, payments
# - ACID transactions: order.create() and inventory.decrement() atomically
# - Complex queries: reporting, joins between entities
# - Example: user creates an order
# Redis — key-value: sessions, caching, rate limiting
# - Sub-millisecond reads: session token → user object
# - TTL-based expiry: sessions expire automatically
# - Example: cache product page for 10 minutes
# Elasticsearch — search: product search with relevance
# - Full-text search with typo tolerance
# - Faceted search: filter by price, brand, rating
# - Example: user searches 'wireless headphones'
# MongoDB (document) — product catalogue
# - Flexible schema: different products have different attributes
# - Laptop has CPU, RAM; T-shirt has size, colour
# - Example: store varied product attributes without schema migration
# Neo4j (graph) — social features, recommendations
# - 'Users who bought X also bought Y'
# - Friend-of-friend queries
# - Example: find all users within 3 hops in social graph
# InfluxDB (time-series) — metrics, analytics
# - Write-optimised for timestamped data
# - Example: page views, API response times
Output
# Each database chosen for its access pattern strengths
The Full Toolbox Analogy
- Relational = socket wrench (precise, standard, can handle many bolts)
- Document = adjustable wrench (flexible, fits odd shapes)
- Key-Value = screwdriver (fast, specific, limited range)
- Search = power drill (specialized for one task but extremely fast)
- Graph = pliers (excellent for gripping complex relationships)
Production Insight
Using the wrong database type for a workload creates production pain that's hard to undo.
Teams often attempt to 'make it work' with feature X of a relational DB, only to hit scaling limits.
Rule: if you're writing a full-text search engine on top of PostgreSQL FTS, buy yourself time but plan to migrate to a search-specific database.
Key Takeaway
Pick the database that matches the access pattern, not the one you're most comfortable with.
Your PostgreSQL won't become Elasticsearch with enough indices.
The best database is the one that makes your code simpler.
IfNeed ACID transactions across entities
→
UseUse a relational database (PostgreSQL, MySQL)
IfData structure varies per record (product attributes differ)
→
UseUse a document database (MongoDB, Couchbase)
IfSub-millisecond reads with TTL required
→
UseUse a key-value store (Redis, Memcached)
IfFull-text search with relevance ranking
→
UseUse a search engine (Elasticsearch, Solr)
IfDeep relationship traversal (friend-of-friend, recommendations)
→
UseUse a graph database (Neo4j, Amazon Neptune)
IfHigh-write throughput of timestamped data
→
UseUse a time-series database (InfluxDB, TimescaleDB)
Data Consistency Across Systems
Data consistency across multiple databases is the central challenge of polyglot persistence. When the same logical entity (e.g., a product) lives in PostgreSQL (source of truth) and Elasticsearch (search index), any update must propagate to both. Two common patterns: dual writes (application writes to both) and change data capture (CDC) where the primary database's write-ahead log is streamed to secondary systems. Dual writes are simple but fragile – partial failures leave data permanently inconsistent. The outbox pattern mitigates this by writing a message within the same transaction as the primary write, then having a separate process deliver it asynchronously. CDC avoids application-level dual writes entirely but adds infrastructure complexity (Debezium, Kafka).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import asyncio
# When a product is created: must update PostgreSQL AND Elasticsearch
# Option 1: Dual write (naive — risks partial failure)
async def create_product_bad(product):
await postgres.insert('products', product) # succeeds
await elasticsearch.index('products', product) # what if this fails?
# Product exists in Postgres but not in search — inconsistency
# Option 2: Write to primary, sync via CDC (Change Data Capture)
# Debezium reads PostgreSQL WAL → publishes to Kafka → Elasticsearch consumer
# Primary write is source of truth; Elasticsearch is eventually consistent
# Option 3: Outbox pattern
async def create_product_outbox(product):
async with postgres.transaction():
await postgres.insert('products', product)
await postgres.insert('outbox', {
'event': 'product_created',
'data': product,
'processed': False
})
# Separate process reads outbox and syncs to Elasticsearch
# If it fails, retries safely (at-least-once delivery)
print('Outbox pattern ensures eventual consistency')Output
Outbox pattern ensures eventual consistency
Production Insight
Dual writes fail silently in production – a timeout on the secondary write doesn't roll back the primary.
The outbox pattern gives you at-least-once delivery guarantees, but requires careful deduplication.
Rule: never dual-write to two databases without a transactional outbox or CDC.
Key Takeaway
Dual writes are the #1 cause of data inconsistency in polyglot systems.
Use the outbox pattern or CDC for reliable synchronization.
Accept eventual consistency – your users will forgive a few seconds of lag, not permanently missing data.
When to Adopt Polyglot Persistence
Polyglot persistence adds operational cost. Adopt it only when a single database clearly fails to meet requirements. Signs you need multiple databases: your relational database has a 3-second full-text search query, your document database can't enforce a unique constraint across documents, your key-value store can't do aggregation queries. Start with one database (usually relational) and add others incrementally. The rule: each new database must solve a problem that the existing stack cannot solve without significant hackery. If you can solve it with a secondary index, a materialized view, or a dedicated read replica, do that first.
Production Insight
Teams often adopt polyglot persistence prematurely, adding five databases to a system that only serves 1000 QPS.
The operational burden of managing multiple DBs – backups, monitoring, patching – grows non-linearly.
Rule: only add a database when you can't make the existing one work, and be ready to hire specialists for each.
Key Takeaway
Polyglot persistence is a solution, not a design goal.
Start with one database – add others only when you hit a measurable wall.
Each additional database is a long-term commitment to operational maintenance.
IfCurrent database handles 95% of patterns adequately
→
UseDo not adopt polyglot – optimize existing database first.
IfSpecific pattern is 10x slower than alternative database
→
UseConsider adding one more database for that pattern.
IfTwo distinct patterns with conflicting requirements
→
UseSplit into two databases, each optimized for its pattern.
Operational Complexity Realities
Running a polyglot system means managing multiple database technologies, each with its own backup strategy, monitoring stack, scaling approach, and failure modes. You need expertise in each database – a PostgreSQL DBA might not know Elasticsearch shard sizing. Monitoring must cover each database's health metrics. Incident response becomes more complex because a single business transaction touches multiple systems. Invest in automation: consistent deployment via IaC, standardized monitoring dashboards, and runbooks for each database. The cost of this operational overhead must be justified by the business value gained from using the right tool.
Production Insight
In an outage, diagnosing a polyglot system takes longer because you have to check multiple databases and their sync state.
Example: a product reduction in PostgreSQL not reflecting in search because the CDC consumer crashed 4 hours ago.
Rule: invest heavily in observability (distributed tracing, log aggregation, synthetic transactions) across all databases.
Key Takeaway
Polyglot persistence doesn't eliminate operational complexity – it distributes it.
Your monitoring and alerting must cover every database's health and sync pipeline.
You can't have a polyglot system without a dedicated SRE team or equivalent automation.
Data Synchronization Patterns (Outbox, CDC, Batch)
Three common patterns to synchronize data between databases: 1) Outbox pattern: write to primary and an outbox table in the same transaction. A separate process reads the outbox and pushes to secondary systems. Guarantees at-least-once delivery. Requires deduplication. 2) Change Data Capture: use a tool like Debezium to stream changes from the primary database's transaction log. Decouples application from sync logic. Adds infrastructure components (Kafka, connectors). 3) Batch synchronization: scheduled jobs that periodically compare data between systems. Simple to implement but latency can be minutes to hours. Only suitable for non-critical data. Choose based on latency requirements and infrastructure maturity.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Package: io.thecodeforge.python.sync_patterns
# Outbox pattern - reliable async sync
# Write to PostgreSQL and outbox in same transaction
async def outbox_create_product(product):
async with postgres.transaction():
await postgres.insert('products', product)
await postgres.insert('outbox', {
'event': 'product_created',
'data': product,
'processed': False
})
# Worker process (runs periodically or via message queue)
async def outbox_worker():
events = await postgres.query(
"SELECT * FROM outbox WHERE processed = false"
)
for event in events:
try:
await elasticsearch.index('products', event.data)
await postgres.update('outbox',
{'processed': True},
{'id': event.id}
)
except Exception:
# Retry later (exponential backoff)
pass
print('Outbox pattern ensures eventual consistency with retries')Output
Outbox pattern ensures eventual consistency with retries
Production Insight
CDC via Debezium + Kafka is the gold standard for low-latency sync, but it introduces operational complexity.
The outbox pattern is simpler and works well for most systems, provided the worker is resilient and idempotent.
Batch sync is the fallback – good for reporting systems where hourly updates are acceptable.
Rule: choose outbox for new systems; CDC for existing ones where you can't modify the application.
Key Takeaway
Outbox pattern is the simplest reliable sync mechanism – start there.
CDC decouples sync from application code but adds infrastructure.
Batch sync is for non-critical, high-latency-tolerant use cases only.
IfLow latency required (< 1s), can modify application code
→
UseUse outbox pattern with a dedicated worker.
IfLow latency required, cannot modify application
→
UseUse CDC (Debezium + Kafka).
IfHigh latency acceptable (minutes to hours)
→
UseUse batch synchronization (cron job, scheduled query).