Strangler Fig: Bidirectional Sync Failure Lost Finances
40% traffic to new service lost finances.
- Intercept traffic at the edge proxy, route individual features to new services
- Old system stays live until all functionality is migrated — no big bang
- Risk is bounded to the slice currently being migrated, not the entire system
- Data sync between old and new is the hardest part — expect weeks of reconciliation
- Rollback means flipping traffic back to legacy, cheap and fast
- Biggest mistake: migrating the database before the service — you'll need dual writes
Imagine a giant old oak tree in your garden. A strangler fig vine wraps around it, growing its own roots and branches, slowly taking over — until one day the oak rots away and only the fig is left, strong and healthy. Nobody had to chop the oak down overnight. That's exactly what this pattern does to legacy software: you grow a new system around the old one, route traffic to the new parts gradually, and quietly retire the old code piece by piece.
Every senior engineer has a war story about the legacy monolith. The codebase that nobody dares touch, where a one-line change takes three weeks of regression testing and still breaks something in production at 2 AM on a Friday. These systems didn't become terrifying overnight — they grew that way over years of feature additions, hotfixes, and 'we'll clean this up later' compromises. The business depends on them. You cannot simply turn them off.
The Strangler Fig Pattern, coined by Martin Fowler in 2004 after observing actual strangler fig trees in Australian rainforests, is an architectural migration strategy that solves one specific problem: how do you replace a working-but-painful system with a better one without a risky, all-or-nothing 'big-bang' rewrite? The answer is that you don't replace it all at once. You intercept traffic at the edge, divert individual capabilities to new services as they're built, and let the old system die by starvation rather than demolition. The risk at any point in time is bounded to the slice you're currently migrating.
By the end of this article you'll understand the full mechanics of the pattern — the proxy/facade layer, feature-by-feature traffic routing, data synchronisation between old and new, rollback strategies, and the production gotchas that turn a smooth migration into a nightmare if you don't see them coming. You'll also have working code for the routing facade and a feature-flag-driven traffic splitter you can adapt to your own stack today.
What Is the Strangler Fig Pattern? (And Why Your Team Needs It)
The Strangler Fig Pattern is a migration strategy that lets you replace a legacy system incrementally, one feature at a time. You put a routing layer — a reverse proxy, API gateway, or even a smart load balancer — in front of the existing monolith. Every incoming request hits this facade instead of the legacy app directly.
The facade checks a routing table (often backed by feature flags) and decides whether to send the request to the old system or the new service. Over time, you build replacement services for each functionality while the legacy app still handles everything else. You route traffic to the new service when it's ready. Once a feature is fully replaced and tested, you remove the legacy code for that feature.
This isn't a new idea — Martin Fowler described it in 2004. But most teams still default to the 'rewrite it all' approach, which collapses under its own risk. The Strangler Fig pattern caps the blast radius of any mistake to exactly one feature.
- You never rewrite 'everything'. You pick one capability — login, search, payments — and replace that.
- The legacy system continues running all unmigrated features. Zero risk outside the slice.
- If the new service fails, you flip the routing rule back. The legacy system never stopped.
- This pattern works because each slice is small enough to reason about, test, and rollback independently.
Building the Routing Facade: Feature Flags and Traffic Splitting
The facade is the single most critical piece of a Strangler Fig migration. It must be performant, stateless (or externalise state), and observable. Most teams use an API gateway (Kong, Nginx, Envoy) or a reverse proxy with dynamic routing. The key requirement: routing decisions must be changeable at runtime without a deployment.
Feature flags control which users or requests go to the new service. You start at 0% traffic, enable it for internal testing (1% of users), then gradually increase to 100%. The flag can be based on user ID hash, geographic region, or any attribute. If something breaks, you turn the flag off — traffic instantly goes back to legacy.
Don't implement your own feature flag system in-house. Use LaunchDarkly, Unleash, or even a simple Redis-backed toggle. Your only job is to read the flag in the facade, not implement the flag infrastructure.
Data Synchronisation: The Real Challenge of Strangler Fig
Routing traffic is the easy part. The hard part is keeping data consistent between the legacy database and your new service's database. During migration, both systems need to access and modify the same user data, orders, or inventory. If you don't have a solid data sync strategy, you'll end up with silent corruption.
The safest approach is to have a single source of truth (the legacy database) and have the new service read from it but write to its own database plus the legacy one (dual writes). This keeps both systems in sync. However, dual writes are error-prone — one side can fail while the other succeeds. A better approach is to use change data capture (CDC) from the legacy database: any change in the legacy DB is streamed to a message topic, and the new service consumes that stream to update its own store. The new service's writes are also written to the legacy DB via the same CDC pipeline (reverse sync).
Alternatively, you can migrate data at the database level first (e.g., use database views or federation), but that adds a different kind of coupling. The key is bidirectional replication until you cut over completely.
- There is no distributed transaction between two databases in a strangler fig migration.
- Your write path must handle: legacy success + new failure, legacy failure + new success, or both failure.
- The legacy system must remain the authoritative source until cutover is complete.
- Use a reconciliation cron job to detect and fix differences between the two stores hourly.
Rollback Strategy: How to Undo a Migration Without Pain
A good strangler fig migration must have a rapid rollback plan for every slice. The beauty of the pattern is that the legacy system never goes away until the last feature is migrated. You can always flip the routing flag back to legacy for a particular feature.
But a simple routing rollback isn't always enough — you also need to handle data. If the new service wrote data that doesn't exist in legacy, you can lose it on rollback. The rule: the legacy system must be the authoritative writer until cutover. Any writes from the new service must be replicated back to legacy (dual writes or CDC reverse sync). That way, when you flip the routing back, the legacy system has all the data.
Your rollback sequence: 1) Turn off the feature flag (stop routing traffic to new service). 2) Verify the legacy system can serve all the data (run a data consistency check). 3) If data is missing, run a backfill from the new service's database. 4) Decommission new service only after at least 48 hours of clean rollback window.
Test your rollback before you need it. Simulate a failure scenario in staging: let the new service crash and verify that the routing facade correctly falls back to legacy without any UX interruption.
When NOT to Use the Strangler Fig Pattern
The Strangler Fig pattern isn't a silver bullet. It works best for replacing parts of a monolithic system where you can isolate a single capability. It fails when:
- The legacy system has no clear interface boundaries — everything is tightly coupled through a shared database or global state. In that case, you can't extract a single feature without dragging half the monolith with it.
- The new system requires a fundamentally different data model that can't be mapped to the legacy one. If every request needs to transform heavily between old and new schema, the proxy becomes a bottleneck.
- You need performance improvements immediately — the strangler fig approach adds latency from the proxy and dual writes for many months. If you need to make the system 2x faster this quarter, a rewrite (with careful planning) might be the better call.
- The team is unwilling to maintain two codebases in parallel. The pattern requires you to keep the legacy app around until migration is complete. If your team can't handle that cognitive load, consider a big-bang migration with a well-tested rollback plan instead.
Evaluate your specific context. The pattern is a tool, not a religion.
The Midnight Data Loss That Killed a Migration
- Data sync must be bidirectional during the migration period — not just one way.
- Assume every user can be served by either system at any time until migration is complete.
- Change data capture with a message broker is the only reliable way to handle dual writes without application-level coupling.
curl -H "X-Force-Route: new-service" to isolate the issue.Key takeaways
Common mistakes to avoid
4 patternsMigrating the database before the service
Not planning for bidirectional data sync
Ramping traffic too fast without load testing the new service
Assuming the proxy is stateless when it holds routing state
Interview Questions on This Topic
Explain the Strangler Fig Pattern. How does it differ from a big-bang rewrite?
Frequently Asked Questions
That's Architecture. Mark it forged?
5 min read · try the examples if you haven't