Migration Strategies: Moving Away from Legacy Without Burning Down the House
Every mature organization has legacy systems. The question is never whether to modernize but how to do it without halting business value delivery. The graveyard of failed "big bang" rewrites is large. Incremental strategies win.
Migration Pattern Comparison
| Pattern | Approach | Risk | Duration | Team Size | Rollback |
|---|---|---|---|---|---|
| Strangler Fig | Replace piece by piece behind a facade | Low | Long (months-years) | Small-Medium | Per component |
| Branch by Abstraction | Abstract interfaces, swap implementations | Low-Medium | Medium | Medium | Per abstraction |
| Parallel Run | Run old and new simultaneously, compare | Medium | Medium | Large | Instant (keep old) |
| Big Bang Rewrite | Replace everything at once | Very High | Long | Large | All or nothing |
| Lift and Shift | Move as-is to new infra | Low | Short | Small | Easy |
| Feature Parity + Cutover | Build new until feature parity, then switch | High | Very Long | Large | Complex |
Strangler Fig Pattern
Phase 1: Facade Phase 2: Migrate Phase 3: Complete
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Facade / │ │ Facade / │ │ Facade / │
│ API GW │ │ API GW │ │ API GW │
└──┬────────┬──┘ └──┬────────┬──┘ └──────┬───────┘
│ │ │ │ │
│ 100% │ 0% │ 40% │ 60% │ 100%
│ │ │ │ │
┌──▼──┐ ┌──▼──┐ ┌──▼──┐ ┌──▼──┐ ┌──▼──┐
│OLD │ │NEW │ │OLD │ │NEW │ │NEW │
│ │ │ │ │ │ │ │ │ │
└─────┘ └─────┘ └─────┘ └─────┘ └─────┘
Risk Assessment Matrix
| Risk Factor | Low (1) | Medium (2) | High (3) |
|---|---|---|---|
| Data complexity | Simple schemas, few relations | Moderate schemas, some coupling | Complex schemas, heavy cross-refs |
| Integration count | 0-3 downstream systems | 4-10 downstream systems | 10+ downstream systems |
| Domain knowledge | Well-documented, team knows it | Partial docs, some tribal knowledge | No docs, original team gone |
| Business criticality | Internal tool | Revenue-supporting | Revenue-critical, SLA-bound |
| Test coverage | > 70% | 30-70% | < 30% or unknown |
| Data volume | < 10GB | 10GB-1TB | > 1TB |
| Regulatory constraints | None | Audit trail needed | Compliance certification required |
Scoring: 7-10 = Strangler Fig or Lift-and-Shift, 11-16 = Branch by Abstraction or Parallel Run, 17-21 = start with assessment phase, consider external help.
Timeline Planning Framework
| Phase | Duration | Activities | Gate Criteria |
|---|---|---|---|
| Discovery | 2-4 weeks | Map dependencies, identify seams, assess risk | Risk matrix completed |
| Foundation | 2-6 weeks | Build facade/proxy, set up dual-write, CI/CD | Routing works, zero traffic to new system |
| Migration Wave 1 | 4-8 weeks | Migrate lowest-risk component | Production traffic on new component |
| Migration Wave N | 4-8 weeks each | Migrate next component by priority | Each wave validated in production |
| Decommission | 2-4 weeks | Remove legacy code, clean up data | Old system fully offline |
| Stabilization | 2-4 weeks | Monitor, fix edge cases, update docs | SLOs met for 2 consecutive weeks |
The Feature Parity Trap
Effort ▲
│ ╱ "Just one more feature"
│ ╱ before we can switch
│ ╱
│ ╱ ← Trap zone: legacy keeps
│ ╱ evolving, target moves
│ ╱
│ ●────╱── Planned cutover (missed)
│ ╱
│ ╱
│ ╱
│ ╱
└──────────────────────────────────────────► Time
Start 6mo 12mo 18mo 24mo
Solution: Define a FROZEN feature set. New features go
to the new system only. Accept temporary gaps.
Key Principles
Migrate data last. Move routing and compute first. Data migration is the riskiest and most irreversible step. Delay it until everything else is proven.
Measure parity, do not feel it. Define quantitative criteria for when a component is ready: latency P99, error rate, throughput. "It seems to work" is not a migration gate.
Budget for the long tail. The last 10% of migration takes 50% of the effort. Plan for it. Edge cases, data inconsistencies, and undocumented behaviors live in that tail.