Data Architecture Patterns: Lambda, Kappa, Medallion, Mesh & Fabric
Choosing a data architecture pattern shapes everything downstream: team structure, tooling, cost, and how fast insights reach decision-makers. Each pattern emerged from specific constraints. Understanding those constraints prevents cargo-culting the wrong one.
Pattern Comparison
| Dimension | Lambda | Kappa | Medallion | Data Mesh | Data Fabric |
|---|---|---|---|---|---|
| Core idea | Batch + stream layers | Stream-only | Bronze/Silver/Gold layers | Domain ownership | Metadata-driven integration |
| Complexity | High (dual pipeline) | Medium | Medium | High (organizational) | High (metadata layer) |
| Latency | Batch: hours, Stream: seconds | Seconds | Depends on layer | Varies by domain | Varies |
| Team model | Centralized data team | Centralized data team | Centralized or platform | Federated domain teams | Centralized + federated |
| Best for | Mixed batch/real-time needs | Pure streaming | Lakehouse / analytics | Large orgs, many domains | Multi-source enterprises |
| Key risk | Code duplication, drift | Reprocessing at scale | Gold layer bottleneck | Governance fragmentation | Metadata complexity |
| Typical tools | Spark + Kafka + warehouse | Kafka/Flink/Pulsar | Databricks/Delta/Iceberg | Domain APIs + catalog | Knowledge graph + catalog |
| Maturity | Proven (2010s) | Proven (2010s) | Mainstream (2020s) | Emerging (2020s) | Emerging (2020s) |
Evolution Timeline
2011 2014 2018 2020 2023 2026
│ │ │ │ │ │
▼ ▼ ▼ ▼ ▼ ▼
Lambda Kappa Medallion Data Mesh Data Fabric Convergence
(Marz) (Kreps) (Databricks) (Dehghani) (Gartner) (Mesh+Fabric
hybrids)
│ │ │ │ │
└─ Batch + └─ Stream └─ Lakehouse └─ Domain └─ Metadata
Stream only layers ownership automation
Selection Decision Tree
Start
│
├─ Do you need real-time AND batch processing?
│ ├─ Yes → Can you unify into a single stream?
│ │ ├─ Yes → KAPPA
│ │ └─ No → LAMBDA
│ └─ No → Primarily analytics/BI?
│ ├─ Yes → MEDALLION (Lakehouse)
│ └─ No → Multiple autonomous domains?
│ ├─ Yes → DATA MESH
│ └─ No → Many heterogeneous sources?
│ ├─ Yes → DATA FABRIC
│ └─ No → Start simple (warehouse + ELT)
Organizational Fit Matrix
| Factor | Lambda | Kappa | Medallion | Data Mesh | Data Fabric |
|---|---|---|---|---|---|
| Org size | Medium-Large | Any | Any | Large (50+ engineers) | Enterprise |
| Data team maturity | High | Medium | Low-Medium | High | High |
| Number of domains | Few | Few | Few-Many | Many (4+) | Many |
| Regulatory needs | Medium | Medium | High (lineage) | High (contracts) | High (catalog) |
| Cloud strategy | Any | Streaming-heavy | Lakehouse vendor | Multi-platform | Multi-cloud |
| Budget | High | Medium | Medium | High (platform team) | Very High |
Practical Guidance
Start with Medallion if you are building a new analytics platform. The Bronze/Silver/Gold layering is intuitive, well-tooled (Delta Lake, Apache Iceberg), and does not require organizational change.
Adopt Data Mesh principles gradually. Start with a data catalog and domain ownership of key datasets. Do not reorganize teams around mesh until you have proven the governance model with 2-3 domains.
Lambda is legacy in most cases. If you are running Lambda today, evaluate whether Kappa (unified streaming) or Medallion (lakehouse) can replace the dual-pipeline complexity.