tadata
Back to home

Data Architecture Patterns: Lambda, Kappa, Medallion, Mesh & Fabric

#data-architecture#architecture#data-engineering#strategy

Choosing a data architecture pattern shapes everything downstream: team structure, tooling, cost, and how fast insights reach decision-makers. Each pattern emerged from specific constraints. Understanding those constraints prevents cargo-culting the wrong one.

Pattern Comparison

DimensionLambdaKappaMedallionData MeshData Fabric
Core ideaBatch + stream layersStream-onlyBronze/Silver/Gold layersDomain ownershipMetadata-driven integration
ComplexityHigh (dual pipeline)MediumMediumHigh (organizational)High (metadata layer)
LatencyBatch: hours, Stream: secondsSecondsDepends on layerVaries by domainVaries
Team modelCentralized data teamCentralized data teamCentralized or platformFederated domain teamsCentralized + federated
Best forMixed batch/real-time needsPure streamingLakehouse / analyticsLarge orgs, many domainsMulti-source enterprises
Key riskCode duplication, driftReprocessing at scaleGold layer bottleneckGovernance fragmentationMetadata complexity
Typical toolsSpark + Kafka + warehouseKafka/Flink/PulsarDatabricks/Delta/IcebergDomain APIs + catalogKnowledge graph + catalog
MaturityProven (2010s)Proven (2010s)Mainstream (2020s)Emerging (2020s)Emerging (2020s)

Evolution Timeline

2011          2014          2018          2020          2023          2026
 │             │             │             │             │             │
 ▼             ▼             ▼             ▼             ▼             ▼
Lambda      Kappa        Medallion     Data Mesh    Data Fabric   Convergence
(Marz)      (Kreps)      (Databricks)  (Dehghani)   (Gartner)    (Mesh+Fabric
                                                                   hybrids)
 │             │             │             │             │
 └─ Batch +    └─ Stream     └─ Lakehouse  └─ Domain    └─ Metadata
    Stream        only          layers        ownership     automation

Selection Decision Tree

Start
  │
  ├─ Do you need real-time AND batch processing?
  │   ├─ Yes → Can you unify into a single stream?
  │   │         ├─ Yes → KAPPA
  │   │         └─ No  → LAMBDA
  │   └─ No  → Primarily analytics/BI?
  │             ├─ Yes → MEDALLION (Lakehouse)
  │             └─ No  → Multiple autonomous domains?
  │                       ├─ Yes → DATA MESH
  │                       └─ No  → Many heterogeneous sources?
  │                                 ├─ Yes → DATA FABRIC
  │                                 └─ No  → Start simple (warehouse + ELT)

Organizational Fit Matrix

FactorLambdaKappaMedallionData MeshData Fabric
Org sizeMedium-LargeAnyAnyLarge (50+ engineers)Enterprise
Data team maturityHighMediumLow-MediumHighHigh
Number of domainsFewFewFew-ManyMany (4+)Many
Regulatory needsMediumMediumHigh (lineage)High (contracts)High (catalog)
Cloud strategyAnyStreaming-heavyLakehouse vendorMulti-platformMulti-cloud
BudgetHighMediumMediumHigh (platform team)Very High

Practical Guidance

Start with Medallion if you are building a new analytics platform. The Bronze/Silver/Gold layering is intuitive, well-tooled (Delta Lake, Apache Iceberg), and does not require organizational change.

Adopt Data Mesh principles gradually. Start with a data catalog and domain ownership of key datasets. Do not reorganize teams around mesh until you have proven the governance model with 2-3 domains.

Lambda is legacy in most cases. If you are running Lambda today, evaluate whether Kappa (unified streaming) or Medallion (lakehouse) can replace the dual-pipeline complexity.

Resources