tadata
Back to home

Data Product Thinking: Treating Data as a First-Class Product

#data-strategy#product-management#data-mesh#analytics

The shift from "data as a byproduct" to "data as a product" is one of the most impactful organizational changes a company can make. Inspired by data mesh principles, data product thinking applies product management discipline — user research, SLAs, lifecycle management — to datasets, APIs, and analytical models.

Data Product Lifecycle

┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│ DISCOVER │───▶│  BUILD   │───▶│ OPERATE  │───▶│  EVOLVE  │
│          │    │          │    │          │    │          │
│ Identify │    │ Schema   │    │ Monitor  │    │ Versioning│
│ consumers│    │ Contracts│    │ SLAs     │    │ Deprecation│
│ Define   │    │ Pipeline │    │ Quality  │    │ Migration │
│ value    │    │ Testing  │    │ Support  │    │ Sunsetting│
└──────────┘    └──────────┘    └──────────┘    └──────────┘
      ▲                                               │
      └───────────────── Feedback Loop ───────────────┘

Data Product Canvas

Use this template to design any data product before building it:

DimensionQuestions to Answer
Name & DomainWhat is this product called? Which domain owns it?
ConsumersWho uses this data? What decisions does it support?
Value PropositionWhat question does it answer? What would break without it?
Source DataWhat upstream data does it depend on?
Schema & ContractWhat are the fields, types, and guarantees?
Quality SLAsFreshness, completeness, accuracy targets?
Access PatternsSQL query? API call? Dashboard? ML feature store?
Security & PrivacyPII handling? Access controls? Retention policy?
Owner & SupportWho is on-call? How are issues reported?
CostCompute and storage cost? Cost per consumer query?

Quality SLA Template

SLA DimensionDefinitionExample TargetMeasurement
FreshnessMax age of data<2 hours for operational, <24h for analyticalMetadata timestamp check
Completeness% of expected records present>99.5%Row count vs source comparison
Accuracy% of values matching source of truth>99.9%Reconciliation queries
Schema StabilityBreaking changes per quarter0 unannouncedSchema registry monitoring
AvailabilityUptime of data product endpoint>99.5%Endpoint health checks
LatencyQuery response time (p95)<5s for dashboardsQuery performance monitoring

Discoverability Maturity Model

LevelStageHow Data Is FoundTooling
0Tribal knowledgeAsk someone who knowsSlack, word of mouth
1DocumentationWritten guides in wiki pagesConfluence, Notion
2CatalogSearchable metadata catalogDataHub, OpenMetadata
3MarketplaceData products with ratings, usage stats, SLAsAtlan, DataZone, Collibra
4AI-AssistedNatural language search, auto-recommendationsCatalog + LLM integration

Internal Data Marketplace Architecture

┌─────────────────────────────────────────────────────┐
│              DATA CONSUMERS                          │
│  Analysts · Data Scientists · Applications · LLMs    │
└──────────────────┬──────────────────────────────────┘
                   │  Search, subscribe, consume
┌──────────────────▼──────────────────────────────────┐
│            DATA MARKETPLACE                          │
│  ┌────────────┐ ┌────────────┐ ┌────────────┐      │
│  │ Discovery  │ │ Access     │ │ Quality    │      │
│  │ & Search   │ │ Request    │ │ Dashboard  │      │
│  │            │ │ Workflow   │ │            │      │
│  └────────────┘ └────────────┘ └────────────┘      │
│  ┌────────────┐ ┌────────────┐ ┌────────────┐      │
│  │ Usage      │ │ Lineage    │ │ Cost       │      │
│  │ Analytics  │ │ Viewer     │ │ Attribution│      │
│  └────────────┘ └────────────┘ └────────────┘      │
├─────────────────────────────────────────────────────┤
│            DATA PRODUCTS (by domain)                 │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐           │
│  │ Finance  │ │ Product  │ │ Marketing│           │
│  │ Domain   │ │ Domain   │ │ Domain   │  ...      │
│  │ revenue  │ │ events   │ │ campaigns│           │
│  │ costs    │ │ users    │ │ attribution│          │
│  └──────────┘ └──────────┘ └──────────┘           │
├─────────────────────────────────────────────────────┤
│            PLATFORM LAYER                            │
│  Compute · Storage · Orchestration · Governance      │
└─────────────────────────────────────────────────────┘

Organizational Implications

Data product thinking requires three shifts:

  1. From project to product: Data initiatives have continuous ownership, not project end dates. A domain team owns "customer revenue data" permanently, not as a one-time ETL task.

  2. From central to federated: Domain teams own and publish their data products. The central platform team provides tooling, standards, and infrastructure — not the data itself.

  3. From output to outcome: Success is measured by consumer adoption and decision impact, not by the number of tables or pipelines shipped.

Resources