Data Product Modeling
Data products are consumption-ready datasets designed for specific use cases. They represent the final layer of your data architecture - the reports, dashboards, and metrics that business users consume.
Core Principles
1. Consumer-First Design
Data products are built for specific consumers and use cases:
- Designed around how data will be consumed
- Organized by consumer needs (executives, analysts, operations)
- Include only relevant metrics for the use case
- Clear ownership and refresh schedules
2. Self-Service Ready
Products should be discoverable and usable without deep technical knowledge:
- Clear business definitions and metadata
- Documented metrics with calculations
- Known consumers and usage patterns
- Refresh schedules and data freshness
3. Product Thinking
Treat data as a product with lifecycle management:
- Documented and supported
- Quality-tested and monitored
- Evolved based on consumer feedback
- Clear ownership and accountability
Data Product Structure
A data product in Modality consists of:
- Product metadata: Description, owner, consumers, refresh schedule
- Reports: Collections of related metrics and visualizations
- Metrics: Calculated business measures with formulas
- Entity references: Links to conceptual entities used in calculations
Basic Syntax
data_product "product_name" {
description = "What this product provides"
owner = "team-name"
refresh_schedule = "0 6 * * *" # Cron expression
consumer "team-name"
consumer "dashboard-url"
report "Report Name" {
type = "dashboard" | "analytical" | "operational" | "other"
description = "Report purpose"
uses "Domain.Entity"
metric "Metric Name" {
description = "What this measures"
calculation = "SQL formula or business logic"
target_value = "Goal or benchmark"
uses "Domain.Entity"
}
}
}Product Attributes
description
A clear description of what the data product provides and its intended use cases.
data_product "customer_analytics" {
description = "Customer behavior and lifecycle analytics"
}owner
The team or individual responsible for maintaining the data product.
data_product "customer_analytics" {
owner = "analytics-team"
}refresh_schedule
When the product data is refreshed, using cron expression format.
data_product "customer_analytics" {
refresh_schedule = "0 6 * * *" # Daily at 6am UTC
}Common schedules:
"0 * * * *"- Hourly"0 6 * * *"- Daily at 6am"0 6 * * 1"- Weekly on Mondays at 6am"0 6 1 * *"- Monthly on the 1st at 6am
consumer
Who uses this data product (teams, individuals, or dashboard URLs).
data_product "customer_analytics" {
consumer "product-team"
consumer "marketing-team"
consumer "https://analytics.company.com/dashboard"
}Reports
Reports group related metrics and visualizations together.
Report Types
- dashboard: Interactive visualizations for exploration
- analytical: Deep-dive analysis reports
- operational: Real-time operational metrics
- other: Custom report types
Report Syntax
report "Report Name" {
type = "dashboard"
description = "Report purpose"
uses "Domain.Entity" # Entity references
metric "Metric Name" { ... }
}Metrics
Metrics are calculated business measures (KPIs).
Metric Attributes
description
What the metric measures in business terms.
metric "Total Revenue" {
description = "Sum of all order values"
}calculation
The formula or business logic for calculating the metric.
metric "Total Revenue" {
calculation = "SUM(Order.amount)"
}target_value
The goal or benchmark for this metric.
metric "Total Revenue" {
target_value = "$1M monthly"
}uses
The entities referenced in this metric's calculation.
metric "Customer Lifetime Value" {
calculation = "SUM(Order.amount) / COUNT(DISTINCT Customer.id)"
uses "Sales.Customer"
uses "Sales.Order"
}Complete Example
# Define conceptual entities
domain "Sales" {
color = "#e74c3c"
description = "Customer sales and orders"
entity "Customer" {
type = "entity"
description = "A customer who can place orders"
}
entity "Order" {
type = "entity"
description = "A customer order with line items"
contains entity.Customer
}
}
# Define data product
data_product "Sales Analytics" {
description = "Sales performance metrics and dashboards"
owner = "sales-team"
refresh_schedule = "0 * * * *" # Hourly
consumer "sales-team"
consumer "executives"
consumer "https://analytics.company.com/sales"
report "Revenue Dashboard" {
type = "dashboard"
description = "Real-time revenue tracking"
uses "Sales.Order"
uses "Sales.Customer"
metric "Total Revenue" {
description = "Sum of all orders"
calculation = "SUM(Order.amount)"
target_value = "$1M monthly"
uses "Sales.Order"
}
metric "Average Order Value" {
description = "Average revenue per order"
calculation = "AVG(Order.amount)"
target_value = "$150"
uses "Sales.Order"
}
metric "Customer Count" {
description = "Number of unique customers"
calculation = "COUNT(DISTINCT Customer.id)"
target_value = "10,000 active"
uses "Sales.Customer"
}
}
report "Customer Insights" {
type = "dashboard"
description = "Customer behavior analysis"
uses "Sales.Customer"
uses "Sales.Order"
metric "Customer Lifetime Value" {
description = "Average revenue per customer"
calculation = "SUM(Order.amount) / COUNT(DISTINCT Customer.id)"
target_value = "$500"
uses "Sales.Customer"
uses "Sales.Order"
}
metric "Repeat Customer Rate" {
description = "Percentage of customers with multiple orders"
calculation = "COUNT(DISTINCT Customer WHERE order_count > 1) / COUNT(DISTINCT Customer) * 100"
target_value = "40%"
uses "Sales.Customer"
uses "Sales.Order"
}
}
}Product Components (Advanced)
Modality supports flexible component types for representing different data product patterns:
Component Types
- report: Traditional BI dashboards and reports
- cube: OLAP cubes for multidimensional analysis
- metric: Standalone KPI definitions
- reverse-etl: Data synced back to operational systems
- api: Programmatic data access endpoints
- stream: Real-time data streams
Each component references the conceptual entities it uses, creating lineage from source → entity → product component.
data_product "Customer 360" {
description = "Complete customer view"
owner = "analytics-team"
# Multiple component types in one product
component "customer_dashboard" {
type = "report"
uses "Sales.Customer"
}
component "customer_api" {
type = "api"
uses "Sales.Customer"
}
component "customer_sync" {
type = "reverse-etl"
uses "Sales.Customer"
}
}Best Practices
1. Organize by Consumer
Create data products around who will use them:
data_product "Executive Dashboard" {
description = "C-suite metrics"
consumer "executives"
}
data_product "Sales Team Analytics" {
description = "Daily sales operations"
consumer "sales-team"
}2. Document Ownership Clearly
Always specify who maintains the product:
data_product "Customer Analytics" {
owner = "analytics-team"
}3. Set Refresh Schedules
Define how often data should be updated:
data_product "Real-time Dashboard" {
refresh_schedule = "0 * * * *" # Hourly
}
data_product "Daily Reports" {
refresh_schedule = "0 6 * * *" # Daily at 6am
}4. List All Consumers
Track who depends on this data:
data_product "Customer Analytics" {
consumer "product-team"
consumer "marketing-team"
consumer "executives"
}5. Group Related Metrics
Keep related metrics together in the same report:
data_product "Sales Analytics" {
report "Revenue Metrics" {
metric "Total Revenue" { ... }
metric "Average Order Value" { ... }
metric "Revenue Growth Rate" { ... }
}
report "Customer Metrics" {
metric "Customer Lifetime Value" { ... }
metric "Customer Acquisition Cost" { ... }
metric "Churn Rate" { ... }
}
}6. Use Clear Names
Choose descriptive names that indicate purpose:
// Good
data_product "Customer Lifecycle Analytics"
// Avoid
data_product "Dashboard 1"7. Document Calculations
Make metric formulas clear and understandable:
metric "Net Revenue" {
description = "Total revenue minus refunds and discounts"
calculation = "SUM(Order.amount) - SUM(Refund.amount) - SUM(Discount.amount)"
target_value = "$800K monthly"
}8. Link to Conceptual Entities
Always reference the entities used in your metrics:
metric "Customer Lifetime Value" {
calculation = "SUM(Order.amount) / COUNT(DISTINCT Customer.id)"
# This creates lineage
uses "Sales.Customer"
uses "Sales.Order"
}This creates clear lineage from source systems → entities → metrics → products.
Common Patterns
Executive Dashboard
data_product "Executive Dashboard" {
description = "High-level company metrics"
owner = "analytics-team"
refresh_schedule = "0 * * * *" # Hourly
consumer "executives"
consumer "board"
report "Company KPIs" {
type = "dashboard"
metric "Monthly Recurring Revenue" { ... }
metric "Customer Acquisition Cost" { ... }
metric "Net Promoter Score" { ... }
metric "Burn Rate" { ... }
}
}Operational Dashboard
data_product "Operations Dashboard" {
description = "Real-time operational metrics"
owner = "ops-team"
refresh_schedule = "*/15 * * * *" # Every 15 minutes
consumer "operations-team"
consumer "customer-support"
report "Live Metrics" {
type = "operational"
metric "Active Users (Now)" { ... }
metric "Error Rate (Last Hour)" { ... }
metric "Queue Depth" { ... }
metric "Response Time (p95)" { ... }
}
}Analytical Report
data_product "Cohort Analysis" {
description = "Customer cohort retention analysis"
owner = "analytics-team"
refresh_schedule = "0 6 * * 1" # Weekly on Monday
consumer "product-team"
consumer "marketing-team"
report "Retention Analysis" {
type = "analytical"
description = "Customer retention by acquisition cohort"
metric "30-Day Retention" { ... }
metric "90-Day Retention" { ... }
metric "Cohort Revenue" { ... }
}
}