Skip to content

Data Product Modeling

Data products are consumption-ready datasets designed for specific use cases. They represent the final layer of your data architecture - the reports, dashboards, and metrics that business users consume.

Core Principles

1. Consumer-First Design

Data products are built for specific consumers and use cases:

  • Designed around how data will be consumed
  • Organized by consumer needs (executives, analysts, operations)
  • Include only relevant metrics for the use case
  • Clear ownership and refresh schedules

2. Self-Service Ready

Products should be discoverable and usable without deep technical knowledge:

  • Clear business definitions and metadata
  • Documented metrics with calculations
  • Known consumers and usage patterns
  • Refresh schedules and data freshness

3. Product Thinking

Treat data as a product with lifecycle management:

  • Documented and supported
  • Quality-tested and monitored
  • Evolved based on consumer feedback
  • Clear ownership and accountability

Data Product Structure

A data product in Modality consists of:

  1. Product metadata: Description, owner, consumers, refresh schedule
  2. Reports: Collections of related metrics and visualizations
  3. Metrics: Calculated business measures with formulas
  4. Entity references: Links to conceptual entities used in calculations

Basic Syntax

mml
data_product "product_name" {
  description = "What this product provides"
  owner = "team-name"
  refresh_schedule = "0 6 * * *"  # Cron expression

  consumer "team-name"
  consumer "dashboard-url"

  report "Report Name" {
    type = "dashboard" | "analytical" | "operational" | "other"
    description = "Report purpose"

    uses "Domain.Entity"

    metric "Metric Name" {
      description = "What this measures"
      calculation = "SQL formula or business logic"
      target_value = "Goal or benchmark"

      uses "Domain.Entity"
    }
  }
}

Product Attributes

description

A clear description of what the data product provides and its intended use cases.

mml
data_product "customer_analytics" {
  description = "Customer behavior and lifecycle analytics"
}

owner

The team or individual responsible for maintaining the data product.

mml
data_product "customer_analytics" {
  owner = "analytics-team"
}

refresh_schedule

When the product data is refreshed, using cron expression format.

mml
data_product "customer_analytics" {
  refresh_schedule = "0 6 * * *"  # Daily at 6am UTC
}

Common schedules:

  • "0 * * * *" - Hourly
  • "0 6 * * *" - Daily at 6am
  • "0 6 * * 1" - Weekly on Mondays at 6am
  • "0 6 1 * *" - Monthly on the 1st at 6am

consumer

Who uses this data product (teams, individuals, or dashboard URLs).

mml
data_product "customer_analytics" {
  consumer "product-team"
  consumer "marketing-team"
  consumer "https://analytics.company.com/dashboard"
}

Reports

Reports group related metrics and visualizations together.

Report Types

  • dashboard: Interactive visualizations for exploration
  • analytical: Deep-dive analysis reports
  • operational: Real-time operational metrics
  • other: Custom report types

Report Syntax

mml
report "Report Name" {
  type = "dashboard"
  description = "Report purpose"

  uses "Domain.Entity"  # Entity references

  metric "Metric Name" { ... }
}

Metrics

Metrics are calculated business measures (KPIs).

Metric Attributes

description

What the metric measures in business terms.

mml
metric "Total Revenue" {
  description = "Sum of all order values"
}

calculation

The formula or business logic for calculating the metric.

mml
metric "Total Revenue" {
  calculation = "SUM(Order.amount)"
}

target_value

The goal or benchmark for this metric.

mml
metric "Total Revenue" {
  target_value = "$1M monthly"
}

uses

The entities referenced in this metric's calculation.

mml
metric "Customer Lifetime Value" {
  calculation = "SUM(Order.amount) / COUNT(DISTINCT Customer.id)"

  uses "Sales.Customer"
  uses "Sales.Order"
}

Complete Example

mml
# Define conceptual entities
domain "Sales" {
  color = "#e74c3c"
  description = "Customer sales and orders"

  entity "Customer" {
    type = "entity"
    description = "A customer who can place orders"
  }

  entity "Order" {
    type = "entity"
    description = "A customer order with line items"

    contains entity.Customer
  }
}

# Define data product
data_product "Sales Analytics" {
  description = "Sales performance metrics and dashboards"
  owner = "sales-team"
  refresh_schedule = "0 * * * *"  # Hourly

  consumer "sales-team"
  consumer "executives"
  consumer "https://analytics.company.com/sales"

  report "Revenue Dashboard" {
    type = "dashboard"
    description = "Real-time revenue tracking"

    uses "Sales.Order"
    uses "Sales.Customer"

    metric "Total Revenue" {
      description = "Sum of all orders"
      calculation = "SUM(Order.amount)"
      target_value = "$1M monthly"
      uses "Sales.Order"
    }

    metric "Average Order Value" {
      description = "Average revenue per order"
      calculation = "AVG(Order.amount)"
      target_value = "$150"
      uses "Sales.Order"
    }

    metric "Customer Count" {
      description = "Number of unique customers"
      calculation = "COUNT(DISTINCT Customer.id)"
      target_value = "10,000 active"
      uses "Sales.Customer"
    }
  }

  report "Customer Insights" {
    type = "dashboard"
    description = "Customer behavior analysis"

    uses "Sales.Customer"
    uses "Sales.Order"

    metric "Customer Lifetime Value" {
      description = "Average revenue per customer"
      calculation = "SUM(Order.amount) / COUNT(DISTINCT Customer.id)"
      target_value = "$500"
      uses "Sales.Customer"
      uses "Sales.Order"
    }

    metric "Repeat Customer Rate" {
      description = "Percentage of customers with multiple orders"
      calculation = "COUNT(DISTINCT Customer WHERE order_count > 1) / COUNT(DISTINCT Customer) * 100"
      target_value = "40%"
      uses "Sales.Customer"
      uses "Sales.Order"
    }
  }
}

Product Components (Advanced)

Modality supports flexible component types for representing different data product patterns:

Component Types

  • report: Traditional BI dashboards and reports
  • cube: OLAP cubes for multidimensional analysis
  • metric: Standalone KPI definitions
  • reverse-etl: Data synced back to operational systems
  • api: Programmatic data access endpoints
  • stream: Real-time data streams

Each component references the conceptual entities it uses, creating lineage from source → entity → product component.

mml
data_product "Customer 360" {
  description = "Complete customer view"
  owner = "analytics-team"

  # Multiple component types in one product
  component "customer_dashboard" {
    type = "report"
    uses "Sales.Customer"
  }

  component "customer_api" {
    type = "api"
    uses "Sales.Customer"
  }

  component "customer_sync" {
    type = "reverse-etl"
    uses "Sales.Customer"
  }
}

Best Practices

1. Organize by Consumer

Create data products around who will use them:

mml
data_product "Executive Dashboard" {
  description = "C-suite metrics"
  consumer "executives"
}

data_product "Sales Team Analytics" {
  description = "Daily sales operations"
  consumer "sales-team"
}

2. Document Ownership Clearly

Always specify who maintains the product:

mml
data_product "Customer Analytics" {
  owner = "analytics-team"
}

3. Set Refresh Schedules

Define how often data should be updated:

mml
data_product "Real-time Dashboard" {
  refresh_schedule = "0 * * * *"  # Hourly
}

data_product "Daily Reports" {
  refresh_schedule = "0 6 * * *"  # Daily at 6am
}

4. List All Consumers

Track who depends on this data:

mml
data_product "Customer Analytics" {
  consumer "product-team"
  consumer "marketing-team"
  consumer "executives"
}

Keep related metrics together in the same report:

mml
data_product "Sales Analytics" {
  report "Revenue Metrics" {
    metric "Total Revenue" { ... }
    metric "Average Order Value" { ... }
    metric "Revenue Growth Rate" { ... }
  }

  report "Customer Metrics" {
    metric "Customer Lifetime Value" { ... }
    metric "Customer Acquisition Cost" { ... }
    metric "Churn Rate" { ... }
  }
}

6. Use Clear Names

Choose descriptive names that indicate purpose:

mml
// Good
data_product "Customer Lifecycle Analytics"

// Avoid
data_product "Dashboard 1"

7. Document Calculations

Make metric formulas clear and understandable:

mml
metric "Net Revenue" {
  description = "Total revenue minus refunds and discounts"
  calculation = "SUM(Order.amount) - SUM(Refund.amount) - SUM(Discount.amount)"
  target_value = "$800K monthly"
}

Always reference the entities used in your metrics:

mml
metric "Customer Lifetime Value" {
  calculation = "SUM(Order.amount) / COUNT(DISTINCT Customer.id)"

  # This creates lineage
  uses "Sales.Customer"
  uses "Sales.Order"
}

This creates clear lineage from source systems → entities → metrics → products.

Common Patterns

Executive Dashboard

mml
data_product "Executive Dashboard" {
  description = "High-level company metrics"
  owner = "analytics-team"
  refresh_schedule = "0 * * * *"  # Hourly

  consumer "executives"
  consumer "board"

  report "Company KPIs" {
    type = "dashboard"

    metric "Monthly Recurring Revenue" { ... }
    metric "Customer Acquisition Cost" { ... }
    metric "Net Promoter Score" { ... }
    metric "Burn Rate" { ... }
  }
}

Operational Dashboard

mml
data_product "Operations Dashboard" {
  description = "Real-time operational metrics"
  owner = "ops-team"
  refresh_schedule = "*/15 * * * *"  # Every 15 minutes

  consumer "operations-team"
  consumer "customer-support"

  report "Live Metrics" {
    type = "operational"

    metric "Active Users (Now)" { ... }
    metric "Error Rate (Last Hour)" { ... }
    metric "Queue Depth" { ... }
    metric "Response Time (p95)" { ... }
  }
}

Analytical Report

mml
data_product "Cohort Analysis" {
  description = "Customer cohort retention analysis"
  owner = "analytics-team"
  refresh_schedule = "0 6 * * 1"  # Weekly on Monday

  consumer "product-team"
  consumer "marketing-team"

  report "Retention Analysis" {
    type = "analytical"
    description = "Customer retention by acquisition cohort"

    metric "30-Day Retention" { ... }
    metric "90-Day Retention" { ... }
    metric "Cohort Revenue" { ... }
  }
}

Released under the MIT License.