Skip to main content

Use Cases

Composite metrics are a flexible metric type designed to sum aggregated metric sources at the user level. Use them when you need to add or subtract two or more aggregrated values into a single signal, such as:
  • Net value (e.g., revenue minus refunds)
  • Unit-level experiment change (latestvalue minus first value)

Calculation

At the unit level, composite metrics first compute the aggregations of each component. The aggregated results are then added or subtracted according to the formula specified (for example: A + B or A - B + C). This calculation is done at a daily, 7-day, and cumulative level during experiment analysis.
Composite Metrics do not currently support daily rollups in turbo mode.
At a group level, the mean is calculated as the average of the unit-level composite aggregation.

Example

If you define a composite gap metric as Max Price - Min Price, we compute Max Price and Min Price for each user, then subtract to get each user’s net value. The experiment group mean is the average of those per-user results. The SQL for the individual components would look like the following: MAX Component:
-- Unit Level
SELECT
  source_data.unit_id,
  exposure_data.group_id,
  MAX(source_data.value_column) as value
FROM source_data
JOIN exposure_data
ON
  -- Only include users who saw the experiment
  source_data.unit_id = exposure_data.unit_id
  -- Only include data from after the user saw the experiment
  -- In this case exposure_data is already deduped to the "first exposure"
  AND source_data.timestamp >= exposure_data.timestamp
GROUP BY
  source_data.unit_id,
  exposure_data.group_id;

-- Experiment
SELECT
  group_id,
  COUNT(distinct unit_id) total_units
FROM exposure_data
GROUP BY group_id;

-- Group Level
SELECT
  group_id,
  SUM(value)/SUM(total_units) as mean
FROM unit_data
JOIN group_data
USING (group_id)
GROUP BY group_id;
MIN Component:
-- Unit Level
SELECT
  source_data.unit_id,
  exposure_data.group_id,
  MIN(source_data.value_column) as value
FROM source_data
JOIN exposure_data
ON
  -- Only include users who saw the experiment
  source_data.unit_id = exposure_data.unit_id
  -- Only include data from after the user saw the experiment
  -- In this case exposure_data is already deduped to the "first exposure"
  AND source_data.timestamp >= exposure_data.timestamp
GROUP BY
  source_data.unit_id,
  exposure_data.group_id;

-- Experiment
SELECT
  group_id,
  COUNT(distinct unit_id) total_units
FROM exposure_data
GROUP BY group_id;

-- Group Level
SELECT
  group_id,
  SUM(value)/SUM(total_units) as mean
FROM unit_data
JOIN group_data
USING (group_id)
GROUP BY group_id;
Composite Aggregation:
-- Unit Level
SELECT
  source_data.unit_id,
  exposure_data.group_id,
  MAX(source_data.value_column) - MIN(source_data.value_column) AS value
FROM source_data
JOIN exposure_data
ON
  -- Only include users who saw the experiment
  source_data.unit_id = exposure_data.unit_id
  -- Only include data from after the user saw the experiment
  -- In this case exposure_data is already deduped to the "first exposure"
  AND source_data.timestamp >= exposure_data.timestamp
GROUP BY
  source_data.unit_id,
  exposure_data.group_id;

-- Experiment
SELECT
  group_id,
  COUNT(distinct unit_id) total_units
FROM exposure_data
GROUP BY group_id;

-- Group Level
SELECT
  group_id,
  SUM(value)/SUM(total_units) as mean
FROM unit_data
JOIN group_data
USING (group_id)
GROUP BY group_id;

Options

  • Cohort Windows
    • You can specify a window for data collection after a unit’s exposure. For example, a 0-1 day cohort window would only count actions from days 0 and 1 after a unit was exposed to an experiment
      • Only include units with a completed window can be selected to remove units out of pulse analysis for this metric until the cohort window has completed
  • Winsorization
    • Specify a lower and/or upper percentile bound to winsorize at. These bounds will be applied separately to the top level metric and its individual components.
  • Baked Metrics
    • Baked Metrics allow you to specify how long a metric needs to mature. This is common in situations like chargebacks or cancellations. Statsig will delay loading the data until the window has elapsed, and only calculate pulse results for that metric if a unit’s metric has matured.