Sum Metrics

Sum metrics calculate the sum of a numeric column in a metric source.

Use cases

Sum metrics calculate the sum of a numeric column, which makes them well suited to comparing behavior across groups. Common examples are:

Totalling revenue by summing a revenue column from purchase logs
Totalling time spent by summing an aggregated time_spent metric on a user-day fact table

Calculation

At the unit level, sum metrics take the SUM of their input column.

At the group level, Statsig calculates the mean as the SUM of the unit-level sums, divided by the count of UNIQUE UNITS exposed to the experiment.

The SQL for this calculation:

sql

-- Unit Level
SELECT
  unit_id,
  group_id,
  SUM(value_column) as value
FROM source_data
JOIN exposure_data
ON
  -- Only include users who saw the experiment
  source_data.unit_id = exposure_data.unit_id
  -- Only include data from after the user saw the experiment
  -- In this case exposure_data is already deduped to the "first exposure"
  AND source_data.timestamp >= exposure_data.timestamp
GROUP BY
  source_data.unit_id,
  exposure_data.group_id;

-- Experiment
SELECT
  group_id,
  COUNT(distinct unit_id) total_units
FROM exposure_data
GROUP BY group_id;

-- Group Level
SELECT
  group_id,
  SUM(value)/SUM(total_units) as mean
FROM unit_data
JOIN group_data
USING (group_id)
GROUP BY group_id;

Methodology notes

Sum metrics use the SQL SUM aggregation. Refer to the Options section below for many advanced options.

Options

Metric Breakdowns: configure Metadata Columns to group results by for dimensional views in pulse results.
Multi Source: build sums from multiple metric sources, equivalent to a UNION in SQL. Useful when the same measure exists in sharded or disparate tables.
Winsorization: specify a lower and/or upper percentile bound. Statsig clamps all values below the lower threshold or above the upper threshold to reduce the impact of outliers.
Capping: specify an exact cap value (per unit type) to limit a unit's daily value. For example, cap daily purchases at 100 on an e-commerce site to exclude reseller behavior.
CUPED: enable CUPED and specify the lookback window for pre-experiment data inputs.
Thresholding: convert this metric into a 1/0 unit count metric that flags whether the unit's total sum meets or exceeds a given threshold.
Cohort Windows: specify a window for data collection after a unit's exposure. For example, a 0-1 day cohort window counts only actions from days 0 and 1 after exposure. Select Only include units with a completed window to exclude units from pulse analysis until their cohort window has completed.
Baked Metrics: specify how long a metric needs to mature. Metric maturation is common for chargebacks or cancellations. Statsig delays loading the data until the window has elapsed, and calculates pulse results only after a unit's metric has matured.

Was this helpful?