Skip to main content

Deprecation Notice

This documentation page may be out of date, but is being kept for reference. Please refer to the newer documentation here.

Metrics

Metrics are measures of user or system behavior that you'll use as evaluation criteria for experiments. In Statsig Warehouse Native, Metrics are calculations on top of Metric Sources. The configuration of a metric includes:

  • The type of aggregation you want to perform (E.g. Sum, Mean, Count, Unique Users)
  • For Means, Sums, etc., the field to use for the numerical calculation
    • Optional filters on fields from your Metric Source query
    • Optional drill-down dimensions on metadata fields to be used in experiment analysis
    • Optional toggle for lower/upper winsorization bounds for outlier control
    • Optional toggle for cohort windows to take input data from
    • Optional toggle for whether the metric should wait for a user's data to bake before including a user in results
  • For Ratios and Funnels, the metric sources and fields that contribute the components of the metric
    • Optional filters on fields from your Metric Source query
    • Optional toggle for cohort windows to take input data from

Metric Tab

Supported Metric Types

Metrics define different aggregations for user-level aggregations, as well as group-level aggregations and Statistical calculations.

We're actively working on adding more metric types - refer to the crosstab below for the metric types available today, and how these are calculated on the backend.

Metric TypeExamplesMetrics Tab CalculationUser Level CalculationGroup Calculation TypeStats Notes
Countclicks, purchases, API requestsCount of Metric Source rows"Sum of user values
Sumrevenue, time spent, bandwidthSum of Metric Source values"Sum of user values
Meanaverage latency, average purchase priceAverage of non-null Metric Source valuesSum of values, Count of valuesSum(values)/Sum(counts)Delta Method applied
Count DistinctUnique game rooms the user connected toCount of distinct user-value pairsCount of distinct valuesAverage of user-level counts
Unit CountSee below - varies by rollup modeSee belowSee belowSee belowSee below
- Daily ParticipationAverage DAU of exposed usersDaily Active Users1/0 flag for participation on each daySum of values / Total Days
- One Time EventDid a user complete an event during the experimentDaily Active Users1/0 flag for participation across experiment lifespanCount of users
- Custom WindowDid a user subscribe between 3-7 days from exposureDaily Active Users1/0 flag for participation within windowCount of users
- Latest ValueIs the user a subscriber today?Daily Active Users1/0 flag for participation on latest available day of dataCount of users
Ratiorevenue per page hit, revenue per new userValue of Numerator/Value of Denominator based on typesValue of numerator, denominator based on typesSum(numerators)/Sum(denominators)Delta Method applied
Funnelconversion through a 5-step buy flowValue of Numerator/Value of Denominator based on typesFor each step, did the user complete all previous stepsSum(completions)/Sum(step starts)Delta Method applied
Percentilep99.5 latency on page loadPX of all daily values observedN/AConfigured Percentile of value columnUses the outer CI method

You can think of each of these in terms of a SQL query. The means of the experiment groups are either calculated directly (for ratios and mean metrics) or as the group total divided by the group population.

Count

-- User Level
SELECT
user_id,
COUNT(1) as value
FROM source_data
GROUP BY user_id;

-- Group Level
SELECT
group_id,
SUM(value) as total,
COUNT(distinct user_id) as population
FROM user_level_data
GROUP BY group_id;

Sum

-- User Level
SELECT
user_id,
SUM(value_column) as value
FROM source_data
GROUP BY user_id;

-- Group Level
SELECT
group_id,
SUM(value) as total,
COUNT(distinct user_id) as population
FROM user_level_data
GROUP BY group_id;

Mean

-- User Level
SELECT
user_id,
SUM(value_column) as value,
COUNT(value_column) as records
FROM source_data
WHERE value_column IS NOT NULL
GROUP BY user_id;

-- Group Level
SELECT
group_id,
SUM(value)/SUM(records) as mean
FROM user_data
GROUP BY group_id;

Count Distinct

-- User Level
SELECT
user_id,
COUNT(distinct value_column) as value
FROM source_data
GROUP BY user_id;

-- Group Level
SELECT
group_id,
SUM(value) as total,
COUNT(distinct user_id) as population
FROM user_data
GROUP BY group_id;

Percentile

-- Group Level
SELECT
group_id,
PERCENTILE(value, percentile_level) as value,
COUNT(distinct user_id) as population
FROM user_data
WHERE value IS NOT NULL
GROUP BY group_id;

Unit Count

-- User Level
SELECT
user_id,
COUNT(distinct date_column) as `Daily Participation Value`,
MAX(1) as `One-Time Event Value`,
MAX(IF(minutes_since_exposure between window_start and window_end, 1, 0)) as `Custom Window Value`,
MAX_BY(passes_filters, date_column) as `Latest Value Value`
FROM source_data
GROUP BY user_id;

-- Group Level
SELECT
group_id,
SUM(`Daily Participation Value`/days_exposed) as `Daily Participation Total`
SUM(`One-Time Event Value`) as `One-Time Event Total`,
SUM(`Custom Window Value`) as `Custom Window Total`,
SUM(`Latest Value Value`) as `Latest Value Total`,
COUNT(distinct user_id) as population
FROM user_data
GROUP BY group_id;

Ratio

-- User Level
SELECT
user_id,
<> as numerator, --depends on numerator type
<> as denominator -- depends on denominator type
FROM source_data
GROUP BY user_id;

-- Group Level
SELECT
group_id,
SUM(numerator)/SUM(denominator) as mean
FROM user_data
WHERE COALESCE(denominator, 0) != 0
GROUP BY group_id;

Funnel

-- User Level, per step
SELECT
user_id,
funnel_session_id, -- optional
funnel_step_id,
IF(`Completed All Steps Up to Current Step In Order`, 1, 0) as numerator,
IF(`Completed Previous Steps In Order`, 1, 0) as denominator
FROM user_data;

--Group Level
SELECT
group_id,
funnel_step_id,
SUM(numerator)/SUM(denominator) as mean
FROM user_data
GROUP BY group_id;

Configuring Your Metric

When creating a metric, pick your metric source, and determine if you want to filter your metric source data. You can press the reload button to validate that any filters are working as expected. Then, depending on your metric source, you will configure the inputs to the metric as well as specific configurations for experiment analysis.

Example Metric

Some common configurations you can use are below:

Filters

Filters are a powerful way to break down data from a data source. This allows experimenters to create metrics to answer targeted questions as needed, without having to write SQL or modify the underlying SQL and change existing metrics.

These filters are pushed down as far as possible into the metric source query to minimize reads.

Winsorization

Winsorization sets a percentile-based ceiling and/or floor for user-level metrics at the time of analysis. For example, in a 30 day experiment looking at revenue, this would consider the total 30-day revenue across each user in the experiment. If the winsorization threshold was 99.9%, the analysis would identify the 99.9th % of user-level revenue and set any user's revenue to that value if theirs was higher. This is useful in controlling for extreme outliers from bugs or power users, which might unduly impact the average result in a group.

Time windows

Some metrics (particularly user metrics and funnels) have time windows built into the metric definition. For example, this would set the analysis to only consider an event or metric in the first week since exposure, or since the first funnel event.

A secondary toggle for matured data sets the analysis to exclude users from the result set unless that window has elapsed for them to avoid shifting denominators as the experiment matures. For example, if your funnel is set to include the first week since exposure, if this toggle is set any user only 5 days from their first exposure would be excluded from the analysis until they hit their 7th day.

Breakdowns

Some metric types can include a dimension-based breakdown. This is very useful if you frequently want to see how the metric was influenced across high-level cuts like country or product category. This does increase the cost of calculation, as each dimension is functionally another metric for the purposes of analysis.

Thresholds

Sum and count metrics can be configured to use a threshold. When using a threshold, the metric will measure if the user's sum or count metric surpassed a given threshold. This is usually combined with cohort windows to create a metric like "% of users who spent more than $100 in their first week".

Example Metrics

Filtered Revenue Metric

This metric would calculate the total revenue on checkout events on clothing/electronics items for items costing < $1000. User-level values would be winsorized to the 99.9th percentile, and the results would be grouped by 2 different selected metadata columns by default.

Sum Metric

Ratio Metric

This metric would calculate the ratio of total checkout events per distinct users in the experiment who ever visited.

Ratio Metric

Activation Metric

This metric would calculate if a user ever visited in the span of the experiment.

Activation metric

Week 2 Retention Metric

This metric would calculate the rate at which exposed users were active in the 2nd week since their exposure, a common pattern for signup experiments.

Since the metric is configured to wait for users to reach the window, users who were exposed < 14 days ago would not be included in the experiment results on a given day.

Retention Metric

Funnel Metric

This funnel metric would calculate the overall and step-level conversion for users going from a visit -> cart view -> checkout at a unique user level. Each event would only count if the user had already completed the previous steps in order.

Since the calculation window is set to 7, the user would have 7 days to complete the funnel from the time of their first visit.

Funnel Metric

When Analysis Units and Assignment Units Are Different

Analysis with Different ID

Viewing your Metric

With the metric set up, you can load or schedule loads of a timeseries view to populate a timeseries that tracks the population value and aids in debugging/identifying data bugs. Additionally, this can be used to calculate the projected topline impact of releasing an experiment.

Timeseries

With the metric set up, you can start using it in experiments.