Metric Deltas
How Statsig Warehouse Native computes metric deltas to compare absolute and relative differences between experiment groups in scorecards.
Computing metric deltas
A metric delta refers to the difference in metric values between two groups, by default the test and control groups. This is the impact measured when evaluating experiment results. To account for the different number of users (or units) in each group, Statsig compares the mean metric value per user, not the total.
Selecting groups
All deltas are defined as the difference between a "treatment" group and a presumably unchanged "control" group. However, Statsig allows comparison between any two groups.
Pulse provides two different metric deltas. The absolute delta is the difference between the two means:
$$
\Delta \overline{X}=\overline{X}_t-\overline{X}_c $$
Understanding the impact relative to the baseline value of the metric is often helpful. For example, an absolute delta of +1 clicks/user has different meanings with a baseline value of 1 (+100% increase) vs. a baseline value of 100 (+1% increase). The relative delta is computed using the mean of the control group as the baseline:
$$
\Delta \overline{X} \%=\frac{\overline{X}_t-\overline{X}_c}{\overline{X}_c} \times 100 \% $$
If you reverse the order of group comparison in Pulse to be "control" vs "treatment", all deltas reverse and the direction of change inverts.
Computing means
Computing group means correctly is critical for obtaining meaningful metric deltas. The methodology for calculating metric means depends on the metric type.
Event count and sum metrics
These metrics represent totals: number of times an event occurs, sum of time spent, total purchase amount, and similar values. The mean is the average user-level total during the analysis period.
The mean value of the metric $X$ for a group is given by:
$$
\overline{X}=\frac{1}{N} \sum_{i=0}^N \sum_{d=0}^{n_i} X_{i, d} $$
where:
- $N$ is the number of users in the group
- $n_i$ is the number of days during the analysis period that user $i$ was the experiment
- $X_{i,d}$ is the metric value for user $i$ on day $d$
Statsig includes only user metrics recorded after a user has been exposed to the experiment in the group mean.
User accounting and Event User metrics (and legacy Event DAU)
Event User metrics set to "Daily Participation Rate" capture the number of distinct users that have the event each day. In Pulse results, these values are normalized by the number of days the user is in the experiment. This represents the probability that a user is daily active for that event (the daily participation rate). The group mean is given by:
$$
\overline{X}=\frac{1}{N} \sum_{i=0}^N \frac{1}{n_i} \sum_{d=0}^{n_i} X_{i, d} $$
where:
- $X_{i,d}$ takes value 0 or 1 depending on if user $i$ has the event on a given day $d$.
The following user accounting metrics are computed the same way: DAU, WAU, MAU_28day, L7, L14, L28
For new user accounting (new_DAU, new_WAU, new_MAU_28day), Statsig counts users that are new xAU at some point during the analysis window. The group mean is given by:
$$
\overline{X}=\frac{1}{N} \sum_{i=0}^N \max \left(X_i\right) $$
Where $\max(X_i)$ is the maximum value of the new xAU metric for user $i$.
event_dau metrics are now in legacy support only and Statsig no longer creates them for new events. Existing event_dau metrics continue to be available for any of your new experiments and continue to be computed daily. For all new events, create an event_user metric to measure daily active users.
Custom ratios, means, retention, and stickiness metrics
These metrics include click-through rate, average purchase value, sessions per user, and similar values. Each is obtained by dividing a numerator value, $X$, by a denominator value, $Y$. The mean value of a ratio metric $R$ for an experiment group is given by:
$$
\overline{R}=\frac{\frac{1}{N} \sum_{i=0}^N \sum_{d=0}^{n_i} X_{i, d}}{\frac{1}{N} \sum_{i=0}^N \sum_{d=0}^{n_i} Y_{i, d}}=\frac{\overline{X}}{\overline{Y}} $$
Where $N$ is the number of users in the experiment group that participate in the metric, i.e. have a non-zero denominator value. $X_{i,d}$ and $Y_{i,d}$ are the $X$ and $Y$ values for user $i$ on day $d$.
Different approaches exist for ratio metrics in experiments. Statsig selected this implementation because it's statistically sound and interpretable:
- $R$ is the ratio of two means of independent observations: a set of user-level $X$ values and a set of user-level $Y$ values. The central limit theorem can therefore be used to separately obtain the summary statistics of $X$ and $Y$.
- The group means are computed in the same way as the topline metric value, making the means easier to interpret and relate to the topline metric.
Event User one-time event
For custom event_user metrics with "One-Time Event" selected, Statsig computes how many users have the event at any time after entering the experiment. Statsig doesn't normalize this result by the number of days a user is in the experiment. The group mean is given by:
$$
\overline{X}=\frac{1}{N} \sum_{i=0}^N X_{i} $$
where:
- $N$ is the number of users in the group
- $X_{i}$ takes value 0 or 1 depending on if user $i$ has the event at any point after entering the experiment
Was this helpful?