Topline and Projected Impact
How Statsig estimates the topline impact of an experiment on company metrics by scaling experiment lift to your total addressable user base.
The topline impact is the average daily effect that an experiment has on the overall metric value between two groups. This is the actual daily impact to a metric resulting from running the experiment, measured across the two groups being evaluated. The projected launch impact is an estimate of the daily impact expected in the metric measured globally if the test group is launched to all users (beyond those in the experiment). Statsig computes this impact relative to the expected baseline value of the metric without the experiment running.
Statsig shows topline impact and projected launch impact in both absolute and relative units. Neither uses CUPED when measuring the impact of your experiment, because CUPED already adjusts for pre-exposure data: the same data that topline metrics change from. Combining them would double-count that adjustment.
Example: Take a simple example experiment with a Control group of 1000 users and a Test group of another 1000 users, which ran for 30 days. For an event_count metric, the Experiment Delta is +1.0 events per user (abs). The Topline Impact for this metric is +33.33 events per day (abs).
Computing topline impact
Statsig computes the topline impact over the total duration of the experiment, which gives the most accurate estimate and a tight confidence interval. The calculation depends on whether the metric represents an absolute quantity or a ratio:
Count and sum metrics (event_count, sum)
Statsig derives the absolute topline impact directly from the experiment results. It depends on the difference in means between test and control, and the average number of users in the test group per day.
Ratio and mean metrics
To derive the topline impact on a ratio metric, Statsig first determines the impact on the numerator (X) and denominator (Y) separately. The topline impact is the current value of the ratio metric minus the baseline value obtained by subtracting the numerator and denominator impacts:
Where the baseline value is the expected value of the topline metric if the experiment wasn't running:
Statsig computes the relative impact for ratio metrics by dividing the absolute impact by the baseline value:
Computing projected launch impact
Statsig uses the layer allocation of the experiment and the size of the test group to estimate a scaling factor m, which represents the increase in absolute impact expected when the test group is launched.
Statsig calculates the launch factor over a rollup window as
to accommodate changes in allocation during the experiment.
The targeting gate isn't factored in. The projected impact calculation assumes that the target gate remains the same after the experiment is launched.
Count and sum metrics (event_count, event_dau, sum)
For count and sum metrics, the projected absolute impact is the current topline impact scaled by m. For example, consider an experiment running with 50% layer allocation and a 50/50 test/control split, so that 25% of all users are in the test group. If the allocation changed during the experiment, Statsig uses a weighted average based on historical allocations. If the topline impact is currently +10 events per day, then launching the experiment would result in +40 events per day.
The relative projected impact is the expected percentage change in the topline metric, relative to the baseline value of the metric without the experiment running.
Ratio and mean metrics
The projected impact of ratio metrics depends on the numerator and denominator impacts, using the same scaling factor m to obtain the projected impact for each term:
Where the first term represents the projected metric value after launch.
Finally, the projected relative impact of a ratio metric is the projected absolute impact divided by the baseline value of the ratio:
Confidence intervals
Statsig computes the confidence intervals for topline and projected impact using the same method as confidence intervals for experiment deltas.In the case of absolute impact of count and sum metrics, the variance calculation is simply a linear combination of the test and control variances:
For projected launch impact:
For ratio metrics and relative impacts, Statsig calculates the variance using the Delta method. This accounts for the correlation between numerator and denominator terms, using Taylor expansion to linearize expressions containing non-linear combinations of experiment variables.
For example, the variance in the relative impact of a count metric is given by:
Was this helpful?