CUPED
How Statsig Warehouse Native uses CUPED variance reduction to improve experiment sensitivity by adjusting for pre-experiment user behavior on metrics.
CUPED: Controlled-experiment using pre-existing data
CUPED (Controlled-experiment Using Pre-Existing Data) is a technique that uses user information from before an experiment to reduce variance and increase confidence in experimental metrics. At Statsig, this pre-experiment data is defined as the 7 days before each user's exposure, rather than a fixed window before the experiment starts for all users. This helps to debias experiments that have meaningful pre-exposure bias (for example, groups that were randomly different before any treatment was applied).
The Cloud product uses a 7-day window for CUPED calculation. For Warehouse Native customers, a 7-day window is recommended, but you can customize it to any length.
For more information, go to the Variance Reduction page.For an in-depth look at the methodology, refer to CURE by Statsig.CUPED for simple aggregations
The methodology for simple aggregations is described in the original Microsoft paper and in Statsig's in-depth article on the technique.The Cloud product uses stratification alongside CUPED to account for users who may not have pre-experiment data. Users are grouped into strata based on available pre-experimentation information. Treatment and control effects are first estimated within each stratum, then aggregated to produce an overall result. The standard difference-in-means and variance estimation is then applied. This approach retains users with missing pre-data while still benefiting from variance reduction where applicable.
CUPED for ratio metrics
The Microsoft paper also gives details on how to implement CUPED for a different analysis unit (Appendix B). Statsig extends this to work for ratio metrics, where each experiment unit has a numerator and a denominator. The variance reduction process works by finding the variance of experiment data, pre-experiment data, and the covariance between the two.
Denote the numerator, denominator, pre-experiment numerator, and pre-experiment denominator of a unit as $Y$, $N$, $X$, and $M$, respectively. Using the CUPED-reduced variance formula,
$$
Var(\frac{Y_{cv}}{N_{cv}})=Var(\frac{Y}{N})+\theta^2 Var(\frac{X}{M})-2\theta Cov(\frac{Y}{N}, \frac{X}{M}) $$
where optimal $\theta$ is found as
$$
\frac{Cov(\frac{Y}{N}, \frac{X}{M})}{Var(\frac{X}{M})} $$
expanded to \ $$
\frac{Cov(\frac{Y}{\mu_N}-\frac{\mu_Y N}{\mu^2_N}, \frac{X}{\mu_M}-\frac{\mu_X M}{\mu^2_M})}{Var(\frac{X}{\mu_M}-\frac{\mu_X M}{\mu^2_M})}
$$
From this:
Because $\mathbb{E}[R]$ is difficult to derive and the expectation term is the same for both groups, Statsig substitutes $\mathbb{E}[R]$ with $\frac{X_{c}}{M_{c}}$, transforming the formulas above to:
$$
\frac{Y_{cv}(control)}{N_{cv}(control)}=\frac{Y(control)}{N(control)} $$
$$
\frac{Y_{cv}(test)}{N_{cv}(test)} \ :=\frac{Y(control)}{N(control)} - (\frac{Y(control)}{N(control)} - \theta \frac{X(control)}{M(control)}) + (\frac{Y(test)}{N(test)} - \theta\frac{X(test)}{M(test)}) \ :=\frac{Y(test)}{N(test)} - \theta\frac{X(test)}{M(test)} + \theta \frac{X(control)}{M(control)} $$
Using the optimal $\theta$, Statsig reduces group-level variance by plugging the parameter back in to calculate the adjustment. Across-group $\theta$ doesn't necessarily reduce variance for one group, or the sum of variances of all groups, but in most cases it does. Simulation shows that 98.3% of metrics saw a decrease with CUPED.
Statsig uses CUPED variance when all of the following conditions hold:
- Core assumptions of the CUPED model are satisfied; this can be violated due to rounding error or other data artifacts
- E(X_hat) = E(X)
- The pooled variance of the adjusted population across groups is < the variance of the unadjusted population
- Enough units have pre-experiment values (> 100)
- Enough percentage of units have pre-experiment values (> 5%)
Was this helpful?