Variance Reduction

Overview of variance reduction techniques in Statsig experiments, including CUPED, stratified sampling, and regression adjustment for higher sensitivity.

Variance reduction

Variance measures the amount of noise in a metric or experiment results. Higher variance produces larger confidence intervals and requires more sample size to consistently observe a statistically significant result for the same effect size.Reducing variance shortens experiment run times by requiring less sample. Statsig uses a form of CUPED based on a 2013 Microsoft paper (Deng, Xu, Kohavi, & Walker). Statsig automatically applies CUPED to experiments and runs it for the topline results on key metrics in Pulse. CUPED produces significant variance reduction for the large majority of metrics where Statsig can apply it.For more details, refer to the launch post for CUPED.

CUPED - Controlled-experiment Using Pre-Existing Data

CUPED (short for Controlled-experiment Using Pre-Existing Data) is a technique that uses user information from before an experiment to reduce variance and increase confidence in experimental metrics. At Statsig, this pre-experiment data covers the 7 days before each user's exposure, rather than a fixed window before the experiment starts for all users. This per-user window helps reduce bias in experiments where groups were randomly different before any treatment was applied.

The Cloud product uses stratification alongside CUPED to account for users who may not have pre-experiment data. Statsig groups users into strata based on available pre-experimentation information. Statsig first estimates treatment and control effects within each stratum, then aggregates them to produce an overall result. Statsig then applies the standard difference-in-means and variance estimation. This approach retains users with missing pre-experiment data while still providing variance reduction where applicable.

Winsorization

Winsorization is another technique for reducing noise by managing the influence of outliers.

Winsorization measures the percentile P_x of a metric and sets all values above P_x to P_x. Winsorization reduces the influence of extreme outliers caused by factors such as logging errors or bad actors.

Metric selection

The metrics you use can significantly influence the sensitivity of your analysis. Winsorization and CUPED, combined with techniques such as creating threshold-based flags, allow you to trade exact numbers for more statistical power. For more information, refer to the blog post on understanding and reducing variance.

Deng, Xu, Kohavi, & Walker: seminal paper on using CUPED for online controlled experiments
Booking.com: CUPED in practice: blog post on the theory and practice of CUPED
Improving the Sensitivity of Online Controlled Experiments: Case Studies at Netflix

Was this helpful?

Variance Reduction