Variance Reduction
Overview of variance reduction techniques in Statsig Warehouse Native, including CUPED, stratified sampling, and regression adjustment for sensitivity.
Variance reduction
Variance measures the dispersion (or "noise") in a metric or experiment results. Higher variance produces larger confidence intervals, requiring more sample to detect a statistically significant result for the same effect size.Lower variance reduces required sample size, which leads to shorter experiment run times. Statsig uses a form of CUPED based on a 2013 Microsoft paper (Deng, Xu, Kohavi, & Walker). Statsig automatically applies CUPED to experiments and runs it for the topline results on key metrics in Pulse, producing significant variance reduction for most metrics.Go to the CUPED launch post for more details.CUPED - Controlled-experiment Using Pre-Existing Data
CUPED (Controlled-experiment Using Pre-Existing Data) uses user information from before an experiment to reduce variance and increase confidence in experimental metrics. In Statsig, the pre-experiment data window is the 7 days before each user's exposure, rather than a fixed window before the experiment starts. This helps debias experiments where groups were randomly different before any treatment was applied.
The Cloud product uses stratification alongside CUPED to account for users who may not have pre-experiment data. Users are grouped into strata based on available pre-experimentation information. Treatment and control effects are estimated within each stratum, then aggregated to produce an overall result. Standard difference-in-means and variance estimation is then applied. This approach retains users with missing pre-data while still benefiting from variance reduction where applicable.
Winsorization
Another technique for reducing noise is Winsorization, which manages the influence of outliers. Winsorization measures the percentile P<sub>x</sub> of a metric and sets all values over P<sub>x</sub> to P<sub>x</sub>. This reduces the influence of extreme outliers caused by factors such as logging errors or bad actors.
Metric selection
The metrics you use can dramatically influence the sensitivity of your analysis. The transformations above, along with techniques like creating threshold-based flags, let you trade exact numbers for significantly more statistical power. Go to the Statsig blog post on variance reduction for more information.Related resources
- Deng, Xu, Kohavi, & Walker: the seminal paper on using CUPED for online controlled experiments
- Booking.com on CUPED: theory and practice of CUPED
- Improving the Sensitivity of Online Controlled Experiments: Case Studies at Netflix
Was this helpful?