Standard Error & Mean Variance
How Statsig computes variance for experiment metrics, including handling of ratio metrics, clustered data, and user-level aggregation across exposures.
The standard error (also denoted "SE" or "std err") of the mean of each group is required to compute the confidence interval and p-value of a metric delta between those groups. You obtain the standard error of the mean by dividing the sample standard deviation of $X$ by the square root of the number of users in the group.
$$
\sigma_{\overline X} = \frac{\sigma_{X}}{\sqrt{N}} = \sqrt{\frac{var(X)}{N}} = \sqrt{var(\overline{X})} $$
Standard deviation is the square root of the variance. Because variances are easier to manipulate algebraically, Statsig derives the variance for each metric type and then takes the square root to obtain the confidence intervals.
Pulse displays the standard error of the mean of each group alongside the units and mean of each group.
Computing variance
The variance of the absolute metric delta is the sum of the variances of the test and control means:
$$
var(\Delta \overline X) =var(\overline X_t - \overline X_c) = var(\overline X_t) + var(\overline X_c) $$
The calculation reduces to correctly computing the variance of the means for each group.
Count and sum metrics
For count and sum metrics, Statsig derives the variance of the sample mean for a given group directly from the sample variance:
$$
var(\overline{X}) = \frac{var(X)}{N} = \frac{\frac{1}{N-1}\sum_{i=0}^{N}(X_i-\overline{X})^2}{N} $$
Where:
- $N$ is the number of users in the group
- $X_i$ is the metric value for user $i$
- $\overline{X}$ is the user-level average of $X$ for users in that group
Ratio and mean metrics
Ratio and mean metrics combine multiple variables $X$ and $Y$ rather than a single variable $X$. The variance of these metrics depends on both the numerator and denominator variables, which are typically correlated. The metric of interest $R$ has a group mean $\overline{R}$ and a group variance of the mean $var(\overline{R}$).
For example, consider a clicks per session metric. The number of clicks and the number of sessions are two sets of observations from the same group of users, so they aren't independent of each other.
To account for this correlation, Statsig obtains the variance of the mean of a ratio metric $R$ using the delta method:
$$
var(\overline R) = var\left(\frac{\overline X}{\overline Y}\right) := \left(\frac{\overline X}{\overline Y}\right)^2 \cdot \left(\frac{var(\overline X)}{\overline X^2} + \frac{var(\overline Y)}{\overline Y^2} - 2 \cdot \frac{covar(\overline X, \overline Y)}{\overline X\cdot \overline Y} \right) $$
where Statsig computes the variance of the numerator and denominator means in the same way as for count metrics above, and the covariance is
$$
covar(\overline X, \overline Y) = \frac{covar(X, Y)}{N} = \frac{\frac{1}{N-1}\sum_{i=0}^{N}(X_i-\overline X)\cdot (Y_i-\overline Y)}{N} $$
Was this helpful?