Fieller Intervals

How Fieller intervals build accurate confidence intervals for ratio metrics in Statsig Warehouse Native experiment scorecards and analysis views.

Fieller intervals

You can use Fieller Intervals as the methodology for calculating confidence intervals for the relative change between test and control group.

The Delta Method is an approximation for the variance of a ratio between two variables that Statsig then uses to establish a confidence interval. Fieller Intervals are an exact solution for the confidence interval.

In most cases, Fieller Interval results are very similar to results from the Delta Method. Because Fieller Intervals are more accurate, Statsig recommends using this methodology.

Calculation

1: Determine if a Fieller interval is well-defined

Before applying Fieller’s Theorem, Statsig checks that the denominator of the relative lift metric $\overline{X_C}$ is significantly distinct from 0.

Statsig calculates the parameter $g$ :

g = \frac{Z_{\alpha/2}^2 \cdot \mathrm{var}(X_C)}{(n_C-1) \cdot \overline{X_C}^2}

Where:

$Z_{\alpha/2}$ is the critical value associated with the confidence level you want
$\mathrm{var}(X_C)$ is the variance of the control group metric values
$n_C$ is the number of units in the control group
$\overline{X_C}$ is the mean of the control group metric values

When $g$ < 1, the control mean is significantly different from 0, and Fieller intervals apply.

2A: Apply Fieller interval formula

Since the control and test group results are independent of each other, Statsig can drop covariance terms in Fieller's Theorem.

CI(\% \Delta \overline{X} ) = \frac{1}{1-g} \left( \frac{\overline{X_T}}{\overline{X_C}} \pm \frac{Z_{\alpha/2}}{\overline{X_C}} \sqrt{ \frac{\overline{X_T}^2}{\overline{X_C}^2} \cdot \frac{\mathrm{var}(X_C)}{n_C-1} + (1-g)\frac{\mathrm{var}(X_T)}{n_T-1} } \right) - 1

2B: Edge case: control mean not statistically distinct from zero

In rare cases (less than 5% of observed metric comparisons on Statsig), g $\geq$ 1, which means the control group’s mean isn’t statistically distinguishable from 0.

When $\overline{X_C}$ isn't statistically different from zero, the denominator of the relative lift calculation is unstable. This means that the confidence interval for the percent difference between test and control is unbounded.

When the confidence interval is unbounded, Statsig surfaces the relative lift observed during the experiment.

\% \Delta \overline{X} = \frac{\overline{X_T}-\overline{X_C}}{\overline{X_C}}

Enabling Fieller intervals in Statsig

Configure the relative confidence interval methodology in Experimentation Settings at the organization level. Changing this setting only affects experiments created after the setting change.

Experimentation settings configuration interface

In many cases, the results are effectively the same as using the Delta Method. However, if you're running experiments with small sample sizes or noisy denominators, Fieller Intervals are more reliable, and Statsig strongly recommends using them.

In the experiment scorecard, Fieller Intervals appear as shown below.

Experiment scorecard with Fieller intervals

Was this helpful?