What is SRM?
SRM, or sample ratio mismatch, is a problem with experiments characterized by there being too many units in some groups, and too few in others. The example below is an exposure crosstab of an experiment that has SRM. Even though the group percentages may look similar, if an assignment system is actually splitting traffic evenly, an imbalance this extreme (or more) would have less than a 0.01% chance to occur randomly.
Why SRM Is An Issue
SRM is an issue because it is usually non-random; the extra or missing traffic is not identical to the original traffic. How can SRM occur?- A bug causes a users’ client or browser to crash before a log for the exposure can be sent. The users who don’t come back are not re-exposed and included, but those that do come back are. This leads to bias in measurement
- A conditional dependency causes “who is exposed” to be filtered by some characteristic for one or many groups, meaning the groups are not identical to other groups, biasing measurement
- A script is bulk-exposing users going through 1 group at a time, and logs are truncated after a certain count - meaning the last group’s exposures are truncated.
How SRM is detected
Detecting SRM is generally done by using a chi-squared test, which is a common and accepted way to analyze categorical data to understand if observed frequencies match expected frequencies. For example, in the experiment above we expect an even distribution of 167.85k units per group, and observe [166.08k, 171.18k, 166.30k]. If the p-value of the test is low, we reject the null hypothesis that the groups are identical and conclude that there is a difference between the groups’ observed and intended/expected assignment rates.What to do if an experiment has SRM
On Statsig, SRM will create a warning or failure state on an experiment’s health check when detected - depending on how extreme the SRM is.
