Create a New Experiment
To create a user-level experiment,
- Log into the Statsig console at https://console.statsig.com/
- Navigate to Experiments in the left-hand navigation panel
- Click on the Create button
- Enter the name and description for your experiment as shown in the figure below
- By default, your experiment runs in its own Layer. If you would like to add this experiment to a Layer, select the Add Layer option under Advanced in the experiment creation modal or create a new Layer via Create New Layer.
- Click "Create"
Configure Your Scorecard
When running an experiment, it is common you are trying to test an explicit hypothesis, which you are measuring using a set of key metrics. The Scorecard makes this easy, with an affordance to enter your Hypothesis, Primary, and Secondary metrics.
Primary Metrics are the metrics you are looking to influence directly with your experiment. Secondary metrics are the set of metrics you may want to monitor or ensure don't regress with your test, but aren't directly trying to move.
Configuring the Scorecard is optional, but is especially helpful to ensure other members of your team viewing your experiment have context on the hypothesis being tested and how success is being measured. Additionally, all metrics added to the Scorecard are pre-computed daily, as well as eligible for more advanced statistical treatments like CUPED and Sequential Testing.
Read more about best practices for configuring your Scorecard here.
Configure Your Groups and Parameters
This is where the meat of experiment configuration happens. Whereas the Scorecard is an optional setup step, configuring your experiment's allocation, targeting criteria, and groups and parameters is mandatory.
For Allocation, enter the percentage of users that you want to allocate to this experiment. By default, each experiment runs in its own layer and you can enter a value up to 100%.
If you want to use a targeting gate in your experiment, tap on All Allocated Users in the Targeting section to select a Feature Gate from the drop-down. A targeting gate will restrict the users eligible for the experiment to those who pass the list of conditions defined in the linked Feature Gate. This continues to apply after making a decision on an experiment in the Statsig UI.
By default, no Feature Gate is selected and your experiment will use all allocated users (up to the Allocation % specified in the previous step) within either your exposed userbase or within the Layer you have selected.
When configuring your Groups and Parameters, we recommend adding your test parameter(s) first. Parameters are what actually control the different experiment variants in code. Enter the values that the experiment parameter will take for each variant. Read more about Groups vs. Parameters here. Please note that you cannot start your experiment without adding at least one parameter.
If you are looking to test more variants than just an A/B, add more Groups to your experiment by tapping the "+" to the right of the existing experiment groups. You will be prompted to enter the parameter value for each new experiment group added. You'll notice that the split percentages between the experiment groups automatically change to evenly distribute users between the groups.
Once parameters and their values for different groups are defined, you can add additional Group metadata to name, describe, and add a corresponding variant image to each experiment group via the "Groups" section. Note that neither the Group name nor description is used in your end experiment- only the parameters and their values are actually called in code to influence the end experience a user sees based on their group assignment.
Device-level and Custom ID Experiments
The default randomization unit for experiments is User ID. To create an experiment with a different unit ID type, follow steps 1 - 4 from the "User-level Experiments" section above. Then,
- Click the ID Type drop down menu and make a selection.
- Click Create
Now follow the remaining steps as described in the previous section to complete your experiment setup.
By default, each experiment runs in its own Layer. When you want to create an experiment that excludes any users exposed to other experiments, follow steps 1 -4 from the "User-level Experiments" section above. Then,
- Select Advanced
- Click the checkbox for Add Layer
- Select an existing layer or create a new layer.
- Click Create
Now follow the remaining steps as described in the top section on this page to complete your experiment setup.
Significance Level Adjustments
By default, Pulse results are displayed with 95% confidence intervals and without Bonferroni correction. This default can be changed during the experiment creation and can also be adjusted in the settings when viewing results in Pulse.
- Bonferroni Correction: Select this option to automatically apply the correction in experiments with more than one test group. This reduces the probability of Type I errors (false positives) by adjusting the significance level alpha, which will be divided by the number of test variants in the experiment.
- Default Confidence Interval: The selection will be used by default every time Pulse results are shown for this experiment. Choose lower confidence intervals (e.g.: 80%) when there's higher tolerance for false positives and fast iteration with directional results is preferred over longer/larger experiments with increased certainty.
Setting a target duration is optional, but can be good experimental practice to ensure you wait long enough for an experiment to reach full power before reading out the results. You can set either a target number of days or number of exposures for your experiment, and use the Power Analysis Calculator to determine what target duration to set based on which metrics you're looking to move.
- 💡Selecting target duration greater than 90 days 💡 By default, Statsig will compute Pulse Results (i.e. metric lifts) for your experiment for the first 90 days, while your user assignment will continue to work until the experiment stops. While you will be able to extend this compute window as you approach the 90 days cap, it's worthwhile asking why it's necessary to run an experiment beyond 90 days - What conclusion do you expect to see beyond 90 days that you might not see before then? Will this decision still be relevant in 90 days?
Once a target duration is set, you can track your progress against your target duration from the header of your experiment. When your experiment hits target duration, you will be notified via email and Slack if you have enabled the Statsig Slack integration.