Skip to main content

Your First A/B Test

In this guide, you will create and implement your first A/B/n test. While you can use Statsig's Feature Flags to roll out new features, Statsig's Experiments enable you to run all kinds of A/B tests, from simple bivariant (A vs. B) experiments to multi-variant experiments ( A vs. B/C/D/n) and mutually exclusive experiments.

This guide is for a user-level experiment

Most experiments randomize the users that are exposed to different variants. We call these user-level experiments. To randomize based on devices, say when you don't yet have a userID (e.g. for unregistered users), consider a Device-level Experiment.

Do I need a large sample size?

This is a common question that comes up in the context of A/B tests. Most assume (incorrectly) that they can't get statistically significant results with small sample sizes. Here's a good article that clears this up:

You don't need large sample sizes to run A/B Tests

Step 1 - Create a new experiment in the Statsig console

Log into the Statsig console at https://console.statsig.com/ and navigate to Experiments in the left-hand navigation panel.

Click on the Create button and enter the name and description for your experiment. Click Create.

Screen Shot 2022-06-17 at 12 14 50 PM

In the Setup tab, you can fill out the scorecard for the experiment Hypothesis, and any primary metrics you are interested in watching. While Statsig will show you experiment results for all your metrics, these key metrics represent your hypothesis for the experiment. Establishing a hypothesis upfront ensures that the experiment serves to improve your understanding of users rather than simply serving data points to bolster the case for shipping or not shipping your experiment.

Screen Shot 2022-06-17 at 12 17 17 PM

Then, define the test groups for this experiment. You can define what percentage of users should receive the group's configured values. First, click on "Add Parameter." Here, I'll add a Boolean parameter called "enable feature".

Screen Shot 2022-06-17 at 12 18 39 PM

Now you can set the value for your Test and Control groups. An easy way to double check your statsig integration is to start with an A/A test - meaning, there is no difference between the test and control groups. In that case, the parameter values will be the same:

Screen Shot 2022-06-17 at 12 20 12 PM

In the Allocation panel, enter the percentage of users that you want to allocate to this experiment. By default, this is allocated to 100% of your users.

Screen Shot 2022-06-17 at 12 20 52 PM

Finally, don't forget to scroll back up and click on the Save button at the top right hand side of the page to complete your experiment setup.

Screen Shot 2022-06-17 at 12 21 58 PM

Step 2: Install and Initialize the Statsig SDK

In this guide, we use the Statsig Javascript client SDK. See the Statsig client or server SDKs docs for your desired programming language and technology.

You can install the Statsig SDK via npm, yarn or jsdelivr:

Statsig is available from jsdelivr, an open source CDN. We use this installation method for using Statsig on www.statsig.com! Go ahead, inspect the page source.

To access the current primary JavaScript bundle, use:

https://cdn.jsdelivr.net/npm/statsig-js/build/statsig-prod-web-sdk.min.js

To access specific files/versions:

http://cdn.jsdelivr.net/npm/statsig-js@{version}/{file}

<script src="https://cdn.jsdelivr.net/npm/statsig-js/build/statsig-prod-web-sdk.min.js"></script>

After you install the SDK, you must initialize the SDK using an active Client API Key from the API Keys tab under Settings in the Statsig console.

In addition to the Client API key, you must also pass in a Statsig User ID to ensure users can be assigned and tracked for different variants of the experiment.

import Statsig from "statsig-js";

// initialize returns a promise which always resolves
await Statsig.initialize(
"client-sdk-key",
{ userID: "some_user_id" },
{ environment: { tier: "staging" } } // optional, pass options here if needed
);

Step 3: Check the experiment in your application and log an event

Get your experiment's configuration parameters to enable the user to see the experience you want to create with each variant. In this case, we fetch the value of the shape parameter that we had created earlier.

const expConfig = statsig.getExperiment("onboarding_banner");

const showBanner = expConfig.get("enable feature", false);

Now when a user renders this page in their client application, you will automatically start to see a live log stream in the Statsig console when you navigate to your experiment and select the Diagnostics tab.

image

At this stage, you may also want to track downstream events to measure how your users are responding to different variants of the experiment. To do this, call the LogEvent API with the key measure that you may want to aggregate as well as other metadata that you may use as additional dimensions to breakdown the aggregate results. Events logged to Statsig are associated with the user identifiers, and not experiment-specific.

statsig.logEvent('add_to_cart', 'groceries', {'price': '9.99', 'SKU_ID': 'SKU-123'});

As the experiment progresses, you will see how many experiment checks occur and how many events are logged on an hourly basis in the Diagnostics tab.

Step 4: Review experiment results

Within 24 hours of starting your experiment, you'll see the cumulative exposures in the Results tab.

image

You will also start to see the key results for each test group compared to the control in the Results tab. In the Metric Lifts panel, you can see the full picture of how all your tagged metrics have shifted in the experiment.

Screen Shot 2022-06-17 at 12 24 29 PM

After your experiment has run for your predetermined duration, you can start to make a decision based on the results in the Metric Lifts panel.

Alternatively, you can choose to ramp up the experiment to a larger user base by increasing the allocation in the Setup tab. We recommend scaling up allocation slowly in the initial stages, and then by a larger margin in later stages, say 0% -> 2% -> 10% -> 50% -> 100%. This helps you get more data and build more confidence before you decide to ship the experiment.