Getting Started with Statsig Warehouse Native
In this guide, you'll set up an experiment and load results in Statsig Warehouse Native.
In order to run an experiment in Statsig Warehouse Native, you'll need:
- At least one data source in your warehouse which tracks experiment assignment events. This needs a user identifier, a timestamp, and assignment data (experiment_id, group_id)
- At least one data source in your warehouse which can serve as a source for metric data. This could be any kind of data with a user identifier, a timestamp, and a value you want to turn into a metric.
Step 1 - Connect Your Data
Use the landing page guide or go to your Project Settings (the gear in the top right menu). Then, click on Data Connection.
This connection is very similar to the setups for warehouse ingestion. Refer to the docs for vendors:
One difference for Statsig Warehouse Native is that you should create an isolated Dataset or Schema that Statsig's service user has write access to. We will use this to save intermediate results, making queries more performant and giving you access to your experimental data.
Note - while you're here, you can go to Basic Settings and add any Custom ID Types you want to use for experimentation.
Step 2 - Create a Metric Source
Next, click into Metrics on the left navbar and go to the Metric Sources. Click Create to create a new Metric source. Give your new Metric Source a relevant name and description.
When defining a Metric Source, you'll give us a SQL query that functions as a view into your data. For example, a Metric source could represent data for a single event:
SELECT * FROM my_events_table WHERE event_name = 'myEvent'
or an entire dataset:
SELECT * FROM my_metrics_table
After running your SQL to pull a small sample set, you'll be asked to map specific columns that we require to calculate results. You can also include extra columns for later use in breakdowns and filtering when you create Metrics.
Expand the sample set we pull to help you map correctly and validate that the data looks how you expect it to.
Once that's done, save your metric source.
Step 3 - Create Metrics
Once you have a Metric Source set up, creating metrics is very similar to the usual Statsig flow. Go back to the Metrics view and click Create in the Metrics tab.
To create a metric, you pick the Metric Source you're deriving the metric from and specify some settings:
Metric types specify the aggregations and available settings for your experiment metric. We support:
- Count: basic counts of events with filters, such as click events
- Sum: sums of values from events, such as revenue or aggregating over pre-aggregated metric fields (e.g. daily_clicks in a user-day level source)
- Mean: the mean value of non-null value fields
- User Count: the number of users who had records in your Metric Source, calculated as a daily average, an overall participation rate, or a participation rate in some window based on their exposure
- Ratio: the population-level ratio of two metrics you define
For most metric types, you can specify breakdown columns which will automatically break down a result into common elements in metadata fields.
For example, you might break down a "click" metric by which "page_name" the click happened on. When you calculate experiment results, we'll calculate deltas for the top pages alongside the topline change.
Filters are a powerful way to flexibly reuse metric sources. Your Data Team can provide cleaned and organized views into data, and anyone can use the filtering UI to set up specific custom metrics as needed.
Once a metric is set up, you can reload your sampled source with filters applied to make sure things look good. Save your metric, and you're good to go!
Step 4 - Create an Assignment Source
An Assignment Source is very similar to a Metric Source. You can create one by going to Experiments in the left navbar and clicking on the Assignment Sources Tab, and then Create.
Here, you'll specify a query that provides a view into your exposure assignment data. You'll need to provide a timestamp, experiment ID, group ID, and at least one unit ID (i.e. a unique identifier for the type of units you're experimenting on).
Step 5 - Create an Experiment
With Metrics and an Assignment Source, creating an experiment is easy. Go back to the experiments tab and create a new experiment. You'll see a pop up where you'll enter a name, your core hypothesis, and relevant tags. You can sync your Assignment Sources to pull any new experiments.
In the experiment config, you can configure which metrics you want to measure and manage the groups you want to analyze. We'll automatically detect groups and infer the experimental split, but in some cases you may need to correct the intended split so we can properly evaluate Sample Ratio Mismatch issues.
Once it's set up, click Analyze to define the experiment timeline. After this, your results will start calculating!
Step 6 - Analyze Results
Once you start analysis, you'll see a progress bar tracking the jobs in our pipeline. This generally takes 15-30 seconds for small-mid sized companies, but can depend on the resources you provide and the size of the data.
From here on out, you'll have access to the Statsig pulse view from your experiment, discuss, and make decisions as in the standard Statsig product!