Warehouse Native Debugging Guide

Debug Statsig Warehouse Native experiment configurations and queries using query tools, pipeline logs, and metric source previews in the console.

Common debugging scenarios

Statsig Warehouse Native proactively monitors your data jobs and notifies you when results look off. This guide helps you diagnose the errors, missing data, and result discrepancies that complex data sources can cause, then reconcile Statsig's analysis with your own.

Common issues that cause errors or missing data

Mismatched IDs

In many cases, you may have multiple versions of the same ID across tables. For example, the user abc123 in one source might be USER_abc123 in another, or certain loggers may hash ID values for privacy reasons.

This usually triggers a Statsig alert that it was unable to join data between sources. To resolve this:

Go to the relevant sources and run the sample queries to check for obvious ID mismatches.
If no mismatch is obvious, find the job that is unable to join sources and copy the SQL. Working through the SQL query should pinpoint where the mismatch is occurring.

Conflicting filters

Statsig allows a high degree of customization in explore queries and on explore pages. This can lead to scenarios where you add two conflicting filters that, together, never pass. For example:

You have a metric with a cohort window from 7 to 13 days, but set up your experiment analysis to run on users' first 6 days. These two filters return no users.
You create a count metric filtered to event='purchase', and then create a local metric that filters to event='checkout'. This set of filters also returns a null set.

To resolve conflicting filters, go to the Unit-Level Aggregations jobs (where Statsig renders filters) and search for the metric name of interest in the SQL code.

Common points of confusion

Many data scientists run their own analyses using the Statsig staging data in their warehouse. In some cases, they see very different results. A few common methodology differences tend to explain these discrepancies:

Join conditions

Statsig joins events and metrics to exposures based on the event or metric input data occurring after the user's exposure to the experiment under analysis. If you enable the Treat Timestamp as Date setting, Statsig performs the join by comparing dates. By default, Statsig uses a timestamp. Not having a time filter, or using a date when Statsig uses a timestamp (or vice versa), can yield very different results.

Winsorization/capping

You can configure Statsig to cap or winsorize metrics. If you have extreme outliers, you see significant differences in results if you don't apply this procedure. You can apply winsorization yourself, or create a local metric (or clone the metric) without winsorization.

CUPED

CUPED can greatly reduce variance, and for aggregation (non-ratio) metrics can shift the group-level mean estimates if there is pre-experiment bias. You can disable CUPED in the scorecard at any time to view the non-CUPED results for comparison.

Metric types

In some cases, users have assumed that their input data is binomial, meaning the metric is only 1 or 0 at a unit/user level. If this isn't the case and you configure the metric as a count or sum, the result can be much higher variance than expected.

Was this helpful?