Warehouse Native Debugging Guide
Debug Statsig Warehouse Native experiment configurations and queries using query tools, pipeline logs, and metric source previews in the console.
Common debugging scenarios
When interacting with complex data sources, you may encounter unexpected results. Statsig proactively monitors for these and notifies you if something looks off. This guide helps you identify what went wrong and reconcile differences in results between Statsig's analysis and your own.
Common issues that cause errors or missing data
Mismatched IDs
In many cases, you may have multiple versions of the same ID across tables. For example, the user abc123 in one source might be USER_abc123 in another, or certain loggers may hash ID values for privacy reasons.
This usually triggers a Statsig alert that it was unable to join data between sources. To resolve this:
- Go to the relevant sources and run the sample queries to check for obvious ID mismatches.
- If no mismatch is obvious, find the job that is unable to join sources and copy the SQL. Working through the SQL query should pinpoint where the mismatch is occurring.
Conflicting filters
Statsig allows a high degree of customization in explore queries and on explore pages. This can lead to scenarios where two conflicting filters are added that, together, never pass. For example:
- You have a metric with a cohort window from 7 to 13 days, but set up your experiment analysis to run on users' first 6 days. These two filters return no users.
- You create a count metric filtered to
event='purchase', and then create a local metric that filters toevent='checkout'. This set of filters also returns a null set.
To resolve conflicting filters, go to the Unit-Level Aggregations jobs (where filters are rendered) and search for the metric name of interest in the SQL code.
Common points of confusion
Many data scientists run their own analyses using the Statsig staging data in their warehouse. In some cases, they see very different results. A few common methodology differences tend to explain these discrepancies:
Join conditions
Statsig joins events and metrics to exposures based on the event or metric input data occurring after the user was exposed to the experiment being analyzed. If you enable the Treat Timestamp as Date setting, the join is done by comparing dates. By default, Statsig uses a timestamp. Not having a time filter, or using a date when Statsig uses a timestamp (or vice versa), can yield very different results.
Winsorization/capping
You can configure metrics to be capped or winsorized in Statsig. If you have extreme outliers, you'll see significant differences in results if you don't apply this procedure. You can apply winsorization yourself, or create a local metric (or clone the metric) without winsorization.
CUPED
CUPED can greatly reduce variance, and for aggregation (non-ratio) metrics can shift the group-level mean estimates if there is pre-experiment bias. You can disable CUPED in the scorecard at any time to view the non-CUPED results for comparison.
Metric types
In some cases, users have assumed that their input data is binomial, meaning the metric is only 1 or 0 at a unit/user level. If this isn't the case and the metric is configured as a count or sum, the result can be much higher variance than expected.
Was this helpful?