
Introduction
Statsig Cloud can directly ingest data from your Data Warehouse. This lets you send raw events and pre-computed metrics for tracking and experimental measurement. We currently support ingestion from the following providers:Warning: We support you making multiple data connections to your project, but only support a single export connection.
How it works
In Statsig console, you can:- Set up connection to your data warehouse
- Query your data warehouse for appropriate data
- Map your data fields to Statsig’s expected schema
- Bulk ingest & schedule future ingestions
How to Begin Data Ingestion
To begin ingestion from a Data Warehouse:- Go to your Statsig Console
- Navigate to Data tab on the side navigation bar
- Go to the “Ingestion” tab

Connection Flow
See the docs sidebar to find the documentation for the data warehouse of your choice. Upon connection, you will provide a SQL query to generate a view via data for Statsig to ingest.
Data Mapping
After connecting and providing a SQL query, you’ll map columns in your data output to fields Statsig expects. We’ll run a small sample query to ensure there are no basic issues with data types. To process data correctly, Statsig requires each ingestion to include columns for unit_id, event_name, timestamp, and metadata.
Scheduling Ingestion & Backfilling
Statsig supports multiple schedules for ingestion. At the scheduled window, we will check if data is present in your warehouse for the latest date, and load if it exists. We will check the underlying source table for changes. For up to 3 days after initial ingestion, we will check for >5% changes in row counts and reload the data. We also support a user-triggered backfill. This could be useful if a specific metric definition has changed, or you want to resync data older than a few days. To change your ingestion schedule or start a backfill, click the ellipses at the end of the data connection and navigate to these menus. Reloading data and backfilling metrics and events is billed as any other custom eventAuto-generated User Accounting Metrics are not supported today for data warehouse ingestions.
Troubleshooting Ingestions
If any ingestion errors occur, Statsig will notify you in project and direct your to the Ingestions page. You can diagnose an error directly in Statsig by following the step-by-step triage flow. Common errors may include missing permissions and out-of-date credentials.API Triggered Ingestion (mark_data_ready)
Enterprise customers can trigger ingestion formetrics or events using the statsig API. This will run your daily ingestion immediately after triggering, and can be helpful for companies whose data availability timing may vary day over day and want data to land as soon as possible in Statsig. This can be enabled by selecting “API Triggered” as your ingestion schedule - note that with this enabled, there will not be an automatic ingestion, but we will still re-sync data after the initial ingestion if we observe a change.
To trigger ingestion, send a post request to the https://api.statsig.com/v1/mark_data_ready_dwh endpoint using your statsig API key. An example would be:
- datestamps: Refers to the date of the data being triggered.
- type:
metricsorevents - sources (only for multi-source ingestions): Array of strings representing the sources to trigger
This is rate limited to once every two hours, and there may be a few minutes delay after triggering before status updates while compute resources are created.