Client Event Logger Redesign

Troubleshooting reference for Statsig client event logger messages, including initialization errors, network failures, and missing exposure logs.

Version 3.32.0 of all client SDKs introduces a new event logger architecture focused on smart retry/backoff, improved batching, bounded queues, and new flush mechanisms.

What changed

Logging now uses coordinated batching/scheduling instead of a single queue.
Retry behavior is coordinated between batched events, with batch-level retries (requeue and scheduled retry).
Queue growth is explicitly bounded. Under sustained pressure, Statsig drops events by design to protect stability.
Added Limit Flushing and Scheduled Flushing due to size and time.
The loggingIntervalMs option is now deprecated.

Architecture (high level)

PendingEvents: in-memory collection of newly logged events.
BatchQueue: queue of batched events waiting to send.
FlushCoordinator: controls flush timing and modes.
EventSender: performs network sends and emits flush lifecycle events.
FlushInterval: manages cooldown/backoff timing.

Flush mechanisms

Limit flush: flushes when a full batch is reached and backoff is satisfied. Performs opportunistic draining and keeps flushing as long as each network send succeeds. Falls into backoff upon failure.
Scheduled:full-batch flush: scheduler flushes full batches when cooldown allows.
Scheduled:max-time flush: scheduler flushes partial batches when max interval (60s) is reached.
Manual flush: explicit client.flush().
Shutdown flush: best-effort on shutdown on explicit client.shutdown(), with persistence for shutdown-failed events in local storage.
Quick flush: startup optimization for first-event latency. Flushes within 200ms window.

Retry nuances

Statsig requeues and retries failed batches.
- Statsig doesn't requeue non-retryable errors and drops them.
- Each batch gets 3 retries; Statsig drops batches that exceed that threshold.
Backoff adjusts with success/failure and affects scheduled flush timing.

Drop scenarios (important)

Batch queue overflow during batching or requeue: if events in the batch queue exceed capacity, Statsig drops the oldest batches.
- Queue capacity is batch size (default: 100) * max number of batches (30).
- Increase queue capacity by increasing the batch size with the loggingBufferMaxSize option.
If the batch queue is full and a failed batch can't be requeued, Statsig drops the entire batch.
Non-retryable network failure.
Max retries exceeded.
Storage persistence failure (disabled/shutdown paths).
Persisted-event cap exceeded (oldest events trimmed). Local storage has a hard maximum of 500 events.

Behavioral impact for upgrades

Under very high throughput or long outages, Statsig may drop events by design.
- Adjust your batch size based on throughput and logging volume to avoid dropping events due to queue size limits. Contact Statsig Support if you require higher throughput.
Non-retryable errors drop events.
Flushing cadence has changed. The new flushing mechanisms described above replace the adjustable background tick flush.

Was this helpful?