Log Metrics
Log metrics are a special case of [Sum](./sum) and [Count](./count) metrics, where the unit-level metric value is logged before calculating pulse results. This can be configured in the advanced settings of Sum or Count metrics. This defaults to taking the natural log, but a custom base can be specified.
Use cases
Log metrics are useful for understanding if the distribution of a log-normal or tail-driven metric has shifted. Statsig calculates the metric as a conditional mean: a ratio metric where the numerator is the sum of unit-level log values and the denominator is 1 for units with a valid log. Statsig filters out records with a 0 denominator, because imputing 0s for logs doesn't work without a treatment such as an inverse hyperbolic sine function.
Common uses include revenue, time spent, or other metrics where a small portion of users drives most of the value but bulk improvements matter. Log metrics measure relative change per unit: an increase of 1 corresponds to a multiplication by the log's base.
Calculation
At the unit level, count metrics run a COUNT(1) or SUM(value) across their metric source.
At the group level, Statsig calculates the mean as the SUM of the log of the unit-level value, divided by the count of units with a unit-level value that log is valid for (exists, and is greater than 0).
The SQL for a count metric looks like the following:
-- Unit Level
SELECT
source_data.unit_id,
exposure_data.group_id,
COUNT(1) as value
FROM source_data
JOIN exposure_data
ON
-- Only include users who saw the experiment
source_data.unit_id = exposure_data.unit_id
-- Only include data from after the user saw the experiment
-- In this case exposure_data is already deduped to the "first exposure"
AND source_data.timestamp >= exposure_data.timestamp
GROUP BY unit_id, group_id;
-- Group Level
SELECT
group_id,
-- divide the sum of the logged values by the count of participating units
SUM(LOG(value, <base>))/COUNT(1) as mean
FROM unit_data
WHERE value > 0
-- the filter is implicit from the CTE, but let's make it explicit
-- a sum metric might have negative values
GROUP BY group_id;
Methodology notes
Log metrics can be difficult to interpret and to extrapolate to topline values. Use log metrics together with the raw or winsorized SUM and COUNT metric.
There are a few ways to handle 0s in a log metric. A transformation like IHS can approximate the behavior of log for large values while accepting 0s as inputs. Alternatively, you can scope the analysis to non-zero units. Statsig uses the second approach for ease of interpretation, because log properties are broadly understood.This means there is a potential confounding factor of participation rate. To mitigate this, Statsig presents results the same as for ratio metrics, including statistics for the overall result and the implicit numerators and denominators.Options
Non-log options depend on whether the metric is a Sum or Count.
- Custom log Base
- You can configure a custom base for the log operation. Defaults to LN
Was this helpful?