On this page

Autotune (Bandits)

Autotune and Autotune AI automatically weigh explore versus exploit to deliver the best-performing variant for a single metric.

Autotune and Autotune AI are Statsig's Multi-Armed Bandit solutions. They automatically find the best variant among a group of candidates while dynamically allocating traffic to optimize for a single target metric.

Autotune, the Multi-Armed Bandit solution, allocates traffic toward high-performing variants and can eventually identify a winning variant.

How Autotune works

Autotune is Statsig's Bayesian Multi-Armed Bandit. It tests and measures different variations and their effect on a target outcome. The multi-armed bandit continuously adjusts traffic toward the best-performing variations until it can confidently pick the best variation, which then receives 100% of traffic.

Bandits balance the explore/exploit problem: exploiting the current best-known solution versus exploring to gather more information about other solutions.

The blog posts on Multi-Armed Bandits and Contextual Bandits go into depth on use cases and considerations.

Implementing Autotune

Implementing an Autotune requires checking an experiment in Statsig. After initialization, or on server SDKs, this comes with sub-millisecond latency.

Autotune has a JSON config associated with each variant. The SDK returns this config, which you can use to modify elements of your webpage (for example, an image URL or button color) or to identify which variant is active so you know which code to run.

When to use Autotune

Autotune has two major differences from A/B testing (Statsig Experiments):

  1. The traffic split isn't fixed over the duration of the test. This allows Autotune to divert more traffic to the winner and less to losers while making fewer mistakes. However, the user experience may not be consistent upon repeated visits.
  2. Autotune can only optimize for a single metric. Autotune can't accurately measure a collection of metrics, and isn't a reliable way to understand secondary effects of your changes. It works best when the metric is well-understood and has a direct, immediate relationship to the change being tested.

Because of these differences, Statsig recommends Autotune in the following scenarios:

  1. The cost of exposing users to a losing treatment is high. For example, sending new users to an inferior landing page may result in lost revenue or churn. Testing two registration flows may cause some users to never sign up. Autotune avoids permanently losing users because it adapts quickly to feedback, unlike a static A/B test.
  2. You want the decision to be automated. Because Autotune automatically selects the winner, no human decision-making is required. This is well-suited for launching many simultaneous tests or running a long-term unmonitored test.
  3. It's acceptable for users to see different experiences upon return visits. For example, when changing text or recommendation algorithms.
  4. You have one simple metric to optimize for (for example, click-through rate) with an immediate effect on the test.
  5. You want to test multiple variations. Autotune quickly eliminates poor performers while focusing traffic on the best variants.

Autotune should be avoided in the following scenarios:

  1. When you have a complex ecosystem and want to understand secondary effects, tradeoffs between variants, and user behavior.
  2. When you are optimizing for complex metrics or delayed effects.
For these cases, use A/B testing with Experiments. In general, it's also a best practice to run Autotune within an experiment with a small holdback group that doesn't receive the Autotune, so you can measure the impact of the Autotune.

Was this helpful?