Skip to main content

Bots & Filtering

One common source of frustration when monitoring gate traffic is online bot traffic from sources like search engines and AI scrapers. These can make it harder to see how many “real” users are seeing your changes. Statsig has bot filtering in place to remove known bots from your exposures data, meaning the exposure counts you see and any analytics you do will be clean. You won’t have to worry if the data you’re looking at is influenced by bots or real users. Bot filtering is done on all types of exposures data, not just feature flags. You can be sure that anytime you’re looking at analysis results for feature flags, holdouts, layers, and experiments bots have been filtered out. This ensures that you’re looking at results for real users and not web scrapers in your rollouts. For more on on Bot Filtering rollout, see the Statsig Blog. Once bot data is filtered from your exposures data, it will not be viewable in the Statsig console. We’re exploring how to better surface this information in the future. Please reach out via slack support if you have additional questions.

Controlling Gates and Experiments for Bots

By design, Statsig doesn’t block bots from getting your feature flags and experiments. We simply filter out their exposures from any analysis data and the count of exposures that you see in Pulse. There are no changes in the API or SDK results for bots, and they will be served configs and variants following your setup. You might, however, want to purposefully restrict what features bots see. For example, you’re testing a new homepage variant but you don’t want search engines to index it yet. In this case, there is an easy way to do so via Segments:
  1. Create a “Known Bots” Segment for your project: Create a new segment, ensuring that it will be a conditional segment.
    New conditional segment setup for capturing known bots
    Once created, add a new rule to the segment. Set Criteria to “Browser Name”. Leave Operator as “Any Of”. In the Values field, copy + paste the following string in its entirety. (There is a copy button to the right.) When pasting, Statsig console will take care of splitting the bots up into individual names.
    Scomplerbot, WincherBot, fixbot, keys-so-bot, MojeekBot, Gulper Web Bot, Mattermost-Bot, SerendeputyBot, uipbot, WebCrawler, HearsayPDFBot, WRTNBot, BublupBot, InsytfulBot, DingTalkBot, uk_ldfc_renderbot, crawlers, ImagesiftBot, idealo-bot, taboolabot, KlaxoonBot, SemrushBot, archiver/3.1.1 +http://www.archive.org/details/archive.org_bot, StractBot, crawler_eb_germany_2, exabot, DocBase Crawler, co Bot, Superfeedr bot, Pokey_Bot, GooglePlusBot, OtherwebBot, PubMatic Crawler Bot, SiteAuditBot, Gensparkbot, wpbot, archive.org_bot, Audisto Crawler, amazon-product-discovery-bot, Atomseobot, Googlebot-Mobile, hubspot crawler, XoviOnpageCrawler, PerplexityBot, QualifiedBot, YodaoBot, BitSightBot, GG PeekBot, SMTBot, amazonproductbot, FAST-WebCrawler, TwitterCommerceBot, WellKnownBot, PAGEFREEZER CRAWLER,  dbot, htc_botdugls, RavenCrawler, oBot, notebot, ViberBot, KStandBot, scoopit-crawler, SpeechifyBot, Spider_Bot, txt Crawler, net/bot, BugBountyBot, Letianpai_Robot, by fynd.bot, discobot, LineBotWebhook, ahrefsbot, Veoozbot, tkbot, coccocbot, Googlebot-Video, Streamline3Bot, Zoombot, adbeat_bot, msnbot, Nextdoorbot, node DuckAssistBot, redditbot, Xing Bot, DocSearch Crawler, ; bot, Storebot, playwright-bot, 47_safeAreaBottom, online-webceo-bot, SmarshBot, BeeperBot, ChannelBot, BrightEdge Crawler, //boteden, Quantcastbot, SpringserveBot, IAS Crawler, managr-webcrawler, ) Bot, Dragonbot, crawler4j, tyseobotmobile, IVW-Crawler, SEBot, sap-search-web-crawler, AndersPinkBot, Dcard-link-preview-bot, Mediumbot, Light Crawler, dataforseobot, Better Uptime Bot, CCBot, es_bot, DuckAssistBot, SeznamBot, telegrambot, crawler, Jugendschutzprogramm-Crawler, SeoCherryBot, GroupMeBot, HyperMegaBotGettingOnlyHTMLsFromYourWebsite, our-crawler, Slackbot-LinkExpanding, YandexMobileBot, Web-Crawler, PaperLiBot, Swiftbot, Paqlebot, YandexRenderResourcesBot, MetaJobBot, SynologyChatBot, GenomeCrawlerd, robot, StatusCakeBot, node bitlybot, Your robot, Pharosbot, TSMbot, WalluBot, slackbot, AmazonAdBot, AspiegelBot, EzoicBot, TiggeritoBot, eventseekerBot, AwarioBot, Leikibot, Timpibot, like Gecko) bot, Quora-Bot, JobBot, googlebot, PetalBot, eu bot, LinkArchiver twitter bot, DF Bot, Screaming Frog Wise SEO Spider, Clickagy Intelligence Bot, BLP_bbot, bitlybot, WazzupCrawler, web-crawler, pingbot, yoozBot, triptease-bot, Plesk screenshot bot, Magus Bot, node Screaming Frog SEO Spider, YextBot, seobilitybot, tyseobot, applebot, bingbot, GetLocalBot, TwitterBot, rogerbot, Preview Service; bot, traq-ogp-fetcher-curl-bot, seesawbot, Greppr Web Crawler, ResearchBot, web-bot, iAskBot, JobboerseBot, CriteoBot, FandomOpenGraphBot, com feedbot, Amazon-Advertising-ad-standards-bot, MotoMinerBot, peer39_crawler, Discordbot, DuckDuckGo-Favicons-Bot, KeybaseBot, adsbot, Open Graph Bot, emulate-seobots, bountybot, InfobipCrawler, GoogleBot, macox bot, Google-bot, captify-crawler, Robot, aiHitBot, fedistatsCrawler, ExtendedStayBot, NetpeakCheckerBot, com/bots, Automattic Analytics Crawler, Blog Rssbot, Dubbotbot, Rightlander Crawler, ClarityBot, Cookiebot, UOrgTestingBot, AcademicBotRTU, SEMrushBot, Server Crawler, Diffbot, DiscourseBot, chatbot, VirusTotalBot, SaberBot, TZUnfurlBot, Mail.RU_Bot, Monsidobot, YandexAccessibilityBot, preview service; bot, ecoresearchCrawler, PulsePoint-Crawler, DataForSeoBot, petalbot, Xbot, LinkedInBot, GnowitNewsbot, vebidoobot, Bawaab_bot, Brightbot, ClineCrawler, ; Bot, MSIECrawler, MoodleBot, Testcrawler, AASA-Bot, GPTBot, StrapBot, 5) bot, mj12bot, screaming frog seo spider, HubSpot Crawler, COIBotParser, OcelotBot, com crawler, Pinterestbot, VelenPublicWebCrawler, Firefox superpagesbot2, Parser Robot, GrapeshotCrawler, Mediatoolkitbot, am a bot, semrushbot, SearchAtlas Bot, DiffeoBot, IBM-Crawler, spbot, DatoCmsSearchBot, SISTRIX Crawler, bountybotttt, Summalybot, ID bot, node AppleNewsBot, DotBot, TesseractBotAgent, foundeebot, BadooBot, BacklinksExtendedBot, wowLink Crawler, about-crawlers, find-seo-bot, GraphiteBot, Sidetrade indexer bot, BLEXBot, rc-crawler, FacebookBot, nerdybot, Senutobot, Facebot, //botim, WallabyupBot, TurnitinBot,  XBot_Senior, node ZoominfoBot, AhrefsBot, Exabot, PhaverBot, Applebot, TermlyBot, SemjiBot, Space Unfurl Bot, Slackbot, bidswitchbot, //botsin, AdsBot-Google, Morningscore Bot, DuckDuckBot, UptimeRobot, ClaudeBot, naverbookmarkcrawler, PingdomBot, Web Crawler, PlurkBot, node GrowSEOBot, WebExplorerSearchBot, node FullStoryBot, WebwikiBot, bot, policy adbeat_bot, trendictionbot0, ezoicbot, Catrobatbot, AdsTxtCrawlerTP, com/bot, Nigooutbot, PiBot, pinterestbot, http-spiders-bot, Ocarinabot, msnbot-media, AppsFlyerBot, SeobilityBot, Impressumscrawler, SurdotlyBot, cXensebot, Amazonbot, Rankabot, 2ip bot, harsilbot, FullStoryBot, com bot, Rhobot, FreshpingBot, twitterbot, Twitterbot, Caliperbot, Googlebot-Image, osapon ) bot, yandexbot, MJ12bot, Taboolabot, ActiveComplyBot, MixrankBot, 48_safeAreaBottom, compatible; botify, LoomlyBot, Googlebot, ev-crawler, pagefreezer crawler, AwarioSmartBot, iCjobs Stellenangebote, jbot, aixnew_aibot, SemanticScholarBot, Wire LinkPreview Bot, Elastic-Crawler, UCMore Crawler, x28-job-bot, ISSCyberRiskCrawler, AnytypeBot, clever tech bot, LivelapBot, Screaming Frog SEO Spider, RyteBot, SiteGuruCrawler,  XBot, SuperBot, TypetalkBot, RepoLookoutBot, obot, TimeTreeBot, siteauditbot, iASD_SpiderBot, semaltbot, PopeTech-ScanBot, SummalyBot, aka-bot, YandexBot, HatenaBlog-bot, Googlebot-News, yacybot, SiteCheckerBotCrawler, node AwarioSmartBot, turbotime, net-Robot, reurl-bot, Google-Display-Ads-Bot, node CCBot, fr_bot, CapterraBot, seo-audit-check-bot, gptbot, Testomatobot, Snap-URL-Preview (bot, robots, ZumBot, hstspreload-bot, Fedicabot, serpstatbot, Synapse (bot, NetSeer crawler, uk_ldfc_bot, PartnerOptimizer-bot, Scrapbox Bot, domainsbot, NE Crawler, startmebot, VelaBot, SeekportBot, Radius Compliance Bot, AdkernelTopicCrawler, Linkbot, AppleNewsBot, Jones Searchbot, DropboxPreviewBot, ZoominfoBot, edansbot, Baiduspider-render, Python Requests, YisouSpider, OAI-SearchBot, AliyunSecBot, Baiduspider, Bytespider, TikTokSpider, Claude-SearchBot, Qwantbot, Thinkbot, ABEvalBot, HanaleiBot, OpenindexSpider, VamosBot, StartmeBot, Bingbot, parisbot, BetaLiveCrawlBot, MetaBot, CharSiuBot, PlagAwareBot, io/bot, ShowUpCrawler, ScraperBot, Monibot, ShellBot, IbouBot, MatchboxBot, CookieYesbot, Stripebot, Clearscopebot, quillbot, Sogou web spider, Checker Spider,
    
    Segment rule showing browser name filter populated with bot list
    This common segment can then be used for all your launches.
  2. Apply the Segment to your Gates and Experiments: For Gates, create a new rule that controls the bot experience.
    Feature gate rule ensuring known bots fail the rollout
    For experiments, create a Conditional Override that forces units in this segment to receive whatever version you want.
    Experiment conditional override mapping bot segment to control variant

Opting Out of Bot Filtering

Bot filtering is done at the project level. Admins can opt out of filtering through their console settings.
Bot filtering settings interface

Suggesting New Bots to Statsig

If you have discovered bots that Statsig isn’t including in our default set, or you have internal bots your company manages that you’d like to be applied to all bot filtering by Statsig, please reach out to us in Slack.