Integrating your product with Statsig means depending on Statsig, and we take reliability seriously. Here are some questions many people have when trying to evaluate the risks, please feel free to reach out on Slack if you have questions that are not listed here.
Does Statsig use any caching to help with latency?
We use a combination of caching solutions, depending on the problem we are solving. For serving our console and API requests, most caching is done at the region or host level.
What else does Statsig do to make sure the service is resilient?
Our SDKs are designed to be resilient in case there is an issue in the API requests, to make sure a seamless experience on your side.
Client SDKs:
The SDKs will use the latest values from Statsig server if the user is able to reach Statsig server;
Then it will use cached value from a previous session will be used, if available;
After that, the APIs require you to have set default values in your code, so that will be used. This means the worst case scenario your users will get the default experiences.
The SDKs will automatically retry failed event requests if for some reason Statsig event servers are unreachable. The Client SDKs will even persist failed log requests to local storage and retry in subsequent sessions.
Server SDKs:
They store rules for gates and experiments in memory, so it will be able to continue evaluating them even if Statsig is down.
There is also an option to “bootstrap” your server SDKs with rule values from a previous session if Statsig is down when your server is starting up. This can be done with a Server Data Store, which lets you plug a storage provider into the Statsig SDK, to store your rule values.
The SDKs will automatically retry failed event requests if for some reason Statsig event servers are unreachable.
We use Github and Dockerhub for code and binary storage. We keep track of the entire CI/CD process from source code to production deployment with traceable versioning and binary verification.