At 00:06 UTC on March 17, 2024, we observed increased error rates and latency in our Tokyo region for History calls. We then identified the source of latency and errors were due to our third-party provider for storage. We alerted the third-party provider, which then restarted the impacted storage nodes, and the issue was resolved at t 00:47 UTC on March 17, 2024.
To prevent a similar issue from occurring in the future we have added monitoring to the swap space level on our servers so we will have better alerting if such issues with our third-party provider occur in the future.