10:20-11:02 AM PDT
Presence service experienced latency and timeouts in all regions.
Presence service DB recently rolled over to a new Redis DB cluster after which the timeouts were noticed. To remedy immediately, the DB was rolled back to the previous version where we noticed another failure in the roll-back process which increased the timeouts. The failure step in the process was immediately identified and fixed which resolved the issue bringing back the service to a healthy state.
The roll-back process will be revisited to avoid any failure steps in future.