Problem Description, Impact, and Resolution
At 18:16 UTC on June 13, 2024 we observed increased latency for delivery of mobile push messages in our Frankfurt and US-East points of presence. In response, we increased the resources available to the services and redeployed the service.The issue was resolved at 21:21 UTC on June 13, 2024.
Upon further investigation, we identified this issue occurred due to malformed message payloads creating a backlog in the message queue.
To prevent a similar issue from occurring in the future, we increased the memory for the service to handle similar malformed payloads, as well as added additional monitoring.