Degraded performance of Push, Presence and Storage service in the US East PoP
Incident Report for PubNub
Postmortem

Problem Description, Impact, and Resolution 

At 15:45 UTC on 2022-06-27, we observed delays in push notifications sent and messages written to history, as well as excess presence join & leave events. In response, we scaled the underlying systems supporting these services,, and the issue was resolved at 16:05 UTC. This issue occurred because our third party service provider experienced an outage in the US-East PoP.

Mitigation Steps and Recommended Future Preventative Measures 

To prevent a similar issue from occurring in the future, we are updating our processes to ensure that malfunctioning nodes are restarted in a way that will preserve their state for analysis, as well as updating our runbook for scaling the system.

Posted Jun 29, 2022 - 20:15 UTC

Resolved
We are resolving this issue, and we will follow up with a post-mortem soon.

We apologize for any impact this may have had on your service. Please reach out to us by contacting PubNub Support (support@pubnub.com) if you wish to discuss the impact on your service.
Posted Jun 27, 2022 - 16:57 UTC
Monitoring
Between 2022-06-27, 15:45 to 16:05 UTC, a small percentage of traffic experienced delayed Push notifications, Presence events, and Storage writes that were published from a single PoP in the US East region.

We apologize for the impact this may have had on your service. Please reach out to us by contacting PubNub Support (support@pubnub.com) if you'd like to discuss the impact on your service.
Posted Jun 27, 2022 - 16:16 UTC
This incident affected: Points of Presence (North America Points of Presence) and Realtime Network (Storage and Playback Service, Presence Service, Mobile Push Gateway).