The Presence service reported errors globally
Incident Report for PubNub
Postmortem

Problem Description, Impact, and Resolution 

At 07:30 on 2021-09-27, we observed an elevated number of failures in requests to our presence service for a 6 minute period, which may have resulted in some clients appearing to go offline and then rejoin. An infrastructure provider notified us of downgraded performance in a service we use to maintain presence state then quickly resolved the issue before we were able to perform mitigation steps. The Issue was resolved at 07:36 UTC on 2021-09-27. 

Mitigation Steps and Recommended Future Preventative Measures 

To prevent a similar issue from occurring in the future we are working with our infrastructure provider to determine how this storage module can be better distributed to better protect us from failures in a single node or region.

Posted Oct 05, 2021 - 15:52 UTC

Resolved
The issues have not resurfaced since 07:36 PM UTC. We will follow up with a post-mortem soon.

We apologize for the impact this may have had on your service. Please reach out to us by contacting PubNub Support (support@pubnub.com) if you wish to discuss the impact on your service.
Posted Sep 27, 2021 - 08:14 UTC
Monitoring
Between 07:30 UTC and 07:36 UTC, customers using our Presence APIs will have experienced a high rate of errors and latency. Devices might have falsely timed out and rejoined during this time. The incident was due to an issue being experienced by an underlying platform provider. We will continue to monitor but everything has recovered at this point.

Please reach out to us by contacting PubNub Support (support@pubnub.com) if you wish to discuss the impact on your service.
Posted Sep 27, 2021 - 08:00 UTC
This incident affected: Points of Presence (North America Points of Presence, European Points of Presence, Asia Pacific Points of Presence, Southern Asia Points of Presence) and Realtime Network (Presence Service).