Increased Error Rate and Latency for Presence

Incident Report for PubNub

Postmortem

Problem Description, Impact, and Resolution 

At 18:14 UTC on September 7, 2025 we observed increased error rates and latency for our Presence service in our San Jose, Virginia, and Tokyo regions. We increased capacity in those regions and the issue was resolved at 18:17 UTC. This issue was a recurrence of the issue experienced on September 2, 2025, where a bug in one of our APIs allowed a request to execute an operation that exceeded assumed limits in extreme cases, causing out-of-memory conditions for the Presence service.

Mitigation Steps and Recommended Future Preventative Measures 

In the previous instance of this issue, we placed restrictions on the API in question; those changes were not restrictive enough, which allowed for this recurrence. We have corrected that oversight, as well as increased memory capacities in this area of our system as an additional safeguard.

Posted Sep 12, 2025 - 23:21 UTC

Resolved

The issues affected the Presence service from 18:14 - 18:17 UTC. We will follow up with a post-mortem soon.

We apologize for the impact this may have had on your service. Please reach out to us by contacting PubNub Support (support@pubnub.com) if you wish to discuss the impact on your service.
Posted Sep 07, 2025 - 19:02 UTC

Monitoring

We experienced increased latencies and error rates in our Tokyo, San Jose, and Virginia points-of-presence from 18:14 - 18:17 UTC. A fix was deployed and we are monitoring the situation.
Posted Sep 07, 2025 - 18:45 UTC
This incident affected: Realtime Network (Presence Service).