Potential for some missed messages for subscribers in IAD

Incident Report for PubNub

Postmortem

Problem Description, Impact, and Resolution 

On October 17, 2025 at 04:51 UTC, some customers may have experienced elevated latency and error rates with the Pub/Sub service in the IAD region (US-East). Our engineering teams began immediate investigation and identified a spike in errors related to a recent update to the Pub/Sub service.

We began formal incident response and initiated rollback of the service deployment shortly thereafter. The issue was fully resolved by 06:50 UTC, and rollback across all regions was completed by 08:00 UTC.

The issue occurred because a misconfiguration in the release caused incorrect behavior in the channel cleanup logic. Additionally, our alerting configuration did not include coverage for the synthetic test failures that would have surfaced this issue sooner, delaying detection.

Mitigation Steps and Recommended Future Preventative Measures 

To prevent a similar issue from occurring in the future, our engineering teams have written a simpler and more reliable replacement for the faulty logic. That code is currently undergoing rigorous testing before being reintroduced in a future release.

We are also addressing the lack of proper alerting that contributed to a delayed response. Synthetic tests have been reviewed, and appropriate alerting will be implemented to ensure similar regressions are detected earlier. In parallel, we are updating our development and testing processes to catch such issues before code reaches production. Lastly, we are conducting a refresher training on our incident response process to ensure faster execution and coordination in the future.

Posted Oct 22, 2025 - 22:22 UTC

Resolved

There have been no further issues for the past 45 minutes. We are resolving this issue, and we will follow up with a post-mortem soon.
Posted Oct 17, 2025 - 08:06 UTC

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Oct 17, 2025 - 07:08 UTC

Identified

The issue has been identified and a fix is being implemented.
Posted Oct 17, 2025 - 06:48 UTC

Investigating

We are currently investigating an incident that could lead to some missed messages by subscribers in the IAD region. All messages are being received and persisted, and can be retrieved from the Storage service.

This incident started around 22:07 UTC (03:07 PDT) on 16th Oct 2025. We suspect a moderate impact. Please report any impact related to this incident to support@pubnub.com with any details that you can provide.
Posted Oct 17, 2025 - 06:42 UTC
This incident affected: Realtime Network (Publish/Subscribe Service) and Points of Presence (North America Points of Presence).