Incident date and time: 7/1/18 Intermittently between 12:05 and 13:15 UTC
Affected Services: PubNub Presence
Problem Description, Impact and Resolution:
A subset of customers experienced higher latencies and error rates for presence operations. The initial impact was caused by a spike in system resources which overloaded a subset of customer’s presence counting. Configuration was deployed to mitigate the impacted resources which allowed for recovery. We also identified a bug in our presence applications that prevented them from recovering as quickly as expected.
Mitigation Steps and Recommended Future Preventative Measures:
We are actively improving our presence automated healing configuration. In this incident we feel confident our tools enabled us to identify and resolve the issue, but as always we strive to improve speed to resolution. Additionally, we are working to resolve the identified bugs and ensure they will not happen in the future.