At 08:40 UTC on October 6, 2025, we observed elevated error rates in the Events & Actions service in our EU-Central (FRA) region, which led to delays in processing publish-triggered events. Some customers may have experienced slower-than-expected execution of their event workflows during this time.
We identified a malformed payload that was causing backend consumers to fail when attempting to process the queue. We deployed an updated build with improved parsing logic, which cleared the blockage and restored normal service. The issue was fully resolved by 11:00 UTC on October 6, 2025.
This issue occurred because our event processing service did not correctly handle a malformed message format, which caused the processing queue to stall. Additionally, the alerting system in place was not configured to detect this failure mode promptly, delaying our response.
To reduce the risk of similar delays in the future, we are refining our alert thresholds and naming conventions to improve early detection and clarity during response. We are also reviewing validation logic to ensure malformed messages are consistently isolated before reaching backend queues.