Unable to start Functions
Incident Report for PubNub
Postmortem

Incident Summary:

EC2 experienced outbound connectivity issues from us-west-2. Here is the statement from them:

1:59 PM PDT We are investigating Network Connectivity issues in the US-WEST-2 Region.
2:33 PM PDT Between 1:46 PM and 2:15 PM PDT we experienced network connectivity issues in the US-WEST-2 Region. The issue has been resolved and the service is operating normally.

Root Cause:

Our portal cluster exists in us-west-2 and portal schedules to all regions through us-west-1. Due to the above connectivity issues the portal cluster could not reach out to transpile an schedule against us-west-1. Once the networking was fixed the issue went away. As an aside, we also performed an upgrade to our underlying transpilation service to ensure they were not getting OOM kills and causing the problem however this proved to be a red herring.

Posted Oct 18, 2017 - 21:53 UTC

Resolved
Our datacenter provider confirmed with us the issue has been resolved.
Posted Oct 18, 2017 - 21:43 UTC
Monitoring
We have not seen any dropped packets between the datacenters since the last update. We are going to monitor the situation for the next hour then resolve the issue.
Posted Oct 18, 2017 - 21:32 UTC
Update
The issue has been identified as underly connectivity between two datacenters, we are working with our datacenter provider to resolve the issue. We are seeing intermittent connectivity as well now so some start calls should go through, hence upgrading to degraded performance.
Posted Oct 18, 2017 - 21:17 UTC
Update
The issue has been identified and a fix is being implemented.
Posted Oct 18, 2017 - 21:05 UTC
Identified
We have identified an issue with starting new Functions, engineering is actively working to resolve the issue
Posted Oct 18, 2017 - 21:04 UTC