Yesterday, Google Cloud servers in the us-east1 region were cut off from the rest of the world as there was an issue reported with Cloud Networking and Load balancing within us-east1.
These issues with Google Cloud Networking and Load Balancing have caused physical damage to multiple concurrent fiber bundles that serve network paths in us-east1. At 10:25 am PT yesterday, the status was updated that the “Customers may still observe traffic through Global Load-balancers being directed away from back-ends in us-east1 at this time.”
It was later posted on the status dashboard that the mitigation work was underway for addressing the issue with Google Cloud Networking and Load Balancing in us-east1. However, the rate of errors was decreasing at the time but few users faced elevated latency.
Around 4:05 pm PT, the status was updated, “The disruptions with Google Cloud Networking and Load Balancing have been root caused to physical damage to multiple concurrent fiber bundles serving network paths in us-east1, and we expect a full resolution within the next 24 hours. In the meantime, we are electively rerouting traffic to ensure that customers' services will continue to operate reliably until the affected fiber paths are repaired. Some customers may observe elevated latency during this period. We will provide another status update either as the situation warrants or by Wednesday, 2019-07-03 12:00 US/Pacific tomorrow.”
This outage seems to be the second major one that hit Google's services in recent times. Last month, Google Calendar was down for nearly three hours around the world. Last month Google Cloud suffered a major outage that took down a number of Google services including YouTube, GSuite, Gmail, etc.
According to a person who works on Google Cloud, the team is experiencing an issue with a subset of the fiber paths that supply the region and the team is working towards resolving the issue. They have mostly removed all the Google.com traffic out of the Region to prefer GCP customers.
A Google employee commented on the HackerNews thread, “I work on Google Cloud (but I'm not in SRE, oncall, etc.). As the updates to [1] say, we're working to resolve a networking issue. The Region isn't (and wasn't) "down", but obviously network latency spiking up for external connectivity is bad. We are currently experiencing an issue with a subset of the fiber paths that supply the region. We're working on getting that restored. In the meantime, we've removed almost all Google.com traffic out of the Region to prefer GCP customers. That's why the latency increase is subsiding, as we're freeing up the fiber paths by shedding our traffic.”
Google Cloud users are tensed about this outage and awaiting the services to get restored back to normal.
https://twitter.com/IanFortier/status/1146079092229529600
https://twitter.com/beckynagel/status/1146133614100221952
https://twitter.com/SeaWolff/status/1146116320926359552
Ritiko, a cloud-based EHR company is also experiencing issues because of the Google Cloud outage, as they host their services there.
https://twitter.com/ritikoL/status/1146121314387857408
As of now there is no further update from Google on if the outage is resolved, but they expect a full resolution within the next 24 hours. Check this space for new updates and information.
Google Calendar was down for nearly three hours after a major outage
Do Google Ads secretly track Stack Overflow users?
Google open sources its robots.txt parser to make Robots Exclusion Protocol an official internet standard