On November 12th, 2018, between 1:00 PM and 2:23 PM PST, ThousandEyes noticed issues connecting to G Suite, a critical application for our organization. Reviewing ThousandEyes Endpoint Agent stats, we noticed this was impacting all users at the ThousandEyes office. The outage not only affected G Suite, but also Google Search as well as Google Analytics. What caught our attention was that traffic to Google was getting dropped at China Telecom. Why would traffic from a San Francisco office traversing to Google go all the way to China? We also noticed a Russian ISP in the traffic path, which definitely sparked some concerns.
Figure 1: Traffic from ThousandEyes users in San Francisco getting dropped in China.
Upon further investigation, we saw that several ThousandEyes vantage points around the globe were reporting similar unusual traffic routing, all terminating at China Telecom.
Figure 2: Traffic from multiple locations to Google getting black-holed at China Telecom.
ThousandEyes BGP Route Visualization painted an interesting picture. Traffic from Paris to www.google.com resolved to 22.214.171.124. While Google announces many /24 prefixes to cover its IP address range, this address was not covered by a /24 prefix. Instead, it was covered by a /19 prefix. We saw a suspicious announcement for 126.96.36.199/19 appear after about 12:45 pm PST with a convoluted AS path that included TransTelecom (AS 20485) in Russia, China Telecom (AS 4809) in China and MainOne (AS 37282), a small ISP in Nigeria.
Figure 3: Suspicious announcement for 188.8.131.52/19 showing the best path to Google via Russia, China and Nigeria.
The traffic paths we saw mirrored the BGP AS Path, except all the traffic slammed into the great firewall, terminating at China Telecom edge router. For a detailed look at this incident, follow this ThousandEyes snapshot share link.
This incident at a minimum caused a massive denial of service to G Suite and Google Search. However, this also put valuable Google traffic in the hands of ISPs in countries with a long history of Internet surveillance. Overall ThousandEyes detected over 180 prefixes affected by this route leak, which covers a vast scope of Google services. Our analysis indicates that the origin of this leak was the BGP peering relationship between MainOne, the Nigerian provider, and China Telecom. MainOne has a peering relationship with Google via IXPN in Lagos and has direct routes to Google, which leaked into China Telecom. These leaked routes propagated from China Telecom, via TransTelecom to NTT and other transit ISPs. We also noticed that this leak was primarily propagated by business-grade transit providers and did not impact consumer ISP networks as much.
On November 13, MainOne tweeted out that the root cause of the problem was in fact due to a configuration error.
This incident further underscores one of the fundamental weaknesses in the fabric of the Internet. BGP was designed to be a chain of trust between well-meaning ISPs and universities that blindly believe the information they receive. It hasn’t evolved to reflect the complex commercial and geopolitical relationships that exist between ISPs and nations today. While verification methods like ROA exist, few ISPs use them. Even corporations like Google with massive resources at their disposal are not immune from this sort of BGP leak or malicious hijacks. MainOne took 74 minutes to either notice or be notified of the issue and fix it, and it took about three quarters of an hour more for services to come back up. Most enterprises who don’t have Google’s reach and resources may not be able to resolve the issue as quickly, which can significantly impact business.
BGP-related incidents have been on the rise. In April 2018 we saw a brazen cryptocurrency heist involving the hijack of an entire DNS provider (Route 53). Just a year before that, in April 2017, we saw the Rostelecom BGP route leak which affected a large number of e-commerce and financial services websites. While the ISP community recognizes the scope of the problem, and solutions such as ROA and IRR filtering exist, none of them are silver bullets and implementing them risks breaking the Internet.
In the absence of guarantees, enterprises need to continuously monitor their BGP routes and detect such incidents quickly in order to mitigate any service impacts to their business. Learn more about how you can use ThousandEyes to detect BGP hijacks and leaks. Subscribe to the ThousandEyes blog to stay up to date on outage analyses like these and best practices on delivering a reliable Internet experience.