'BGP. The Internet's weakest link'
Yesterday KPN's fixed and mobile telephone network was down for about 3 hours. As a result, the 112 emergency number was also unavailable.
At Newsweek explained Joost Farwerck - a member of KPN's Board of Management - that KPN is still busy investigating the cause of the disruption, but that in any case it is related to a lot of traffic that was incorrectly routed.
Several weeks ago, things also went wrong at a nationwide PIN failure, which later turned out to be a routing problem (BGP Hijack).
Almost simultaneously with the outage at KPN yesterday, the also wrong in the U.S. at Internet provider Verizon. There, part of the Internet went offline.
A possible cause? The Border Gateway Protocol, the Internet's weakest link.
What is the Border Gateway Protocol?
The Border Gateway Protocol (BGP) is the Internet's main routing protocol. It is used between providers to tie all autonomous systems (AS) and networks together and ensure that they can communicate with each other. Thus, without BGP, the Internet would be very limited.
BGP can therefore be compared, in a sense, to the Domain Name System (DNS). DNS links an easy-to-remember Web site address to an IP address of the associated Web server. BGP does the same thing, only the BGP links the different IP addresses to the different providers. Thus, it allows them to communicate with each other.
How can such a routing problem suddenly occur?
BGP is based on the "trust among systems" principle. This means that each BGP router can list its own IP table, and pass it on to other BGP routers to which it is connected (peers). In turn, these BGP routers also tell this to their peers or neighbors.
This, then, is where the problem arises. BGP is an old protocol that was first used in its current form in 1994. What the drafters of BGP could not have foreseen at the time is how big the Internet was going to become. Also, because BGP is such a fundamental part of today's Internet, it cannot be simply replaced. This requires global adaptations.
One of the most common problems with BGP is a BGP Hijack. In a BGP Hijack, a provider reports an IP range that is not actually in his or her possession. This allows this notification to spread to other BGP routers and thus all over the Internet. Until someone notices this and restores the routes by re-notifying them, all traffic will be sent to another provider. This can cause a system to receive such an amount of traffic that it cannot handle it and thus fails. This is also called a DoS (or Denial of Service).
A BGP Hijack does not have to be intentional. A simple typo can already have caused the damage.
Where did things go wrong yesterday?
Where the routing went wrong yesterday is as yet unknown. Whether the problem is therefore related to BGP is not yet to be said. However, it is a good time to look at BGP and realize how fragile our Internet actually is.
At least during the PIN outage several weeks ago, it did appear to be a BGP Hijack in which a Swiss company redirected KPN's IP addresses through China Telecom (source).
By the way, it's not just KPN and other Internet providers that go wrong. Google also went wrong in 2017 rendering much of Japan inaccessible. Vice versa, Pakistan managed it in 2008 YouTube to be blocked worldwide. An accident is in a small corner.
It is therefore clear that something must be done with the Border Gateway Protocol. For this, there are already several initiatives such as RPKI. However, adjustments are required worldwide and cooperation is necessary to make BGP a secure protocol.