A Practical Introduction to BGP and Its Role in Networking

Brief Introduction

  • Communication is on TCP Port 179
  • Uses Autonomous Systems (AS) to share routes. These can be internally in the AS (iBGP) or externally with other peers (eBGP).
  • BGP is used internally (iBGP) within the AS, or externally (eBGP) between AS’s
  • Slow but very customizable by nature since it has to support and share thousands of routes

What’s an Autonomous System?

An autonomous system in simplicity defines a network that an administration controls. R1, R2, R3, and R4 in the example above is considered BGP AS 65500. Additionally, BGP AS 65000 consists of R5, R6, R7, and R8. Finally, BGP AS 65100 consists of R9 and R10.

Within an AS, is where Interior Gateway Protocols (IGP) live. Some of the most common IGP’s are OSPF, IS-IS, EIGRP, Static Routes, and even RIP (yay….). They are considered IGP because they control the routing internally in the AS.

AS Numbers as both private and public ranges just like IPv4 Addresses that range from 0 to 65535:

  • 0: reserved.
  • 1-64.495 – Public AS’s
  • 64.496 – 64.511 – Used in documentation
  • 64.512 – 65.534 – Private AS’s
  • 65.535 – reserved

Newer devices support something called 4-byte AS. With this enabled, our AS ranges can be increased from 0 to 4294967295. Thus making 4200000000 to 4294967294 a new private range.

Peering

As mentioned earlier, there is two types of BGP Peers:

  • iBGP – Peering between two routers in the same AS (Internal BGP)
  • eBGP – Peering between two routers in different AS’s (External BGP)

In the example above, if the routers within AS 65500 (R1, R2, R3, and R4) peer with each other, this will be considered iBGP (Internal BGP). If R4 (AS 65500) and R5 (65000) peer with each other, this is considered eBGP (External BGP).

In order to peer, there has to various parameters that have to be agreed upon:

  • The AS number their assigned to
  • Neighbor Statements – Both neighbors have to the other’s IP address configured
  • Matching BGP Versions
  • Router-ID has to be unique
  • Address Family Agreement (IPv4, IPv6, etc.)
  • MD5 Authentication (If configured)

The router-id (RID) is a way of identifying the peer incase the peer is using multiple IP addresses with BGP. The RID will look just like an IPv4 Address, and is can be automatically set with an IPv4 address. The way the RID is gotten is based on the following typically, but is subject to change to change depending on vendor:

  1. Manually Set (Recommended)
  2. Highest IP address of any Loopback that is up
  3. If there is no loopbacks, highest IP of an interface that is up

The two timers we have to be aware of when establishing peering is a keepalive timer and the holddown timer. The keepalive timer is kicked off every 60 seconds by default, and will send a packet telling the peer they’re still alive. The holddown timer is set to 180 by default, meaning if it doesn’t get a keepalive within 180 seconds, the peer will go down.

The holddown timer doesn’t have to be the same on both BGP peers, but if they’re not, the lowest holddown timer is set.

There is multiple different attributes that are shared between AS’s, or within the AS, depending on if using iBGP or eBGP. We will discuss all these attributes later on.

eBGP

eBGP will establish a peering between two different AS’s. For example, in the below topology, R5 and R4 will have a BGP peering, and R8 and R9 will have a BGP peering.

By default, eBGP peers will have to be one hop away (unlike iBGP which can be multiple hops away). You can turn on BGP Multi-hop, which will allows the BGP peers to be more then one hop away.

iBGP

iBGP is special in terms of configuration. Unlike other routing protocols, all BGP enabled routers in the AS has to be peer with each other, even if multiple hops away. Additionally, iBGP routes will not be propagated between each peer either (Read below).

If we look at the above topology, if R2 sends the route 192.168.1.0/24 over to R3, then R3 won’t be able to advertise it to any of the BGP routers in the area. This is due to the fact everything, by default, should have a fully meshed peering in the AS. Meaning, R1 should peer with R2, R3, R4, and R5. R2 should peer with R1, R3, R4, and R5. This will continue until you are peering each router in the AS with another router.

To make this more scalable, we can implement route reflectors (discussed in another post at a later time), so you can only peer with only a couple of routes, and those routers are able to propagate routes to each other.

One of the main reasons we need to use iBGP is share attributes learned from routes between different AS’s, so no information is lost.

BGP Peering States

BGP will have neighbor states (see the description of the states below):

  1. Idle
  2. Connect
  3. Active
  4. OpenSent
  5. OpenConfirm
  6. Established

The states will go from Idle all the way to Established when attempting to peer with another router. However, when a peer is “active” it is considered a bad state to be in (More on this below).. At each one of these states, BGP will negotiate different values in order to establish a good session, then start sending routes in the established state. So we should expect all our peers to be in the established state at the very end.

Idle

This will be the initial state we start out in. If we’re stuck in this state, then we will usually have one of two problems:

  1. There is no route to the peer we identified
  2. If multi-hop isn’t enabled, the peer isn’t directly connected for eBGP peers

The 2nd problem is important because eBGP by default expects all their peers to be directly connected, since they will have a TTL (Time-to-Live) of 1 for eBGP. We can change this parameter with the multi-hop setting, which will tell the router how many hops the router is away, up to 255 hops away. This TTL will decrement by one along the path, so we want to pick a TTL count that can satisfy how many hops away.

For iBGP this default TTL is 255, so we should haven’t a problem TTL, unless there is a routing loop in the network.

Connect

The connect state means we we’re able to send a SYN packet to the peer, but we have yet to receive a SYN/ACK back. The important on this step is that we are dealing with the three-way handshake. Meaning, we need to complete the three-way handshake (SYN -> SYN/ACK -> ACK) with the peer on TCP port 179.

If we are successfully able to complete the three-way handshake, we will go into the OpenSent state.

If we have any problems in this state, we will usually have a retry timer of usually 30 seconds before we go into the Active state to indicate a problem.

Active

The active state is one of the problematic states of BGP. Peers stuck in the active state indicate there is an issue with the TCP connection. Active state will “actively” try to reach out to the peer, and eventually will end the connection to go back to the connect state again.

If we see a peer stuck in active, we should check one of the following:

  1. Route to the peer is incorrect/down
  2. IP of the peer is incorrect
  3. An Access-Control List (ACL) or firewall is preventing the connection to establish

OpenSent

The OpenSent state is the connection is when we discussing the parameters of the BGP connections. So needless to say, we have to agree on the following parameters and make sure they are correct on both sides:

  • BGP Versions
  • Source IP of the packet is the same as neighbor statement
  • AS number must match neighbor statement
  • The RID is unique
  • Security Parameters (TTL/Password) is correct for what is expected

If none of these parameters is what is expected, will send an error code back to the peer describing where the mismatch is in the form of a “BGP Notification” packet. More information on this will be in the “BGP Messages” section.

OpenConfirm

The OpenConfirm state means we agree on the parameters in the OpenSent state, and we will begin sending keepalives to the peer. More information on keepalives will be found in the “BGP Messages” section.

Once we receive a keepalive in this state from the peer, we will finally go into the established state.

Established

Once we start receiving keepalives in from the peer, we will go into established, and start sending “BGP Update” packets and continue sending keepalives. These update packets will contain a section called NLRI (Network Layer Reachability Information), that will describe networks as well as the attributes that belong to those networks. More information on the update packets can be found in the BGP Messages section.

Once we get into the established state, everything is working as intended, and make sure we allow/redistribute the routes we want to send/receive.

From here if there is any issues with the BGP session, we will declare the peer as down. Some of the common problems that will happen with this is the following:

  • Network migration, software, or hardware upgrades
  • Keepalives not being received due to drops in-between or transmission/receiving errors
  • High CPU could be dropping any packets, including the keepalives
  • The process controlling BGP crashed
  • Configuration changes in BGP needing to restart the connection

BGP Message Types

In order to exchange information pertaining to BGP, there is four different message types that will be sent:

  • Keepalive
  • Open
  • Notification
  • Update

All four of these messages will be used in both iBGP and eBGP and will ultimately decide what BGP should do.

Keepalive

The keepalive message will usually be sent 60 seconds by default. This will ensure BGP will stay alive as long as we are sending and receiving these. By default, if we don’t receive a keepalive within 180 seconds (the holddown timer), the BGP session will drop and we will have to reestablish everything again.

Open

The open message is sent right after the three-way handshake is established. This essentially tells the peer, “Hey, since we have bidirectional communication, let’s go ahead and OPEN for BGP communication.” These will send some of the following parameters:

  • BGP Version
  • Hold Time
  • AS Number
  • Router-ID
  • Optional Parameters

The optional parameters contain can be Graceful Restart, Route-Refresh, and some other ones that are optional. These will be discussed in another article, but since these are optional, it will just tell the peer, “I support the following parameters” so it know what enhancements to offer.

Notification

When an error is detected during the establishment of the peering, or even after the establishment of the peer, we will send an error message indicating there is an issue with the connection. These error message can range from “Bad Peer AS” to “Cease – Connection Rejected”.

You don’t have to memorize these, but vendors will usually have a command to view what error codes were sent/received.

Error CodeSubcodeDescription
0100Message Header Error
0101Message Header Error – Connection not Synchronized
0102Message Header Error – Bad Message Length
0103Message Header Error – Bad Message Type
0200Open Message Error
0201Open Message Error – Unsupported Version Number
0202Open Message Error – Bad Peer AS
0203Open Message Error – Bad RID
0204Open Message Error – Unsupported Optional Parameter
0205Open Message Error – Deprecated
0206Open Message Error – Unacceptable Hold Time
0207Open Message Error – No Supported Capability Value (Cisco)
0208Open Message Error – No Supported AFI/SAFI (Cisco)
0209Open Message Error – Grouping Conflict (Cisco)
020AOpen Message Error – Grouping Required (Cisco)
0300Update Message Error
0301Update Message Error – Malformed Attribute List
0302Update Message Error – Unrecognized Well-Known Attribute
0303Update Message Error – Missing Well-Known Attribute
0304Update Message Error – Attribute Flags Error
0305Update Message Error – Attribute Length Error
0306Update Message Error – Invalid Origin Attribute
0307(Deprecated)
0400Hold Timer Expired
0500Finite State Machine Error
0600Cease
0601Cease – Maximum Number of Prefixes Reached
0602Cease – Administrative Shutdown
0603Cease – Peer Deconfigured
0604Cease – Administrative Reset
0605Cease – Connection Rejected
0606Cease – Other Configuration Change
0607Cease – Connection Collision Resolution
0608Cease – Out of Resources

Configuration – Cisco

In order to configure BGP peering, we must keep in mind several things:

  • iBGP must have have a route to the peer, even if it’s on a subnet. Relies on IGP (Static Route, OSPF, IS-IS, etc.) to get connectivity
  • Must have TCP Port 179 Connnectivity
  • Need multi-hop or ttl-security if using eBGP on a peer that’s more then one hop away.

First thing to do is to ensure that we have the correct route to the peer. On R1, we should have a route to all our peers for at least iBGP. This means we should have a route to R2 all the way to R8. When we look at show ip route (below), we can see there is a static route to all the peers:

Next thing we need to ensure is that we have a loopback with our RID to ensure there a small chance of it going down. Looking at the config below, in addition to the physical interface, we can see there is a loopback with the IP of 1.1.1.1 that will also be the RID:

Lastly, we will need to define our BGP peers and the router id. We can do this with the following commands:

We will pretty much use this same config on all routers throughout the area. For example, here will be a output of R7:

The one configuration we will have to take note of is where R4 comes into play. Imagine if we have a VPN or something along those lines between R4 and R5, the problem here will be when the BGP packets leave R4, it will leave with the of the external interface (192.0.2.4). Meaning, that Source IP of the packets is 192.0.2.4. However since we are working with iBGP, we will want to use our Private IP (192.168.1.4). To do this, we will have to use a command to tell BGP where to source the packets from:

The two routers that will be somewhat different is R8 and R9, since we will be establishing an eBGP neighbors. All peers up to this point have been iBGP:

We will discuss more about redistribution of routes in another blog post. By doing redistribution wrong, we can easily cause routing loops.

After we configure, we can check to verify if everything is working correctly with the following section.

Verification

We have various commands to ensure we have connectivity and do basic troubleshooting. First, we can check the state with show ip bgp summary:

In the above command, if we had any issues with our neighbors, we will the stat information listed above. For example, if we down the interface on R4 with the IP of 192.0.2.4, you will notice, that the neighbors on R1 will go into Active state, due to lack of TCP connectivity:

From the above output, you can see the peers 172.16.1.5, 172.16.1.6, 172.16.1.7, and 172.16.1.8 is flapping because the state is going between Active/Idle. if we got back to our notes above, we can see the active state means the following:

The active state is one of the problematic states of BGP. Peers stuck in the active state indicate there is an issue with the TCP connection. Active state will “actively” try to reach out to the peer, and eventually will end the connection to go back to the connect state again.

If we see a peer stuck in active, we should check one of the following:

  1. Route to the peer is incorrect/down
  2. IP of the peer is incorrect
  3. An Access-Control List (ACL) or firewall is preventing the connection to establish

A way we can verify we don’t have TCP Port 179 connectivity is by using telnet. By using telnet <IP> 179 we can check to see if we have connectivity to the peer:

We can use show ip bgp neighbor <IP> routes, we can see the routes received by that peer:

We can see above, we are learning three routes from R8 (172.16.1.8), being 0.0.0.0, 10.1.1.0/24, 192.0.2.0/24, and finally 192.168.1.0/24. To verify that R8 is sending these, can run sh ip bgp neighbor <IP> advertised-routes to show all the routes we have been advertising.