Friday, January 25, 2019

Implementing BCP214 on Catalyst 6500

When it comes to running an Internet Exchange, at its most basic level, you're a metro Ethernet provider with a range of as little as 19". The most basic IXP is a single rack mount Ethernet switch, that you plug in, power on, and start plugging customers into.

Which is great, right up until you want to unplug or move any of those customers.

The problem is that, unlike a private interconnect between two routers, with an IXP switch in the middle, if you just yank the cord on one of the routers, it may be able to see that the interface is down and start calculating better routes from other BGP peers than those on the IXP, but every other customer on the IXP sending traffic towards the poor sap who you just disconnected won't see any change.  When unplugged customer A, customer B will continue to see their link to the IXP switch up, and will continue to send traffic towards customer A until the BGP session between them eventually times out, which can be on the order of minutes.

So in an ideal world, before yanking the cord on an IXP peer, you'd like to be able to make it seem like you've yanked the cord (without actually doing it), give BGP the minutes it takes to reconverge around the soon-to-be-down link, and then finally unplug the physical cable only once all of the traffic has drained and doing so won't result in a minute or two of black-holing traffic.

The simplest way to do this is to send an email to the customer under question the day/week/month before and say "hey, I'm going to be unplugging your port, so turn down all of your BGP sessions with others first" but that's a pretty unrealistic expectation to see that level of cooperation from another autonomous system, and wastes a lot of time on the customers part manually turning down all their peering sessions before, and then turning them back up after.

A better way to do it is to actively force the BGP sessions to go down without disrupting any other traffic, then wait for the reconvergence that will happen because of that. This technique is called BCP214. and basically involves using the IXP's switches' ability to filter traffic to specifically filter the IPv4 and IPv6 BGP packets going between peers on the exchange.

I've been doing this "turn down to move the peer to another switch" action quite a bit in the last few months for FCIX, where we've been moving everyone off my Cisco 6506 to a much nicer Arista 7050S-64. The problem is that, while BCP214 helpfully provides some sample configs in the appendix to implement this technique on Cisco IOS, for some reason which is beyond my understanding of the history of IOS command syntax, the Cisco sample doesn't seem to work on my Cisco 6500 running IOS 15.1(2)SY11.

It took some digging to figure out the exact syntax needed to implement the needed ACLs and then apply them to an interface on my 6506, so just in case anyone else needs these, enjoy:

The first part of implementing BCP214 is permanently creating two ACLs for specifically dropping BGP traffic from one IXP address to another (one for IPv4 and the second for IPv6). It's important to appreciate why you want to be specific in filtering only BGP with IXP addresses on it; multi-hop BGP could be flowing over the IXP between two routers not connected to the IXP for some reason, and that traffic shouldn't be dropped.  These example ACLs use the FCIX subnets of 206.80.238.0/24 and 2001:504:91::/64, but need to be modified accordingly to your own IXP subnets.

ip access-list extended acl-v4-bcp214
 deny   tcp 206.80.238.0 0.0.0.255 eq bgp 206.80.238.0 0.0.0.255
 deny   tcp 206.80.238.0 0.0.0.255 206.80.238.0 0.0.0.255 eq bgp
 permit ip any any
!
ipv6 access-list acl-v6-bcp214
 deny tcp 2001:504:91::/64 eq bgp 2001:504:91::/64
 deny tcp 2001:504:91::/64 2001:504:91::/64 eq bgp
 permit ipv6 any any
!

There's two deny lines on each ACL because you don't know if this customer's router happened to be the initiator of the TCP connection for BGP or not, so the source TCP port might be port 179, or the destination port might be 179. You want to drop both of those.

With those ACLs now part of the config, when you need to cause a port to drain its traffic, you temporarily apply those two ACLs to the interface's config and give it a few minutes for the BGP sessions to time out, the routers on both sides to re-converge, and the rest of the Internet to pick up the slack with no black-holed traffic when you then shutdown the interface.

interface GigabitEthernet1/3
 description Peering: FCIX Peer
 switchport
 switchport access vlan 100
 switchport mode access
 ip access-group acl-v4-bcp214 in
 ipv6 traffic-filter acl-v6-bcp214 in
!