Sunday, April 15, 2018

Creating an Internet Exchange for Even More Fun and Less Profit

Last quarter, I was pulled into the slightly odd underground of people running their own autonomous systems, and since then, our circle of friends running autonomous systems at Hurricane Electric's FMT2 has slowly been growing.
Which is great, except that we're all running autonomous systems, which means that we can set up peering links, and are you really friends with another network engineer if you're not running a cross connect between your two networks? This wasn't too bad for the first few networks joining our little cabal of networks, but due to that pesky quadratic growth issue, the number of new cross connects needed when the fifth or sixth person joined started getting ridiculous. (It's like, four or five!)
This is, of course, an issue that real networks have to deal with as well, so when we had an eighth friend sign a service agreement with Hurricane Electric this week, the idea was (half jokingly) floated that we should just start our own Internet Exchange Point to cut down on the number of cross connects we need for each new member.

An Internet Exchange is basically just a single L2 Ethernet switch which every network plugs into, such that every network can directly set up BGP peering / route packets to each other network on the fabric. Furthermore, to make it even easier to add new networks to an Internet Exchange, many IXs run "route servers," which are BGP peers which re-distribute all the connected routes. This is convenient because it means that only the IX operator and the new network need to adjust their BGP configuration when a network joins; everyone else is already peered with the route server and start getting the new routes (and which router on the switch to send that traffic to) as part of their already existing connection to the route server.

So we were all sitting there, contemplating the idea of ordering seven more cross connects and once again all logging into our routers to update our configs, and at that point, the idea of creating an Internet Exchange instead didn't seem too bad.

We could instead have all gotten cross connects into one of the existing Internet Exchanges in the HE FMT2 building, such as SFMIX, but they charge $995/year for a port on their fabric, which is more money than it's worth for all of us to cross connect for amusement's sake (most of us are amateurs and not making money on our networks). So screw it, hold my other beer, and away we go!

And that's how the Fremont Cabal Internet Exchange was born. 

We even made a website and everything.

We allocated a /64 IPv6 subnet from my /48 (which was originally allocated from another guy's /32), drummed up an IPv4 /24 that was currently between projects, and very carefully selected the private ASN 4244741280, and all that was left to get was a switch to all connect to.
Thankfully, my entire network in my cabinet is built on a Cisco 6506, which is technically a switch, so we called that close enough, and instead of having to find another piece of hardware, just allocated a VLAN on my 6506 as the switch fabric, and we were all set. Besides, we were getting a little worried that there were getting to be too few Internet Exchanges running on Cisco 6500s these days.

Now whenever someone wants to connect to the FCIX (Fremont Cabal Internet Exchange) fabric, they just get a cross connect to my cabinet, I set another port to be an access port to the FCIX VLAN, and they're hooked up to everyone.

It's only 1Gbps to each network, but most of us are only originating a few prefixes for a few servers, so we aren't really pushing the limits of single 1G links per participant yet, but just like in any real IX, as soon as someone starts saturating their link to FCIX, they can start setting up direct peering links to other networks to start shedding that traffic off their exchange links. You know... when that happens...

Ideally we would have applied for a public ASN for the exchange, but that $550 + $100/yr for a registered ASN kind of went against the objective of saving money on cross connects, and I figured the chances of someone connecting to FCIX already using one random 4 byte private ASN inside their network was pretty low. Since the IX ASN is never appended to any routes going through the exchange, there's also the fact that no one outside the exchange will ever see this ASN, so it seems like a pretty acceptable trade-off for a group of amateurs for now. (The biggest downside I can think of is that we might not be able to register this IX on peeringDB with a private ASN, to further prop up the facade that this is an Internet Exchange to be taken seriously)

Edit: OK, I stand corrected. peeringDB had no problem and we're now live on there as well. That was not expected.
The last piece to really make adding new members to this peering fabric convenient is setting up two route servers, so that each new member doesn't trigger everyone needing to log into their routers to add a new BGP peer. Instead, everyone peers with the route servers and they handle the full N-to-N exchange of routes. When a new member joins, they set up their router on the fabric's /24+/64, and peer with the two route servers, and the only other involvement needed is from one of the IX admins (which is really just me, currently) to add them to the route server. Every other member doesn't need to be involved and can just enjoy the new routes appearing on their router.
We have two BGP route servers so as I need to restart each one for maintenance reasons, everyone can still trade routes over the other one and I don't trigger a reconvergence every time I restart the daemon or VM. We even managed to get the second VM on a different hypervisor in Javier's cabinet instead of mine, for further fault tolerance.

We're still working to figure out exactly which route server software we want to use. I'm the most familiar with Quagga, but Quagga tries to emulate the Cisco model of all config changes are made on the fly through the console, where I don't want to be hand crafting config changes every time we add a member, so I'm currently taking a crash course in running BIRD as one of our route servers, and will likely be swapping various daemons in for each route server as we learn more.

Sunday, April 1, 2018

Measuring the Internet

Progress on the whole "running my own bit of the Internet" project has been going well. We've got a router, and some servers, and even a NAS, so one of the next questions is how well is our network behaving over time?

There are plenty of different ways to ask this question, and plenty of different metrics to look at. For example, to track my bandwidth usage, I'm using LibreNMS, which is a pretty good SNMP front-end to query my router every five minutes to see how many packets I'm moving.

One network monitoring tool that I've discovered as part of this project is the RIPE Atlas. It is a world-wide network of measurement probes spread across the Internet, which they use to measure the health of the Internet as a whole, but also allow others to request measurements on it.

To get started, you can request a probe and if approved, they mail you the simple hardware (clearly based on a TP-Link router running their custom firmware) to plug into your network. Once it's powered on, the probe starts taking measurements from your Internet connection, and you start earning credits to spend on your own custom measurements.
For example, I requested the probe, and although I never got any kind of email, a DHL package showed up about 6-8 weeks later with the probe + some cables inside. Once I plugged it in and registered its serial number to my account, I'm now accruing 21600 credits per day for keeping the probe online, plus another 5000-50000 credits per day for "results delivered" which I presume is running other people's custom measurements.
I haven't come up with any long term custom measurements yet, but to give you a sense of scale, a single traceroute costs 60 credits, so running a traceroute to my network from 10 random probes costs 600 credits, and RIPE's traceroute report is pretty slick.
The main reason I haven't programmed any periodic custom measurements yet is because the probe comes with a set of "built-in" measurements where it automatically measures the latency and packet loss to its first and second hop, all the root DNS servers, and some Atlas infrastructure, which already answers most of my questions on how well my network is doing. I really should set up some user-defined queries to monitor my HTTP servers, but for now I'm just accruing credits.

You can see all the public information coming from my probe here. You can even order Atlas measurements specifically from my network to your network if you specify the measurement should be sent to probe #34742, which I find rather amusing.

One thing I noticed right away is that I'm seeing 100% packet loss (solid red) to a few IPv6 root DNS servers... This is actually because Hurricane Electric and Cogent have been having a peering dispute over IPv6 for... pretty much forever, so the IPv6 Internet is actually split brain and I'm just not able to reach some parts of the IPv6 Internet from my Hurricane Electric transit...

One of the perks of running my own autonomous system is that I'm able to work on getting myself blended transit from someone other than Hurricane Electric and fix this problem myself... (Anyone with blended IPv6 transit on the west coast want to help me out?)

The probe uses about 3kbps to take its measurements, so the network load from it is what I would describe as "pretty much undetectable" considering my main transit link hovers around four orders of magnitude higher than that. This plot is from LibreNMS for my "DHCP" /29 subnet, which I use for my Atlas probe and plugging in my laptop to a spare port on my router when I'm standing in the datacenter working on my rack.

Monday, January 15, 2018

Off-the-Grid Raspbian Repositories

For one of my future projects, I'm looking at spinning up 15-25 Raspberry Pi desktops in the middle of nowhere, with somewhere between zero and awful Internet connectivity. We're making a best effort wag at what the OS image should be before-hand, but I expect we will inevitably get there and realize we want to install some other piece of software from the Raspbian software repo.

I figured the best solution to this was to pre-download ALL of the software available for Raspbian (it's only ~400GB) and sneaker-net it in. I also don't want to be making any changes to the config files on the Raspbian installs, since that would mean we'd need to remember to set things back to normal when we're done, so I want this local copy of the Raspbian repos to include some DNS trickery to make the Raspberry Pis think they're still actually talking to the actual repos on the Internet.

This means I need:

  1. A router to give the Pis IP addresses and respond to their DNS queries for and with the address of the local server instead of those server's actual public IP addresses.
  2. A laptop with at least 500GB of disk space running apache2 serving a local copy of and

My router of choice for these sorts of scratch projects where I need something to do DHCP/DNS/NAT but don't particularly care about performance or getting my router back afterwards is the WRT54GL. I've got a stack of them running mainline OpenWRT, not because OpenWRT has a nice interface, but because it is exceedingly flexible.

Beyond the basic configuration of picking a new subnet that isn't and setting a hostname for it, I made two changes:

  1. A static DHCP address allocation for my laptop running apache2.
  2. I sshed into the router and created a new hosts file that gives the laptop's IP address for the two repos I'm trying to fake.
Hosts files are the simplest way to override DNS records for single hostnames. I created /etc/hosts.rpimirror and it contains the two domains associated with my laptop's static IP address:
Once I created this extra hosts file, I added it to the web configuration via "Network > DHCP and DNS > Resolv and Hosts Files > Additional Hosts files", but of course your router's method to add additional hosts files will vary if you're not running OpenWRT. 
At this point, any Raspberry Pi plugged into this router will get faked answers back for the two repo domains, and will try and send all of those requests to the laptop, without having to edit any of the sources files on the Pis, so as soon as they get plugged into any other router, they'll behave just as normal.

Now all that's left to do is to create a local mirror of those two repos and make them available over http. 

To copy the mirror files to the laptop, I wrote a simple script which uses rsync to copy the Raspbian repo and debmirror to copy the needed parts of the repo (mainly because it doesn't support rsync for some reason). We're talking about something on the order of 400GB of files here, so strap in; this download is going to take a while regardless of how fast your Internet is. Once it's done, you should be able to rerun the same script and have it go relatively quickly since it will only be updating your existing local copy.

You will need to change the MIRRORBASE variable to somewhere on your computer that you want to store all of these files, since I have it hard-coded in my home directory.

I'm also only mirroring the stretch packages from, so you will need to edit the "release" variable if you're using a different release of Raspbian.

Script was based on the Ubuntu debmirror page and the rsync command at the bottom of the Raspbian mirror's list.

Once that download is done, you'll need to link the and folders into apache's /var/www/html root so both folders are visible:
sudo ln -s /home/kenneth/mirror/ /var/www/html/debian
sudo ln -s /home/kenneth/mirror/ /var/www/html/raspbian

At this point, you should be able to see the two repo's dists and pool folders by pointing your browser at "http://laptop's_address/raspbian" and "http://laptop's_address/debian", and any Raspberry Pis plugged into the router will be able to download any new software or updates entirely locally without a connection to the Internet.

While this does let you install any updates or software existing at the time of your repo snapshots, this doesn't get you any updates or software released after the snapshot. One option for that, assuming this local repo is somewhere entirely without Internet connectivity, is to carry it back to somewhere with Internet, rerun my repo mirror script above to download any changes to the repos, then carry the server back to where it's being used.