Sunday, April 15, 2018

Creating an Internet Exchange for Even More Fun and Less Profit

Last quarter, I was pulled into the slightly odd underground of people running their own autonomous systems, and since then, our circle of friends running autonomous systems at Hurricane Electric's FMT2 has slowly been growing.
Which is great, except that we're all running autonomous systems, which means that we can set up peering links, and are you really friends with another network engineer if you're not running a cross connect between your two networks? This wasn't too bad for the first few networks joining our little cabal of networks, but due to that pesky quadratic growth issue, the number of new cross connects needed when the fifth or sixth person joined started getting ridiculous. (It's like, four or five!)
This is, of course, an issue that real networks have to deal with as well, so when we had an eighth friend sign a service agreement with Hurricane Electric this week, the idea was (half jokingly) floated that we should just start our own Internet Exchange Point to cut down on the number of cross connects we need for each new member.

An Internet Exchange is basically just a single L2 Ethernet switch which every network plugs into, such that every network can directly set up BGP peering / route packets to each other network on the fabric. Furthermore, to make it even easier to add new networks to an Internet Exchange, many IXs run "route servers," which are BGP peers which re-distribute all the connected routes. This is convenient because it means that only the IX operator and the new network need to adjust their BGP configuration when a network joins; everyone else is already peered with the route server and start getting the new routes (and which router on the switch to send that traffic to) as part of their already existing connection to the route server.

So we were all sitting there, contemplating the idea of ordering seven more cross connects and once again all logging into our routers to update our configs, and at that point, the idea of creating an Internet Exchange instead didn't seem too bad.

We could instead have all gotten cross connects into one of the existing Internet Exchanges in the HE FMT2 building, such as SFMIX, but they charge $995/year for a port on their fabric, which is more money than it's worth for all of us to cross connect for amusement's sake (most of us are amateurs and not making money on our networks). So screw it, hold my other beer, and away we go!

And that's how the Fremont Cabal Internet Exchange was born. 

We even made a website and everything.

We allocated a /64 IPv6 subnet from my /48 (which was originally allocated from another guy's /32), drummed up an IPv4 /24 that was currently between projects, and very carefully selected the private ASN 4244741280, and all that was left to get was a switch to all connect to.
Thankfully, my entire network in my cabinet is built on a Cisco 6506, which is technically a switch, so we called that close enough, and instead of having to find another piece of hardware, just allocated a VLAN on my 6506 as the switch fabric, and we were all set. Besides, we were getting a little worried that there were getting to be too few Internet Exchanges running on Cisco 6500s these days.

Now whenever someone wants to connect to the FCIX (Fremont Cabal Internet Exchange) fabric, they just get a cross connect to my cabinet, I set another port to be an access port to the FCIX VLAN, and they're hooked up to everyone.

It's only 1Gbps to each network, but most of us are only originating a few prefixes for a few servers, so we aren't really pushing the limits of single 1G links per participant yet, but just like in any real IX, as soon as someone starts saturating their link to FCIX, they can start setting up direct peering links to other networks to start shedding that traffic off their exchange links. You know... when that happens...

Ideally we would have applied for a public ASN for the exchange, but that $550 + $100/yr for a registered ASN kind of went against the objective of saving money on cross connects, and I figured the chances of someone connecting to FCIX already using one random 4 byte private ASN inside their network was pretty low. Since the IX ASN is never appended to any routes going through the exchange, there's also the fact that no one outside the exchange will ever see this ASN, so it seems like a pretty acceptable trade-off for a group of amateurs for now. (The biggest downside I can think of is that we might not be able to register this IX on peeringDB with a private ASN, to further prop up the facade that this is an Internet Exchange to be taken seriously)

Edit: OK, I stand corrected. peeringDB had no problem and we're now live on there as well. That was not expected.
The last piece to really make adding new members to this peering fabric convenient is setting up two route servers, so that each new member doesn't trigger everyone needing to log into their routers to add a new BGP peer. Instead, everyone peers with the route servers and they handle the full N-to-N exchange of routes. When a new member joins, they set up their router on the fabric's /24+/64, and peer with the two route servers, and the only other involvement needed is from one of the IX admins (which is really just me, currently) to add them to the route server. Every other member doesn't need to be involved and can just enjoy the new routes appearing on their router.
We have two BGP route servers so as I need to restart each one for maintenance reasons, everyone can still trade routes over the other one and I don't trigger a reconvergence every time I restart the daemon or VM. We even managed to get the second VM on a different hypervisor in Javier's cabinet instead of mine, for further fault tolerance.

We're still working to figure out exactly which route server software we want to use. I'm the most familiar with Quagga, but Quagga tries to emulate the Cisco model of all config changes are made on the fly through the console, where I don't want to be hand crafting config changes every time we add a member, so I'm currently taking a crash course in running BIRD as one of our route servers, and will likely be swapping various daemons in for each route server as we learn more.

Sunday, April 1, 2018

Measuring the Internet

Progress on the whole "running my own bit of the Internet" project has been going well. We've got a router, and some servers, and even a NAS, so one of the next questions is how well is our network behaving over time?

There are plenty of different ways to ask this question, and plenty of different metrics to look at. For example, to track my bandwidth usage, I'm using LibreNMS, which is a pretty good SNMP front-end to query my router every five minutes to see how many packets I'm moving.


One network monitoring tool that I've discovered as part of this project is the RIPE Atlas. It is a world-wide network of measurement probes spread across the Internet, which they use to measure the health of the Internet as a whole, but also allow others to request measurements on it.

To get started, you can request a probe and if approved, they mail you the simple hardware (clearly based on a TP-Link router running their custom firmware) to plug into your network. Once it's powered on, the probe starts taking measurements from your Internet connection, and you start earning credits to spend on your own custom measurements.
For example, I requested the probe, and although I never got any kind of email, a DHL package showed up about 6-8 weeks later with the probe + some cables inside. Once I plugged it in and registered its serial number to my account, I'm now accruing 21600 credits per day for keeping the probe online, plus another 5000-50000 credits per day for "results delivered" which I presume is running other people's custom measurements.
I haven't come up with any long term custom measurements yet, but to give you a sense of scale, a single traceroute costs 60 credits, so running a traceroute to my network from 10 random probes costs 600 credits, and RIPE's traceroute report is pretty slick.
The main reason I haven't programmed any periodic custom measurements yet is because the probe comes with a set of "built-in" measurements where it automatically measures the latency and packet loss to its first and second hop, all the root DNS servers, and some Atlas infrastructure, which already answers most of my questions on how well my network is doing. I really should set up some user-defined queries to monitor my HTTP servers, but for now I'm just accruing credits.

You can see all the public information coming from my probe here. You can even order Atlas measurements specifically from my network to your network if you specify the measurement should be sent to probe #34742, which I find rather amusing.

One thing I noticed right away is that I'm seeing 100% packet loss (solid red) to a few IPv6 root DNS servers... This is actually because Hurricane Electric and Cogent have been having a peering dispute over IPv6 for... pretty much forever, so the IPv6 Internet is actually split brain and I'm just not able to reach some parts of the IPv6 Internet from my Hurricane Electric transit...

One of the perks of running my own autonomous system is that I'm able to work on getting myself blended transit from someone other than Hurricane Electric and fix this problem myself... (Anyone with blended IPv6 transit on the west coast want to help me out?)

The probe uses about 3kbps to take its measurements, so the network load from it is what I would describe as "pretty much undetectable" considering my main transit link hovers around four orders of magnitude higher than that. This plot is from LibreNMS for my "DHCP" /29 subnet, which I use for my Atlas probe and plugging in my laptop to a spare port on my router when I'm standing in the datacenter working on my rack.

Monday, January 15, 2018

Off-the-Grid Raspbian Repositories

For one of my future projects, I'm looking at spinning up 15-25 Raspberry Pi desktops in the middle of nowhere, with somewhere between zero and awful Internet connectivity. We're making a best effort wag at what the OS image should be before-hand, but I expect we will inevitably get there and realize we want to install some other piece of software from the Raspbian software repo.

I figured the best solution to this was to pre-download ALL of the software available for Raspbian (it's only ~400GB) and sneaker-net it in. I also don't want to be making any changes to the config files on the Raspbian installs, since that would mean we'd need to remember to set things back to normal when we're done, so I want this local copy of the Raspbian repos to include some DNS trickery to make the Raspberry Pis think they're still actually talking to the actual repos on the Internet.

This means I need:

  1. A router to give the Pis IP addresses and respond to their DNS queries for mirrordirector.raspbian.org and archive.raspberrypi.org with the address of the local server instead of those server's actual public IP addresses.
  2. A laptop with at least 500GB of disk space running apache2 serving a local copy of mirrordirector.raspbian.org/raspbian and archive.raspberrypi.org/debian


My router of choice for these sorts of scratch projects where I need something to do DHCP/DNS/NAT but don't particularly care about performance or getting my router back afterwards is the WRT54GL. I've got a stack of them running mainline OpenWRT, not because OpenWRT has a nice interface, but because it is exceedingly flexible.

Beyond the basic configuration of picking a new subnet that isn't 192.168.1.0/24 and setting a hostname for it, I made two changes:

  1. A static DHCP address allocation for my laptop running apache2.
  2. I sshed into the router and created a new hosts file that gives the laptop's IP address for the two repos I'm trying to fake.
Hosts files are the simplest way to override DNS records for single hostnames. I created /etc/hosts.rpimirror and it contains the two domains associated with my laptop's static IP address:
10.2.2.2    mirrordirectory.raspbian.org.
10.2.2.2    archive.raspberrypi.org.
Once I created this extra hosts file, I added it to the web configuration via "Network > DHCP and DNS > Resolv and Hosts Files > Additional Hosts files", but of course your router's method to add additional hosts files will vary if you're not running OpenWRT. 
At this point, any Raspberry Pi plugged into this router will get faked answers back for the two repo domains, and will try and send all of those requests to the laptop, without having to edit any of the sources files on the Pis, so as soon as they get plugged into any other router, they'll behave just as normal.

Now all that's left to do is to create a local mirror of those two repos and make them available over http. 

To copy the mirror files to the laptop, I wrote a simple script which uses rsync to copy the Raspbian repo and debmirror to copy the needed parts of the raspberrypi.org repo (mainly because it doesn't support rsync for some reason). We're talking about something on the order of 400GB of files here, so strap in; this download is going to take a while regardless of how fast your Internet is. Once it's done, you should be able to rerun the same script and have it go relatively quickly since it will only be updating your existing local copy.

You will need to change the MIRRORBASE variable to somewhere on your computer that you want to store all of these files, since I have it hard-coded in my home directory.

I'm also only mirroring the stretch packages from archive.raspberrypi.org, so you will need to edit the "release" variable if you're using a different release of Raspbian.


Script was based on the Ubuntu debmirror page and the rsync command at the bottom of the Raspbian mirror's list.

Once that download is done, you'll need to link the mirrordirector.raspbian.org/raspbian and archive.raspberrypi.org/debian folders into apache's /var/www/html root so both folders are visible:
sudo ln -s /home/kenneth/mirror/archive.raspberrypi.org/debian /var/www/html/debian
sudo ln -s /home/kenneth/mirror/mirrordirector.raspbian.org/raspbian /var/www/html/raspbian


At this point, you should be able to see the two repo's dists and pool folders by pointing your browser at "http://laptop's_address/raspbian" and "http://laptop's_address/debian", and any Raspberry Pis plugged into the router will be able to download any new software or updates entirely locally without a connection to the Internet.

While this does let you install any updates or software existing at the time of your repo snapshots, this doesn't get you any updates or software released after the snapshot. One option for that, assuming this local repo is somewhere entirely without Internet connectivity, is to carry it back to somewhere with Internet, rerun my repo mirror script above to download any changes to the repos, then carry the server back to where it's being used.

Wednesday, November 15, 2017

Creating an Autonomous System for Fun and Profit

At its core, the Internet is an interconnected fabric of separate networks. Each network which makes up the Internet is operated independently and only interconnects with other networks in clearly defined places.

For smaller networks like your home, the interaction between your network and the rest of the Internet is usually pretty simple: you buy an Internet service plan from an ISP (Internet Service Provider), they give you some kind of hand-off through something like a DSL or cable modem, and give you access to "the entire Internet". Your router (which is likely also a WiFi access point and Ethernet switch) then only needs to know about two things; your local computers and devices are on one side, and the ENTIRE Internet is on the other side of that network link given to you by your ISP.

For most people, that's the extent of what's needed to be understood about how the Internet works. Pick the best ISP, buy a connection from them, and attach computers needing access to the Internet. And that's fine, as long as you're happy with only having one Internet connection from one vendor, who will lend you some arbitrary IP address(es) for the extend of your service agreement, but that starts not being good enough when you don't want to be beholden to a single ISP or a single connection for your connectivity to the Internet.

That also isn't good enough if you *are* an Internet Service Provider so you are literally a part of the Internet. You can't assume that the entire Internet is that way when half of the Internet is actually in the other direction.

This is when you really have to start thinking about the Internet and treating the Internet as a very large mesh of independent connected organizations instead of an abstract cloud icon on the edge of your local network map.

Which is pretty much never for most of us.

Almost no one needs to consider the Internet at this level. The long flight of steps from DSL for your apartment up to needing to be an integral part of the Internet means that pretty much regardless of what level of Internet service you need for your projects, you can probably pay someone else to provide it and don't need to sit down and learn how BGP works and what an Autonomous System is.

But let's ignore that for one second, and talk about how to become your own ISP.

To become your own Internet Service Provider with customers who pay you to access the Internet, or be your own web hosting provider with customers who pay you to be accessible from the Internet, or your own transit provider who has customers who pay you to move their customer's packets to other people's customers, you need a few things:
  1. Your own public IP address space allocated to you by an Internet numbering organization
  2. Your own Autonomous System Number (ASN) to identify your network as separate from everyone else's networks
  3. At least one router connected to a different autonomous system speaking the Border Gateway Protocol to tell the rest of the Internet that your address space is accessible from your autonomous system.
Once your router tells other networks that you're now the home of some specific range of IP addresses, and that advertisement propagates out through the rest of the Internet, everyone else's routers will have an entry in their routing tables so if they see any packets with your address on them, they know which direction to send them so they eventually end up at your door step.

Wait, but why don't you need any of this for your home Internet?

So why doesn't your home router need to speak BGP or you need to own public IP space to be reachable from the rest of the Internet? Because your ISP takes care of that for you. In addition to funding the wiring from their data center to your house, the $50/month you pay to your ISP funds them getting address space allocated for you, advertising it to the rest of the Internet, and getting enough connectivity to the rest of the Internet that your packets can get where they're headed.

The same answer is true on the other end when you spin up a web server somewhere like Digital Ocean or Amazon Web Services; they handle IP addressing and BGP for you so all you need to worry about is setting up the software for your own corner of the Internet on the one specific address they set aside for you from their big pools of addresses that they manage.

Wait, but why am I blogging about BGP then?

If you've made it this far, you're probably pretty curious why I'm talking about BGP at all, and what this blog post is leading up to. 

So... I recently set up my own autonomous system... and I don't really have a fantastic justification for it...

My motivation was twofold:
  1. One of my friends and I sat down and figured it out that splitting the cost of a rack in Hurricane Electric's FMT2 data center marginally lowered our monthly hosting expenses vs all the paid services we're using scattered across the Internet which can all be condensed into this one rack.
And this first reason on its own is a perfectly valid justification for paying for co-location space at a data center like Hurricane Electric's, but isn't actually a valid reason for running it as an autonomous system, because Hurricane Electric will gladly let you use their address space for your servers hosted in their building. That's usually part of the deal when you pay for space in a data center: power, cooling, Internet connectivity, and your own IP addresses.
  1. Another one of my friends challenged me to do it as an Autonomous System.
So admittedly, my justification for going through the additional trouble to set up this single rack of servers as an AS is a little more tenuous. I will readily admit that, more than anything else, this was a "hold my beer" sort of engineering moment, and not something that is at all needed to achieve what we actually needed (a rack to park all our servers in).

But what the hell; I've figured out how to do it, so I figured it would make an entertaining blog post. So here's how I set up a multi-homed autonomous system on a shoe-string budget:

Step 1. Found a Company

You're going to need a legal entity of some sort for a few steps here, so you're going to need a business name. I already happened to have one from other projects, so at the minimum you'll want to go to your local city hall and get a business license. My business license cost me the effort to come up with a kick-ass company name and about $33/year, and I've never even gotten around to doing anything fancy like incorporating it, so it's really just a piece of paper that hangs in my hallway and allows me to file 1099-MISC forms on my tax returns within the city of Sunnyvale, CA. In the context of this project, this business license primarily just needs to look official enough to get me approvals when I apply for an autonomous system number needed to set up my own network.

Step 2. Get Yourself Public Address Space

This step is, unfortunately, probably also the most difficult. You need to get yourself a block of public IP addresses big enough to be advertised over BGP (there's generally agreed upon minimums to keep the global routing table from getting ridiculous) and allocated for you to advertise over BGP yourself. You'll probably want both IPv4 addresses, which have to be at least a /24 subnet (256 addresses) and IPv6 addresses, which have to be at least a /48 subnet (65536 subnets of /64).

The big problem is that there are no IPv4 addresses left. There was only 4 billion of them in the first place, and we've simply run out of them, so the "normal" procedure of going to your local Internet numbers organization like ARIN isn't that productive. If all you need is IPv6 space (which is unlikely) and you happen to be in North America, you actually can still go to ARIN and request resources, but IPv4 addresses are generally still really needed. There's other solutions like buying IPv4 space on the second hand market, but that's getting expensive, so here's probably the least helpful part of this whole blog post:

I just borrowed some address space from my friends.

For example, I've got another friend who, for a different project, got a /32 IPv6 allocation from ARIN, which is a metric TON of addresses, so I asked him if I could have a (relatively small) /48 sub-allocated from his /32, so he drafted me an all official looking "Letter of Authorization" on his company letterhead that literally just says:
"Dear Sirs,
"Please accept this letter of authority on behalf of [FRIEND'S COMPANY NAME] to permit the BGP announcement of [/48 IPv6 SUBNET INSIDE HIS /32 SUBNET] by [KENNETH'S COMPANY NAME].
"Sincerely, [FRIEND'S SIGNATURE]"
It's not as impressive as having IP space with my name on it in ARIN's database, but it's also a whole hell of a lot cheaper than even the smallest address allocation you can get from ARIN (a couple beers vs $250/year).

This letter of authorization is also the first instance of where learning about how the Internet actually works gets a little weird. That letter is literally all it took for me to take control of a sub-block of someone else's public address space and get it routed to my network instead of theirs. Some of my network peers later asked for me to provide this LoA when we were setting up my network links, but that means I just sent them a PDF scan of a letter with my friend's signature on it. And I mean an actual signature; not some kind of fancy cryptographic signature, but literally a blue scribble on a piece of paper.

To be fair, the letterhead looked very official.

Step 3. Find Yourself Multiple Other Autonomous Systems to Peer With

So the name of the game with the Internet is that you need to be interconnected with at least one other part of it to be able to reach any of it, but that isn't necessarily good enough here. If you were only peering with one other autonomous system, you probably wouldn't even need to run BGP, and if you did, you could even do it using a "private" autonomous system number (ASN) which your upstream provider could just replace with their own before passing your routes on to the rest of the Internet.

But that's not good enough here. I didn't want to use some kind of lousy non-public ASN! This project was a personal challenge from a friend and the network engineering equivalent of driving a pickup with a lift kit, so we need a public ASN. We're going to later need to apply to ARIN to get one allocated and we'll need to provide at least two other autonomous systems we're going to be peering with to justify the "multi-homed" routing policy we're using to justify ARIN allocating us an ASN.

This multi-homed policy where we're peering with multiple other networks is also kind of neat because it means that if one of our upstream providers decides to take the day off, or only provide us a certain amount of bandwidth to the rest of the Internet, we have alternatives we can use from our peering links into other autonomous systems.

This whole concept of peering and all the different types of peering policies you might want for your network is a pretty deep rabbit hole, so I actually ended up buying a whole book just on peering, which was very helpful: The 2014 Internet Peering Playbook, Norton. He also has a website, which is a significant fraction of the content of his book in a less curated form.

Peering is definitely one of these "how the sausage gets made" sorts of topics that a lot of networks tend not to like to talk about. Exactly how well connected one network is to other networks is hugely important to their customer's quality of service, so everyone wants to make it appear that they're extremely well connected without showing their hand and letting others see their weaknesses. This means the peering community is rife with quiet backroom handshake deals that are never publicly disclosed, and you can spend hours digging through online looking glass servers that show you the global BGP tables trying to figure out what in the world networks are doing with their peering links.

Long story short, I'm getting a "paid transit" peering link from Hurricane Electric due to renting one of their cabinets, and managed to find a few friends in the Hurricane Electric FMT2 data center who had spare Ethernet ports on their router and were willing to set up free peering links for what little traffic happens to go directly between our own networks. Free peering links tend to be pretty common when both networks are at about the same level in the network provider / customer hierarchy, so tier 1 transit providers tend to peer for free to make the Internet happen, and lower tier small networks tend to peer for free to by-pass both needing to pay higher level ISPs to transit their traffic when they can move it directly, but if either network thinks they can charge the other for money, that might happen as well.

This is obviously where human networking becomes exceedingly important in computer networking, so start making friends with the peering coordinators for other networks which you expect to be trading a lot of traffic with. Every packet I'm able to shed off onto one of these lateral peering links with another AS is traffic that doesn't tie up my primary 1Gb hand-off from HE and makes my network faster.

Step 4. Apply for an Autonomous System Number

There are five Internet number organizations world-wide, and since I'm in North America the one I care about is ARIN, so I created an account on ARIN's website and:
  1. Created a Point of Contact Record for myself - Pretty much just a public profile for my identity: "Kenneth Finnegan, address, phone number, etc etc"
  2. Requested an Organization Identifier for "[MY COMPANY NAME]" and tied my point of contact record to it - This was by opening a ticket and attaching my business license to prove that my company actually exists. Since my company isn't its own legal identity, ARIN internally tracks it as "Kenneth Finnegan (doing business as) [MY COMPANY NAME]", but this doesn't show up on the public listing, so it wasn't a big deal.
  3. Requested an ASN for my Organization Identifier - This is where I needed to be able to list at least two other ASes I was planning on peering with. 
  4. Pay the $550 fee for getting an ASN issued per ARIN's current fee schedule for resources.
The whole process took about a week between setting up the orgID and requesting the ASN, mainly because I didn't quite get either support ticket request right on the first try due to me not quite knowing what I was doing, but in the end ARIN ultimately took my money and issued me an ASN all my own.

Step 5. Source a Router Capable of Handling the Entire Internet Routing Table

Remember how your home router only needs two routes? One for your local computers and one for the rest of the Internet, so the two routes are probably something like "192.168.1.0/24 (local) 0.0.0.0/0 (WAN)"

Processing a full BGP Internet routing table is a little more involved than that, and means you need a much more powerful router than one you can go buy at Office Depot. You could also probably build a router yourself out of a server running a router operating system like pfsense or just your favorite Linux distro with the right iptables voodoo and a BGP daemon like quagga, but that wasn't part of the originally thrown gauntlet challenge for this project.

The challenge was to use a real Cisco router capable of handling the entire Internet routing table, and I wanted one that can switch it at line speed. Hurricane Electric alone is giving us a 1Gb Ethernet hand-off, not including all the additional bandwidth out of our rack available due to other peering links, so we wanted a router that could definitely handle at least 1Gbps.
Meet the Cisco Catalyst 6506. Yes, it's rack mount, on a 19" rack. And is 12U high, which since a U is 1.75", means that this router is almost two feet high. And 150 pounds. And burns about 1kW.

Yes. It's size is ridiculous. Which for this project, isn't entirely out of line.

But it's also kind of a really awesome router, particularly for being a nearly 2 decade old product. The 6500 series is a line of switch/router chassis which support 3,4,6,9, or 13 modular line cards/router supervisors. In the early 2000s this was the best switch that money could buy, and it is definitely showing its age now, but that's perfect. Network engineers love to hate their 6500s because they're so old, but its relatively limited "only" 30 million packets per second throughput is plenty for an autonomous system that fits in a single rack, and its age means I was able to pick up a 6506 configured with dual supervisors and three (3!) x 48 port Gigabit Ethernet line cards on eBay for $625 shipped!

I probably could have found a more reasonably sized router for what I needed, but the 6506 has the appeal that it definitely has more switching horsepower than I'll ever need for this project, and its biggest downsides are it's size and power, which are both not that big of issues since I've got a whole 44U rack for just a few servers and I don't get billed for my power usage. More desirable routers have the big downside that they're actually desirable, so other people are willing to spend a few thousand dollars on them, where I didn't really want to drop $2k on a well kitted out 3945.

The 6506 probably deserves blog posts of its own, but the main thing is that low end configurations of it like this are cheap on eBay, with the one disadvantage that they don't come with a supervisor card with enough memory to handle a full Internet table. This means I did need to scrounge a sup720-XL supervisor that can handle 1 million routes in its routing table. Another few hundred bucks on eBay, or a friend with access to the right kind of e-waste pile solves this problem.
Granted, a million routes isn't actually that much. The public IPv4 table is about 675,000 routes, and IPv6 is another 45,000, and they're both growing fast, so in another 2-3 years the Internet is going to exceed the capabilities of this router. When that happens, I'm going to need to either replace it with something more advanced than this ancient beast or start using some tricks to trim down the full table once we get there. If you'd like to follow along at home and watch the IPv4 routing table march towards the demise of the cat6500, you can find a link to a bunch of graphs here.
I also added a 4 port 10Gb line card, because 10Gb is awesome, and took one of the 48x1Gb line cards out because I really didn't need 144 Ethernet ports on this thing. That's just a ridiculous number of Ethernet ports for a single rack.

So the final configuration is:

  1. 48x1Gb line card for my four copper peering links with other autonomous systems, including Hurricane Electric
  2. 4x10Gb line card for my peering link with one of my friends who happened to have a spare 10Gb port on his router, and who also thinks 10Gb is awesome. This will probably also serve some local 10Gb links in the rack once I grow beyond one server.
  3. A blankoff plate instead of my third 48x1Gb line card to save power.
  4. 48x1Gb line card for the local servers in the cabinet. Since we've only got two servers installed so far, there's currently only a 2x1Gb bonded link to my server and a 4x1Gb bonded link + out of band management to my friend's server.
  5. The sup720-BXL which does the actual router processing and makes this whole mess a BGP router. The one cable from this card runs up to a console server letting me manage this beast remotely from the comfort of my apartment without standing in a cold data center.
  6. One of my spare sup720 (not XL) which can't handle the full Internet table pulled out an inch so it doesn't power up because this seemed like the best place to store it until I figure out what to do with it. 

Step 6. Turn it All On and Pray

Wait, I mean, carefully engineer your brand new network and swagger into the data center confident that your equipment is all correctly configured.

But seriously, I found a few textbooks on BGP network design and happened to have a 13 hour flight to China and back to take a crash course in BGP theory, and spent a week in my apartment with ear plugs in taking a crash course in how to interact with Cisco devices more sophisticated than just setting up VLANs on an Ethernet switch, which is about all my experience with Cisco IOS before this month.

After spending a week lovingly hand crafting my router configuration (while listening to networking podcasters bagging on how ridiculous it is that we still lovingly hand craft our routing configurations), I was ready to deploy my router plus all of our servers in the data center.

When I signed my service agreement with Hurricane Electric, it consisted of:

  • One full 44U rack with threaded posts.
  • Two single phase 208V 20A feeds.
  • A single 1Gbps copper network drop
The network operations center then also emailed me and asked how many of Hurricane's IP addresses I needed, which was two: one for my router's uplink interface, and a second for a serial port console server so if I ever manage to really bork my router's configuration I can still reach it's console port without having to trek over to the data center and stand there in the cold. This means that my hand-off from HE is a /29, so I actually have 5 usable addresses, but that Ethernet drop goes into a fixed eight port GigE switch which breaks out the console server, then plugs into the 6506 for the majority of my Internet traffic.

Once I confirmed that my network feed from HE was live, I then opened a support ticket with HE saying "My BGP router is on [IPv4 ADDRESS] and [IPv6 ADDRESS] and will be advertising these specific routes per attached letters of authorization" and waited for them to set it up on their side, which took less than an hour before I got an email from them saying "we turned it on, and your router connected, so it looks good from here"
And we're off to the races.

At this point, Hurricane Electric is feeding us all ~700k routes for the Internet, we're feeding them our two routes for our local IPv4 and IPv6 subnets, and all that's left to do is order all our cross-connects to other ASes in the building willing to peer with us (mostly for fun) and load in all our servers to build our own personal corner of the Internet.

The only major goof so far has been accidentally feeding the full IPv6 table to our first other peer that we turned on, but thankfully he has a much more powerful supervisor than the Sup720-BXL, so he just sent me an email to knock that off, a little fiddling with my BGP egress policies, and we were all set.

In the end, setting up my own autonomous system wasn't exactly simple, it was definitely not justified, but some times in life you just need to take the more difficult path. And there's a certain amount of pride in being able to claim that I'm part of the actual Internet. That's pretty neat. 

And of course, thanks to all of my friends who variously contributed parts, pieces, resources, and know-how to this on-going project. I had to pull in a lot of favors to pull this off, and I appreciate it.

Tuesday, August 1, 2017

Building a Stand-Alone Raspberry Pi Dispatch Server

As part of my amateur radio hobby, I volunteer and provide communication support for a few events every year, the largest of which is the Wildflower Triathlon. Compared to most amateur radio communication support roles, the scale of Wildflower is pretty unprecedented with us having multiple active dispatchers on multiple repeaters handling calls at the same time. (Presentation on Wildflower)

One of the key pieces of technology which enables us to keep multiple dispatchers in sync with regards to what is going on is a system called Computer Aided Dispatch (CAD). Computer Aided Dispatch is a system which aids radio dispatchers by replacing what is typically local paper log sheets or even just a legal pad on the low end with a web application. This is powerful since it means that every dispatcher can see every log entry  and incident ticket as it is entered in real-time, and logged to the web server, with automatic time-stamping, etc.

There are plenty of CAD software packages out there, and the open source option that comes up most often is TicketsCAD, but my impression is that it is a relatively complex system to configure and use, so for events like Wildflower where the CAD system is only deployed for a few hours to days, that complexity is burdensome. Most of our users have never interacted with a CAD system before, so simplicity on the user side is also an important motivator, so we use a very simple and clean CAD system called BlackFlower.

I don't know the entire history of the BlackFlower CAD package, but in short, there was an overlap of staff between Wildflower and Burning Man in the early 2000s, and they decided to develop a CAD system specifically for these two events. Since then, those of us running Wildflower have ended up not being involved with the development of BlackFlower, but we still use the package and just don't get hung up on it including Burning Man logos.

For smaller events, where we're not spinning up a multi-seat dispatch center, the overhead of lugging a Linux server into the field is usually prohibitive, but I thought it would be interesting to see if I could build a minimum viable product using just a Raspberry Pi with some extra hardware to make it more suitable for getting pulled out of a box in the field and turned on for either training events or really really small events.

I'm still hesitant to actually recommend that you use this as a CAD system running off a microSD card in a Raspberry Pi for real events where you're handling real medical calls. I see a CAD system running off a Raspberry Pi most interesting as a way to lower the barrier to entry for live demonstrations and drills for evaluation of the concept. Once an organization decides that they would want to rely on using CAD, I'd still recommend moving to a platform with  more precedence for reliability than the Raspberry Pi (and things like a real hard drive and on-board RTC).

Bill of materials

Installing Raspbian

Install Raspbian Lite (or full Raspbian if you want a GUI) per any of the guides available. My personal favorite is to just copy NOOBS onto an SD card and download Raspbian Lite from that on the Pi.  Once installed and booted, you need to perform some of the configuration from the Pi itself with a monitor and keyboard since SSH is disabled by default. Once I get SSH set up, I typically switch to using the Pi headless and SSHing into it from one of my other systems, but all of this guide is doable from the Pi itself if you prefer. While having raspi-config open, I just do all of the needed changes on it on one go (disclaimer: raspi-config menus tends to change periodically, so you might have to wing it vs what I have here):
  • Log in with username pi and the default password raspberry.
  • Run sudo raspi-config
  • 1 - Change User Password: Set it to something you're remember. This is your "pi" account for logging into Linux on the system
  • 2 - Hostname: I set mine to gobox.
  • 3 - Boot Options → B2 Wait for Network at Boot → No
  • 4 - Localization Options: Set your locale, timezone, and Wi-Fi country code. I use en_US.UTF-8, America → Los Angeles, and US respectively.
  • 5 - Interfacing Options: Enable SSH so we can log into Linux remotely and I2C for the Real Time Clock hat.
  • 7 - Advanced Options → A3 Memory Split → 16 because we're about to unplug the monitor for the last time so why reserve more memory for the GPU? If you're using full Raspbian for a more user friendly Linux experience, you might not want to do this.


Set up and configure your RTC (Real Time Clock) hat following AdaFruit's tutorial. This is so when the Raspberry Pi boots somewhere without an Internet connection, it can still know what day and time it is, instead of defaulting to Jan 1, 1970, which is what it does if you don't set up an RTC hat. Since I'm using a DS3231 hat the line I ended up adding to my /boot/config.txt is "dtoverlay=i2c-rtc,ds3231"

Install software dependencies

# Quality of life utilities that I install on every Linux system
sudo apt install vim dstat git screen
# Basic LAMP Dependencies
sudo apt install apache2 php5 mysql-server mysql-client php5-mysql
# Set a password for the root MySQL user and remember it. This is the password to log into the MySQL server, and shouldn't be confused with your Linux user password set earlier (except I just use the same one because why not; I'm the same admin for both)
# BlackFlower dependencies
sudo apt install perl libnet-ssleay-perl openssl libauthen-pam-perl libpam-runtime libio-pty-perl python apt-show-versions
# I also like to install a MySQL database management tool like phpmyadmin so I can reach into the BlackFlower database and tune a bunch of the options myself using a nice web frontend. Be sure to tell it to auto reconfigure apache2 since that's the web server we're using. It will ask for the database root password you set two commands ago when you installed mysql-server
sudo apt install phpmyadmin

We now have a web server up and running with phpmyadmin as a friendly MySQL frontend to manage databases (include blackflower's) available at http://gobox/phpmyadmin/ (or whatever you set your hostname to)

Next up is installing blackflower, which consists of copying all the web assests into the web server folder (/var/www/html/ in Debian-based systems) and running the included bash script to load information into the blackflower config file and copy a blank database into MySQL. The official released tarballs for BlackFlower are available here, but there's some issues with the official releases, so I actually maintain a mirror of those tarballs, plus a branch with my patches, on github.  For this deployment I'm using my "fix12" branch, which is all my latest patches against the most-recent-as-of-this-writing v1.12.3pre1 release.

In these instructions, I'm copying the BlackFlower files into /var/www/html/cad/, so that the cad system will be available at http://hostname/cad/, which means the same web server can be used for other things, but if you want to use a different URL, just change the part after /var/www/html/.

Install BlackFlower

mkdir -p ~/src
cd ~/src
git clone https://github.com/PhirePhly/blackflower.git -b fix12
# Copy the app files into the web server
sudo mkdir -p /var/www/html/cad
sudo cp -r ~/src/blackflower/* ~/src/blackflower/.htaccess /var/www/html/cad/
sudo chown -R www-data:www-data /var/www/html/cad
# Configure BlackFlower for the local environment
cd /var/www/html/cad
sudo bash ./initialize.sh

This initialize.sh script prompts you for all the information to access your MySQL server and what your main admin account name should be.

Host for MySQL database: Accept default (localhost)
Name of host to allow apps: Accept default (localhost)
Enter MySQL admin password: whatever you entered during the apt install of mysql-server
Enter database name to create for CAD: Enter something unique if you want to run multiple instances of BlackFlower on one system, but for stand-alone systems, the default is fine (cad)
MySQL username: Accept default (cad)
New password for this user: Enter something good
Enter CAD administrator username: This is where we're now talking about users inside the CAD system instead of MySQL users. The default is good (Administrator)
The Administrator user starts with the password "default-admin-pw", but it will prompt you for a unique password once you log in. This is the password for just logging into the BlackFlower cad system, so depending on the scale of your event, you might want a unique password for the CAD admin vs the SQL admin vs Linux admin accounts, but I generally just use the same password for all three since our IT team is handling all computer systems at the same time.
At this point, you should have a functional CAD server. If you're only planning on using it from the local Raspberry Pi, or you've already got a functional LAN at your event to hang this server off of to make it available, you're done. Point a browser at it, log in with "Administrator:default-admin-pw" and reset the password to something unique, load in units for your event, create lower level user accounts for your dispatchers, and you're ready to run an event. Exact details for using BlackFlower as a CAD system and all the tricks / conventions we use for it deserves its own blog post, but even just using its event logger system can be a set up for an event vs a pad of paper. [Insert philosophical debate about the pros and cons of a computer based system vs the elegant simplicity of a legal pad here]

Note: The root http://hostname/ URL at this point still points to the default "Apache is alive!" index.html file in /var/www/html/. You probably want to replace that with some sort of homepage for the event LAN in case someone accidentally forgets to add the /cad/ to the end. I usually just have a list of bare links to the network resources for the event, which at the very least is the CAD server, the phpmyadmin page, and maybe a folder of files like a PDF of the event map.

Note: The BlackFlower documentation also says this, but it's worth mentioning that this is just a small CRUD webapp built by a few guys volunteering for an event, so don't expect BlackFlower to be particularly hardened against adversarial attackers, so exposing an instance of it to the whole Internet or even just the event's free WiFi may be a bad idea. Every event should distinguish between the trusted and the untrusted parts of their network, and this server should be sitting squarely in one of the trusted subnets.

WiFi Access Point

So this running CAD server plus a consumer grade router/AP like the classic WRT54GL or a newer router like my favorite the Asus 66U, and you have a reasonable CAD system. But we can still go deeper! What if we didn't need a separate WiFi access point and instead used the WiFi chipset on the Raspberry Pi itself?

The official documentation for setting up the Pi as an access point is pretty good, and I followed it up until setting up the bridge between WiFi and the Ethernet interface, since I don't need that and I'm most interested in just using it as a stand-alone offline server with no Internet connection, but your exact requirements for this server being a DHCP server or client on either WiFi or Ethernet and bridging vs routing between them will of course vary.

The way I have it set up, anything that connects to the Pi's WiFi AP will be issued an IP address and can reach the CAD server, but with no Internet access or routing for the clients. The Pi's Ethernet port will still request a DHCP lease from another router, so plugging it into any existing network will still work as usual. With a little more effort, I could at least set it up so that if the Pi had Internet via its Ethernet port, it could share it with any WiFi clients. There's plenty of good tutorials for building your own Linux-based router online, which is what you'd be doing at that point.

I was a little more assertive than the guide to get local DNS working, and picked a more distinctive subnet than 192.168.0.0/24, since 192.168.0.0 and 192.168.1.0 are two subnets which tend to already be used in places.

  • Raspberry Pi wlan0 IP: 10.44.4.1/24
  • Local domain: eventlan.net
  • WiFi SSID: "GoBox" with password "dispatch"
  • Update the /etc/hosts file so gobox correctly resolves to 10.44.4.1 for other clients instead of 127.0.1.1, which is the default in Raspbian.

So now we really have an entirely stand-alone box that lets you power it up, connect from a few laptops, point their browsers at http://eventlan.net/cad/, and you're up and running.

NTP Server

One last thing that would a nice to have, but I haven't done for this project yet, is having the Pi run an NTP server to synchronize the clients to the Pi, which is a requirement for BlackFlower. This would consist of configuring an NTP server, then advertising it via DHCP by including "dhcp-option = option:ntp-server, 0.0.0.0" in the dnsmasq.conf file.

Two options for how to configure the NTP server would be to copy its time from the RTC hat we've already installed and configured, or attach a GPS receiver and set up the Pi as a strata 1 server. The NTPsec guys have a good whitepaper on setting a Raspberry Pi up as a strata 1 server.

For serious use for an event, I'd say getting a working NTP server is pretty important, but since I'm only planning on using my server for demos thus far, I haven't bothered to go through the steps for this. If I do at some point do this, I'll come back and back-fill that information here.

Conclusion

At this point, this project is still in the workbench lab environment stage, so I have no experience deploying this minimalist of a CAD server for a real event, and can't make any guarantees for what that experience will look like until I try it.  I can't take responsibility for what comes out of anyone trying to use this setup for a real event, but I would be interested in hearing any stories from other amateur organizations on their experience using either BlackFlower or any other CAD system to support events.