Sunday, April 1, 2018

Measuring the Internet

Progress on the whole "running my own bit of the Internet" project has been going well. We've got a router, and some servers, and even a NAS, so one of the next questions is how well is our network behaving over time?

There are plenty of different ways to ask this question, and plenty of different metrics to look at. For example, to track my bandwidth usage, I'm using LibreNMS, which is a pretty good SNMP front-end to query my router every five minutes to see how many packets I'm moving.

One network monitoring tool that I've discovered as part of this project is the RIPE Atlas. It is a world-wide network of measurement probes spread across the Internet, which they use to measure the health of the Internet as a whole, but also allow others to request measurements on it.

To get started, you can request a probe and if approved, they mail you the simple hardware (clearly based on a TP-Link router running their custom firmware) to plug into your network. Once it's powered on, the probe starts taking measurements from your Internet connection, and you start earning credits to spend on your own custom measurements.
For example, I requested the probe, and although I never got any kind of email, a DHL package showed up about 6-8 weeks later with the probe + some cables inside. Once I plugged it in and registered its serial number to my account, I'm now accruing 21600 credits per day for keeping the probe online, plus another 5000-50000 credits per day for "results delivered" which I presume is running other people's custom measurements.
I haven't come up with any long term custom measurements yet, but to give you a sense of scale, a single traceroute costs 60 credits, so running a traceroute to my network from 10 random probes costs 600 credits, and RIPE's traceroute report is pretty slick.
The main reason I haven't programmed any periodic custom measurements yet is because the probe comes with a set of "built-in" measurements where it automatically measures the latency and packet loss to its first and second hop, all the root DNS servers, and some Atlas infrastructure, which already answers most of my questions on how well my network is doing. I really should set up some user-defined queries to monitor my HTTP servers, but for now I'm just accruing credits.

You can see all the public information coming from my probe here. You can even order Atlas measurements specifically from my network to your network if you specify the measurement should be sent to probe #34742, which I find rather amusing.

One thing I noticed right away is that I'm seeing 100% packet loss (solid red) to a few IPv6 root DNS servers... This is actually because Hurricane Electric and Cogent have been having a peering dispute over IPv6 for... pretty much forever, so the IPv6 Internet is actually split brain and I'm just not able to reach some parts of the IPv6 Internet from my Hurricane Electric transit...

One of the perks of running my own autonomous system is that I'm able to work on getting myself blended transit from someone other than Hurricane Electric and fix this problem myself... (Anyone with blended IPv6 transit on the west coast want to help me out?)

The probe uses about 3kbps to take its measurements, so the network load from it is what I would describe as "pretty much undetectable" considering my main transit link hovers around four orders of magnitude higher than that. This plot is from LibreNMS for my "DHCP" /29 subnet, which I use for my Atlas probe and plugging in my laptop to a spare port on my router when I'm standing in the datacenter working on my rack.

1 comment: