Friday, August 1, 2014

Using Squid StoreIDs to optimize Steam's CDN

As part of my new router build, I'm playing around with transparent HTTP caching proxies.

Caching proxies are a really neat idea; when one computer has already downloaded a web page or image, why download it again when another device right next to it asks for the same image? Ideally, something between all of the local devices and the bottleneck in the network (namely, my DSL connection) would intercept every HTTP request, save all of the answers, and interject its own responses when it already knows the answer.

My setup is pretty typical for caching proxies. On my router, I have a rule in iptables that any traffic from my local 10.44.0.0/20 subnet headed for the Internet on port 80 should be redirected to port 3127 on my router, where I have a squid proxy running in "transparent" mode.

The basic transparent proxy deserves a post of its own once I finish polishing it, but for right now I'm writing this mainly as notes to myself, because the lead time on the next part is going to be pretty long.

My protocol-compliant caching proxy seems to be able to answer about 2-5% of HTTP requests from the local cache, which means that the responses are coming back in the 1-3ms range instead of 40-200ms. 2-5% isn't something to sneeze at, but it isn't particularly profound either. Squid does allow you to write all kinds of rules about when to violate a response's cacheable meta-data or how to completely make up your own. A common rule is:

refresh_pattern -i \.(gif|png|jpg|jpeg|ico)$ 3600 90% 43200

which indicates to cache any images missing a cacheable header for 90% of their current age (with upper and lower bounds). This opens a whole rabbit hole of how deeply you want to abuse and mangle cacheable headers in the name of squeezing out a few more hits. I've played that game before, and it usually ends up causing a lot of pain because incorrectly cached items tend to break websites in very subtle ways...




Another problem with caching proxies is the opposite of the previously mentioned over-caching. While that was an issue of a single URL consecutively mapping to different content, there is the issue of multiple URLs mapping to the same content.

This is very common; any large content delivery network will have countless different servers each locally serving the same content. The apt repositories for Ubuntu or Debian are perfect examples of this: universityA.edu/mirror/ubuntu/packagename and universityB.edu/mirror/ubuntu/packagename are the same file, even though they have different URLs.

Squid, in version 3.4, has finally added a feature called StoreID which lets your fight around this multiple URLs to one content problem. It allows you to have Squid pass every URL through an exterior filter program that mangles each URL to try and generate a one-to-one mapping between URLs and content. I decided to play with this on the Steam CDN.

When you download a game in Steam, it is actually downloaded as 1MB chunks from something on the order of four different servers at once. In the menu Steam - Settings - Downloads - Download Region you can tell Steam which set of servers to download from, but it still selects exactly which servers to use beyond your control.

A typical Steam chunk URL looks like this:

http://valveSERVERID.cs.steampowered.com/depot/GAMEID/chunk/CHUNKID

  • SERVERID is a relatively small number (two or three digits) and identifies which server this chunk is coming from. At any one point, a Steam client seems to be hitting about four different servers. valve48 and valve271 are two that I'm seeing a lot in San Jose, but the servers seem to come and go throughout the day.
  • GAMEID is a number assigned to each game, although I've seen some games move from one ID to another halfway through the download. The largest game ID I've seen is in the high 50,000s. I strongly suspect that these are sequentially issued.
  • CHUNKID is a 160 bit hex number. Presumably a SHA1 checksum of the chunk? I haven't bothered poking at it.
The main takeaway is that, even when I have three computers downloading the same update, since each one of them is going to hit different servers for each chunk, I'm only seeing 25-40% cache hits for three sets of the exact same {GAMEID, CHUNKID} pairs.

Using Squid's new StoreID feature, I'm able to map each {SERVERID, GAMEID, CHUNKID} vector to the correct {GAMEID, CHUNKID} and now see 100% cache hits for every download after the first. With the VM I'm using for testing, I'm seeing about 20MBps throughput for anything that has already been accessed by any other system, and that is limited by the VM's NIC maxing out. I expect to be seeing close to Gigabit throughput once I move this to my router with it's SSD.



In hindsight, I think rewriting all the URLs to a consistent steamX.cs.steampowered.com is a poor choice. If you're going to rewrite URLs, you may as well go all in and rewrite it as an invalid hostname so there isn't the chance to break some future change on Valve's part. A rewrite to something like valveX.cs.steampowered.squid likely prevents any future possible namespace problems. I really hope the documentation for StoreID catches up and starts presenting some best practices, because I'm finding their documentation short of reading the code a little lacking...

Related rant: I really wish the Internet DNS system codified a top level domain for site-local use like IPv4 did in RFC1918 for the 10.0.0.0/8, 192.168.0.0/16 and 172.16.0.0/12 subnets. There exists a draft RFC from 2002 proposing "private.arpa.", but I'd like to see a shorter TLD like "lan." I personally use "lan.", but with how ICANN keeps talking about making TLDs a free-for-all, I dread the day that they make "lan." live.

In the end, the drag here is that Squid3.4 is so new that there doesn't exist any packages for it in Ubuntu or Debian. Even Debian bleeding edge is 3.3.8. It's obviously possible to compile and run squid3.4.6 on your own, but I really hate trying to maintain software outside of the package manager unless I really have to. I don't see myself using this new StoreID feature until Ubuntu 16.04 unless Debian packages it really soon in Jessie and I'm somehow convinced to switch my router back to Debian.

Saturday, July 12, 2014

WNDR3800 External Antenna Mod

How Size Really Matters

As I mentioned before, my previous primary router was a Netgear WNDR3800. A good piece of WiFi kit that I've simply outgrown as my main router, it now gets to serve as my network's main access point.
Unfortunately, I've always been a little disappointed with it's range as an access point. Throughput inside the apartment is great, but 10' beyond the front door you're toast. Part of this is because of the anemic 50mW it puts out on 5GHz, but I like to blame the fact that it uses tiny internal antennas more than anything else.  I can understand the appeal of the "slick" look from internal antennas, but I've never been one to go for popular aesthetic, so I figured I'd finally fix that.

This is a popular mod for the 3800. All you need is a pair of u.fl to RP-SMA pigtails and two dual-band RP-SMA WiFi antennas, both of which you can get on eBay for a total cost around $10. I've even seen some pre-packaged "WNDR3700/3800 external antenna mod" kits for sale.
Getting into the 3800 isn't quite as easy as the classic WRT54GL, but all you need is a T9 torx driver for six screws, and the four rubber feet are actually snap-in, not adhesive, so they go back in quite nicely.
The stock antennas are crazy small.
Seriously. They are tiny. They are foam taped to the inside of the case, so they're real easy, if a little destructive, to peel off.
I then drilled two 1/4" holes in the top of the case, centered in the short way and 1.3" from the two sides in the long way. The plastic is surprisingly soft and not brittle, which was a relief since I was afraid of shattering the case while drilling.

ANTENNA PLACEMENT HERE IS UTTERLY CRITICAL!

If you don't mount the antennas perfectly symmetrically, you'll always suffer from them fundamentally lacking symmetry, which would drive me nuts.
I had been under the impression that the 3800 had four internal antennas, and while there are four connectors on the board, it only uses two of them. The two red boxes are u.fl connectors as I expected, but the blue-boxed connectors left me a little befuddled.
They sort of look like u.fl connectors without their center pin populated?  Anyone know what these are?
In the end, I've taken a sleek router and given it that solid industrial two-giant-antennas-sticking-out-the-top look. What's not to like?

Quantitative measurements are... underwhelming. At a fixed distance, I saw no measurable change in RSSI from before to after... Qualitatively, it seems to have slightly longer range, but nowhere near what I expected.  The 15cm pigtails I got were a little long, so I might replace them with much shorter ones to reduce loss there, and I never put much faith in either $3 WiFi antennas I get from eBay or RSSI readings from devices. A little disappointing, but not a failure either. Now that I've moved out of San Luis Obispo, I need to figure out how to sweep these antennas now that I've lost access to Cal Poly's microwave lab...

Would I recommend performing this mod? Having failed to disprove the null hypothesis, I can't really say. If you have the parts sitting around, go for it, but it may not be worth going out and buying new parts for. Has anyone seen improvements from this kind of mod before?

Friday, July 11, 2014

Building My Own Router - Hardware

For a long time, my apartment's main router has been a WNDR3800. It's a great run-of-the-mill dual band Gigabit SOHO router that supports OpenWRT, and I've been playing with CeroWRT, but my network has been outgrowing it's capabilities for a while now. With some assistance from my buddy Sean, I've put together an Atom-based router that I've quite happy with.

This post will mainly be a hardware run-down for the build. Documenting my network topology and why it needs a router of this caliber is a post for another day. Until then, know that you can run pfsense as a great router OS on this box, but I personally am using Ubuntu as the basis for an entirely hand-configured software stack.

Parts list:



The heart of the router is the D2500CCE motherboard. This is a fantastic router board with it's dual Intel Gigabit Ethernet adapters. It comes with a PCI and mini-PCIe expansion slot, which I loaded with a WiFi card that I'm going to use as one of the nodes in my ad-hoc mesh projects. The AR5B95 can't be an access point, so I still need my WNDR3800 to act as an access point which is trunked off of this router. If and when I upgrade to higher-end or more access points, they'll be a drop-in replacement.

The dual core Atoms come with a decent stock heat sink, which lets you almost get away with passive cooling, but I opted to install a single 4cm fan for piece of mind.
This thing is just loaded with IO too. 7 USB ports in total, 2 RS-232 ports on the back, another two serial ports on headers, a parallel port, and even an LVDS port for an LCD.

The graphics support in 64 bit Ubuntu for this motherboard is garbage. Not a huge deal since I only plugged a monitor into it to run the OS installer, but you won't have a good time trying to install any kind of GUI on this. To get anything to display on an external monitors at all after you run the installer, you need to disable LVDS by appending the "video=LVDS-1:d" kernel argument. Good luck doing that if you didn't install an SSH server...
The M350 case makes for a nice stand-alone router case. I thought long and hard about building this all around a 19" rack mount enclosure, but opted for the flexibility of a normal case. It comes with both a standard 1/4" mounting hole for the power barrel connector and a punchout for a WiFi antenna which I used.

The M350 does only come with one hard drive bracket, so I had to order a second one separately for the fan. The stock mounting screws for the brackets were garbage; I've already stripped one by hand and had to drill it out, which was annoying. You'll want to replace them if you have the spare hardware available.
The front of the M350 has a removable cover (Which only comes off after removing the top! You'll break the plastic tabs otherwise...) which exposes another 4cm fan mount and a daughter board with the power switch and two USB ports. These USB ports are meant for hidden USB dongles, since there's no way to get at them without disassembling the entire case. Since I've already got WiFi on the Mini-PCIe slot, I might eventually install a Bluetooth dongle on these.

Of course, the stock power LEDs were bright blue, so the first thing I did was desolder them and replace them with low intensity red 3mm LEDs.
The PicoPSU is an interesting little board that plugs straight into 20 pin ATX power connectors (which still works on the 24 pin socket on the D2500 motherboard). It comes with a 2.5x5.5mm barrel connector for 12V input, and then converts that to the multitude of voltages you need on an ATX power connector, as well as a single Molex and single SATA power connectors. You're going to need a pretty beefy 12V power supply to pull 80W from this thing, so the 2.5mm barrel connector isn't wrong, but my system only draws ~12W and I've standardized on 2.1mm barrel connectors for my apartment 12V system, so I replaced the input connector. I've already got a 12V 12A power supply I use to run my main networking stack, so you'll need to make sure you have something capable of powering your system.

The network design and software stack I run on this deserves one if not several more of their own posts, but the basics of it is that one NIC is used as a WAN uplink, and the other is my primary LAN adapter, with each other subnet encapsulated in VLAN tags that get broken out by my 802.1Q aware managed switches and access point. I've been real happy with the performance so far; when performing a 100Mbps transfer between subnets routed through this single trunk line, the Atoms loaf along at about 88% idle.

Friday, July 4, 2014

Building a Strata 1 NTP Server

Despite all indications to the contrary, I am in fact still alive. Things have been quiet on here because I've been spending the last two months recovering from the Wildflower Triathlon and writing my thesis, so not much to blog about other than writing about how I'm writing about my thesis, which would be a little meta.

As a quick sanity break, I went back to playing with my apartment's time network, which I've already shown how to configure multicast discovery on. Now that all of my NTP clients can automatically discover my NTP servers, even across subnets due to my apartment's network supporting multicast routing, the next logical step is to improve the actual NTP servers.

Since my NTP servers are drawing their time from a vote of multiple lower strata time servers on the Internet, the limiting factor to how closely synchronized to them I can be is based on my apartment's Internet connection. Highly variable latency and an asymmetric connection puts the lower limit on how closely my NTP servers can be to actually correct is about 5ms (which is perfectly FINE for any reasonable person, BTW).  Given these limitations on Internet-base time servers, the next logical step is to locally define time myself. There are multiple ways to do this:
  • Building a WWV receiver 
  • Physically synchronizing an atomic clock and transferring time to my apartment
  • Using a GPS receiver to derive local time from the GPS constellation
This third option is one of the more reasonable solutions to an entirely unreasonable problem, so I decided to go with that as a first pass.

I long ago somehow ended up with a Motorolla OnCore GPS engine that was missing it's antenna. Luckily, MCX active GPS antennas can be had on eBay for <$10, so once that problem was solved I could play with an ancient industrial-grade GPS receiver. This thing not only supports the standard TTL NEMA sentences that tell you where you are and what time it is, but also has a one pulse per second output and supports feeding in differential GPS corrections for a fixed "time-keeping" mode.


As a quick first test, I built a quick adapter board between the GPS engine's 2x5 pin header to the standard FTDI serial cable, with a diode and 0.3F capacitor (Yes, not micro Farads, 0.3 actual Farads) to keep the SRAM from losing the current GPS almanac every time I power cycle it (which cuts the GPS lock time down from 12.5 minutes to 20-30 seconds). The 1pps output from the received is connected to the CTS line on the FTDI cable.

Configuring GPSd is ridiculously simple: Tell it which serial port you have your GPS receiver connected to, and it figures out baud rate, which pin you have 1pps connected to, etc. Then just point your NTP server at two special local host addresses where GPSd makes the time information available.


The NMEA sentences are off by 200ms due to the delay through the serial port, so I'm thinking I might tag it as a strata 8 source or something so that it avoids using something that confuses the NTP algorithm that much.

Reading 1pps over USB isn't ideal, and is likely the source of most of my clock jitter. Initial measurements are putting my clock jitter around the 0.1ms range, but I really haven't gotten a chance to do too much analysis of it.  Ideally, I'd use an actual hardware serial port, but I need to build the RS-232 translators first...  My new router I'm building has four hardware serial ports, so I won't have any excuse.

Wednesday, March 26, 2014

Range Testing NanoBridge M5s

In preparation for the Wildflower Triathlon in May, I've been testing many of the major components of the computer network I'm helping build for communications. One of the big unknowns this year is the unusually large number of mid-range microwave links, which I've never had too much experience with. WDS links across buildings or small parks I'm comfortable with, but we are looking at streaming HD video through WiFi hops that are multiple kilometers long, and I've just never played with links like that before.
To gain some experience before the final deployment, I bought a pair of Ubiquiti NanoBridge M5 nodes. These are 200mW 5GHz WiFi access points with optional high-gain dishes, and at $80 a piece are pretty reasonable for what you get.

The NanoBridges come with either 22dBi or 25dBi dishes, pole mounting hardware, and a passive PoE injector. Luckily, Ubiquiti uses the kind of standard passive PoE injector pinout, so I often use the cheap 2.1mm barrel PoE injectors you can get on eBay for $2 instead of using the OEM injectors from Ubiquiti so I can power them off 12V batteries for portable operations. I selected the lower gain 22dBi dishes because I plan to be moving these around a lot so the smaller dishes are a nice convenience.
I think the NanoBridges claim to have a 20km range, which isn't an unreasonable number if you look at the link margin calculations, but since I've never done a deployment like this before I did want to do a real-life deployment before trying to do it across lake San Antonio. I gave one of my buddies, Robbie, a six pack of beer and one of the two dishes and told him to point it at Cuesta Ridge north of SLO. I then drove up the horrific road to the top of the ridge and pointed the second dish back at his apartment. Once we got both of them turned on and roughly pointed in the right direction, the link came right up at full speed.

At 6.5km, this test is longer than any of the links that we're going to need for Wildflower, so I was very happy when the link came up without any fine tuning of either dish's aim. I was even able to aim one of the dishes 30-40 degrees off center and the link stayed up, although with a smaller link margin.
AirOS, which is what most Ubiquiti nodes run, has a really neat spectrum analyzer mode that I used to help select channels for the test. The first screen shot is from Cuesta Ridge looking at the second node down the hill beaconing its SSID.
This is a capture of my MacBook Air transferring files from my apartment's AP using a 40MHz channel.
This is another spectrum scan from the radio site on top of Cuesta. Interestingly, none of these have the distinctive 802.11 spectrums, but are other modulation types from non-802.11 equipment using the 5GHz ISM band. I thought they were really interesting.

So in short, I'm really happy with my NanoBridge M5s, which couldn't be much easier for setting up a ~80Mbps effective throughput link between any two points with line of sight. 6.5km was no problem, and shorter links will only enjoy a larger power margin to allow for non-perfect aiming, rain fade, etc.