Monday, November 29, 2010

Chumby Status CGI Script

After writing about my Chumby based web server this weekend, it managed to work its way onto Hacker News, which meant that my internet connection and alarm clock got to be on the sharp end of a good old fashioned slash-dotting, and considering, held up admirably well.
The statistics for the blog post itself are almost as good, but, these statistics are for a web server which I set up and which I have full control over, so I'm afraid I spent a lot of time this weekend watching `watch -n 10 tail /mnt/storage/logs/lighttpd.access.log` seeing the new hits literally roll in, some times reaching into the tens per minute.

By the end of the weekend, having to ssh into my router, then ssh into my Chumby (obviously not publicly open, due to lack of root password), then run the watch tail command got a little tired.  I decided that it was time to learn some shell tools like awk and write my first useful CGI script to allow me to monitor the Chumby's health remotely from a web browser.

A CGI script is a program available at a URL.  When a web visitor accesses this URL in their browser, it causes the CGI script or program to run, and then the resulting web page the user sees is the output of the CGI script.  As for exactly what language the script is written in is only limited by what languages the server can handle, so one could invent an entirely new programming language, and then simply start their scripts with #!/bin/interpreter or compile the code down to an executable binary, and stick it in their web server's cgi-bin folder.

Since I'm mainly handling large text files and formatted output already available through such programs as df, free, and ifconfig, it seemed most reasonable to write this status script in shell.

The simplest shell cgi script would simply use echo to print first a MIME content-type header, and then as much of that content type as one feels fit to pipe through echo, be it text/html, or text/plain, or anything else for that matter.

I decided to use text/html as my content type, with plenty of pre tags sprinkled throughout, to preserve the mono-space type of the shell output, while allowing me to use html tags to set the page title, include analytics javascript, and eventually include graphs.

The last fancy thing about my script you can notice below is the fact that it uses a cache file for all of the status output.  This is because, when I was writing it, the daily access log was nearing 40,000 lines long, so even relatively simple commands such as wc -l < lighttpd.access.log (Count the number of lines in the log file) started taking several seconds to complete.  When a single page load takes several seconds, the likelihood of a second person visiting the site while it's still working on the previous one increases, which then takes ever longer, and pretty soon my Chumby is sitting in a pool of molten lead-free solder crying.

This caching is done by using `stat` to find the age of the current cache file as a number of seconds since epoch, then subtract that from the current time.  The larger this number, the staler the cache.  By the time I finished, traffic had died down, so I decided to set the cache timeout as low as 20 seconds, which means it rarely serves the same stats twice, but does prevent someone with a hot CTRL-R finger from DoSing my alarm clock.  It also means the script can fire off the cached status for the current request, then worry about regenerating the entire output for the next visitor, meaning the results page loads much faster than it should otherwise. (There is a risk of a race condition where many people hit it when the cache is expired but before any of them finish the CGI script to update it, but the timing would be challenging, and would be a very transient CPU spike)

The problem with this caching system is that if no one visits the status page in a long time, it doesn't get run in a long time, and the cached page served to the first visitor could be horribly (several hours, or even days) old.  To prevent this, in addition to all of the visitors, I have cron run the cgi script once an hour into /dev/null to quietly update the cache for the next web visitor.

status.sh:


I also included the Chumby octopus from /etc/motd, which gets displayed every time you ssh into the Chumby, on the page, because it was cute.

A lot of credit has to be given to my Dad, who has been writing shell scripts much fancier than this since before I was born, who nudged me in the right direction for a lot of the scripting I needed to get done this weekend.

3 comments:

  1. awsome blog. came across it googlin up how to make use of my lcd screen.

    I made one of those ring oscillator dealies with a 7404 after seeing yours outta mosfets. so simple yet so cool.

    keep up the great work.

    ReplyDelete
  2. @Anonymous: Glad to hear it. I really try and document all of what I do such that other's can learn from it and move it forward.

    ReplyDelete
  3. Ken:
    Even better. And isn't it nice to have at least something that you can discuss with your dad? In my case it's aircraft, and the odd typesetting issue.

    ReplyDelete