Offline Websites and Low Bandwidth Simulator in Go
Jon Thompson writes about Jeff Allen's interesting new work on tools for working with low bandwidth:
Jeff continues to try and solve the low bandwidth/high latency problems that aid workers face in the field every day and that we encountered in Indonesia. We all know the joy of VSAT networks that slow to a crawl because either some folks on the team are downloading stuff they shouldn’t be downloading or all the computers are infected with bandwidth sucking viruses. It appears Jeff has moved one step closer to sorting out some of the problems surrounding bandwidth optimization by utilizing the Go programming language.
Rather than try and explain to you what Jeff has done I’ll let you read ‘A rate-limiting HTTP proxy in Go‘ and ‘How to control your HTTP transactions in Go‘ and sort out what he is talking about. Hopefully, this post will bait Jeff into leaving a lengthy comment that explains exactly what the hell he is up to.
My understanding is that Jeff is developing two useful tools:
- A web proxy that simulates low bandwidth connections, similar to the Loband Simulator. Jeff's version is probably more accurate than ours because it doesn't need to modify the web page, but ours might be easier for non-developers to try out, because you don't have to install any software.
- A web proxy that can be fed prepackaged content repositories to serve up, so that you can take pre-prepared content (offline websites) with you into the field, and browse them through the proxy as though you were online.
People have been trying to make offlineable websites for a long time. Some of the best examples so far are using entirely client-side (in-browser) technology, such as the Logistics Operational Guide, developed by the World Food Programme for the Logistics Cluster, which can run entirely offline using Google Gears.
Gears had a lot of potential for developers to create offlineable websites, but Google has abandoned its future development in favour of the open standard HTML5, which is not ready yet. So there's no obvious and future-proof way to develop offlineable websites at the moment. Jeff's proxy, combined with a spidering system, could be one way to download an entire site, even if it wasn't designed to be downloaded by the developers.
Another important potential comes from content management systems (CMS) such as Wordpress, Drupal and Joomla. More and more websites are developed using such systems, rather than coded from scratch. The systems know all of the pages on the site, and the links between them, and could easily build an offlineable version of the site for download into Gears, HTML5 or Jeff's proxy. And one plugin could potentially enable thousands of sites to be offlineable, especially if it was included in the CMS distribution and enabled by default.
A few wikis such as MediaWiki, MoinMoin, DocuWiki and JSPWiki have a programming interface (XML-RPC or WebDAV) that allows a smart client to download pages in their original text format, which could make them more efficient to store offline and also potentially editable offline. Jeff's proxy could be extended to support sites built in such wikis automatically. There are still some limitations to this approach:
- The pages would not look the same as the online versions, since the styling wouldn't be downloaded and the effects of CMS plugins would not be visible;
- It would probably still be quite slow to download an entire site this way, by spidering, without server-side support for downloading multiple pages at once;
- Few websites are built out of Wikis, so the potential maximum reach is limited compared to better support for Wordpress, Drupal or Joomla.
Anyway, I wish I knew Go, and had time to hack on Jeff's proxy tools.