The reality of Internet access in Africa
When we talk about making the Internet accessible over slow and unreliable connections, we speak from experience. Our recent work in Zambia, for example, taught us a lot about the realities of Internet in the field, in rural Africa.
We were helping to deliver hands-on computer training for 150 young Zambian women as part of our ongoing work with Camfed. The training was framed as an introduction to email; for many it was a first introduction to computers.
We did not expect great Internet connectivity. We set up a local email system for the training to avoid dependency on a good connection. While practical, this approach wasn't ideal as we wanted to offer as much real experience of the Internet as possible.
The rural high school where we were working has a small IT lab that is connected to the Internet through a directional WIFI link to a near by youth skills centre where there is a VSAT link to the rest of the world.
The environment was harsh for network equipment. The mains power supply was was poor, with regular cuts and brown outs. The school and skills centre both had back-up generators but these were started manually and not always at the same time. The route to the Internet could not have been called reliable. Neither building normally had UPSs to condition the power to their network equipment, though fortunately we had brought some with us.
When conditions conspired to supply power to all the right places, Internet access at the school worked, but it wasn't great. A VSAT connection has an unavoidably high latency and costs prohibit high bandwidth connections. The Internet connection at the school was a nominal 1Mbit/s with a 1:10 contention ratio which means we were sharing our satellite bandwidth with nine other user sites. This works well in countries with abundant Internet connectivity as network use is effectively intermittent and sharing doesn't cause much degradation of service. In Zambia, and much of Africa, a connection like his is often shared between a number schools, hospitals and telecentres, etc., meaning network use is almost constant. Our experience of bandwidth was often that of the worst-case: 12Kbyte/s. Though we soon found there were more serious problems with the connection.
The web is surprisingly forgiving over a lossy connection experiencing data corruption, and at first it was easy to dismiss slow page load times as a limitation of the connection. Though it soon became obvious that something else was wrong. The image above shows corruption to image data downloaded from a popular mapping site.
A web page with corrupted content can be simply reloaded, but when downloading Ubuntu packages to update a machine we found the md5sums weren't matching.
The following tests helped us uncover network faults:
- Pinging hosts on the Internet route using a larger packet (-s packet size).
- Downloading a large file with a known hash and comparing the computed value. If the hash value didn't match, then repeating the download and checking if the computed value has changed.
- Using MTR to see at what router packet loss is occurring.
- Using Network Diagnostic Tool from a JAVA enabled browser to measure network performance.
"Can we get an engineer to look into this?"
"Our engineers only visit Samfya district as part of a regular inspection"
"How often do your engineers come out?"...
"Once a month".
No such visit was likely before we left Zambia. We needed to find a workaround.
When running a shell over SSH on the trouble connection, every so often the session would close with a "Corrupted MAC on input" error. This indicates that the last SSH packet failed verification, presumably because the underlying transport layers failed to guarantee the integrity of the data. It seemed like this characteristic could be useful in building a system to help download large files.
Being able to drop connections when ever packets are corrupted isn't in itself going to help download complete files. But it does mean that a file transfer application running over SSH will never unknowingly write incorrect data to the file (up to the point that SSH MAC checks can detect the problem). This lead to the idea of a brute force solution by persistently running a file transfer application over SSH until the whole file is successfully downloaded. The file transfer application must be able to resume part-way through a file if the link is dropped. A suitable application is rsync.
If files are available on a remote host which can be accessed using public key authentication so that it can be left running unattended, the described work around becomes in shell script form;
return 1 until [ $? -eq 0 ] ; do rsync -axrv --partial --inplace <remote-host>:directory . doneWith this method we were able to retrieve the Ubuntu packages we needed up update our servers.