View low bandwidth version

Author Archive for chrisw

AfNOG 2011, Part 2

Monday, May 30th, 2011
People sitting at computers in a lecture

Boot Camp

AfNOG boot camp was absolutely massive this year. I think they had 75 people when they were only expecting 40. They took over half our classroom as well, which made setup tricky as we had to work around people and ask them to move repeatedly, and we couldn’t get all of our tables in to cable them up.

It was followed by the obligatory welcome dinner, at the White Sands’ outdoor restaurant, with the requisite number of speeches and applauses.

Today we had the first day of Scalable Services. Desktop installation hadn’t gone too well. My attempt to respin with fixes, wiping the unused space after the imaged partition, failed badly and resulted in a corrupted image, so we had to reimage those boxes.

People sitting around dinner tables in front of a stage on the beach

Welcome

Luckily it seems that everyone brought laptops, so the PCs aren’t really needed. And the virtual machines seem to be working well so far. We haven’t yet had to compile any software on the virtual machines, and I hope it won’t be too slow when we do. We’re using 34 out of the 35 virtual machines that we created.

Tomorrow is my first session, a 1 hour practical on virtualisation, installing VirtualBox and FreeBSD, after Joel’s theory session.

AfNOG 2011, Part 1

Saturday, May 28th, 2011
Alan Barrett laying cable

Alan Barrett laying cable

I’m in Dar es Salaam, Tanzania for AfNOG 2011. I arrived on Wednesday morning at 7am (on the red-eye flight from London Heathrow) and I’m here until Tuesday 7th June.

Until now we’ve been setting up the venue. We’ve been super busy, working until midnight every night so far. We had to run our own cables, quite a lot of them (over 600 metres).

Running them through the windows was tricky, since we needed to be able to close them for security, and to allow the air conditioning to work. Someone (Alan?) came up with the genius idea of using tough palm leaves wrapped around them to protect them as they pass through the narrow gap between window panes. Bio-degradable trunking!

To cope with the power failures, Geert Jan built a monster Power-over-Ethernet injector to power the wireless access points in each room and keep the wireless network running.

The training workshops start tomorrow, Sunday 29th May, with the Unix Boot Camp, an introduction to Unix and the command line. We expect that many of the participants will have little experience with Unix, as has been the case in previous years. These free tools have immense benefits, both for us running the workshops and for the participants when they return home. But they are very different to the Windows environments that the participants are most familiar with. Without basic skills, they would struggle and hold back the group during the rest of the workshops.

Feeding the cable monster

Feeding the cable monster

I’m not involved in the boot camp, but after it finishes, we move straight into the main tracks, which last for five days. This year we have some new tracks: Network Monitoring & Management, Advanced Routing Techniques and Computer Emergency Response Team training.

We have also cancelled the basic Unix System Administration track (SA-E) this year, as it has finally been localised to most African countries, and therefore people have the opportunity to attend it locally at lower cost and build local communities. However, this leaves us with nowhere to cover more advanced systems administration techniques, which are some of my favourite topics, including:

Geert Jan with his 8-way Power over Ethernet injector

Geert Jan and the Monster Injector

  • virtualisation (desktops, servers and thin clients, VirtualBox, Xen, KVM, jails, lxc)
  • system imaging (ghost, snapshots)
  • backups (snapshots, Rsync, Rdiff-backup, Duplicity)
  • file servers (NFS, Samba, sshfs, AFS, ZFS)
  • virtualised storage (iSCSI, ATAoE, Fibre Channel, DRBD)
  • cloud computing (Amazon and Linode virtual servers, scripting and APIs)
  • cluster computing (Mosix, virtual machine host clusters)
  • DHCP (for network management and booting)
  • network security (firewalls, port locking, 802.1x)
  • wireless networks (planning, monitoring, troubleshooting, WEP and WPA, 802.1x authentication)
  • Windows domains and security (including Samba 4)

If participants show enough interest in these topics, they could be added in future. I think it’s unfortunate that the course is arranged into week-long tracks rather than half-day or one-day sessions from which people could pick and choose, Bar Camp style. That would make it much easier for people to run sessions on many new topics.

Stacked up computers

Some of our 80 desktop computers

In the past this would have been difficult, because we provided desktop computers for participants. It used to take us 3-4 days to set up 80-odd desktop PCs with customised FreeBSD installations. We’ve noticed that more and more people are coming to the workshops with their laptops, and this time we’ve made a big effort to shift from dedicated to virtual platforms, to reduce setup time and costs in future.

The hardest track to do this for, in my opinion, was Scalable Services English (SS-E), the one I’m working on. We were all pretty much agreed to stay with desktop PCs this year, making us the only track to do so. But when we arrived, we discovered that the mains power situation here is pretty awful. On Wednesday we had four power failures. We only have five UPS, not nearly enough to protect every desktop.

For participants with laptops, they effectively have their own built-in UPS. If we give them virtual machines to work with, then we only have to protect the hosts. We can keep those in the NOC (Network Operations Centre), where the UPS are, so they’ll be protected for around 15 minutes of any power outage, which we have to hope will be enough for the hotel to start their generator.

Cannibalising RAM

Cannibalising RAM

Some participants will probably forget their laptops, so we’ll provide them with desktops, but we have no way to UPS them. These desktops will be set up with FreeBSD, as in previous years.

We rented 80 machines from a local company. Some had Windows, in varying states of repair, some had no operating system installed. We decided to use some of these desktops as hosts for the participants’ virtual machines.They only had 2 GB of RAM each, but since we had plenty, we cannibalised eight others for their RAM to upgrade our machines to 4 GB each.

We decided to use VirtualBox for the virtual machines. It’s free, open source, can host on all major platforms (Windows, Mac, Linux and even FreeBSD), has a nice GUI and a command-line automation tool, supports bridged networking easily, and is relatively fast and efficient.

Backs of systems being imaged

Imaging backend

We configured the master (that we’ll copy onto the other machines) starting with the setup from last year. We then had to install VirtualBox and build our first virtual machine inside it. Then we discovered that the virtual machine was unable to access the network in bridged mode. Packets sent by the virtual machine we simply never sent by the host. We needed to use bridged mode so that participants could run services on their machines simply by installing them. without requiring extra configuration on the host.

We had no Internet access for most of that day, because all three of our redundant providers were down for different reasons. Eventually we managed to use Geert Jan’s 3G dongle to get online and research the problem. We found that bridged networking doesn’t work in the binary package of VirtualBox 3.2.12 that comes with FreeBSD 8.2, so we had to wait until Internet access was fixed to download 120 MB of software (ports updates and VirtualBox 4.0.8) like this:

Michuki Mwangi configuring a PC for imaging

Imaging frontend

pkg_add -r portupgrade
portsnap fetch extract update
portupgrade virtualbox-ose virtualbox-ose-kmod

This took a long time, as VirtualBox is a large piece of software which also required us to download and build a new version of QT, but eventually it succeeded and the problem was solved.

We decided to put only five virtual machines on each host. Sometimes we would have the whole class compiling software from ports, which would slow down all of them significantly. We will use six or seven servers to host 30-35 virtual machines. On the master host, we created five copies of our master virtual machine by copying its hard disk like this:

cd .VirtualBox/HardDisks
for i in 1 2 3 4 5; do
	cp AfNOG\ SSE\ Master.vdi vm0$i.vdi
	VBoxManage internalcommands sethduuid vm0$i.vdi
done
Moving the systems to the NOC

Relocation

Then we created the virtual machines in the VirtualBox GUI and attached them to these new images. We needed to generate a new UUID for each disk image copy, using the undocumented sethduuid command above, otherwise VirtualBox would refuse to add the copies because it had a disk image already registered with the same UUID.

We could have created the virtual machines using the VBoxManage command as well, but it would have taken longer to work out how to use it than simply to create the five machines by hand. I later worked out the commands that we could have used:

cd ~/"VirtualBox VMs"
for i in {1..5}; do
	echo $i
	vmname=VM0$i
	diskimage="$vmname/FreeBSD.vdi"
	VBoxManage createvm --name "$vmname" --ostype FreeBSD --register
	VBoxManage modifyvm "$vmname" --memory 256 \
		--nic1 bridged --bridgeadapter1 bge0.219 \
		--nic2 bridged --bridgeadapter2 bge0.$[50+$i] \
		--vram 4 --pae off --audio none --usb on \
		--uart1 0x3f8 4 --uartmode1 server /home/chris/"$vmname"-console.pipe
	VBoxManage storagectl "$vmname" --name "IDE Controller" --add ide
	cp VM01/FreeBSD.vdi "$diskimage"
	VBoxManage internalcommands sethduuid "$diskimage"
	VBoxManage storageattach "$vmname" --storagectl "IDE Controller" \
		--port 0 --device 0 --type hdd --medium "$diskimage"
done

We named the images VM01 to VM05, which was important for running automated scripts on them. Then we configured VirtualBox to start them automatically at boot time, in headless mode, by adding the following lines to /etc/rc.conf:

vboxheadless_enable="YES"
vboxheadless_machines="VM01 VM02 VM03 VM04 VM05"
vboxheadless_user="inst"

We wrote a short script to help us apply the same command to all five virtual machines:

#!/bin/sh
# script to control all five virtual machines

command=$1
shift

for i in 1 2 3 4 5; do
	VBoxManage $command VM0$i "$@"
done

This allows us to log into a machine and do things like:

  • ./manage acpipowerbutton to initiate a controlled shutdown of all five virtual machines
  • ./manage modifyvm --macaddress1 auto to generate new, random MAC addresses after cloning the host
  • ./manage startvm --type headless to get the virtual machines running again (headlessly, independent of the GUI)
Room with desks around the edge, covered in computers and equipment

The NOC

We knew that we wouldn’t have space to attach monitors and keyboards to all the hosts, and we’d have to fiddle about with cables in the hot NOC room (without working aircon) if we needed access to their consoles, so we added the ability to log into them remotely using VNC and GDM. To do this, we had to install the VNC server:

pkg_add -r vnc

Which unfortunately doesn’t come with the nice xorg loadable module that adds a built-in VNC server to the X server, making a fast and stateless remote control session possible. So we had to hack about with inetd, first by adding a service name with a port number to /etc/services:

vnc		5900/tcp

And then a service line in /etc/inetd.conf:

vnc	stream	tcp	nowait		root	/usr/local/bin/Xvnc Xvnc -inetd :1 -query localhost -geometry 1024x768 -depth 24 -once -fp /usr/local/lib/X11/fonts/misc/ -securitytypes=none

This requires us to enable the XDMCP protocol in GDM, in order for VNC to communicate with it to present a GDM login screen. So we replaced the contents of /usr/local/etc/gdm/custom.conf with the following:

[xdmcp]
Enable=true

[security]
DisallowTCP=false

And then restarted GDM:

sudo /usr/local/etc/rc.d/gdm restart

And checked that we could connect from another machine and got a login prompt:

vncviewer 196.200.217.128

Which did indeed give us a working login screen:

VNC graphical login on a FreeBSD virtual machine host

VNC graphical login on a FreeBSD virtual machine host

This method is very slow. I wanted to find a better way to access the guests, especially if their network configuration was broken. I tried to connect a host serial port to a pipe and then access that pipe from a shell command, eventually over ssh, in a similar way to the text-only console offered by Xen (xm console). The above VBoxManage commands set up a pipe in my home directory, and I wrote the following short script to access it:

#!/bin/sh
set -x
echo "Console for $USER"
nc -U /home/chris/$USER-console.pipe

I created user accounts for each virtual machine, with the same name, and set their shells to this script, so that when they log in, they would automatically be connected to the pipe. However I was unable to make it work well. In particular, because of incompatible terminal emulations, I was unable to run vi to edit configuration files in the guest. If you find a way around this, please let me know. I haven’t tried it yet, but conman looks like it might be a good bet.

I spent a long time searching for the hidden VNC support in VirtualBox 4. It’s undocumented (the manual only talks about RDP) and people on the IRC channel say that it doesn’t exist, but it does, at least when starting the guests in headless mode. I added the following lines to /etc/rc.conf:

vboxheadless_VM01_flags="-n -m 5901"
vboxheadless_VM02_flags="-n -m 5902"
vboxheadless_VM03_flags="-n -m 5903"
vboxheadless_VM04_flags="-n -m 5904"
vboxheadless_VM05_flags="-n -m 5905"

And then, after starting the guests in headless mode, I could connect to these ports and access the virtual displays, much more conveniently and much faster than by shutting down the guests using VBoxManage and starting them again using the VirtualBox GUI.

We used multicast to image the six virtual machine hosts from the master. This took about three hours, so we left it running overnight.

In the morning we checked that the hosts had been imaged successfully by booting them with their newly installed images, and gave them unique hostnames (host1.sse.ws.afnog.org etc.) and IP addresses.

We used the manage script to reset the MAC addresses of the network cards of each virtual machine on each host:

for i in 128 129 130 131 132 133 134; do ssh 196.200.217.$i ./manage acpipowerbutton; done
for i in 128 129 130 131 132 133 134; do ssh 196.200.217.$i ./manage modifyvm --macaddress1 auto; done
for i in 128 129 130 131 132 133 134; do ssh 196.200.217.$i ./manage startvm; done
Michuki Mwangi setting up a projector

Astral projection

Since they were all configured for DHCP, we could have got their IP addresses from the DHCP server, but we wanted to give them a nice naming scheme, so we logged in to each one (using the console and the VirtualBox GUI) and assigned it a unique hostname and a static IP address.

We checked that we could log into each virtual machine remotely using the SSH keys that we’d installed, and then we shut down the hosts and moved them to the NOC.

Boot camp starts tomorrow, next door, but we still have to arrange our room.

Michuki Mwangi surrounded by rows of desks covered with computers

Classroom

We may have up to 37 people, our biggest class ever, in a room that’s about eight metres on a side, so layout of the room is a real challenge.

I wanted to do something to facilitate working in groups, such as each table having four places (two each side) and with its long axis front-to-back. This was vetoed because participants would have to turn their heads to see the projected screen, and it might be hard for them to take notes as a result.

So we’re going to have long, cramped benches instead. I think this is unfortunate, and I hope I can persuade people to try something more imaginative in future.

Apptivate for Development

Friday, May 13th, 2011

As I’m at the Open Data for Development Conference, I thought I should write a quick note about some of our recent openness achievements:

  • SARPAM is a current Re-Action project to share drug price data across Southern Africa, helping countries negotiate a better deal from drug suppliers.
  • IHP+Results is another Re-Action project, just completed, to promote accountability among IHP signatories by sharing information on their health service improvements and progress towards meeting their Millennium Development Goals.

You can find the source code for both these applications on Aptivate’s GitHub. Our intention is to agree with our partners to publish full source code wherever possible.

Offline Websites and Low Bandwidth Simulator in Go

Wednesday, February 16th, 2011

Jon Thompson writes about Jeff Allen’s interesting new work on tools for working with low bandwidth:

Jeff continues to try and solve the low bandwidth/high latency problems that aid workers face in the field every day and that we encountered in Indonesia. We all know the joy of VSAT networks that slow to a crawl because either some folks on the team are downloading stuff they shouldn’t be downloading or all the computers are infected with bandwidth sucking viruses. It appears Jeff has moved one step closer to sorting out some of the problems surrounding bandwidth optimization by utilizing the Go programming language.

Rather than try and explain to you what Jeff has done I’ll let you read ‘A rate-limiting HTTP proxy in Go‘ and ‘How to control your HTTP transactions in Go‘ and sort out what he is talking about. Hopefully, this post will bait Jeff into leaving a lengthy comment that explains exactly what the hell he is up to.

My understanding is that Jeff is developing two useful tools:

People have been trying to make offlineable websites for a long time. Some of the best examples so far are using entirely client-side (in-browser) technology, such as the Logistics Operational Guide, developed by the World Food Programme for the Logistics Cluster, which can run entirely offline using Google Gears.

Gears had a lot of potential for developers to create offlineable websites, but Google has abandoned its future development in favour of the open standard HTML5, which is not ready yet. So there’s no obvious and future-proof way to develop offlineable websites at the moment. Jeff’s proxy, combined with a spidering system, could be one way to download an entire site, even if it wasn’t designed to be downloaded by the developers.

Another important potential comes from content management systems (CMS) such as WordPress, Drupal and Joomla. More and more websites are developed using such systems, rather than coded from scratch. The systems know all of the pages on the site, and the links between them, and could easily build an offlineable version of the site for download into Gears, HTML5 or Jeff’s proxy. And one plugin could potentially enable thousands of sites to be offlineable, especially if it was included in the CMS distribution and enabled by default.

A few wikis such as MediaWiki, MoinMoin, DocuWiki and JSPWiki have a programming interface (XML-RPC or WebDAV) that allows a smart client to download pages in their original text format, which could make them more efficient to store offline and also potentially editable offline. Jeff’s proxy could be extended to support sites built in such wikis automatically. There are still some limitations to this approach:

  • The pages would not look the same as the online versions, since the styling wouldn’t be downloaded and the effects of CMS plugins would not be visible;
  • It would probably still be quite slow to download an entire site this way, by spidering, without server-side support for downloading multiple pages at once;
  • Few websites are built out of Wikis, so the potential maximum reach is limited compared to better support for WordPress, Drupal or Joomla.

Anyway, I wish I knew Go, and had time to hack on Jeff’s proxy tools.

Network monitoring planning

Thursday, February 10th, 2011

People have been trying to monitor their networks with pmGraph, without understanding how they might need to change the network topology in order for that to work.

I think it is essential to understand this before attempting to install pmGraph, to avoid wasted time and frustration. But it seems that the relevant documentation was relegated to an easily-missed link off the installation instructions. It was also difficult to read and understand.

I’ve just completely rewritten the installation planning instructions. They are relevant to any kind of network traffic monitoring, not just using pmGraph or pmacct. Comments are most welcome.

Hibernate, EJB and the @Unique Constraint

Friday, November 26th, 2010

What are Hibernate and EJB

As a bit of background introduction, Hibernate is a Java library that allows Java objects to be loaded and saved from a database. (It is other things as well, but for simplicity I can ignore those for now). It handles loading, creating, updating and searching for objects by generating SQL queries for us.

Hibernate is an implementation of IBM and Sun’s Enterprise Java Beans (EJB) specification. You can argue about which came first, Hibernate or EJB, but Hibernate is a key member of the EJB board and most new EJB-related standards follow Hibernate’s de facto lead, and key Hibernate developers like Emmanuel Bernard are leaders of the EJB specification teams.

Insanity Rules

Let’s start with the theory. I’m going to argue that EJB is insane. I mean it. I’ve been telling people that for nearly a year, and nobody has been able to prove me wrong.

It’s insane because it’s trying to solve the wrong problem, an impossible problem. It’s trying to keep your in-memory Java objects perfectly in sync with the database contents at all times. If you don’t believe me, check out the manual (under Do not treat exceptions as recoverable).

Of course that’s impossible because other people, and other instances of the application, can be modifying the database under your feet, and you have no way to know until you try to save the object, which is when it fails. But the only way to find out is to commit your transaction, and you might not want to do that because you might not be ready to actually save the object yet, or you might want to recover gracefully if it fails (see below).

Instead, EJB forces you to pretend that everything is just rosy, and let it throw an exception when the inevitable happens, and someone modified the record under your feet, or some other constraint is violated (such as uniqueness). The worst thing about this exception is that you can’t recover from it. That’s because the faulty object is still managed by EJB.

If you try to recover from the exception (for example to display a nice message to the user instead of dumping core all over the shop floor), and you touch the database session in any way, you risk that EJB will try to save the object again… which fails again… which throws another exception.

If you discard the session, you’d better not dare touch any object that you loaded from the database before, because it might be lazily loaded (not yet loaded), and throw another exception when you try to actually do anything with it.

There is no way out of this trap, at least officially:

Do not treat exceptions as recoverable: This is more of a necessary practice than a “best” practice. When an exception occurs, roll back the Transaction and close the Session. If you do not do this, Hibernate cannot guarantee that in-memory state accurately represents the persistent state.

We’ve implemented a workaround in RITA, which tries to identify the offending object when an exception occurs and evicts it from the cache, but it’s pretty scary and will never be officially supported by Hibernate.

Performance

The other problem with this approach is that it forces the EJB implementation to constantly check all of your objects to see if any of them have changed, and if so try to persist them to the database. Your application knows which objects have changed and is best placed to handle any errors in trying to save them to the database, but apparently the designers of EJB know better. Or something.

Validation

One of the nicer things about EJB is that it lets you annotate your data-storage classes with extra information that controls how they are saved to the database by EJB. For example, this allows you to specify the table and column names for your classes and properties, as well as information about indexes.

A sub-standard of EJB is Bean Validation (JSR 303), which allows you to write code that checks whether your object is valid before trying to save it to the database. In some cases, this can save you from falling into the trap above, because it allows you to validate your object when you want, before saving it, rather than following the whims of your EJB implementation.

So, what can you do to validate your object? Well, you can check that some fields are not null, or that their value follows a certain pattern. You can write your own custom validator that’s adapted to your specific objects, by checking for invalid combinations of field values. And… that’s about it.

Notably missing from this list is the ability to check anything in the database. The philosophical reason for this is that EJB is completely database-agnostic, and in fact it’s trying to pretend that there is no database and apart from one magical call to a save() method, your objects live forever in some kind of implementation-independent limbo. Of course there’s no universal way to access that limbo, so if you want to do it then you can kiss your platform-independence goodbye.

Validating Uniqueness

There’s not even a generic way to implement something like the @Unique validation, which is so obvious that people keep asking for it. It would simply ensure that a property is unique for that kind of object, so that for example you don’t have two User objects with the same name or emailAddress. But it doesn’t exist in the Bean Validation specification. The official reason is that:

@Unique cannot be tested at the Java level reliably but could generate a database unique constraint generation. @Unique is not part of the BV spec today.

In other words, “life isn’t perfect so we’re not going to bother trying to make it better.” Perhaps they’re trying to save us from our own foolishness (moral hazard), that we might actually believe that it’s enough to check for this and we’ll never fail when writing to the database, the server will never crash or explode in flames, etc…

Incidentally, committing the transaction to check for uniqueness might force your code to go through unreadable contortions to avoid saving an invalid object or inconsistent state in the database (a preference for academic perfection over clean, maintainable code seems to be common among designers of Java standards). And committing a transaction early is also dangerous because the database will no longer detect and warn you about conflicting changes in a concurrent transaction, so you could end up silently overwriting someone else’s changes.

Hibernate is more pragmatic, but @Unique doesn’t even exist there, at least not officially, although there is a sample implementation on the community wiki. I’m not clear exactly why it’s not official, although that page says that “accessing the Session/EntityManager during a valiation is opening yourself up for potential phantom reads”, whatever that means. It is true that:

  • executing many individual reads would be difficult to manage efficiently, if you had many of these annotations;
  • reading while a flush() is in progress may trigger another flush(), leading to an infinite loop;
  • it requires you to jump through hoops in an extremely ugly way just to get a usable Hibernate session object.

Anyway, even if Sun and Hibernate don’t want to write this validator because it’s not technically perfect, many people are going ahead and writing it themselves, even complaining that it’s “harder than you think”.

Our Implementation

So I wanted to talk about what we’ve done to work around this, apart from me swearing never again to use EJB or Hibernate, in RITA. I don’t like the above approach because we wrap an object of our own around the Hibernate session, to keep some of this crazyness locked away in a well-guarded cellar of the application. Their approach gives us no way to access our wrapper object. And it’s not a standard or anything so it doesn’t matter much if we ignore it.

We already have a Hibernate Interceptor, which already does the following:

  • logs object changes in the audit log; (this appears to be the most common use of interceptors in Hibernate)
  • uploads modified records from a local instance to the master, if working online;
  • goes offline if the upload fails;
  • updates version numbers of owned objects on a local instance;
  • updates version numbers of all objects on a master.

We added a checkUniqueConstraints function, which is about 40 lines long (much shorter than the example on the community wiki), that looks for @UniqueInDatabase annotations on properties and runs a quick and dirty Criteria query to verify that no conflicting values are present in the database at that time.

Further Work

It might be a good idea in future to separate these different functionalities into separate layers using something like Listeners. Interceptors are more convenient because they have access to the state of the object when it was loaded, and the current state, which is handy for audit logging.

I think it would be handy if Java (or Hibernate) would provide an easy way to iterate over an object’s properties (whether annotated on their fields or getters or setters) and retrieve a specific annotation, class of annotations, or all annotations. I think this code already exists in Hibernate’s AnnotationConfiguration, and it’s a shame to have to write it again. Our method would be half as long if it could reuse this.

Capturing Prepared Statement Parameters

Wednesday, November 3rd, 2010

I’m using Hibernate for a project, and sometimes I have problems with saving records because the values in the Java object don’t fit within the database columns (e.g. large floating point numbers in a DECIMAL column, or long strings).

Hibernate often executes the INSERT operations in batches, which means that the actual failing values are not visible, because the PreparedStatement API gives you no way to get them out, and Hibernate doesn’t let you intercept them being set. The insert can also happen long after you created the object. These facts makes it very hard to find and fix the invalid data.

I decided to write a wrapper for PreparedStatement to capture the values being set by Hibernate, and a new Batcher to wrap the PreparedStatements returned by the driver in wrapper objects of my class. I was about to start laboriously writing yet another delegator class that does the boring work of implementing 100 methods and delegating each one individually to the wrapped class. I love Java so much.

Luckily I stopped and figured that someone might have done this before, and indeed I found an implementation by Holy. I adapted it slightly and integrated it into RITA.

To replace the default batcher with my own, to enable the wrapping of statements, I just had to add the following line to my Hibernate configuration properties:

hibernate.jdbc.factory_class = org.wfp.rita.db.CapturingBatcher$Factory

Thanks, Jakob Holy!

ICTs for Rural Development Seminar

Wednesday, October 27th, 2010

Just attended a very interesting seminar on The Rural Information Economy and ICTs, hosted by the UN Food and Agriculture Organisation (FAO), a major actor in this area, at their headquarters in Rome.

This is an area in which Aptivate is also very interested, and one in which I’ve done some research and been following developments. I still managed to learn quite a bit from three very interesting presentations:

Information Economy Report 2010 (UNCTAD)

The informational dimension of poverty, i.e. where information can help to alleviate or reduce poverty:

  • Market price information
  • Income-earning opportunities (e.g. jobs)
  • Weather information and warnings
  • Correct use of pesticides and fertilisers
  • Health information and education
  • Disaster risk reduction

Communication up and down the supply chain, and with peers and advisors, also helps.

There is an increasing trend to direct involvement of the beneficiaries in the production of ICTs:

  • As ICT workers
  • Manufacturing of ICTs (as an alternative occupation to subsistence farming)
  • Providing IT and ICT-enabled services (answering questions, finding information, running telecentres)

Mobile phone penetration has exceeded all other ICTs in growth in developing countries. On average in the least developed countries, it has increased from 2% to 26% of the population (1000% growth) from 2000 to 2009. Possibly the fastest-spreading technology ever in the history of the world.

Growth is uneven. There are still some LDCs where less than 10% of the population have a mobile phone. In Ethiopia for example, only 5% have a phone. This was largely attributed to lack of liberalisation of telecomms markets.

Half of rural population in LDCs have no access to a mobile phone signal, which will limit the further growth of mobile usage. Many Universal Service Funds are sitting unused. In some cases this is because they are mandated only to be used on the fixed line network, which is nearly obsolete.

Mobile micro-insurance has become a big topic. For example:

  • Kilimo Salama in Kenya
  • Burkina Faso, Mali (index-based crop insurance)
  • Alliance Afrique

Kilimo Salama recently made their first payouts to farmers because weather conditions exceeded their thresholds. The payouts are automatic and don’t have to be claimed by the farmers. The largest was about $30.

Even those who don’t have access to ICTs themselves can benefit from more transparent markets when enough participants use ICTs.

Download the full report (PDF, 171 Pages, 1240Kb).

Enabling role of ICTs to transform smallholder farmers to entrepreneurs (IFAD)

IFAD offers grants and loans to governments for argicultural development programmes. They are starting to offer grants (but not loans) to the private sector as well.

Grameen and BRAC had limited success with mobile banking (so far), because most of their customers are groups, not individuals, and mobile phones tend to be personal devices.

IFAD and WFP are running a joint project called the Weather Risk Management Facility (WRMF), a micro-insurance project. Half of the insurance premiums are paid by the farmers, and half by the sellers of inputs (seeds, fertilizer, pesticides) as they benefit from farmers being willing to buy more of their products due to reduced risk of crop failure.

ICTs enhancing plant production at the field level (FAO)

e-Locust2 uses vehicles with GPS, laptops and HF radio modems to send real-time information on locust swarms to governments, which can help to warn and prepare neighbouring villages and allow the targeted use of pesticides to control the pests. Time is critical to achieve this.

Digital Pens are being used to capture information entered on forms. The pen recognises what is being written, and where on the form, and captures the data for later upload. This makes it possible to have electronic filing with minimal training, minimal unreliable ICTs, an inherent fallback to paper-based methods, and hard copies of the forms that can be given to farmers or stored in local offices.

There are problems getting pest monitoring officials to enter high quality data when there is no incentive (reward) for accurate data, e.g. in one-way monitoring systems. If governments used this data to target their interventions, villagers would have a much more obvious incentive to ensure that the data was entered accurately and on time.

Thanks

Thanks to FAO for hosting this excellent seminar, and to the World Food Programme for allowing me time off to attend it.

Several of us expressed an interest in continuing the discussion online, we have been heard, and Michael Riggs, lead facilitator of the e-Agriculture Community, is working on enabling this to happen. There will also be a follow-on discussion at the ICTD 2010 Conference in London.

Consistency, Portability and Backwards Compatibility

Wednesday, October 6th, 2010

Michał Purzyński reported a problem with our pmGraph software, when using a PostgreSQL database:

I’d like to report a bug – pmgraph is unable to get data from postgresql database to which nfacctd is writing…

javax.servlet.ServletException: org.postgresql.util.PSQLException:
ERROR: column "src_port" does not exist

It turns out that pmacct, up to and including version 0.12.4, uses a different column name for the source port if the database is MySQL or SQLite:

if (!strcmp(config.type, "mysql") || !strcmp(config.type, "sqlite3")) {
  strncat(insert_clause, "src_port", SPACELEFT(insert_clause));
  strncat(where[primitive].string, "src_port=%u", SPACELEFT(where[primitive].string));
}
else {
  strncat(insert_clause, "port_src", SPACELEFT(insert_clause));
  strncat(where[primitive].string, "port_src=%u", SPACELEFT(where[primitive].string));
}

I really wish applications wouldn’t change their behaviour arbitrarily like this. To work around it, we would have to hard-code the database types in pmGraph as well, or add an option to switch the column names. Since pmGraph uses JDBC to access the database, it’s not even obvious which driver names are accessing an (underlying) MySQL or SQLite database. So we need to switch the column names, but if we can get pmacct fixed then we can ease the pain for new users in future.

I reported a bug to Paolo Lucente, the lead developer of pmacct, through their mailing list. Paolo agreed to change this behaviour, even though it will break backwards compatibility. We spent some time discussing how to do this in a way that would minimise any impact on existing users.

To do this, we took advantage of pmacct’s existing system of database table versioning, which means that you can still use the oldest table structures, even with the most recent version of pmacct. Paolo agreed to create a new schema version that uses the same column names for all databases, so that pmacct will remain backwards-compatible for all users unless they deliberately choose to change their database schema version.

As we chose to standardise on the PostgreSQL column names, the column names will change for MySQL users between schema versions 7 and 8, so we’ll need to add a configuration option to pmGraph to allow users to choose whether they want the old or the new column names. This is the very same switch that I wanted to avoid in pmacct, but pmGraph has fewer users so it has less impact.

Svelte Web Design with SVG

Wednesday, October 6th, 2010

Web designers who care about efficiency and speed might like to have a look at Sam Ruby’s Blog.

All images are embedded SVG in the XHMTL. No bitmaps at all. Notice how fluid it is, how it scales with the browser’s zoom in and zoom out controls (Control + and Control – in Firefox) and as you resize the browser window.

Screenshot of Sam Ruby's Blog

Screenshot of Sam Ruby's Blog

The page is small, just 14.5k of HTML plus 6.6k of CSS. There’s 21k of JavaScript that isn’t required for the design. Even the drop-down menu at the top works with Javascript disabled. Finally there a WOFF web font that adds 40k (another nice technique). It would be nice to have web fonts hosted for cross-site caching.

One disadvantage of designing sites this way is that the page must be valid XHTML for inline SVG to work. This makes it difficult to support older browsers properly, because the server must send the content-type as text/xhtml+xml, not text/html. This will cause older browsers to download the page instead of rendering it. You could work around that with user agent sniffing. I think that Internet Explorer might need that in any case.

Another disadvantage is that very few CMSs currently support generating valid XHTML, so it’s difficult to know what tool we could recommend to help you to build and manage a website with inline SVG. Massimiliano of the Habari Project says:

I don’t have any examples of blog software constructed this way… the only way to find out how many people would like having SVG on their blog is to provide a blogging tool which allows them to do it.

Both issues could be worked around by using external SVG (in separate files) instead of inline (embedded in XHTML). External SVG files are more cacheable but require additional HTTP requests to fetch from the server the first time.

Most older browsers do not support SVG images, so although the site degrades gracefully, it looks very plain without any graphics. You could work around this with a server-side renderer that converts the SVG to PNG for older browsers.

I think this is an excellent example of a great technique that we could be using for many more sites.