View low bandwidth version

Archive for the ‘data collection’ Category

Rough Guide to rural data collection with ODK

Monday, December 5th, 2011

This post has three purposes, which I think overlap sufficiently to combine them:

  • A User Guide for the system that we developed for UNICEF, IDS and RuralNet Zambia
  • A Developers’ Guide for anyone wishing to build something similar
  • Notes on lessons learned that may assist future implementers

Project goals

Automate the data entry part of a long paper-based survey, by replacing the paper forms with electronic devices.

Hardware and application selection

The survey has several long and complex questions, and long sets of multiple-choice answers. The data collection needs to be done in dusty rural Zambia, and the devices might need to be used for a full day without power. Collected data should be sent wirelessly to a secure data repository at some time after collection.

Text entry is required for many fields. That means either a real keyboard with keys, or a sufficiently large touch screen to type comfortably on. Use of the device camera, and presentation of reports and graphs on the same device, might be required in future.

Two possible hardware platforms were identified:

  • Tablet laptops with touch screens
  • Tablet mobile devices (iPad or Android tablet)

We selected the latter for this project due to lower cost, lighter weight, better usability and longer battery life.

The available software options that we identified were:

  • EpiSurveyor (Java J2ME, partly closed source, we have used before and fixed bugs)
  • OpenXdata (Java J2ME, open source, developed and supported by an Aptivate alumnus among others)
  • Open Data Kit (ODK) (Android, open source, active community)
  • Bespoke online/offline survey in HTML5

Of these, we eliminated EpiSurveyor and OpenXdata due to lack of compatibility with the hardware platform(s) we had chosen.

We chose ODK over a bespoke system due to limited time available for development, and ability to easily take photos and record GPS coordinates using the device’s hardware.

Of the available Android tablet devices, we chose the Samsung Galaxy Tab for the pilot project, due to its high quality construction. For future projects we would probably use a lower cost device; see the lessons learned for details.

Form creation

Since the survey is quite long (about 230 questions) we wanted an easy way to enter the questions. The ODK application requires the form to be in XForms format. We identified the following tools for creating XForms:

We decided to use XLS2XForm, which enabled us to enter the large number of questions easily in Excel. The others all have graphical builders, which have advantages and disadvantages for less technical users:

  • More visually appealing
  • All available options presented visually (types of controls, groups, etc.)
  • Less likely to make a mistake and produce an invalid form
  • Cumbersome user interface slows down data entry

Unfortunately, none of these designers were able to import an existing form in XForms format, which means that the modifiable “source code” of the form must be maintained in a “proprietary” format in each case, and it’s difficult to switch between tools.

You can download the conversion tools, and the Excel spreadsheet with the completed questionnaire as we delivered it to RuralNet, here. RuralNet staff, please use the latest version of the spreadsheet that you can find locally. To use the tools, you will need to download and install Python 2.7 and Java (JRE). Then download the tools as a ZIP file and extract it somewhere. I recommend that you keep the master copy of the spreadsheet in Dropbox to ensure that it’s backed up, and it’s always clear what the latest version is.

For help in building surveys using XLS2XForm, please see the documentation. In addition to the question types listed there, we have used the following shortcuts, which also work in this customised version of XLS2XForm:

  • text is short for add text prompt (a text field, such as a person’s name)
  • note is short for add note prompt (a read-only field, providing additional information for the user)
  • time is a time field without a date (for example, survey start and end times)

To compile the spreadsheet into an XForms form, run the build_and_validate.py script by double-clicking on it. If it works, it will show the message “Success!”, otherwise it will show an error message, usually caused by a mistake in the Excel spreadsheet. If it works, it will create (replace) the file called zambia-ranq-round3.xml in the same directory. If your spreadsheet has a different name, you can create a shortcut to call build_and_validate_custom.py with the name of the spreadsheet on the command line.

Software components

ODK Aggregate is the software that powers the Internet server. It is a repository for blank forms (designs) and completed forms (data). Our server is located at http://partimob.appspot.com/. This server is currently paid for by us, and will need to transfer to RuralNet at some point.

ODK Collect is the application runs on the device, and users interact with it to complete the survey. It’s essentially a user interface for XForms. It can download blank forms (designs) from an ODK Aggregate server, and upload completed forms (data) to the Aggregate server as well.

ODK Briefcase is the software that downloads completed forms (data) from the Aggregate server and convert them into CSV (spreadsheet) format, which can be loaded into

Customised ODK Collect

We are using a custom version of ODK Collect. You can download the source code for it here, or the compiled application here. You can also find it in the ZIP file download. If you prefer, you can use the latest official version of ODK Collect. The two are compatible, but our version adds the following useful features:

  • Use supplied login and password by default to save a round trip and a prompt.
  • Add keyboard navigation, useful for form filling on android-x86 because the mouse interface is pretty clunky.
  • Restore ability to modify completed and submitted forms on the device, which was removed from the official version in 1.1.7.
  • Improved error messages and progress indication during form uploads.
  • Allow setting the instance name on the first page of the survey.
  • Allow saving incomplete surveys on required questions (in case a survey is interrupted; almost all of our questions are required).

There are several ways to install ODK Collect on a device:

  • Download it from the Android Market (official version only, not our customised version)
  • Copy the APK file onto a microSD card, insert the card into the device, and use the My Files application find and open it from the SD card.
  • Attach the USB cable from the device to a computer, enable mass storage mode on the device, and on the computer, drag and drop the APK file onto the device’s internal memory, then use the My Files application to find and open it.
  • Attach the USB cable from the device to a computer, and use ADB‘s install command to install the APK file.

It’s useful to put the application onto the device’s desktop. To do that, open the Applications list, find ODK Collect, and press and hold it with your finger for a few seconds. The background will change to the desktop; release your finger to drop the application there.

It’s also useful to remove all the other junk from the desktop. For each icon and widget on the desktop, press and hold it with your finger for a few seconds, until the trashcan icon appears, then drag your finger to the trashcan and release it there.

Form management on the device

There are several ways to put blank forms (designs) onto the tablets:

  • Download them from the ODK Aggregate server using ODK Collect.
  • Copy them onto a microSD card, insert the card into the device, and use the My Files application to copy them from the SD card to the /sdcard/odk/forms directory.
  • Attach the USB cable from the device to a computer, enable mass storage mode on the device, and on the computer, drag and drop the form into the /sdcard/odk/forms directory.
  • Attach the USB cable from the device to a computer, and use ADB or DDMS to push the file onto the device, into the /sdcard/odk/forms directory.

Of these methods, ADB or DDMS is recommended for rapid development, and using the Aggregate server is recommended for production use, since the form must be installed on the Aggregate server for it to be able to accept submissions.

Similarly there are several ways to copy completed forms (data) off the device:

  • Upload them to the ODK Aggregate server using ODK Collect.
  • Use the My Files application to copy them from /sdcard/odk/instances to a microSD card, then remove the card and connect it to the computer, and drop the files into the ODK Briefcase data directory.
  • Attach the USB cable from the device to a computer, enable mass storage mode on the device, and on the computer, drag and drop the files from the /sdcard/odk/instances directory to the ODK Briefcase data directory.
  • Attach the USB cable from the device to a computer, and use ADB or DDMS to pull the file from the device’s /sdcard/odk/instances directory to the ODK Briefcase data directory.

Of these methods, using ODK Aggregate is recommended for development and production use.

Since the Aggregate server is on the Internet, this method requires that the device have Internet access. So it either needs a valid SIM card installed with credit and a data bundle, or a WiFi network connected. We had many problems with using SIM cards for data, so WiFi is preferred if possible.

The directories mentioned above will not exist until ODK Collect is installed on the device and run for the first time. Forms downloaded from the Aggregate server will also be placed in the /sdcard/odk/forms directory. Forms completed on the device will be placed in the /sdcard/odk/instances directory.

Configuring ODK Collect

Collect needs to know the details of the ODK Aggregate server to log into it, download blank forms and upload completed forms.

Open the ODK Collect application, press the Settings button and click on Change Settings. Click on URL and enter https://partimob.appspot.com. Similarly, complete the Username and Password using the details that you’ve been given by the Aggregate server operator, or the account that you’ve created on the Aggregate server. This account should only have Data Collector permissions, no more. Press the Back key to get back to the main menu of ODK Collect.

Downloading forms using ODK Collect

Open ODK Collect on the device, and click on the Get Blank Form button. Collect will try to log into the Aggregate server using the details that you’ve provided, and get a list of forms on the server that have the Downloadable box ticked. This is on by default for newly uploaded forms.

Tick the box next to all the forms that you want to download, and click on the Get Selected button.

Filling forms on the device

Open ODK Collect on the device, and click on the Fill Blank Form button. All the forms in the device’s /sdcard/odk/forms directory should be listed. Choose the form that you want to complete.

You will see an introductory screen showing how to move between questions by swiping your finger across the screen, from right to left or left to right. This screen has a text box at the bottom, which you can use to name the form that you’re completing. Naming forms is useful if your data collection is interrupted and you need to resume it later. It’s much easier to identify the form using its name, rather than opening it and flicking through to find some identifying information. You might name the form based on the household code that you’re surveying.

Depending on your answers to some questions, others may be hidden, or their text might change.

At the end of the form there is another chance to Name this form, and a tickbox to Mark form as finalized. Before you can upload the form to the Aggregate server, this box must be ticked, and you must press the Save Form and Exit button. Otherwise Collect will consider that the form is incomplete.

Sending completed forms to Aggregate

Open ODK Collect on the device, and click on the Send Finalized Form button on the main menu. Tick the box next to all the forms that you want to upload to Aggregate, and click on Send Selected. After the upload is complete, you should see the Upload Results message. Every form should have “Success” next to it, otherwise it was not sent successfully.

Downloading forms using Briefcase

We are using a customised version of ODK Briefcase with the following changes:

  • Fix the export of repeated groups, which before only worked for the first row (issue 461).
  • Shorten exported column names, to allow the CSV file to be imported into Access.
  • Allow the server name, username and password to be provided on the command line (or via a shortcut).

You can find the source code here and the pre-compiled version here, as an executable JAR file. You can also find it in the ZIP file download. If you make changes to the source and want to build the executable JAR again, install Maven and use the mvn package command.

To download the completed forms, open Briefcase by double-clicking on the briefcase-1.0-jar-with-dependencies.jar file. On the Transfer tab, click on the Connect button. For the URL, enter https://partimob.appspot.com, and for the user name and password, give the details of an ODK Aggregate account with Data Viewer permissions.

Then you should see a list of forms appear under the heading Forms to Transfer. Tick the box next to the one that your users have been completing, and then click on the Transfer button. If you do this after all the completed forms (data) have been submitted to the ODK Aggregate server, you will not need to do it again for that form template (design).

Now switch to the Transform tab and see if the form appears in the Form list. If it doesn’t, then exit and restart the Briefcase application (issue 464).

For Output Type, choose .csv and media files. For Output Directory, choose the directory where you’d like to save the CSV files. Note that any previous files exported to that directory from the same form will be overwritten without warning, even if they have been modified (cleaned). Click on the Output button to write the CSV files.

Cleaning data in Excel

You can find the Excel spreadsheet that we use for data storage and cleaning here. Note that Excel is a long way from the best way to store and manipulate data like this. Microsoft Access would be far more appropriate. Yet again I wish there was a sufficiently powerful open source alternative.

Because the spreadsheet contains cleaned data, which is “better” than the raw data which is included in the CSV export, we don’t want to overwrite existing rows. For the main section of the questionnaire (the so-called Single Responses) you can include only the new data like this:

  • Open the main spreadsheet and switch to the Single Responses tab
  • Highlight all rows from 3 down to the bottom, and Sort them by the SubmissionDate column.
  • Note the last submission date on this spreadsheet.
  • Open the newly exported CSV file for the single responses (something like RANQ-2011-Round-4-v5.csv).
  • Sort this file by the SubmissionDate column as well.
  • Highlight and copy all the rows whose submission date is later (more recent) than the last one in the main spreadsheet.
  • Paste them at the bottom of the Single Responses tab of the main spreadsheet, below the other data.

For the other tables, this process needs to be done completely manually at present.

You can then check and clean the data by viewing and modifying it in Excel. Note that each sheet has one or two columns at the end, which are filled by formulae that look up values from the Single Responses sheet, such as the Household Code.

Using the Android x86 Emulator

To be written.

Lessons learned

To be written.

ICTs for Rural Development Seminar

Wednesday, October 27th, 2010

Just attended a very interesting seminar on The Rural Information Economy and ICTs, hosted by the UN Food and Agriculture Organisation (FAO), a major actor in this area, at their headquarters in Rome.

This is an area in which Aptivate is also very interested, and one in which I’ve done some research and been following developments. I still managed to learn quite a bit from three very interesting presentations:

Information Economy Report 2010 (UNCTAD)

The informational dimension of poverty, i.e. where information can help to alleviate or reduce poverty:

  • Market price information
  • Income-earning opportunities (e.g. jobs)
  • Weather information and warnings
  • Correct use of pesticides and fertilisers
  • Health information and education
  • Disaster risk reduction

Communication up and down the supply chain, and with peers and advisors, also helps.

There is an increasing trend to direct involvement of the beneficiaries in the production of ICTs:

  • As ICT workers
  • Manufacturing of ICTs (as an alternative occupation to subsistence farming)
  • Providing IT and ICT-enabled services (answering questions, finding information, running telecentres)

Mobile phone penetration has exceeded all other ICTs in growth in developing countries. On average in the least developed countries, it has increased from 2% to 26% of the population (1000% growth) from 2000 to 2009. Possibly the fastest-spreading technology ever in the history of the world.

Growth is uneven. There are still some LDCs where less than 10% of the population have a mobile phone. In Ethiopia for example, only 5% have a phone. This was largely attributed to lack of liberalisation of telecomms markets.

Half of rural population in LDCs have no access to a mobile phone signal, which will limit the further growth of mobile usage. Many Universal Service Funds are sitting unused. In some cases this is because they are mandated only to be used on the fixed line network, which is nearly obsolete.

Mobile micro-insurance has become a big topic. For example:

  • Kilimo Salama in Kenya
  • Burkina Faso, Mali (index-based crop insurance)
  • Alliance Afrique

Kilimo Salama recently made their first payouts to farmers because weather conditions exceeded their thresholds. The payouts are automatic and don’t have to be claimed by the farmers. The largest was about $30.

Even those who don’t have access to ICTs themselves can benefit from more transparent markets when enough participants use ICTs.

Download the full report (PDF, 171 Pages, 1240Kb).

Enabling role of ICTs to transform smallholder farmers to entrepreneurs (IFAD)

IFAD offers grants and loans to governments for argicultural development programmes. They are starting to offer grants (but not loans) to the private sector as well.

Grameen and BRAC had limited success with mobile banking (so far), because most of their customers are groups, not individuals, and mobile phones tend to be personal devices.

IFAD and WFP are running a joint project called the Weather Risk Management Facility (WRMF), a micro-insurance project. Half of the insurance premiums are paid by the farmers, and half by the sellers of inputs (seeds, fertilizer, pesticides) as they benefit from farmers being willing to buy more of their products due to reduced risk of crop failure.

ICTs enhancing plant production at the field level (FAO)

e-Locust2 uses vehicles with GPS, laptops and HF radio modems to send real-time information on locust swarms to governments, which can help to warn and prepare neighbouring villages and allow the targeted use of pesticides to control the pests. Time is critical to achieve this.

Digital Pens are being used to capture information entered on forms. The pen recognises what is being written, and where on the form, and captures the data for later upload. This makes it possible to have electronic filing with minimal training, minimal unreliable ICTs, an inherent fallback to paper-based methods, and hard copies of the forms that can be given to farmers or stored in local offices.

There are problems getting pest monitoring officials to enter high quality data when there is no incentive (reward) for accurate data, e.g. in one-way monitoring systems. If governments used this data to target their interventions, villagers would have a much more obvious incentive to ensure that the data was entered accurately and on time.

Thanks

Thanks to FAO for hosting this excellent seminar, and to the World Food Programme for allowing me time off to attend it.

Several of us expressed an interest in continuing the discussion online, we have been heard, and Michael Riggs, lead facilitator of the e-Agriculture Community, is working on enabling this to happen. There will also be a follow-on discussion at the ICTD 2010 Conference in London.

Mobiles for Scientific Research

Friday, July 9th, 2010

We know mobiles are very useful in areas where desktop computer and communications infrastructure is not easily available or affordable. And we’re very interested in mobile applications and scientific research in exactly these regions.

So I was very interested to see a new training workshop being run by the Science Dissemination Unit (SDU) of the Abdus Salam International Centre for Theoretical Physics (ICTP). The workshop is on Mobile Science: Sensing, Computing and Dissemination and the deadline for applications is tomorrow, July 10th.

Quoting from the announcement:

The Science Dissemination Unit (SDU) of the Abdus Salam International Centre for Theoretical Physics (ICTP), with the assistance of the University of Washington (USA) and of the UCLA Centerfor Embedded Networked Sensing (USA) will hold a Workshop on “Mobile Science: Sensing, Computing and Dissemination” in Trieste (Italy) from the 2 to the 5 of November 2010.

Mobile applications offer tremendous benefits to academic research and
education, and to society as a whole throughout the world. This is an
opportunity that deserves attention and promotion, especially in less
developed areas where mobile phones are the first telecommunications
technology in history to have more users than in the developed world.

The specific things that interested me were:

The Mobile Science workshop aims to engage the scientific community in developing countries in the design, development, and deployment of the newest mobile scientific applications;
i.e. advocating appropriate mobile applications in scientific
research/academia;
Participants will learn how to apply mobile technology tools to retrieve scientific data
I.e. designing mobile apps for science data collection;
how to apply appropriate web-based analysis to assimilate mobile data into scientific studies
I.e. web-based statistical analysis and presentation, like a free online version of SPSS? As far as I know this doesn’t exist yet. The closest that I can think of is the Google Docs spreadsheet, which is of course just a spreadsheet, requires an internet connection and doesn’t allow plugins for additional scientific analysis functionality. But there could be a very interesting app to develop here.
and how to share their scientific findings with a potentially large mobile audience.
I.e. low bandwidth design with an emphasis on web standards for cross-platform compatibility, so that it works on the largest number of mobile devices.

If you want to apply, better get on your bike (or modem?) because the deadline is tomorrow. If you want to do mobile scientific research applications, please get in touch, we’d like to help you.

Translations, PDAs and Field Research

Wednesday, December 2nd, 2009

Translation can be a real headache.

PDA used for interviews in Tanzania

PDA used for interviews in Tanzania

Identifying text for translation, finding individual strings and phrases to avoid duplications, contextual exceptions, keeping track of them, revisions, collaborating remotely, reviewing, back-translating,  integrating translations back into a finished product – you name it, the translation workflow has got it.

But first, a bit of background:

Aptivate started working with Camfed about 2 years ago when they were planning a major baseline study of their work supporting women’s education and empowerment in Africa. As part of their broader monitoring and evaluation work they wanted to understand the impact of their programme on areas such as attitudes towards girls education, awareness of HIV and sexual health issues and the effectiveness of community structures.

We trained young women from rural areas in Tanzania, Zambia and Zimbabwe to use PDAs for face-to-face interviews with teachers, students, parents and officials in the education system.

We used Palm Tungsten E2 PDAs and Solar Bags from Voltaic to run the exercise. We customised and bug-fixed a version of the excellent Episurveyor for use in the education context (it was designed as a health tool).

The surveys that Camfed created for the study (50-60 questions for 6 different stakeholders, many common questions) were designed in English and had to be available in the following languages:

  • Swahili
  • Shona
  • Ndebele
  • Bemba
  • Lozi

The questionnaires were created in a spreadsheet – one sheet per stakeholder (e.g. parent, teacher) with a list of questions and optional responses on each sheet. We put together a tool in Excel to help with the translation process. Essentially it:

  1. Went around sheets indexing each cell with relevant text
  2. Built a single list of strings in a new sheet
  3. Presented only unique strings to a translator and locked the rest down
  4. Rebuilt the original surveys in the new language once the translation was completed
  5. Can repeat all the above to allow for back-translations too

This is a good example of the agile approach – do the simplest thing you can to get the job done well (and on a deadline!). The translations got done, we scripted the automatic translation of the EpiSurveyor survey files (which are XML objects) and that was, as they say, that.

Until I had a chat with Camfed yesterday and they asked – “you know that translation tool you made, can we use it for some other things we’re doing?”

Fantastic!

It’s great to have built a tool that starts to get useful beyond its original remit. The Excel tool we made isn’t suitable for general use yet and after using it for 2 years, there is plenty of scope for improvement around issues of collaboration and revision management.

Enter the internet.

I posted a question to MetaFilter yesterday on this subject and I got some really interesting responses I thought I’d share in case anybody is thinking of doing this kind of thing.

In particular, check out:

Happy translating!