This post has three purposes, which I think overlap sufficiently to combine them:
- A User Guide for the system that we developed for UNICEF, IDS and RuralNet Zambia
- A Developers’ Guide for anyone wishing to build something similar
- Notes on lessons learned that may assist future implementers
Project goals
Automate the data entry part of a long paper-based survey, by replacing the paper forms with electronic devices.
Hardware and application selection
The survey has several long and complex questions, and long sets of multiple-choice answers. The data collection needs to be done in dusty rural Zambia, and the devices might need to be used for a full day without power. Collected data should be sent wirelessly to a secure data repository at some time after collection.
Text entry is required for many fields. That means either a real keyboard with keys, or a sufficiently large touch screen to type comfortably on. Use of the device camera, and presentation of reports and graphs on the same device, might be required in future.
Two possible hardware platforms were identified:
- Tablet laptops with touch screens
- Tablet mobile devices (iPad or Android tablet)
We selected the latter for this project due to lower cost, lighter weight, better usability and longer battery life.
The available software options that we identified were:
- EpiSurveyor (Java J2ME, partly closed source, we have used before and fixed bugs)
- OpenXdata (Java J2ME, open source, developed and supported by an Aptivate alumnus among others)
- Open Data Kit (ODK) (Android, open source, active community)
- Bespoke online/offline survey in HTML5
Of these, we eliminated EpiSurveyor and OpenXdata due to lack of compatibility with the hardware platform(s) we had chosen.
We chose ODK over a bespoke system due to limited time available for development, and ability to easily take photos and record GPS coordinates using the device’s hardware.
Of the available Android tablet devices, we chose the Samsung Galaxy Tab for the pilot project, due to its high quality construction. For future projects we would probably use a lower cost device; see the lessons learned for details.
Form creation
Since the survey is quite long (about 230 questions) we wanted an easy way to enter the questions. The ODK application requires the form to be in XForms format. We identified the following tools for creating XForms:
- ODK Build
- PurcForms
- Kobo Form Builder
- XLS2XForm (actually the pyxform fork)
We decided to use XLS2XForm, which enabled us to enter the large number of questions easily in Excel. The others all have graphical builders, which have advantages and disadvantages for less technical users:
- More visually appealing
- All available options presented visually (types of controls, groups, etc.)
- Less likely to make a mistake and produce an invalid form
- Cumbersome user interface slows down data entry
Unfortunately, none of these designers were able to import an existing form in XForms format, which means that the modifiable “source code” of the form must be maintained in a “proprietary” format in each case, and it’s difficult to switch between tools.
You can download the conversion tools, and the Excel spreadsheet with the completed questionnaire as we delivered it to RuralNet, here. RuralNet staff, please use the latest version of the spreadsheet that you can find locally. To use the tools, you will need to download and install Python 2.7 and Java (JRE). Then download the tools as a ZIP file and extract it somewhere. I recommend that you keep the master copy of the spreadsheet in Dropbox to ensure that it’s backed up, and it’s always clear what the latest version is.
For help in building surveys using XLS2XForm, please see the documentation. In addition to the question types listed there, we have used the following shortcuts, which also work in this customised version of XLS2XForm:
textis short foradd text prompt(a text field, such as a person’s name)noteis short foradd note prompt(a read-only field, providing additional information for the user)timeis a time field without a date (for example, survey start and end times)
To compile the spreadsheet into an XForms form, run the build_and_validate.py script by double-clicking on it. If it works, it will show the message “Success!”, otherwise it will show an error message, usually caused by a mistake in the Excel spreadsheet. If it works, it will create (replace) the file called zambia-ranq-round3.xml in the same directory. If your spreadsheet has a different name, you can create a shortcut to call build_and_validate_custom.py with the name of the spreadsheet on the command line.
Software components
ODK Aggregate is the software that powers the Internet server. It is a repository for blank forms (designs) and completed forms (data). Our server is located at http://partimob.appspot.com/. This server is currently paid for by us, and will need to transfer to RuralNet at some point.
ODK Collect is the application runs on the device, and users interact with it to complete the survey. It’s essentially a user interface for XForms. It can download blank forms (designs) from an ODK Aggregate server, and upload completed forms (data) to the Aggregate server as well.
ODK Briefcase is the software that downloads completed forms (data) from the Aggregate server and convert them into CSV (spreadsheet) format, which can be loaded into
Customised ODK Collect
We are using a custom version of ODK Collect. You can download the source code for it here, or the compiled application here. You can also find it in the ZIP file download. If you prefer, you can use the latest official version of ODK Collect. The two are compatible, but our version adds the following useful features:
- Use supplied login and password by default to save a round trip and a prompt.
- Add keyboard navigation, useful for form filling on android-x86 because the mouse interface is pretty clunky.
- Restore ability to modify completed and submitted forms on the device, which was removed from the official version in 1.1.7.
- Improved error messages and progress indication during form uploads.
- Allow setting the instance name on the first page of the survey.
- Allow saving incomplete surveys on required questions (in case a survey is interrupted; almost all of our questions are required).
There are several ways to install ODK Collect on a device:
- Download it from the Android Market (official version only, not our customised version)
- Copy the APK file onto a microSD card, insert the card into the device, and use the My Files application find and open it from the SD card.
- Attach the USB cable from the device to a computer, enable mass storage mode on the device, and on the computer, drag and drop the APK file onto the device’s internal memory, then use the My Files application to find and open it.
- Attach the USB cable from the device to a computer, and use ADB‘s
installcommand to install the APK file.
It’s useful to put the application onto the device’s desktop. To do that, open the Applications list, find ODK Collect, and press and hold it with your finger for a few seconds. The background will change to the desktop; release your finger to drop the application there.
It’s also useful to remove all the other junk from the desktop. For each icon and widget on the desktop, press and hold it with your finger for a few seconds, until the trashcan icon appears, then drag your finger to the trashcan and release it there.
Form management on the device
There are several ways to put blank forms (designs) onto the tablets:
- Download them from the ODK Aggregate server using ODK Collect.
- Copy them onto a microSD card, insert the card into the device, and use the My Files application to copy them from the SD card to the /sdcard/odk/forms directory.
- Attach the USB cable from the device to a computer, enable mass storage mode on the device, and on the computer, drag and drop the form into the /sdcard/odk/forms directory.
- Attach the USB cable from the device to a computer, and use ADB or DDMS to push the file onto the device, into the /sdcard/odk/forms directory.
Of these methods, ADB or DDMS is recommended for rapid development, and using the Aggregate server is recommended for production use, since the form must be installed on the Aggregate server for it to be able to accept submissions.
Similarly there are several ways to copy completed forms (data) off the device:
- Upload them to the ODK Aggregate server using ODK Collect.
- Use the My Files application to copy them from /sdcard/odk/instances to a microSD card, then remove the card and connect it to the computer, and drop the files into the ODK Briefcase data directory.
- Attach the USB cable from the device to a computer, enable mass storage mode on the device, and on the computer, drag and drop the files from the /sdcard/odk/instances directory to the ODK Briefcase data directory.
- Attach the USB cable from the device to a computer, and use ADB or DDMS to pull the file from the device’s /sdcard/odk/instances directory to the ODK Briefcase data directory.
Of these methods, using ODK Aggregate is recommended for development and production use.
Since the Aggregate server is on the Internet, this method requires that the device have Internet access. So it either needs a valid SIM card installed with credit and a data bundle, or a WiFi network connected. We had many problems with using SIM cards for data, so WiFi is preferred if possible.
The directories mentioned above will not exist until ODK Collect is installed on the device and run for the first time. Forms downloaded from the Aggregate server will also be placed in the /sdcard/odk/forms directory. Forms completed on the device will be placed in the /sdcard/odk/instances directory.
Configuring ODK Collect
Collect needs to know the details of the ODK Aggregate server to log into it, download blank forms and upload completed forms.
Open the ODK Collect application, press the Settings button and click on Change Settings. Click on URL and enter https://partimob.appspot.com. Similarly, complete the Username and Password using the details that you’ve been given by the Aggregate server operator, or the account that you’ve created on the Aggregate server. This account should only have Data Collector permissions, no more. Press the Back key to get back to the main menu of ODK Collect.
Downloading forms using ODK Collect
Open ODK Collect on the device, and click on the Get Blank Form button. Collect will try to log into the Aggregate server using the details that you’ve provided, and get a list of forms on the server that have the Downloadable box ticked. This is on by default for newly uploaded forms.
Tick the box next to all the forms that you want to download, and click on the Get Selected button.
Filling forms on the device
Open ODK Collect on the device, and click on the Fill Blank Form button. All the forms in the device’s /sdcard/odk/forms directory should be listed. Choose the form that you want to complete.
You will see an introductory screen showing how to move between questions by swiping your finger across the screen, from right to left or left to right. This screen has a text box at the bottom, which you can use to name the form that you’re completing. Naming forms is useful if your data collection is interrupted and you need to resume it later. It’s much easier to identify the form using its name, rather than opening it and flicking through to find some identifying information. You might name the form based on the household code that you’re surveying.
Depending on your answers to some questions, others may be hidden, or their text might change.
At the end of the form there is another chance to Name this form, and a tickbox to Mark form as finalized. Before you can upload the form to the Aggregate server, this box must be ticked, and you must press the Save Form and Exit button. Otherwise Collect will consider that the form is incomplete.
Sending completed forms to Aggregate
Open ODK Collect on the device, and click on the Send Finalized Form button on the main menu. Tick the box next to all the forms that you want to upload to Aggregate, and click on Send Selected. After the upload is complete, you should see the Upload Results message. Every form should have “Success” next to it, otherwise it was not sent successfully.
Downloading forms using Briefcase
We are using a customised version of ODK Briefcase with the following changes:
- Fix the export of repeated groups, which before only worked for the first row (issue 461).
- Shorten exported column names, to allow the CSV file to be imported into Access.
- Allow the server name, username and password to be provided on the command line (or via a shortcut).
You can find the source code here and the pre-compiled version here, as an executable JAR file. You can also find it in the ZIP file download. If you make changes to the source and want to build the executable JAR again, install Maven and use the mvn package command.
To download the completed forms, open Briefcase by double-clicking on the briefcase-1.0-jar-with-dependencies.jar file. On the Transfer tab, click on the Connect button. For the URL, enter https://partimob.appspot.com, and for the user name and password, give the details of an ODK Aggregate account with Data Viewer permissions.
Then you should see a list of forms appear under the heading Forms to Transfer. Tick the box next to the one that your users have been completing, and then click on the Transfer button. If you do this after all the completed forms (data) have been submitted to the ODK Aggregate server, you will not need to do it again for that form template (design).
Now switch to the Transform tab and see if the form appears in the Form list. If it doesn’t, then exit and restart the Briefcase application (issue 464).
For Output Type, choose .csv and media files. For Output Directory, choose the directory where you’d like to save the CSV files. Note that any previous files exported to that directory from the same form will be overwritten without warning, even if they have been modified (cleaned). Click on the Output button to write the CSV files.
Cleaning data in Excel
You can find the Excel spreadsheet that we use for data storage and cleaning here. Note that Excel is a long way from the best way to store and manipulate data like this. Microsoft Access would be far more appropriate. Yet again I wish there was a sufficiently powerful open source alternative.
Because the spreadsheet contains cleaned data, which is “better” than the raw data which is included in the CSV export, we don’t want to overwrite existing rows. For the main section of the questionnaire (the so-called Single Responses) you can include only the new data like this:
- Open the main spreadsheet and switch to the Single Responses tab
- Highlight all rows from 3 down to the bottom, and Sort them by the SubmissionDate column.
- Note the last submission date on this spreadsheet.
- Open the newly exported CSV file for the single responses (something like RANQ-2011-Round-4-v5.csv).
- Sort this file by the SubmissionDate column as well.
- Highlight and copy all the rows whose submission date is later (more recent) than the last one in the main spreadsheet.
- Paste them at the bottom of the Single Responses tab of the main spreadsheet, below the other data.
For the other tables, this process needs to be done completely manually at present.
You can then check and clean the data by viewing and modifying it in Excel. Note that each sheet has one or two columns at the end, which are filled by formulae that look up values from the Single Responses sheet, such as the Household Code.
Using the Android x86 Emulator
To be written.
Lessons learned
To be written.