<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Aptivate &#124; A Blog for ICT4D</title>
	<atom:link href="http://blog.aptivate.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.aptivate.org</link>
	<description>International I.T. Development</description>
	<lastBuildDate>Wed, 01 Feb 2012 14:09:14 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.6</generator>
		<item>
		<title>Content indexing in Django using Apache Tika</title>
		<link>http://blog.aptivate.org/2012/02/01/content-indexing-in-django-using-apache-tika/</link>
		<comments>http://blog.aptivate.org/2012/02/01/content-indexing-in-django-using-apache-tika/#comments</comments>
		<pubDate>Wed, 01 Feb 2012 13:12:44 +0000</pubDate>
		<dc:creator>Chris Wilson</dc:creator>
				<category><![CDATA[Django]]></category>
		<category><![CDATA[Engineer's Log]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[System Administration]]></category>

		<guid isPermaLink="false">http://blog.aptivate.org/?p=992</guid>
		<description><![CDATA[For the Documents module of our new open-source Generic Intranet, we need to be able to extract the text content and metadata from various kinds of documents: PDF files Microsoft Office DOC, XLS and PPT files and the new XML equivalents, DOCX, XLSX and PPTX. I found various tools online to help extract this text, [...]]]></description>
			<content:encoded><![CDATA[<p>For the Documents module of our new open-source <a href="https://github.com/aptivate/intranet">Generic Intranet</a>, we need to be able to extract the text content and metadata from various kinds of documents:</p>
<ul>
<li>PDF files</li>
<li>Microsoft Office DOC, XLS and PPT files</li>
<li>and the new XML equivalents, DOCX, XLSX and PPTX.</li>
</ul>
<p>I found various tools online to help extract this text, largely thanks to Stack Overflow <a href="http://superuser.com/questions/165978/command-line-tool-in-linux-to-extract-text-from-word-excel-powerpoint-or-co">here</a> and <a href="http://stackoverflow.com/questions/888784/extract-text-from-a-powerpoint-ppt-or-pptx-file">here</a>. This ended up with a hodgepodge of tools:</p>
<ul>
<li><a href="http://www.unixuser.org/~euske/python/pdfminer/">PDF Miner</a> for PDF files</li>
<li><a href="http://github.com/mikemaccana/python-docx/tarball/master">python-docx</a> for DOCX files</li>
<li><a href="http://silvercoders.com/en/products/doctotext/">DocToText</a> for PPTX, XLSX and PPT files</li>
<li><a href="http://www.winfield.demon.nl/">antiword</a> for DOC files</li>
</ul>
<p>There were a number of problems with this hodgepodge:</p>
<ul>
<li>I was <strong>unable to find</strong> any Python or command-line solution for <strong>old Excel (XLS) files</strong><strong>;</strong></li>
<li>These solutions did not extract metadata, only document text;</li>
<li>The choice of which tool to use depends on the MIME type returned by the <a href="http://www.darwinsys.com/file/">file(1)</a> command, which varies depending on the OS (Debian/Ubuntu or CentOS) and which version of the library is installed</li>
</ul>
<p><a href="http://stackoverflow.com/questions/2239459/python-based-document-metadata-parser">Another Stack Overflow post</a> recommended Apache Tika for metadata extraction. It appears to support all the document formats that we need, and to have auto-detection of the document format, which solves all the MIME type problems as well. However, it introduces a new problem: it&#8217;s written in Java, which is hard to access from Python.</p>
<p>Luckily I found some <a href="http://redmine.djity.net/projects/pythontika/wiki">instructions</a> for building a Python wrapper around Tika, using some tools that I&#8217;d never heard of, and this seemed like a good approach. Unfortunately the installation process is very non-standard, which would not fit in with our fabric-based automated deployment process, and would make it harder for users to install the Intranet themselves.</p>
<p>The instructions are somewhat outdated at the time of writing, as they refer to Tika version 0.7, while 1.0 has been released. I was unable to register for an account to update that page, so I wrote to the author with the details that I discovered, and will also document here that the following command works for me:</p>
<pre>python ../jcc/jcc/__main__.py \
        --include /usr/share/java/org.eclipse.osgi.jar \
        --jar tika-parsers-1.0.jar \
        --jar tika-core-1.0.jar \
        java.io.File java.io.FileInputStream \
        java.io.StringBufferInputStream \
        --package org.xml.sax \
        --include tika-app-1.0.jar \
        --python tika --version 1.0 --reserved asm</pre>
<p>I was able to go further than this, and package Tika in a way that makes it easy to install with Pip, and thus integrate with our deployment process.</p>
<p>The wrapper is written using JCC, which works by generating and compiling C++ code that links to the Java classes, and then a Python wrapper around that C++. This means that it needs to be recompiled for each platform, so I couldn&#8217;t just distribute a binary blob with the Intranet (I had the same problem with DocToText above).</p>
<p>The version of setuptools on our servers doesn&#8217;t support JCC&#8217;s <a href="http://lucene.apache.org/pylucene/jcc/documentation/install.html#shared">shared library mode</a>. JCC dies with an error if it&#8217;s not explicitly disabled or the patch applies. I couldn&#8217;t do either of these as part of our standard deployment process. So I <a href="https://github.com/aptivate/jcc">patched JCC</a> to disabled shared mode, since we don&#8217;t need it anyway. I also added some patches to allow various <code>setup.py</code> commands used by <code>pip</code> to be forwarded through JCC to the <code>setup</code> function call.</p>
<p>This seems to be enough to allow you to install JCC like this:</p>
<pre>pip install git+git://github.com/aptivate/jcc.git</pre>
<p>I also wrote a setup.py file that handles pip&#8217;s command line invokations and passes the necessary options to JCC, and JCC&#8217;s invocation of the setup function. This seems to be enough to install the package using pip:</p>
<pre>pip install git+git://github.com/aptivate/python-tika.git</pre>
<p>and you can use the last parameter as a package specification in pip_packages.txt, or whatever you pass to pip -r.</p>
<p>You can find the pip-installable Tika package, complete with <a href="http://repo1.maven.org/maven2/org/apache/tika/">Tika 1.0 JAR files</a>, in our <a href="https://github.com/aptivate/python-tika">python-tika</a> repository on Github. This will save you the work of downloading and compiling Tika and all of its dependencies. I have started a <a href="http://mail-archives.apache.org/mod_mbox/lucene-pylucene-dev/201202.mbox/%3Calpine.DEB.2.02.1202011310060.5732%40lap-x201%3E">discussion</a> with the JCC developers about merging these changes into the upstream project.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.aptivate.org/2012/02/01/content-indexing-in-django-using-apache-tika/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Rough Guide to rural data collection with ODK</title>
		<link>http://blog.aptivate.org/2011/12/05/rough-guide-to-rural-data-collection-with-odk/</link>
		<comments>http://blog.aptivate.org/2011/12/05/rough-guide-to-rural-data-collection-with-odk/#comments</comments>
		<pubDate>Mon, 05 Dec 2011 17:58:43 +0000</pubDate>
		<dc:creator>Chris Wilson</dc:creator>
				<category><![CDATA[Appropriate Technology]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[Engineer's Log]]></category>
		<category><![CDATA[Internet]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Mobiles]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[data collection]]></category>

		<guid isPermaLink="false">http://blog.aptivate.org/?p=986</guid>
		<description><![CDATA[This post has three purposes, which I think overlap sufficiently to combine them: A User Guide for the system that we developed for UNICEF, IDS and RuralNet Zambia A Developers&#8217; Guide for anyone wishing to build something similar Notes on lessons learned that may assist future implementers Project goals Automate the data entry part of [...]]]></description>
			<content:encoded><![CDATA[<p>This post has three purposes, which I think overlap sufficiently to combine them:</p>
<ul>
<li>A User Guide for the system that we developed for UNICEF, IDS and RuralNet Zambia</li>
<li>A Developers&#8217; Guide for anyone wishing to build something similar</li>
<li>Notes on lessons learned that may assist future implementers</li>
</ul>
<h3>Project goals</h3>
<p>Automate the data entry part of a long paper-based survey, by replacing the paper forms with electronic devices.</p>
<h3>Hardware and application selection</h3>
<p>The survey has several long and complex questions, and long sets of multiple-choice answers. The data collection needs to be done in dusty rural Zambia, and the devices might need to be used for a full day without power. Collected data should be sent wirelessly to a secure data repository at some time after collection.</p>
<p>Text entry is required for many fields. That means either a real keyboard with keys, or a sufficiently large touch screen to type comfortably on. Use of the device camera, and presentation of reports and graphs on the same device, might be required in future.</p>
<p>Two possible hardware platforms were identified:</p>
<ul>
<li>Tablet laptops with touch screens</li>
<li>Tablet mobile devices (iPad or Android tablet)</li>
</ul>
<p>We selected the latter for this project due to lower cost, lighter weight, better usability and longer battery life.</p>
<p>The available software options that we identified were:</p>
<ul>
<li>EpiSurveyor (Java J2ME, partly closed source, we have used before and fixed bugs)</li>
<li>OpenXdata (Java J2ME, open source, developed and supported by an Aptivate alumnus among others)</li>
<li>Open Data Kit (ODK) (Android, open source, active community)</li>
<li>Bespoke online/offline survey in HTML5</li>
</ul>
<p>Of these, we eliminated EpiSurveyor and OpenXdata due to lack of compatibility with the hardware platform(s) we had chosen.</p>
<p>We chose ODK over a bespoke system due to limited time available for development, and ability to easily take photos and record GPS coordinates using the device&#8217;s hardware.</p>
<p>Of the available Android tablet devices, we chose the Samsung Galaxy Tab for the pilot project, due to its high quality construction. For future projects we would probably use a lower cost device; see the lessons learned for details.</p>
<h3>Form creation</h3>
<p>Since the survey is quite long (about 230 questions) we wanted an easy way to enter the questions. The ODK application requires the form to be in XForms format. We identified the following tools for creating XForms:</p>
<ul>
<li><a href="http://opendatakit.org/use/build">ODK Build</a></li>
<li><a href="http://code.google.com/p/purcforms/">PurcForms</a></li>
<li><a href="https://sites.google.com/site/kobodesk/kobo-form-builder">Kobo Form Builder</a></li>
<li><a href="http://opendatakit.org/use/xls2xform">XLS2XForm</a> (actually the <a href="https://github.com/jbeorse/pyxform">pyxform</a> fork)</li>
</ul>
<p>We decided to use XLS2XForm, which enabled us to enter the large number of questions easily in Excel. The others all have graphical builders, which have advantages and disadvantages for less technical users:</p>
<ul>
<li>More visually appealing</li>
<li>All available options presented visually (types of controls, groups, etc.)</li>
<li>Less likely to make a mistake and produce an invalid form</li>
<li>Cumbersome user interface slows down data entry</li>
</ul>
<p>Unfortunately, none of these designers were able to import an existing form in XForms format, which means that the modifiable &#8220;source code&#8221; of the form must be maintained in a &#8220;proprietary&#8221; format in each case, and it&#8217;s difficult to switch between tools.</p>
<p>You can download the conversion tools, and the Excel spreadsheet with the completed questionnaire as we delivered it to RuralNet, <a href="https://github.com/aptivate/idspartimob">here</a>. RuralNet staff, please use the latest version of the spreadsheet that you can find locally. To use the tools, you will need to download and install <a href="http://www.python.org/getit/">Python 2.7</a> and <a href="http://www.oracle.com/technetwork/java/javase/downloads/jre-7u1-download-513652.html">Java</a> (JRE). Then download the tools <a title="ZIP download" href="https://github.com/aptivate/idspartimob/zipball/master">as a ZIP file</a> and extract it somewhere. I recommend that you keep the master copy of the spreadsheet in <a href="https://www.dropbox.com/home">Dropbox</a> to ensure that it&#8217;s backed up, and it&#8217;s always clear what the latest version is.</p>
<p>For help in building surveys using XLS2XForm, please see the <a href="http://opendatakit.org/help/form-design/xls2xform/">documentation</a>. In addition to the question types listed there, we have used the following shortcuts, which also work in this customised version of XLS2XForm:</p>
<ul>
<li><code>text</code> is short for <code>add text prompt</code> (a text field, such as a person&#8217;s name)</li>
<li><code>note</code> is short for <code>add note prompt</code> (a read-only field, providing additional information for the user)</li>
<li><code>time</code> is a time field without a date (for example, survey start and end times)</li>
</ul>
<p>To compile the spreadsheet into an XForms form, run the <code>build_and_validate.py</code> script by double-clicking on it. If it works, it will show the message &#8220;Success!&#8221;, otherwise it will show an error message, usually caused by a mistake in the Excel spreadsheet. If it works, it will create (replace) the file called <code>zambia-ranq-round3.xml</code> in the same directory. If your spreadsheet has a different name, you can create a shortcut to call <code>build_and_validate_custom.py</code> with the name of the spreadsheet on the command line.</p>
<h3>Software components</h3>
<p><a href="http://opendatakit.org/use/aggregate/">ODK Aggregate</a> is the software that powers the Internet server. It is a repository for blank forms (designs) and completed forms (data). Our server is located at <a href="http://partimob.appspot.com/">http://partimob.appspot.com/</a>. This server is currently paid for by us, and will need to transfer to RuralNet at some point.</p>
<p><a href="http://opendatakit.org/use/collect/">ODK Collect</a> is the application runs on the device, and users interact with it to complete the survey. It&#8217;s essentially a user interface for XForms. It can download blank forms (designs) from an ODK Aggregate server, and upload completed forms (data) to the Aggregate server as well.</p>
<p><a href="http://code.google.com/p/opendatakit/wiki/ODKBriefcase">ODK Briefcase</a> is the software that downloads completed forms (data) from the Aggregate server and convert them into CSV (spreadsheet) format, which can be loaded into</p>
<h3>Customised ODK Collect</h3>
<p>We are using a custom version of ODK Collect. You can download the source code for it <a href="http://code.google.com/r/chris-collect/">here</a>, or the compiled application <a href="https://github.com/aptivate/idspartimob/blob/master/ODK-Collect-trunk-111119-custom.apk">here</a>. You can also find it in the ZIP file download. If you prefer, you can use the <a href="http://opendatakit.org/use/collect/">latest official version of ODK Collect</a>. The two are compatible, but our version adds the following useful features:</p>
<ul>
<li>Use supplied login and password by default to save a round trip and a prompt.</li>
<li>Add keyboard navigation, useful for form filling on android-x86 because the mouse interface is pretty clunky.</li>
<li>Restore ability to modify completed and submitted forms on the device, which was removed from the official version in 1.1.7.</li>
<li>Improved error messages and progress indication during form uploads.</li>
<li>Allow setting the instance name on the first page of the survey.</li>
<li>Allow saving incomplete surveys on required questions (in case a survey is interrupted; almost all of our questions are required).</li>
</ul>
<p>There are several ways to install ODK Collect on a device:</p>
<ul>
<li>Download it from the Android Market (official version only, not our customised version)</li>
<li>Copy the APK file onto a microSD card, insert the card into the device, and use the <em>My Files</em> application find and open it from the SD card.</li>
<li>Attach the USB cable from the device to a computer, enable mass storage mode on the device, and on the computer, drag and drop the APK file onto the device&#8217;s internal memory, then use the <em>My Files</em> application to find and open it.</li>
<li>Attach the USB cable from the device to a computer, and use <a href="http://developer.android.com/guide/developing/tools/adb.html">ADB</a>&#8216;s <code>install</code> command to install the APK file.</li>
</ul>
<p>It&#8217;s useful to put the application onto the device&#8217;s desktop. To do that, open the Applications list, find ODK Collect, and press and hold it with your finger for a few seconds. The background will change to the desktop; release your finger to drop the application there.</p>
<p>It&#8217;s also useful to remove all the other junk from the desktop. For each icon and widget on the desktop, press and hold it with your finger for a few seconds, until the trashcan icon appears, then drag your finger to the trashcan and release it there.</p>
<h3><span>Form management on the device</span></h3>
<p><span style="font-weight: normal;">There are several ways to put blank forms (designs) onto the tablets:</span></p>
<ul>
<li>Download them from the ODK Aggregate server using ODK Collect.</li>
<li>Copy them onto a microSD card, insert the card into the device, and use the <em>My Files</em> application to copy them from the SD card to the /sdcard/odk/forms directory.</li>
<li>Attach the USB cable from the device to a computer, enable mass storage mode on the device, and on the computer, drag and drop the form into the /sdcard/odk/forms directory.</li>
<li>Attach the USB cable from the device to a computer, and use <a href="http://developer.android.com/guide/developing/tools/adb.html">ADB</a> or <a href="http://developer.android.com/guide/developing/debugging/ddms.html">DDMS</a> to push the file onto the device, into the /sdcard/odk/forms directory.</li>
</ul>
<p>Of these methods, ADB or DDMS is recommended for rapid development, and using the Aggregate server is recommended for production use, since the form must be installed on the Aggregate server for it to be able to accept submissions.</p>
<p>Similarly there are several ways to copy completed forms (data) off the device:</p>
<ul>
<li>Upload them to the ODK Aggregate server using ODK Collect.</li>
<li>Use the <em>My Files</em> application to copy them from /sdcard/odk/instances to a microSD card, then remove the card and connect it to the computer, and drop the files into the ODK Briefcase data directory.</li>
<li>Attach the USB cable from the device to a computer, enable mass storage mode on the device, and on the computer, drag and drop the files from the /sdcard/odk/instances directory to the ODK Briefcase data directory.</li>
<li>Attach the USB cable from the device to a computer, and use <a href="http://developer.android.com/guide/developing/tools/adb.html">ADB</a> or <a href="http://developer.android.com/guide/developing/debugging/ddms.html">DDMS</a> to pull the file from the device&#8217;s /sdcard/odk/instances directory to the ODK Briefcase data directory.</li>
</ul>
<p>Of these methods, using ODK Aggregate is recommended for development and production use.</p>
<p>Since the Aggregate server is on the Internet, this method requires that the device have Internet access. So it either needs a valid SIM card installed with credit and a data bundle, or a WiFi network connected. We had many problems with using SIM cards for data, so WiFi is preferred if possible.</p>
<p>The directories mentioned above will not exist until ODK Collect is installed on the device and run for the first time. Forms downloaded from the Aggregate server will also be placed in the /sdcard/odk/forms directory. Forms completed on the device will be placed in the /sdcard/odk/instances directory.</p>
<h3>Configuring ODK Collect</h3>
<p>Collect needs to know the details of the ODK Aggregate server to log into it, download blank forms and upload completed forms.</p>
<p>Open the ODK Collect application, press the Settings button and click on <em>Change Settings</em>. Click on <em>URL</em> and enter <em>https://partimob.appspot.com</em>. Similarly, complete the Username and Password using the details that you&#8217;ve been given by the Aggregate server operator, or the account that you&#8217;ve created on the Aggregate server. This account should only have <em>Data Collector</em> permissions, no more. Press the Back key to get back to the main menu of ODK Collect.</p>
<h3>Downloading forms using ODK Collect</h3>
<p>Open ODK Collect on the device, and click on the <em>Get Blank Form</em> button. Collect will try to log into the Aggregate server using the details that you&#8217;ve provided, and get a list of forms on the server that have the <em>Downloadable</em> box ticked. This is on by default for newly uploaded forms.</p>
<p>Tick the box next to all the forms that you want to download, and click on the <em>Get Selected</em> button.</p>
<h3>Filling forms on the device</h3>
<p>Open ODK Collect on the device, and click on the <em>Fill Blank Form</em> button. All the forms in the device&#8217;s <em>/sdcard/odk/forms</em> directory should be listed. Choose the form that you want to complete.</p>
<p>You will see an introductory screen showing how to move between questions by swiping your finger across the screen, from right to left or left to right. This screen has a text box at the bottom, which you can use to name the form that you&#8217;re completing. Naming forms is useful if your data collection is interrupted and you need to resume it later. It&#8217;s much easier to identify the form using its name, rather than opening it and flicking through to find some identifying information. You might name the form based on the household code that you&#8217;re surveying.</p>
<p>Depending on your answers to some questions, others may be hidden, or their text might change.</p>
<p>At the end of the form there is another chance to <em>Name this form</em>, and a tickbox to <em>Mark form as finalized</em>. Before you can upload the form to the Aggregate server, this box must be ticked, and you must press the <em>Save Form and Exit</em> button. Otherwise Collect will consider that the form is incomplete.</p>
<h3>Sending completed forms to Aggregate</h3>
<p>Open ODK Collect on the device, and click on the <em>Send Finalized Form</em> button on the main menu. Tick the box next to all the forms that you want to upload to Aggregate, and click on <em>Send Selected</em>. After the upload is complete, you should see the <em>Upload Results</em> message. Every form should have &#8220;Success&#8221; next to it, otherwise it was not sent successfully.</p>
<h3>Downloading forms using Briefcase</h3>
<p>We are using a customised version of ODK Briefcase with the following changes:</p>
<ul>
<li>Fix the export of repeated groups, which before only worked for the first row (<a href="http://code.google.com/p/opendatakit/issues/detail?id=461">issue 461</a>).</li>
<li>Shorten exported column names, to allow the CSV file to be imported into Access.</li>
<li>Allow the server name, username and password to be provided on the command line (or via a shortcut).</li>
</ul>
<p>You can find the source code <a href="http://code.google.com/r/chris-briefcase/source/checkout">here</a> and the pre-compiled version <a href="https://github.com/aptivate/idspartimob/blob/master/briefcase-1.0-jar-with-dependencies.jar?raw=true">here</a>, as an executable JAR file. You can also find it in the ZIP file download. If you make changes to the source and want to build the executable JAR again, install Maven and use the <code>mvn package</code> command.</p>
<p>To download the completed forms, open Briefcase by double-clicking on the <code>briefcase-1.0-jar-with-dependencies.jar</code> file. On the Transfer tab, click on the Connect button. For the URL, enter <code>https://partimob.appspot.com</code>, and for the user name and password, give the details of an ODK Aggregate account with <em>Data Viewer</em> permissions.</p>
<p>Then you should see a list of forms appear under the heading <em>Forms to Transfer</em>. Tick the box next to the one that your users have been completing, and then click on the Transfer button. If you do this after all the completed forms (data) have been submitted to the ODK Aggregate server, you will not need to do it again for that form template (design).</p>
<p>Now switch to the <em>Transform</em> tab and see if the form appears in the <em>Form</em> list. If it doesn&#8217;t, then exit and restart the Briefcase application (<a href="http://code.google.com/p/opendatakit/issues/detail?id=464">issue 464</a>).</p>
<p>For <em>Output Type</em>, choose <em>.csv and media files</em>. For <em>Output Directory</em>, choose the directory where you&#8217;d like to save the CSV files. Note that any previous files exported to that directory from the same form will be overwritten without warning, even if they have been modified (cleaned). Click on the <em>Output</em> button to write the CSV files.</p>
<h3><strong>Cleaning data in Excel</strong></h3>
<p>You can find the Excel spreadsheet that we use for data storage and cleaning here. Note that Excel is a long way from the best way to store and manipulate data like this. Microsoft Access would be far more appropriate. Yet again I wish there was a sufficiently powerful open source alternative.</p>
<p>Because the spreadsheet contains cleaned data, which is &#8220;better&#8221; than the raw data which is included in the CSV export, we don&#8217;t want to overwrite existing rows. For the main section of the questionnaire (the so-called Single Responses) you can include only the new data like this:</p>
<ul>
<li>Open the main spreadsheet and switch to the <em>Single Responses</em> tab</li>
<li>Highlight all rows from 3 down to the bottom, and <em>Sort</em> them by the <em>SubmissionDate</em> column.</li>
<li>Note the last submission date on this spreadsheet.</li>
<li>Open the newly exported CSV file for the single responses (something like <em>RANQ-2011-Round-4-v5.csv</em>).</li>
<li>Sort this file by the <em>SubmissionDate</em> column as well.</li>
<li>Highlight and copy all the rows whose submission date is later (more recent) than the last one in the main spreadsheet.</li>
<li>Paste them at the bottom of the Single Responses tab of the main spreadsheet, below the other data.</li>
</ul>
<p>For the other tables, this process needs to be done completely manually at present.</p>
<p>You can then check and clean the data by viewing and modifying it in Excel. Note that each sheet has one or two columns at the end, which are filled by formulae that look up values from the Single Responses sheet, such as the <em>Household Code</em>.</p>
<h3>Using the Android x86 Emulator</h3>
<p>To be written.</p>
<h3>Lessons learned</h3>
<p>To be written.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.aptivate.org/2011/12/05/rough-guide-to-rural-data-collection-with-odk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Embedding jinja2 templates in Django templates</title>
		<link>http://blog.aptivate.org/2011/11/15/embedding-jinja2-templates-in-django-templates/</link>
		<comments>http://blog.aptivate.org/2011/11/15/embedding-jinja2-templates-in-django-templates/#comments</comments>
		<pubDate>Tue, 15 Nov 2011 10:57:38 +0000</pubDate>
		<dc:creator>Martin Burchell</dc:creator>
				<category><![CDATA[Django]]></category>

		<guid isPermaLink="false">http://blog.aptivate.org/?p=910</guid>
		<description><![CDATA[We recently integrated the Askbot forum into the Django-based websites we developed for the RIMI4AC Project. Askbot uses the Jinja2 templating language but this was incompatible with the standard Django templates we had used up to this point. Here&#8217;s how we solved the problem. When we were asked to recommend a forum to be integrated [...]]]></description>
			<content:encoded><![CDATA[<p>
We recently integrated the <a href="http://askbot.org">Askbot</a> forum into the <a href="https://www.djangoproject.com/">Django</a>-based websites we developed for the <a href="http://www.rimi4ac.net/en/">RIMI4AC Project</a>. Askbot uses the <a href="http://jinja.pocoo.org/docs/">Jinja2</a> templating language but this was incompatible with the standard Django templates we had used up to this point. Here&#8217;s how we solved the problem.
</p>
<p>
When we were asked to recommend a forum to be integrated into the suite of websites we were developing for the RIMI4AC Project, Askbot was the clear favourite, due to its large feature set, ease of customisation, active development team and wide user base. The only drawback was that the templating engine used by Askbot was Jinja2, which would make it difficult for Askbot to be embedded into the websites. Up until then these websites had been developed using standard Django templates.
</p>
<p>
We came up with the following options to solve this problem:
</p>
<h2>Create an Askbot skin using Jinja2, which would mimic our existing templates</h2>
<p>
This would be easy to implement but would incur high maintenance costs, as any changes to the standard Django templates would need to be made also to the Jinja2 templates. This could possibly be scripted to make this easier.
</p>
<h2>Embed Askbot in an iframe</h2>
<p>
Again this would be simple to implement but iframes introduce a number of problems themselves with navigation and rendering.
</p>
<h2>Rewrite Askbot to use Django templates</h2>
<p>
This would be a lot of work and as we would effectively be forking Askbot we would incur the costs of maintaining our own version.
</p>
<h2>Rewrite the rest of our websites to use Jinja2 templates</h2>
<p>
This would be quite a bit of work and any new components we wanted to integrate into our websites would also need to use Jinja2.
</p>
<h2>Choose a different forum that used Django templates</h2>
<p>
We really didn&#8217;t want to do this as we had had good reasons for choosing Askbot.
</p>
<h2>Some way of rendering Django templates from within Jinja2</h2>
<p>
Although Jinja2 supports extensions and we could possibly have written one to render Django templates, this seemed to be the opposite of what we wanted &#8211; to embed Askbot as a component of our website and not the other way around.
</p>
<h2>Some way of rendering Jinja2 templates from within Django templates</h2>
<p>
This looked like the most promising solution. Fortunately the Askbot developers were good enough to name their views, which meant that we could provide our own urls.py with views of the same name and then any reverse() lookups within Askbot would just work. Any view that rendered into a Jinja2 template by calling the function render_into_skin() would be replaced with this wrapper function:
</p>
<pre>
def render_jinja2_into_django_template(request, jinja2_view, *args, **kwargs):
    response = jinja2_view(request, *args, **kwargs)

    return render_to_response("forum_container.html",
        {'forum_content':response.content},
        context_instance=RequestContext(request))
</pre>
<p>
The wrapper function would call the original Jinja2 view function and we could pull the raw content out of the returned response object. This would go into the forum_content variable that would be passed to our own Django template and simply written out from within there.
</p>
<p>
Because we wanted Askbot to appear as a component within a page rather than its own standalone application, we would also need to remove the headers and footers from the Askbot Jinja2 templates. Askbot&#8217;s skin customisation made this straightforward. The Django container template would need to include any stylesheets or scripts that we had removed from the headers.
</p>
<p>
With some fixes to the CSS we had successfully embedded Askbot in our website. There would be some maintenance costs in that any future changes to the Askbot views would need to be reflected in our own Askbot views, but this would be far preferable to having to maintain our own set of templates.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.aptivate.org/2011/11/15/embedding-jinja2-templates-in-django-templates/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How can a $35 tablet computer change the world?</title>
		<link>http://blog.aptivate.org/2011/10/21/how-can-a-35-tablet-computer-change-the-world/</link>
		<comments>http://blog.aptivate.org/2011/10/21/how-can-a-35-tablet-computer-change-the-world/#comments</comments>
		<pubDate>Fri, 21 Oct 2011 22:38:51 +0000</pubDate>
		<dc:creator>Chris Wilson</dc:creator>
				<category><![CDATA[Africa]]></category>
		<category><![CDATA[Appropriate Technology]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[India]]></category>
		<category><![CDATA[Internet]]></category>
		<category><![CDATA[Mobiles]]></category>
		<category><![CDATA[Networking]]></category>
		<category><![CDATA[Teaching]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://blog.aptivate.org/?p=966</guid>
		<description><![CDATA[Osama Manzar poses some very interesting questions about India&#8217;s new $35 tablet computer &#8220;for the poor&#8221;. However he doesn&#8217;t attempt to answer these questions, leaving the reader in no doubt that he thinks the answer is No! in all cases. I must admit to being skeptical about any such innovation, and I&#8217;ve been listening to [...]]]></description>
			<content:encoded><![CDATA[<p>Osama Manzar poses some <a href="http://www.livemint.com/Articles/2011/10/17000845/Aakash-may-not-help-bridge-the.html">very interesting questions</a> about India&#8217;s new $35 tablet computer &#8220;for the poor&#8221;. However he doesn&#8217;t attempt to answer these questions, leaving the reader in no doubt that he thinks the answer is <strong>No!</strong> in all cases.</p>
<p>I must admit to being skeptical about any such innovation, and I&#8217;ve been listening to <a href="http://tech.groups.yahoo.com/group/bytesforall_readers/message/14910">both</a> <a href="http://tech.groups.yahoo.com/group/bytesforall_readers/message/14921">sides</a> of the <a href="http://tech.groups.yahoo.com/group/bytesforall_readers/message/14929">debate</a> on the <a href="http://tech.groups.yahoo.com/group/bytesforall_readers/">BytesForAll</a> mailing list. Despite my skepticism, Osama&#8217;s questions have some answers, and I&#8217;d like to present them for comment.</p>
<blockquote>
<ul>
<li>India has one of the lowest ratio of teachers—just 456 teachers per million people.</li>
<li>Seventy-two percent of our primary schools have only three teachers or less.</li>
<li>25% of teachers were absent from school, and only about half were teaching, during unannounced visits to a nationally representative sample of government primary schools.</li>
</ul>
<p>How is the $35 tablet going to solve any of these problems?
</p></blockquote>
<p>Of course technology on its own is not going to solve these problems. It is just a valuable weapon in the armoury of those who would launch an all-out war on poverty (and other abstract nouns).</p>
<p>Kentaro Toyama, an ex-Microsoft guru turned ICT4D researcher, <a href="https://plus.google.com/113254845719530983612/posts/gtju48L8bYq">says</a> that &#8220;technology is [just] an amplifier of human intent and capacity.&#8221; And when faced with a task that&#8217;s possible but simply too large, an amplifier is exactly what we need. It doesn&#8217;t need to be high tech. Tanzania did just fine with radio, one of the oldest, simplest and most inclusive ICTs:</p>
<blockquote><p>
About ten years after independence, Tanzania decided to move towards universal primary education, almost doubling the number of children in school. The government estimated that it needed an extra 40,000 teachers. As the existing training colleges were producing only 5,000 new teachers a year, it was decided to recruit secondary school leavers and train them on an apprenticeship model, partly on the job and partly through distance education. Over a period of three years, they were posted in schools where they had a reduced teaching load. They then followed correspondence courses backed by radio programmes; they were supervised and tested on their classroom practices, and passed their examinations. Two evaluations found that they ended up reasonably competent in the classroom (Chale, 1993; quoted by Perranton, 2000; retrieved from <a href="http://archive.unu.edu/africa/files/UNU_RevitalizingHigherEducation.pdf">UNU</a>)
</p></blockquote>
<p>If India were to launch a massive teacher education programme, they would find it cheaper to implement that programme using technology. For example, they might distribute radios, TVs, portable audio players or even (heaven forbid!) computers to trainee teachers. It might take longer for those teachers to reach high standards, and more might drop out, without the personal connection and feedback of face-to-face training. Even so, one could train more teachers for more time and achieve a similar number of fully trained teachers at a lower cost.</p>
<blockquote><p>
In the business sector, more than 70% micro, small and medium enterprises (MSMEs) are not connected to information society to leverage opportunities of business and efficiency. How will the $35 tablet help in the financial inclusion of MSMEs, which are largely situated in small towns and remote areas?
</p></blockquote>
<p>It&#8217;s unfortunate that the tablet doesn&#8217;t include a long-range wireless network (such as GPRS), which must surely cover most of India as it does Africa. Even without an Internet connection, it can still provide useful services such as record keeping, business accounting and stock tracking to small enterprises. The tablet is based on Android, but the marketplace has been disabled, and this is a serious limitation. I think it&#8217;s likely to be overcome soon. When that happens, India&#8217;s many skilled software developers will be free to create localised applications for a potentially huge local market.</p>
<blockquote><p>Most of India’s 3.3 million non-governmental organisations (NGOs) are also located in remote areas—70% of them lack any sort of information and communication technology (ICT) infrastructure or connectivity, and have no websites.</p>
<p>How can the $35 tablet help these NGOs’ global outreach efforts or aid the millions of people working with them in rural areas?</p></blockquote>
<p>You probably know the answer to this question as well as I do: <em>The same way as computer and phones can, only more so.</em> Helping people to communicate and to do their work is exactly what ICTs do. All of them. With the possible exception of Angry Birds. A computer can help us to make leaflets, track visits to patients and beneficiaries, diagnose illnesses, improve farming techniques, or learn about anything we wish to know in the whole world of knowledge. </p>
<blockquote><p>Can it bring transparency in governance at this level?</p></blockquote>
<p>Good question. Not by itself, sure. Transparency comes from open data. The people might get together to publish what the government would rather hide, or pressure the government to release the data, but a $35 tablet won&#8217;t help them much.</p>
<p>When they do release that data, however, the usual problem is how to make use of it. Government data tends to be massive and unwieldy, and answering difficult questions takes much time and significant skill even with the best of data. I think that free, open, widecast media provide the biggest opportunity to make real use of transparency, and our use of the Internet as an enabler of democracy is the best example of that.  Potentially, a simple but powerful Internet device could help bring people together to investigate and answer those difficult questions. But by the sound of it, this tablet is not quite there yet. Hopefully it will be soon.</p>
<blockquote><p>
Since a large population of our country communicate verbally, and cannot read and write with ease, their preferred medium of content consumption and content production is audio-visual&#8230; But to make use of good multimedia content, you need powerful machines, not cheap and underperforming ones.
</p></blockquote>
<p>I disagree with that. I grew up with &#8220;multimedia content&#8221; on BBC Micros: simple games, moving blocks around a screen, simple word processors and spreadsheets and databases and graphics. A picture is worth a thousand words, and a simple, clear diagram can be worth far more than a complex, confusing one. Advanced graphics are no substitute for a visual designer&#8217;s ingenuity and skill. Wikipedia is &#8220;multimedia content&#8221; that is perfectly suited to a $35 tablet.</p>
<blockquote><p>
If the $35 tablet can do anything good to education in India, the only way is by handing them to each and every teacher and school management staff to monitor the workings and functioning of the school and its teachers&#8230;
</p></blockquote>
<p>Monitoring is an interesting application, and a double-edged sword. <a href="http://www.ids.ac.uk/go/idsperson/professor-robert-chambers">Robert Chambers</a>, the inventor of <a href="http://en.wikipedia.org/wiki/Participatory_rural_appraisal">participatory rural appraisal</a>, told us a story at the recent <a href="http://ict4d-finale.eventbrite.com/">ICT4D Finale</a> event in Cambridge of a hospital in India where the nurses were given mobile phones &#8220;to collect data at the source.&#8221; But the director of the hospital used it to monitor what they were doing, effectively spying on them. The nurses went on strike and eventually the director was fired. I think that for monitoring to have a positive benefit, it must be done with consent and a shared vision to use the data to improve performance, not to criticise and control.</p>
<blockquote><p>rather than assuming that each student will buy Aakash and India will become digitally literate overnight.</p></blockquote>
<p>I have to agree with that sentiment, although I&#8217;m not sure who raised it. Kapil Sibal, who takes the credit for inventing the $35 tablet, merely <a href="http://www.telegraphindia.com/1111016/jsp/7days/story_14628545.jsp">said</a>:</p>
<blockquote><p>This low cost device is part of our national mission on education through information and communication technology (NME-ICT) which will connect over 1,000 institutions across the country, enabling tonnes of web-based course content for free.</p></blockquote>
<p>Now that doesn&#8217;t sound so far-fetched, does it?</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.aptivate.org/2011/10/21/how-can-a-35-tablet-computer-change-the-world/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Jazz Talking: The Agile &amp; Participation Event</title>
		<link>http://blog.aptivate.org/2011/09/28/jazz-talking-the-agile-participation-event/</link>
		<comments>http://blog.aptivate.org/2011/09/28/jazz-talking-the-agile-participation-event/#comments</comments>
		<pubDate>Wed, 28 Sep 2011 22:50:40 +0000</pubDate>
		<dc:creator>alan</dc:creator>
				<category><![CDATA[agile]]></category>
		<category><![CDATA[participation]]></category>
		<category><![CDATA[Alistair Cockburn]]></category>
		<category><![CDATA[Jazz talk]]></category>
		<category><![CDATA[Robert Chambers]]></category>

		<guid isPermaLink="false">http://blog.aptivate.org/?p=952</guid>
		<description><![CDATA[A format for a multi-disciplinary conversation - two experts, on a sofa, in front of an audience. ]]></description>
			<content:encoded><![CDATA[<p><a rel="attachment wp-att-955" href="http://blog.aptivate.org/2011/09/28/jazz-talking-the-agile-participation-event/robert_and_alistair_360x216/"><img class="alignnone size-full wp-image-955" title="Robert and Alistair" src="http://blog.aptivate.org/wp-content/uploads/2011/09/robert_and_alistair_360x216.jpg" alt="Robert and Alistair" width="360" height="216" /></a></p>
<p>For a while I&#8217;ve felt that the Agile methodologies from the software development world share a similar outlook to the Participatory methodologies from the international development world.</p>
<p>So we came up with an idea for an event. Wouldn&#8217;t it be great to get an expert from each discipline and have them talk to each other, in front of an audience?</p>
<p>Last night, thanks to support from the <a href="http://www.humanitariancentre.org/">Humanitarian Centre</a>, and our two esteemed guests, our idea became <a href="http://ict4d-finale.eventbrite.com/">reality</a>.</p>
<p>Alistair Cockburn, Agile guru, sat on a sofa next to Robert Chambers, expert on Participatory approaches, in front of an audience.</p>
<p>I thought it was fantastic and we&#8217;ve had a lot of positive feedback about the event. It was so good, I found myself afterwards wondering if this is in general a good format for an event.</p>
<p>So I wanted to write a post about the form of the event, rather than the content.</p>
<p>After the event I was chatting with Alistair and he&#8217;d already been thinking along similar lines. We called it a <strong>&#8220;Jazz Talk&#8221;</strong>. We were drawing an analogy with two jazz musicians improvising.</p>
<h2>Jazz Talks</h2>
<p>Here&#8217;s the format -</p>
<p><strong>1)</strong> Get <strong>two affable speakers</strong> from different disciplines<br />
<strong>2)</strong> Sit them on a <strong>sofa</strong> in front of an audience<br />
<strong>3)</strong> Let them talk about the relationship between their disciplines<br />
<strong>4)</strong> Periodically interrupt them with <strong>&#8220;Kibitzers&#8221;</strong></p>
<h2>Kibitzer</h2>
<p>A &#8220;kibitzer&#8221; is a person who comments on the conversation.</p>
<p>&#8220;Kibitzer&#8221; was a term Alistair came up with. I had to look it up, literally it means an observer of a card game who gives (unwanted) commentary.</p>
<p>There&#8217;s two types of Kibitzer. A &#8220;content kibitzer&#8221; gives comment on the content of the conversation. In the event last night I played the role of one of the kibitzers and asked the question <em>&#8220;How do we get funders to engage with agile / participatory proposals?&#8221;</em>.  All of our kibitzers last night were content kibitzers.</p>
<p>Talking to Alistair afterwards, he was keen to push the idea of a &#8220;form kibitzer&#8221;. This is someone who gives a commentary on the form of the conversation, not the subject matter. For instance, <em>&#8220;I liked how speaker-A extended speaker-B&#8217;s questions to the audience&#8221;</em>, or <em>&#8220;Can we hear more from speaker-A?&#8221;</em>. I think form kibitzing is less natural but likely to be shorter. It also potentially plays a facilitatory role in guiding the conversation and could help address issues like one speaker dominating the conversation.</p>
<p>Perhaps a mix of both types could work. Each commentary would start with a short form kibitz followed by a content kibitz.</p>
<h2>Timing</h2>
<p>Here&#8217;s a suggested recipe:</p>
<ul>
<li>a <strong>90 minute conversation</strong></li>
<li><strong>kibitzing</strong> every <strong>15 minutes</strong> (eg 5 interruptions)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.aptivate.org/2011/09/28/jazz-talking-the-agile-participation-event/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Computers in Schools: Sound solutions</title>
		<link>http://blog.aptivate.org/2011/09/05/computers-in-schools-sound-solutions/</link>
		<comments>http://blog.aptivate.org/2011/09/05/computers-in-schools-sound-solutions/#comments</comments>
		<pubDate>Mon, 05 Sep 2011 11:55:50 +0000</pubDate>
		<dc:creator>Chris Wilson</dc:creator>
				<category><![CDATA[Africa]]></category>
		<category><![CDATA[Appropriate Technology]]></category>
		<category><![CDATA[Hardware]]></category>
		<category><![CDATA[Teaching]]></category>

		<guid isPermaLink="false">http://blog.aptivate.org/?p=928</guid>
		<description><![CDATA[Activities with sound are ideal for kids. Preferably lots of sound. Especially when it comes to teaching language, reading and writing. When you have a classroom full of children with computers, each working at their own pace on speech or language tasks, they need private sound rather than the built-in speakers of their laptops. Otherwise [...]]]></description>
			<content:encoded><![CDATA[<p>Activities with sound are ideal for kids. Preferably lots of sound. Especially when it comes to teaching language, reading and writing.</p>
<p>When you have a classroom full of children with computers, each working at their own pace on speech or language tasks, they need private sound rather than the built-in speakers of their laptops. Otherwise the cacophony would make learning much harder for all of them.</p>
<p>Headphones (or headsets) are the normal solution for language labs in UK schools. But they&#8217;re not great for use with primary school kids in a dirty, dusty environment. They&#8217;re extremely fragile, hard to clean, uncomfortable to wear for long periods, and can spread ear infections.</p>
<p>A bluetooth headset would work, and would be nice in theory, but much more expensive, and would need charging often.</p>
<p>The most obvious solution seems to be something that looks like a mobile phone, but attaches to a computer with a cable. They&#8217;re very hard to find. It seems that everyone wants tiny, delicate, wireless or in-ear headsets. So manufacturers don&#8217;t bother making the kind of big, clunky, bulletproof handsets I&#8217;m thinking of.</p>
<p>First, after long and fruitless searching, I discovered that what I&#8217;m looking for is actually called a handset (because you hold it in your hands) and not a headset (that fits over your head).</p>
<p>And then I found them:</p>
<p><a href="http://www.maplin.co.uk/voip-usb-handset-46145"><img style="border: 1px solid black;" title="Maplin USB Handset" src="http://images.maplin.co.uk/300/a87cu.jpg" alt="USB Handset" width="250" height="250" /></a> <a href="http://www.eastel33.com/Trimline-Phone-E-720-12.html"><img style="border: 1px solid black;" alt="USB Handset" src="http://www.eastel33.com/upload/photo/d81439a2f6c9d713f61dccec076fe3f5.jpg" title="Eastel Trimline Phone E-720" class="alignnone" width="250" height="250" /></a> <a href="http://www.gd-wholesale.com/wholesale-dir/a66c/e3227f/usb-phone-s-1.html"><img style="border: 1px solid black;" alt="Nokia-like USB Handset" src="http://www.gd-wholesale.com/userimg/66/3227i1/usb-phone-180.jpg" title="GD Wholesale UGW-141906" class="alignnone" width="250" height="250" /></a> <a href="http://www.amazon.co.uk/gp/product/B000XRMB0W/ref=cm_cr_rev_prod_img"><img style="border: 1px solid black;" class="alignnone" title="USB handset (discontinued)" src="http://ecx.images-amazon.com/images/I/21A8-ymqT-L._SL500_AA300_.jpg" alt="Slim grey USB handset" width="250" height="250" /></a></p>
<p>Unfortunately the cheapest I&#8217;ve found so far is £10 ($14) through Maplin, which is about ten times the cost of the cheap, fragile headsets we&#8217;d like to replace.</p>
<p>If you know of any others, or a cheaper bulk supplier than Maplin (such as their supplier in China) please let us know!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.aptivate.org/2011/09/05/computers-in-schools-sound-solutions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ubuntu Laptops in Schools</title>
		<link>http://blog.aptivate.org/2011/09/01/ubuntu-laptops-in-schools/</link>
		<comments>http://blog.aptivate.org/2011/09/01/ubuntu-laptops-in-schools/#comments</comments>
		<pubDate>Thu, 01 Sep 2011 08:24:10 +0000</pubDate>
		<dc:creator>Chris Wilson</dc:creator>
				<category><![CDATA[Africa]]></category>
		<category><![CDATA[Engineer's Log]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[System Administration]]></category>
		<category><![CDATA[Teaching]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://blog.aptivate.org/?p=926</guid>
		<description><![CDATA[I&#8217;m currently working on a project that&#8217;s putting computers into Zambian schools to try to revolutionise education, making it more fun and interactive for kids, and reducing the problems of teacher absence. They&#8217;re using Intel Classmate style PCs, currently running Windows 7 Home Starter. I&#8217;m investigating whether Ubuntu would provide a better experience. It might [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m currently working on a project that&#8217;s putting computers into Zambian schools to try to revolutionise education, making it more fun and interactive for kids, and reducing the problems of teacher absence.</p>
<p>They&#8217;re using Intel Classmate style PCs, currently running Windows 7 Home Starter. I&#8217;m investigating whether Ubuntu would provide a better experience. It might be faster, more reliable, more manageable and easier to lock down than Windows.</p>
<p>Ubuntu 10.10 (Maverick) doesn&#8217;t boot on these computers, probably due to problems with the HPET. I don&#8217;t like Unity so I don&#8217;t want to try 11.04 just yet, which left me falling back to 10.04 (Lucid) with long-term support.</p>
<h3>Automatic Logins</h3>
<p>The computers should be in a kiosk-like mode for student use, where no login is required but they are locked down. They should also be used by teachers (with a password and fewer restrictions) and administrators (with another password and no restrictions). So I created three user accounts. Student is set to log in by default with no password.</p>
<p>While this works, there are other places where a password is requested and none works, because the Student account doesn&#8217;t have a valid password:</p>
<ul>
<li>unlocking from screensaver</li>
<li>switching users</li>
<li>sudo from the command line</li>
</ul>
<p>The last one is less important because students should not be able to access the command line anyway, or have any administrative rights. But they need to unlock the screensaver and be able to switch users.</p>
<p>We solved the screensaver problem by telling the screensaver not to lock the screen for this user, just as we did for Camfed in the <a href="http://oer.aptivate.org/wiki/Classroom_LTSP_Configuration#Guest_User_Accounts">Zambia SRC with LTSP</a>:</p>
<pre>
# Disable locking the screen for users with no password to unlock it
sudo -u student gconftool-2 \
	--type boolean \
	--set /apps/gnome-screensaver/lock_enabled false
</pre>
<p>However the user switching was more tricky. Luckily I found a very helpful <a href="http://superuser.com/questions/119254/how-to-switch-users-without-entering-password/119813#119813">question and answer</a> on SuperUser. I improved on it slightly by reusing Ubuntu&#8217;s builtin <code>nopasswdlogin</code> group, so that users who can log in with no password can also be switched to with no password. </p>
<p>To achieve this, just add the following line at the <strong>beginning</strong> of <code>/etc/pam.d/gnome-screensaver</code>:</p>
<pre>
auth sufficient pam_succeed_if.so user ingroup nopasswdlogin
</pre>
<h3>Firefox Kiosk Mode</h3>
<p>We want the browser to be fullscreen all the time, so we need to use some extensions:</p>
<ul>
<li><a href="https://addons.mozilla.org/en-US/firefox/addon/full-fullscreen/">Full Fullscreen</a> to make it start in fullscreen mode;</li>
<li><a href="http://kb.mozillazine.org/Keyconfig_extension">Keyconfig</a> to stop them exiting full screen mode with F11, or closing the browser with Alt-F4.</li>
</ul>
<p>We also change some preferences using <a href="about:config">about:config</a>:</p>
<dl>
<dt>xpinstall.enabled: false</dt>
<dd>to prevent installing more extensions;</dd>
<dt>app.update.auto: false</dt>
<dd>to stop Firefox checking for updates by itself;</dd>
<dt>browser.sessionstore.resume_from_crash: false</dt>
<dd> to prevent the <em>Restore previous session</em> prompt when starting Firefox;</dd>
<dt>extensions.update.enabled: false</dt>
<dd>to stop Firefox checking for updates to its installed extensions;</dd>
<dt>extensions.update.notifyUser: false</dt>
<dd>to avoid a prompt if an extension update is discovered;</dd>
<dt>browser.tabs.warnOnClose: false</dt>
<dd>to avoid the prompt to save your tabs on browser exit;</dd>
</dl>
<h3>Window Manager</h3>
<p>We want the students to have access to a restricted set of applications. The user interface also needs to be unbreakable (child-proof). Windows should always be maximised, as the laptops have quite small screens. All of this points to using a custom window manager/desktop instead of the standard Gnome or KDE.</p>
<p>Fluxbox and Openbox were recommended, but they seem to be aimed at highly-customised desktop environments (for geeks) rather than locked-down kiosks or embedded systems. <a href="http://matchbox-project.org/">Matchbox</a> looks like quite a good fit. It has a very simple front menu and an everything-maximised window manager, which sounds great for ease of use.</p>
<p>We&#8217;re using GDM for the user login, which offers users a choice of which session (window manager) to run. This is OK, and even quite good for administrators, as it provides a failsafe option in case the usual window manager is borked. But I can&#8217;t see how to disable or override this for particular users. Students have no-password logins, so they don&#8217;t even get the opportunity to choose a window manager.</p>
<p>The DefaultSession in <code>/etc/gdm/custom.conf</code> (chosen using <code>gdmsetup</code>) changes their window manager, but affects all users, and we don&#8217;t want to force everyone to use the restrictive kiosk window manager.</p>
<p>I found that GDM lets you <a href="http://projects.gnome.org/gdm/docs/2.14/configuration.html">specify your own Xsession script</a>, which gdm uses to actually start the session selected by the user. So I wrote a replacement:</p>
<pre>
#!/bin/sh

if [ "$USER" = "student" ]; then
	/etc/gdm/Xsession /usr/bin/matchbox-session
else
	/etc/gdm/Xsession "$@"
fi
</pre>
<p>All it does is call the original Xsession, overriding the name of the session manager if the current user is the special <code>student</code> user, and otherwise behaves exactly as normal.</p>
<p>Save it in <code>/usr/local/bin/GdmKioskSession</code>, make it executable, and add the following line to <code>/etc/gdm/custom.conf</code>:</p>
<pre>
BaseXSession=/usr/local/bin/GdmKioskSession
</pre>
<p>If you don&#8217;t even want the application menu, but want to force a particular application such as a web browser (true kiosk mode), replace <code>/usr/bin/matchbox-session</code> with <code>/usr/local/bin/kiosk-session</code>, create that file with the following contents and make it executable:</p>
<pre>
#!/bin/sh
matchbox-window-manager -use_titlebar no &#038;
exec /usr/bin/chromium-browser -kiosk -app=http://staging.ischool.zm/
</pre>
<p>More lockdown tips to follow.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.aptivate.org/2011/09/01/ubuntu-laptops-in-schools/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Traffic shaping with PF, ALTQ and HFSC</title>
		<link>http://blog.aptivate.org/2011/08/05/traffic-shaping-with-pf-altq-and-hfsc/</link>
		<comments>http://blog.aptivate.org/2011/08/05/traffic-shaping-with-pf-altq-and-hfsc/#comments</comments>
		<pubDate>Fri, 05 Aug 2011 11:00:32 +0000</pubDate>
		<dc:creator>Chris Wilson</dc:creator>
				<category><![CDATA[Engineer's Log]]></category>
		<category><![CDATA[Internet]]></category>
		<category><![CDATA[Networking]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[System Administration]]></category>
		<category><![CDATA[bandwidth]]></category>

		<guid isPermaLink="false">http://blog.aptivate.org/?p=911</guid>
		<description><![CDATA[We usually use Linux firewalls for traffic shaping, because the power of the traffic control (tc) system exceeds FreeBSD&#8217;s dummynet in most ways. Dummynet can be used to create arbitrary delays and packet loss, which is very useful for simulating poor connections, but not for sharing bandwidth and prioritising packets between different traffic classes on [...]]]></description>
			<content:encoded><![CDATA[<p>We usually use Linux firewalls for traffic shaping, because the power of the traffic control (tc) system exceeds FreeBSD&#8217;s dummynet in most ways.</p>
<p>Dummynet can be used to create arbitrary delays and packet loss, which is very useful for <a href="http://blog.aptivate.org/2010/06/08/simulating-low-bandwidth-publishers-for-development/">simulating poor connections</a>, but not for sharing bandwidth and prioritising packets between different traffic classes on a real traffic shaper.</p>
<p>However, I&#8217;ve just been testing PF (the new standard packet filter) and ALTQ (the alternative queueing system) on FreeBSD, and I&#8217;m impressed by the capabilities. I prefer this combination (PF+ALTQ) over Linux TC because:</p>
<ul>
<li>PF and ALTQ are fully integrated and configured using the same file, whereas TC has its own (very hard to use) classifier. I normally use the iptables CLASSIFY target to classify traffic instead, but this is not integrated.</li>
<li>TC is very hard to use generally. The authors seem more concerned with functionality than usability.</li>
<li>ALTQ has named queues which helps usability enormously compared to TC&#8217;s hex numbered classes.</li>
<li>ALTQ gives very low delay when the interface is not 100% saturated, which seems impossible to achieve with TC.</li>
</ul>
<p>It does annoy me that ALTQ is not enabled in the default kernel, so you have to <a href="http://www.freebsd.org/doc/en/books/handbook/kernelconfig-building.html">compile your own kernel</a>. I used the following commands to copy the default GENERIC configuration to a custom one, which I called ALTQ:</p>
<pre>
cd /boot
cp -p kernel GENERIC # backup the current kernel
cd /usr/src/sys/i386/conf
cp GENERIC ~/ALTQ
ln -s ~/ALTQ .
vi ALTQ
</pre>
<p>and added the following lines to the new kernel configuration file, ALTQ:</p>
<pre>
options ALTQ
options ALTQ_RED
options ALTQ_RIO
options ALTQ_HFSC
options ALTQ_PRIQ
</pre>
<p>and then compiled and installed the new kernel:</p>
<pre>
cd /usr/src
make buildkernel KERNCONF=ALTQ
make installkernel KERNCONF=ALTQ
</pre>
<p>and then reboot to load the new kernel. After that, we need to create a pf configuration. Some example configurations use CBQ queues, but I prefer HFSC because:</p>
<ul>
<li>HFSC is guaranteed accurate, whereas CBQ is approximate</li>
<li>CBQ requires you to guess the average packet size and its accuracy depends entirely on this</li>
<li>HFSC has service curves which allow you to deliver small files quickly <strong>and</strong> drop the priority of large connections (e.g. file downloads) with great ease.</li>
</ul>
<p>Here is a sample configuration of PF+ALTQ+HFSC that I used for testing on a transparent bridging firewall (bridge0 connecting em0 and em1):</p>
<pre style="border:1px dashed #bbf;background-color:#ddf;padding:.5em;">
altq on em1 hfsc bandwidth 1Mb queue { ftp, ssh, icmp, other }
queue ftp bandwidth 30% priority 0 hfsc (upperlimit 99%)
queue ssh bandwidth 30% priority 2 hfsc (upperlimit 99%)
queue icmp bandwidth 10% priority 2 hfsc (upperlimit 99%)
queue other bandwidth 30% priority 1 hfsc (default upperlimit 99%)
pass out quick on bridge0 inet proto tcp from any port 21 to any queue ftp
pass out quick on bridge0 inet proto tcp from any port 22 to any queue ssh
pass out quick on bridge0 inet proto icmp from any to any queue icmp
pass out quick on bridge0 all
</pre>
<p>We are only queueing on em1 here, which is the downstream, so we are only limiting downloads. We deliberately limit them to 1 Mbps for testing. The limit should always be lower than your actual download bandwidth, to ensure that the queue is on the FreeBSD firewall and not any other device.</p>
<p>We create four named queues under the root, which is implicitly named <code>root_em1</code>. We reserve 30% of bandwidth each for FTP, SSH and other traffic, and 10% for ICMP. However, any class can exceed its reserved bandwidth, up to the <code>upperlimit</code>, which defaults to 100%, which means that one class can potentially cause delays to traffic in other classes, so we override this to 99%.</p>
<p>Note that even though we create the queues on the em1 device, we must filter packets on bridge0, as otherwise our traffic does not match our pf rules.</p>
<p>Update: I found some more information about <a href="http://doc.pfsense.org/index.php/Traffic_Shaping_Guide">traffic shaping</a> and <a href="http://devwiki.pfsense.org/HFSCBandwidthShapingNotes">advanced usage of HFSC</a>, including realtime guaranteed classes for VoIP.</p>
<p>Update 2: For a simpler setup with ALTQ, try <a href="http://microsux.dk/?p=321">this guide</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.aptivate.org/2011/08/05/traffic-shaping-with-pf-altq-and-hfsc/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Checking missing translations automatically</title>
		<link>http://blog.aptivate.org/2011/07/26/checking-missing-translations-automatically/</link>
		<comments>http://blog.aptivate.org/2011/07/26/checking-missing-translations-automatically/#comments</comments>
		<pubDate>Tue, 26 Jul 2011 15:38:52 +0000</pubDate>
		<dc:creator>Martin Burchell</dc:creator>
				<category><![CDATA[Django]]></category>
		<category><![CDATA[Testing]]></category>

		<guid isPermaLink="false">http://blog.aptivate.org/?p=896</guid>
		<description><![CDATA[For our open source openconsent project, which uses the Django framework, we have recently added internationalisation support. Here&#8217;s how we&#8217;re testing it. Before any translations are in place, it&#8217;s difficult to ensure that all text is appropriately tagged for translation, either with {% trans %} tags in templates or using gettext() and its friends in [...]]]></description>
			<content:encoded><![CDATA[<p>For our open source <a href="https://github.com/aptivate/openconsent">openconsent</a> project, which uses the <a href="https://www.djangoproject.com/">Django</a> framework, we have recently added internationalisation support. Here&#8217;s how we&#8217;re testing it.</p>
<p>Before any translations are in place, it&#8217;s difficult to ensure that all text is appropriately tagged for translation, either with <tt>{% trans %}</tt> tags in templates or using <tt>gettext()</tt> and its friends in the code. Checking missing translations by eye is time-consuming and prone to error.</p>
<p>Inspired by the article  <a href="http://www.technomancy.org/python/django-i18n-test-translation-by-manually-setting-translations/">Mocking gettext with Django Translations to test that your code is translating</a> by Rory McCann we wrote an automated test to do this:</p>
<pre># coding: utf-8

from publicweb.tests.open_consent_test_case import OpenConsentTestCase
from django.core.urlresolvers import reverse
from django.utils import translation
from lxml.html.soupparser import fromstring
from lxml.cssselect import CSSSelector

class InternationalisationTest(OpenConsentTestCase):

    def setUp(self):
        self.login()

    def test_all_text_translated_when_viewing_decision_list(self):
        self.check_all_text_translated('decision_list')

    def test_all_text_translated_when_adding_decision(self):
        self.check_all_text_translated('decision_add')

    def check_all_text_translated(self, view):
        self.mock_get_text_functions_for_french()

        translation.activate("fr")

        response = self.client.get(reverse(view), follow=True)
        html = response.content

        root = fromstring(html)
        sel = CSSSelector('*')

        for element in sel(root):
            if self.has_translatable_text(element):
                self.assertTrue(self.contains(element.text, "XXX "),
                                "No translation for element " + \
                                str(element) + " with text '" + \
                                element.text + \
                                "' from view '" + view + "'")

    def has_translatable_text(self,element):
        if element.text is None or element.text.strip() == "" \
            or "not_translated" in element.attrib.get('class', '').split(" ") \
            or element.tag == 'script' \
            or element.text.isdigit():
            return False
        else:
            return True

    def contains(self, string_to_search, sub_string):
        return string_to_search.find(sub_string) &gt; -1

    def mock_get_text_functions_for_french(self):
        # A decorator function that just adds 'XXX ' to the front of all
        # strings
        def wrap_with_xxx(func):
            def new_func(*args, **kwargs):
                output = func(*args, **kwargs)
                return "XXX "+output
            return new_func

        old_lang = translation.get_language()
        # Activate french, so that if the fr files haven't
        # been loaded, they will be loaded now.
        translation.activate("fr")

        french_translation = translation.trans_real._active.value

        # wrap the ugettext and ungettext functions so that 'XXX '
        # will prefix each translation
        french_translation.ugettext = \
            wrap_with_xxx(french_translation.ugettext)
        french_translation.ungettext = \
            wrap_with_xxx(french_translation.ungettext)

        # Turn back on our old translations
        translation.activate(old_lang)
        del old_lang</pre>
<p>We mock the French <tt>ugettext()</tt> and <tt>ungettext()</tt> to prefix any translated strings with XXX. Our automated tests now just need to ensure that any text on the page begins with XXX.</p>
<p>There are two tests in this class, one for each page that we want to check. These both call the method <tt>check_all_text_translated()</tt>. This sends a GET request for the given view. We use <a href="http://lxml.de/">lxml</a> to parse the response. The CSS selector &#8216;*&#8217; will return us all elements.</p>
<p>Because our database is empty when running these tests, we can be sure that pretty much all of the text nodes should be translated. There are a number of exceptions that we filter out in the method <tt>has_translatable_text()</tt></p>
<ul>
<li>White space</li>
<li>JavaScript</li>
<li>Numbers</li>
<li>Anything in a tag with class &#8220;not_translated&#8221;</li>
</ul>
<p>The last category is a bit of a hack as it isn&#8217;t really used other than in our tests. We couldn&#8217;t think of a way around this. There are only a couple of places where we need to do this, for example when displaying the user name of the logged in user.</p>
<p>If none of these exceptions applies and the text does not begin with XXX, we ensure our test fails with plenty of information to track down the missing translation.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.aptivate.org/2011/07/26/checking-missing-translations-automatically/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Starting and Stopping VNC with Fabric</title>
		<link>http://blog.aptivate.org/2011/07/13/starting-and-stopping-vnc-with-fabric/</link>
		<comments>http://blog.aptivate.org/2011/07/13/starting-and-stopping-vnc-with-fabric/#comments</comments>
		<pubDate>Wed, 13 Jul 2011 17:26:34 +0000</pubDate>
		<dc:creator>hamish</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.aptivate.org/?p=889</guid>
		<description><![CDATA[We wanted to have a VNC server for client demos and for remote pairing on tasks.  Our old VNC server was in the office, so the bandwidth available to anyone outside the office was pretty poor, leading to a pretty bad experience - lots of lag between doing anything and seeing the result, slow screen [...]]]></description>
			<content:encoded><![CDATA[<p>We wanted to have a VNC server for client demos and for remote pairing on<br />
tasks.  Our old VNC server was in the office, so the bandwidth available to<br />
anyone outside the office was pretty poor, leading to a pretty bad experience -<br />
lots of lag between doing anything and seeing the result, slow screen updates<br />
etc.</p>
<p>So we&#8217;ve now set up a server with <a title="Linode" href="http://linode.com/">Linode</a> for running VNC sessions. To make it<br />
easy for clients to connect for demos we&#8217;ve setup two ways to connect to the<br />
VNC session through their web browser &#8211; a java applet, and the excellent<br />
<a title="Guacamole server" href="http://guacamole.sourceforge.net/">guacamole server</a>t (which uses the HTML 5 canvas element).</p>
<p>As the web connection is available, we don&#8217;t want to leave a VNC server running<br />
all the time, as that might be a security risk. So we want our staff to start<br />
the VNC server when they need it, and stop it when they&#8217;re finished. Of course<br />
they have other things on their minds, so we decided to write some scripts to<br />
make it easy for them to start and stop VNC.</p>
<p>The scripts need to be able to ssh into our VNC server, start the VNC server,<br />
then start the VNC client locally. When the VNC client exits the script should<br />
stop the VNC server. It would also be nice to be able to check what displays<br />
are available and give sensible options if the default display is already in<br />
use. Finally it would be good to be able to use an ssh tunnel &#8211; otherwise the<br />
traffic all goes in the clear.</p>
<p>We decided to use <a title="fabric" href="http://fabfile.org/">fabric</a> &#8211; a python project that is all about setting up remote<br />
servers by running commands through ssh. And with a little code it can produce<br />
ssh tunnels.</p>
<p>We&#8217;ve posted a <a title="fabric script for VNC" href="https://gist.github.com/1080774">snapshot of our code</a> on a github gist. (We&#8217;ve changed a few<br />
Aptivate specific values in it). You&#8217;ll also need to get <a title="ssh tunnel with fabric" href="https://gist.github.com/856179">tunnels.py</a> and<br />
generate your own copy of the vnc_passwd file with your own vnc password (see<br />
<a href="http://manpages.ubuntu.com/manpages/lucid/man1/vnc4passwd.1.html">man vncpasswd</a> for details of how to generate it). Lastly here is the <a href="https://gist.github.com/1080780">README</a> for<br />
how to use it.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.aptivate.org/2011/07/13/starting-and-stopping-vnc-with-fabric/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

