Making an Arabic website
Last year, as part of our ongoing involvement with the United Nations' Research for Life Programme, we worked with the World Health Organisation to add Arabic and Russian translations to the HINARI, AGORA and OARE programmes. These websites give researchers in developing countries free or subsidised access to journals, books and other resources via a content access portal. These portals share a common underlying codebase and database, developed by WHO and more recently by Aptivate.
The work involved adding support for non-Latin scripts, requiring changes to the character encoding and, in the case of Arabic, text direction. Here are some of the problems we solved during the implementation.
These portals had previously been available in four European languages with Latin scripts and all of the pages were encoded in Latin 1. The first step was to convert all of the pages of the portal to UTF-8, which supports other alphabets. Unfortunately the database that stores the text was using Latin 1 encoding, and there were a number of other tools that would need to be changed were we to convert the database to UTF-8. So instead we chose to convert text on-the-fly in PHP (using utf8_encode()) when displaying, e.g. publisher names and journal titles.
The language selector at the top of the page previously used rollover images, with three images for each language. We replaced this with a lower bandwidth menu using CSS, removing the need for the images. The same rollover effect could have been achieved in CSS but this wasn't required.
Changing the text direction and direction of the page elements in CSS is surprisingly easy. In general, elements that have a float:left rule are changed to float:right and vice versa. Similarly left/right margin, padding and border widths are mirrored. Text can be reversed with the text-direction rule.
We ended up with three style sheets per page:
- common rules for all portals and languages
- rules specific to each of the HINARI, AGORA and OARE portals
- language-specific rules (sometimes a piece of translated text was longer in one language and we had to move things to make it fit).
There were also a few rules that were specific to both portal and language (such as positioning of the footer elements in the Arabic translation). By defining a portal class on the outermost container element, these rules could be specified in the language style sheet.
As the page titles of the books and journals still use the Latin alphabet we were advised by WHO's Arabic expert to keep these left aligned and the A-Z navigation menus still run left-to-right. It wasn't clear what to do in the drop-down menus on the front page where the options are Latin alphabet but the titles are in Arabic. We took the decision to keep everything right-aligned.
Currently Arabic is the only supported language with a right-to-left script, so we have put all right-to-left rules into the Arabic style sheet. If in the future there is a need to add a second right-to-left language we can create an additional style sheet with these rules.
As a further optimisation, the style sheets could be combined on the server dynamically into one CSS file, reducing the number of HTTP requests.