Arabic is often considered to be the hardest language to localise. This is partly due to the fact that some Arabic-speaking countries still have a way to go when it comes to the development and diffusion of region-specific software and technology. But the key reason is that the Arabic script has proved extremely tricky to adapt to digital formats.The Latin based script is one of the simplest in the world, not least because it is alphabetic. Chinese, by contrast, is ideographic, which gave rise to the need for Simplified Chinese. Arabic, a consonantal script, is also extremely difficult to reproduce on a computer, particularly one which has been set up for a Latin script, as there is a need for radical transformations between the processing layouts and the presentational layouts.
The Latin ‘character repertoire’, or number of letters, is considerably simpler than Arabic. Arabic is a cursive and context-dependent script, and it has more glyphs than characters. Context-dependent shaping is a key issue: when a word is being written, the previously entered characters may change shape or even be eliminated when a new character is entered, presenting a huge range of challenges to localisation professionals. In addition, some Arabic character sequences form obligatory ligatures, which is when two or more characters placed together are replaced by one single new character. Ligatures occur frequently in Arabic, and they must be rendered correctly.
Abbreviations and acronyms do not exist in Arabic, which can have a serious, cumulative impact on the layout of any given text.
To make matters worse, Arabic text is bidirectional; it is written from right to left, but numerals and characters in the Latin script within the text are written from left to right. And don’t even ask about diacritical marks…
These are a few representative articles of the problems presented in the localisation of Arabic. Hopefully they will serve to emphasise the importance of collaborating with experienced, qualified linguists on translation projects.
I just read this article on Arabizi, which seems to be a sort of nascent ‘simplified Arabic’.