THE WRITING of Arabic words in English
texts presents a number of difficulties, even for those who are familiar
with both languages.
In 1926, when T E Lawrence
("Lawrence of Arabia") sent his 130,000-word
manuscript of Revolt in the Desert to be typeset, a
sharp-eyed proof-reader spotted that it was "full of
inconsistencies in the spelling of proper names".
Among other things, the
proof-reader noted that "Jeddah" alternated with
"Jidda" throughout the book, while a man whose
name began as Sherif Abd el Mayin later became el Main, el
Mayein, el Muein, el Mayin and le Muyein.
Lawrence refused to change
"Arabic names," he
replied, "won't go into English, exactly, for their
consonants are not the same as ours, and their vowels, like
ours, vary from district to district."
Such inconsistencies may not
matter much in a literary work but in many other situations they
do matter. For instance, if you wanted to look up an Arab called
"Hassan al-Ghobashy" in the telephone directory, he
might be listed under A, E or G:
GHOBASHY Hassan al-
Difficulties arise whenever Arabic
names are listed alphabetically using the Roman alphabet, and when
they are used in databases or search engines. Efforts by the FBI
to track down Usama bin Laden's supporters, for instance, were severely hampered
by this problem.
Newspapers spell the Libyan leader's name in a
variety of ways, with the result that a researcher trying to find
articles about him would be likely to miss a significant number of
them. According to one website, there are 32
possible ways to spell his name.
There is no
ideal, all-purpose solution. There are, however, several different
approaches to a solution and the best choice depends largely on
the writer's purpose and intended audience.
The following notes
are an attempt to explain the issues involved. They should be
considered as "work in progress" and readers are
encouraged to send comments, questions or corrections by email.
To view the
Arabic alphabet click
APPROACH is to take
Arabic words as they are pronounced and write down approximately
similar sounds in
the Roman alphabet. This is what early European
travellers to the Middle East usually did, and the results
were often bizarre or, in some cases, almost unrecognisable.
such as "Mecca" and "Koran" entered the
English language a long time ago and have become so entrenched
that they are now difficult to eradicate. In old books the
Prophet's name is frequently spelled as "Mahomet" and
this is still used to some extent today. There is no logical
reason for it because Muhammad is one Arabic name that can easily
be rendered in a way that is both phonetically accurate and
faithful to its written form.
The Roman alphabet, of course, is used by a number of European
languages, so phonetic representations of Arabic words vary
according to the mother tongue of the writer. Romanised spellings
adopted by Arabs themselves often reflect previous colonial
influences: an Arab in a country with strong English influence
might spell his surname as "Shaheen", while a cousin in
a French-influenced country would spell it as "Chahine".
In both cases, the original Arabic name is the same.
A further consideration is that
there are also significant regional
variations in pronunciation by Arabs. So a single
Arabic word, spoken by a Moroccan, and Egyptian and a Saudi could
easily appear as three different words if written phonetically in the Roman alphabet.
The spellings of Arabic words
found in the western mass media are often at least partly phonetic
but rarely do justice to the original.
In some circumstances, more
precise phonetic spelling is needed - in phrase books for tourists,
for instance, or in pronunciation guides for broadcasters. The
following examples come from a guide issued by Associated Press to help
American radio stations with their pronunciation:
mah-MOOD' ah-BAHS' (Mahmoud Abbas)
mah-MOOD' ab-DEHL'-BA'-set (Mahmoud Abdel-Baset)
shayk OH'-mahr AHB'-dehl RAHK'-mahn (Sheik Omar Abdel-Rahman)
This is not only thoroughly
unscientific but highly inaccurate. The guide happily inserts various
sounds that don't exist in the original Arabic (a K in
"Rahman", for example) and ignores several others that do exist.
It also offers two different
pronunciations of "Abdel-", for no logical reason.
Truly phonetic spelling follows the International
Phonetic Alphabet which is used academically by linguists.
Its disadvantage in general use is that it requires characters
outside the normal alphabet and is therefore more or less incomprehensible
DIFFERENT approach is to start with
Arabic words in their written form
and transcribe (or "Romanise") them by replacing
individual Arabic letters with corresponding letters from the
Roman alphabet. This sounds
simple but is actually very difficult. For example:
Only eight Arabic letters have a clear equivalent in the Roman alphabet:
B, F, K, L, M, N, R, and Z.
Arabic has two distinct consonants
that approximate to the sound of S. The same applies to D, H and T.
There are two glottal sounds
that do not
obviously correspond to any Roman letter.
The ideal solution would be to have a standard, internationally
agreed, system. Several have been proposed but
unfortunately none has been universally accepted. A selection of
these can be viewed in PDF format at http://homepage.mac.com/sirbinks/pdf/Arabic.pdf.
Probably the earliest attempt at
standardisation was Deutsche Morgenländische Gesellschaft
proposal, adopted by the International Convention of Orientalist
Scholars in 1936. It is the system used in the Hans
Wehr Arabic dictionary. Another standard was agreed in 1971 at
a conference of Arab experts in Beirut and - theoretically, at
least, accepted by the countries of the Arab League. It has met
some resistance, particularly in those Arab countries where French
predominates over English. Other transcription/Romanisation
Adopted by the US Library of Congress and the American Library Association
for cataloguing books, the system has found its way into wider
academic use. It covers a multitude
of languages: there are 54 Romanisation tables for more than 150 languages and
dialects written in non-Roman scripts. The table relating to
Arabic may be viewed in PDF format at the following sites:
Alternatively, a complete set of
the tables may be
purchased from amazon.com
Published by the International Standards Organisation. Copies
may be purchased here.
British Standard BS 4280: 1968
Not widely used - which is
hardly surprising since the British
Standards Institute holds the copyright (it cannot be
reproduced here) and copies are expensive to buy (about $39 for an
United Nations Romanization System for Geographical
Overseen by the Group of Experts on Geographical Names
(UNGEGN), this aims to promote "consistent use of accurate place names"
on maps and similar products. Work on the project has been
continuing since 1972. A progress report on Arabic romanization,
dated March 2000, can be viewed in PDF format here.
transcription/romanization systems described above all suffer from
the same disadvantages, to varying degrees:
1. They are difficult to memorise because they use special
characters or add special marks to normal characters.
2. They can cause ambiguity by using digraphs (two-letter
combinations) to represent single Arabic letters. For example,
there is a risk of confusing SH (as in
"sheep") with S H (as in "mis-hap")
or TH (as in "thin") with T H (as in "hot-house").
3. They cannot be used easily
with a standard computer keyboard.
This last point is particularly important today,
though it could not be foreseen when most of the romanization
systems were devised. Currently, the most advanced approaches
involve precise letter-for-letter transcription systems which
allow a text files originally produced in Arabic to be romanized
by a simple computer program and converted back again into perfect
Arabic. Beyond straightforward text files, this has important
implications for the use of databases.
this area has been led by the Xerox company. For detailed and
interesting discussion of the issues, see Romanization, Transcription and Transliteration
by Kenneth R. Beesley.
Buckwalter Transliteration, developed by Tim Buckwalter, a
lexicographer, is a system for "practical storage, display
and email transmission of Arabic text in environments where the
display of genuine Arabic characters is not possible or
convenient". It avoids special characters and can be used
quite simply by anyone with a knowledge of Arabic because the
Roman equivalents of the Arabic letters are easy to remember. For
details of the Buckwalter System see the encoding
Anyone interested in this field
should also explore ArabTeX,
devised by Professor Klaus Lagally, which he defines as "a
package extending the capabilities of TeX/LaTeX to generate the
Arabic writing from an ASCII transliteration".
is a lot of disagreement in the English-language media about how
to spell Arabic words and names in the Roman alphabet. Apart from
variations in the spellings adopted by individual newspapers,
magazines and news agencies, many of these organisations have no
clear guidelines or fail to follow them consistently.
With increasing use of electronic
archives, the spelling variations can make it almost impossible to
retrieve all relevant articles with a reasonable degree of
Variations in spelling can also
confuse readers, as well as journalists themselves, and leave them
wondering whether two (or more) apparently different names refer
to the same person.
The two existing standards that
seem most relevant to journalism are the ALA-LC and UN guidelines
(see above). Both are very similar but in some instances they
resort to special characters that are impractical for media usage
and would also baffle readers.
The romanisation scheme suggested
below is a simplified version of the ALA-LC and UN guidelines
which eliminates the need for special characters. It is proposed
here for the purposes of discussion and readers’ comments
ROMANISATION FOR MEDIA USAGE
* when waw or ya is used as a consonant
Short vowels: u, a,
i (e and o are unnecessary).
Long vowels: uu;
aa; ii. The principle of doubling a short “a” to make
a long “aa” is well established (e.g.
"salaam"). Logically, it could be applied to the
other vowels. Some may prefer "oo" to
"uu", but "ou" could be mis-promounced
as "ow". Again, "ee" may be preferred
Diphthongs: aw, ay.
Some may prefer "au" and "ai".
normally write as double; omit in the case of digraphs
(gh, th, etc) for visual reasons. Doubling is not always
obvious from the written Arabic; omit if uncertain.
Digraphs: to avoid
ambiguity, two-letter combinations which are not digraphs
but resemble them should ideally be separated by ´ (ctrl+
', space). Example: ad´ham.
al- (no assimilation with "sun" letters, e.g.
"al-shams" not "ash-shams").
capitalisation of "bin" in Arabic names would
logically follow in-house capitalisation rules for
"von" (German) or "du" (French).
Logically, "abu" and "abd al-" require
the same treatment as "bin".
ta marbouta: the
options currently in use are: a, at, ah, eh, et. Readers'
views on the acceptability or otherwise of these options
jim: in a
colloquial context, g can replace j where that is the
qaf: in a
colloquial context, g can replace q where that is the
hamzah: may be
omitted at beginning of word; elsewhere use apostrophe ’
The point of setting a standard is
to apply it universally, or at least to make as few exceptions as
possible. However, this is difficult to achieve with Arabic words
because so many mis-transliterations have entered common usage. It
is suggested that the guidelines may be waived in the following
1. Names, where a person or
organisation has clearly indicated a preferred spelling.
2. Places, where a particular
spelling has been adopted locally.
3. Religious terms, where a
particular spelling has been adopted locally by believers.
4. Colloquial terms and
expressions may be spelled phonetically.