www.al-bab.com

An open door to the Arab world

 


   
al-bab.com
Internet 

Country briefing

 
 

News

 
 

Reference

 
 

Special topics

 
  

Arts and culture

  
  

Diversity

 
     

Arabic words and the Roman alphabet

   

Introduction

THE WRITING of Arabic words in English texts presents a number of difficulties, even for those who are familiar with both languages. 

In 1926, when T E Lawrence ("Lawrence of Arabia") sent his 130,000-word manuscript of Revolt in the Desert to be typeset, a sharp-eyed proof-reader spotted that it was "full of inconsistencies in the spelling of proper names".

Among other things, the proof-reader noted that "Jeddah" alternated with "Jidda" throughout the book, while a man whose name began as Sherif Abd el Mayin later became el Main, el Mayein, el Muein, el Mayin and le Muyein.

Lawrence refused to change the spellings. 

"Arabic names," he replied, "won't go into English, exactly, for their consonants are not the same as ours, and their vowels, like ours, vary from district to district."

Such inconsistencies may not matter much in a literary work but in many other situations they do matter. For instance, if you wanted to look up an Arab called "Hassan al-Ghobashy" in the telephone directory, he might be listed under A, E or G:

AL-GHOBASHY Hassan
EL-GHOBASHY Hassan
GHOBASHY Hassan al-

Difficulties arise whenever Arabic names are listed alphabetically using the Roman alphabet, and when they are used in databases or search engines. Efforts by the FBI to track down Usama bin Laden's supporters, for instance, were severely hampered by this problem.

Newspapers spell the Libyan leader's name in a variety of ways, with the result that a researcher trying to find articles about him would be likely to miss a significant number of them. According to one website, there are 32 possible ways to spell his name.

There is no ideal, all-purpose solution. There are, however, several different approaches to a solution and the best choice depends largely on the writer's purpose and intended audience. 

The following notes are an attempt to explain the issues involved. They should be considered as "work in progress" and readers are encouraged to send comments, questions or corrections by email.

To view the Arabic alphabet click here   


Phonetic spelling

ONE APPROACH is to take Arabic words as they are pronounced and write down approximately similar sounds in the Roman alphabet. This is what early European travellers to the Middle East usually did, and the results were often bizarre or, in some cases, almost unrecognisable. 

Inexact spellings such as "Mecca" and "Koran" entered the English language a long time ago and have become so entrenched that they are now difficult to eradicate. In old books the Prophet's name is frequently spelled as "Mahomet" and this is still used to some extent today. There is no logical reason for it because Muhammad is one Arabic name that can easily be rendered in a way that is both phonetically accurate and faithful to its written form.

The Roman alphabet, of course, is used by a number of European languages, so phonetic representations of Arabic words vary according to the mother tongue of the writer. Romanised spellings adopted by Arabs themselves often reflect previous colonial influences: an Arab in a country with strong English influence might spell his surname as "Shaheen", while a cousin in a French-influenced country would spell it as "Chahine". In both cases, the original Arabic name is the same.

A further consideration is that there are also significant regional variations in pronunciation by Arabs. So a single Arabic word, spoken by a Moroccan, and Egyptian and a Saudi could easily appear as three different words if written phonetically in the Roman alphabet.

The spellings of Arabic words found in the western mass media are often at least partly phonetic but rarely do justice to the original.

In some circumstances, more precise phonetic spelling is needed - in phrase books for tourists, for instance, or in pronunciation guides for broadcasters. The following examples come from a guide issued by Associated Press to help American radio stations with their pronunciation:

mah-MOOD' ah-BAHS' (Mahmoud Abbas) 

mah-MOOD' ab-DEHL'-BA'-set (Mahmoud Abdel-Baset)

shayk OH'-mahr AHB'-dehl RAHK'-mahn (Sheik Omar Abdel-Rahman)

This is not only thoroughly unscientific but highly inaccurate. The guide happily inserts various sounds that don't exist in the original Arabic (a K in "Rahman", for example) and ignores several others that do exist. It also offers two different pronunciations of "Abdel-", for no logical reason.

Truly phonetic spelling follows the International Phonetic Alphabet which is used academically by linguists. Its disadvantage in general use is that it requires characters outside the normal alphabet and is therefore more or less incomprehensible to non-specialists.


Transcription (Romanisation)

A DIFFERENT approach is to start with Arabic words in their written form and transcribe (or "Romanise") them by replacing individual Arabic letters with corresponding letters from the Roman alphabet. This sounds simple but is actually very difficult. For example:

  • Only eight Arabic letters have a clear equivalent in the Roman alphabet: B, F, K, L, M, N, R, and Z. 

  • Arabic has two distinct consonants that approximate to the sound of S. The same applies to D, H and T.

  • There are two glottal sounds that do not obviously correspond to any Roman letter.

The ideal solution would be to have a standard, internationally agreed, system. Several have been proposed but unfortunately none has been universally accepted. A selection of these can be viewed in PDF format at http://homepage.mac.com/sirbinks/pdf/Arabic.pdf.

Probably the earliest attempt at standardisation was Deutsche Morgenländische Gesellschaft proposal, adopted by the International Convention of Orientalist Scholars in 1936. It is the system used in the Hans Wehr Arabic dictionary. Another standard was agreed in 1971 at a conference of Arab experts in Beirut and - theoretically, at least, accepted by the countries of the Arab League. It has met some resistance, particularly in those Arab countries where French predominates over English. Other transcription/Romanisation systems include:

ALA-LC Romanization Tables 

Adopted by the US Library of Congress and the American Library Association for cataloguing books, the system has found its way into wider academic use. It covers a multitude of languages: there are 54 Romanisation tables for more than 150 languages and dialects written in non-Roman scripts. The table relating to Arabic may be viewed in PDF format at the following sites:

http://www.lib.umich.edu/area/Near.East/lcromanization.pdf 

http://archimedes.fas.harvard.edu/mdh/lcromanization.pdf   

http://lcweb.loc.gov/catdir/cpso/romanization/arabic.pdf   

Alternatively, a complete set of the tables may be purchased from amazon.com 

ISO 233 

Published by the International Standards Organisation. Copies may be purchased here.

British Standard BS 4280: 1968 

Not widely used - which is hardly surprising since the British Standards Institute holds the copyright (it cannot be reproduced here) and copies are expensive to buy (about $39 for an eight-page document).

United Nations Romanization System for Geographical Names

Overseen by the Group of Experts on Geographical Names (UNGEGN), this aims to promote "consistent use of accurate place names" on maps and similar products. Work on the project has been continuing since 1972. A progress report on Arabic romanization, dated March 2000, can be viewed in PDF format here.


Transliteration

THE transcription/romanization systems described above all suffer from the same disadvantages, to varying degrees:

1. They are difficult to memorise because they use special characters or add special marks to normal characters.

2. They can cause ambiguity by using digraphs (two-letter combinations) to represent single Arabic letters. For example, there is a risk of confusing SH (as in "sheep") with S H (as in "mis-hap") or TH (as in "thin") with T H (as in "hot-house").

3. They cannot be used easily with a standard computer keyboard.

This last point is particularly important today, though it could not be foreseen when most of the romanization systems were devised. Currently, the most advanced approaches involve precise letter-for-letter transcription systems which allow a text files originally produced in Arabic to be romanized by a simple computer program and converted back again into perfect Arabic. Beyond straightforward text files, this has important implications for the use of databases.

Research in this area has been led by the Xerox company. For detailed and interesting discussion of the issues, see Romanization, Transcription and Transliteration by Kenneth R. Beesley.

The Buckwalter Transliteration, developed by Tim Buckwalter, a lexicographer, is a system for "practical storage, display and email transmission of Arabic text in environments where the display of genuine Arabic characters is not possible or convenient". It avoids special characters and can be used quite simply by anyone with a knowledge of Arabic because the Roman equivalents of the Arabic letters are easy to remember. For details of the Buckwalter System see the encoding chart.

Anyone interested in this field should also explore ArabTeX, devised by Professor Klaus Lagally, which he defines as "a package extending the capabilities of TeX/LaTeX to generate the Arabic writing from an ASCII transliteration".


Media usage

THERE is a lot of disagreement in the English-language media about how to spell Arabic words and names in the Roman alphabet. Apart from variations in the spellings adopted by individual newspapers, magazines and news agencies, many of these organisations have no clear guidelines or fail to follow them consistently.

With increasing use of electronic archives, the spelling variations can make it almost impossible to retrieve all relevant articles with a reasonable degree of certainty.

Variations in spelling can also confuse readers, as well as journalists themselves, and leave them wondering whether two (or more) apparently different names refer to the same person.

The two existing standards that seem most relevant to journalism are the ALA-LC and UN guidelines (see above). Both are very similar but in some instances they resort to special characters that are impractical for media usage and would also baffle readers.

The romanisation scheme suggested below is a simplified version of the ALA-LC and UN guidelines which eliminates the need for special characters. It is proposed here for the purposes of discussion and readers’ comments are welcome.

SUGGESTED ROMANISATION FOR MEDIA USAGE

alif 

ba 

ta 

tha 

jim 

ha 

kha 

dal 

dhal 

ra 

zay 

sin 

shin 

sad 

dad 

a

b

t

th

j

h

kh

d

dh

r

z

s

sh

s

d

tah

zah

ayn

ghayn 

fa 

qaf 

kaf 

lam 

mim 

nun 

ha 

waw 

ya 

t

z

 ‘ (alt+ 0145)

gh

f

q

k

l

m

n

h

w*

y*

* when waw or ya is used as a consonant

NOTES:

Short vowels: u, a, i (e and o are unnecessary).

Long vowels: uu; aa; ii. The principle of doubling a short “a” to make a long “aa” is well established (e.g. "salaam"). Logically, it could be applied to the other vowels. Some may prefer "oo" to "uu", but "ou" could be mis-promounced as "ow". Again, "ee" may be preferred to "ii".

Diphthongs: aw, ay. Some may prefer "au" and "ai".

Doubled consonants: normally write as double; omit in the case of digraphs (gh, th, etc) for visual reasons. Doubling is not always obvious from the written Arabic; omit if uncertain.

Digraphs: to avoid ambiguity, two-letter combinations which are not digraphs but resemble them should ideally be separated by ´ (ctrl+ ', space). Example: ad´ham.

Definite article: al- (no assimilation with "sun" letters, e.g. "al-shams" not "ash-shams").

Capitalisation: capitalisation of "bin" in Arabic names would logically follow in-house capitalisation rules for "von" (German) or "du" (French). Logically, "abu" and "abd al-" require the same treatment as "bin".

ta marbouta: the options currently in use are: a, at, ah, eh, et. Readers' views on the acceptability or otherwise of these options are welcome.

jim: in a colloquial context, g can replace j where that is the normal pronunciation.

qaf: in a colloquial context, g can replace q where that is the normal pronunciation

hamzah: may be omitted at beginning of word; elsewhere use apostrophe ’ (alt+ 0146).

The point of setting a standard is to apply it universally, or at least to make as few exceptions as possible. However, this is difficult to achieve with Arabic words because so many mis-transliterations have entered common usage. It is suggested that the guidelines may be waived in the following circumstances:

1. Names, where a person or organisation has clearly indicated a preferred spelling.

2. Places, where a particular spelling has been adopted locally.

3. Religious terms, where a particular spelling has been adopted locally by believers.

4. Colloquial terms and expressions may be spelled phonetically.

     

In the Arabic language section

Introduction 

Is Arabic difficult? 

How to learn Arabic

Where to learn Arabic 

Arabic words in English 

Arabic words and the Roman alphabet 

Computer translation

Arabic proverbs 

Related pages: 

Arabic literature

 

Which spelling?

Using the Google search engine, we checked for the most common spellings of some Arabic names ...
   
Muhammad
Mohammed
Mohamed
Mahomet

41%
32%
25%
3%

 
Gaddafi
Qadhafi
Gadafi
Gadafy
Qadhdhafi

72%
16%
8%
2%
1%

 
Quran
Koran
Qur'an

44%
37%
19%

  
al-Qaeda
al-Qaida
al-Qa'ida
al-Qa'eda

61%
36%
2%
1%

 
Mecca
Makkah
Mekkah

85%
14%
1%

 

 
 


View statistics

 

Last revised on 18 June, 2009