The development of a standardized orthography for Azerbaijani using the
Arabic script in
Iran began in late 20th century. Historically, the
Persian alphabet has been used for Azerbaijani; however, linguists associated with the standardization movement, such as those contributing to the journal
Varlıq (est. 1979), argue that the unmodified Persian system presents phonetic redundancies. For example, the Persian script contains multiple letters for the same consonant sounds—such as the letters and for the
voiceless alveolar plosive [t], and lacks dedicated characters or diacritics for several vowel
phonemes specific to
Turkic languages. Efforts to formalize these conventions culminated in a series of linguistic seminars held in
Tehran in 2001. Chaired by
Javad Heyat, the founder of
Varlıq, these sessions produced a document outlining a standardized orthography for the public. While the Arabic-based script remains the most widespread medium for the language in Iran, its usage patterns have shifted in the 21st century. Although Article 15 of the
Iranian Constitution provides for the use of regional and tribal languages in the press and mass media, as well as the teaching of their literature in schools, a formal state-wide curriculum for Azerbaijani has not been fully implemented. In recent decades, the adoption of the Latin alphabet has increased among younger speakers. This trend is often attributed to the influence of the Latin-based script used in the
Republic of Azerbaijan and the technical convenience of Latin-based keyboard layouts on digital platforms and mobile devices.
Vowels In the Azerbaijani Arabic alphabet, nine vowels are defined. Six of those vowels are present in Persian, whereas three are missing.
Diacritics (including hamza) in combination with the letters
alef (),
vav () or
ye () are used in order to mark each of these vowels. Important to note that similar to Persian alphabet, vowels in the initial position require an
alef () all the time—and if needed, followed by either
vav () or
ye (). This excludes Arabic loanwords that may start with
ʿayn (). Below are the six vowel sounds in common with Persian, their representation in Latin and Arabic alphabets. • (); ; A front vowel; only marked with
fatha () diacritic, or with a
he at middle or final positions in a word. Examples include: , , • (); ; A front vowel; marked with a
hamza on top a
ye (). Examples include: , • (); ; A rounded back vowel; Shown with
vav (), either unmarked, or marked with
sukun (zero-vowel) (). Examples include: , , . • (); ; A back vowel; shown with
alef () in middle and final positions, and
alef-maddeh () in initial position. Examples include: , • () ; A front vowel; shown with a
ye () and no diacritic. Examples include: , • () ; A back vowel; shown with a
vav and a
Ḍammah (). Examples include: , Below are the three vowels that don't exist in Persian, and are marked with diacritics. • () ; A front vowel; shown with a
hamza on top a
vav (). Examples include: , • () ; A front vowel; shown with a "v" diacritic on top a
vav (). Examples include , , • () (rarely used and usually substituted by ); A back vowel; shown with an inverted "v" diacritic on top of a
ye (). Examples include: , , ,
Vowel harmony Like other Turkic languages, Azerbaijani has a system of vowel harmony. Azerbaijani's system of vowel harmony is primarily a front/back system. This means that all vowels in a word must be ones that are pronounced either at the front or at the back of the mouth. In Azerbaijani there are two suffixes that make a plural. It is either or , front and back vowels respectively. The same variety of options for suffixes exists across the board in Azerbaijani. Here is how vowel harmony works, in an example of a word in which the vowels are all frontal: • The word for is . The word for is . ( is incorrect.) And below are examples for back vowels: • The word for is , thus the word for is . A secondary vowel harmony system exists in Azerbaijani language, which is a
rounded/unrounded system. This applies to some (but not all) of the suffixes. For example, there are four variations for the common suffix and . • The word for is . The word for will be . • In Azerbaijani, the city of
Tabriz is . The word for someone from Tabriz is .
Conventions on writing of vowels In the Perso-Arabic script, or in Arabic scripts in general,
diacritics are usually not written out, except in texts for beginners or in order to avoid confusion with a similarly written word. In the Azerbaijani Arabic alphabet, there are conventions with regards to writing of diacritics. For (), the vowel is always written and shown with
alef. For (), the initial vowel is written with an
alef. Vowels in the middle of the word are written in two ways. They are either shown, i.e. written with a diacritic, which usually needs not be written; or they are written with a final
he (). The former is used in closed syllables (CVC), or in the first open syllable of the word. The latter is used in open syllables (CV) with the exception of the first syllable of the word. Note that the vowel
he () is not attached to the following letter, but is separated from it with a
Zero-width non-joiner. For example, the word (gə-lə-cəy-im) is written as . Note that the first syllable of the word is open, but it is not marked. The second syllable is open, and thus the vowel is marked with
he (), not attached to the following letter. Also note the breakdown of the word into syllables – this is because the word is made up of plus possessive pronoun . For E-e (ائ / ئ), the sound is shown with a
hamzeh on top of a
ye in almost all cases. The exceptions are loanwords of Persian, Arabic, or European origin. For example, is written as . Writing it as is incorrect. Other examples include ), , and . In words, for both Azerbaijani and loanwords, if and come side by side, both letters are written; e.g., , , , . Loanwords from Persian or Arabic which contain the sound , but are adopted in Azerbaijani with an sound, are shown with . Examples include , , . For (), the sound is always shown with
ye (). For (), the sound is shown with
ye () all the time. The writing of the diacritic is optional and not necessary, and is only ever actually done in beginner language lesson books or in order to avoid confusion with a similarly written word. Native speakers can usually read words without the use of diacritic, as they are aware of vowel harmony rules (meaning that they can interpolate the correct pronunciation of by the presence of other vowels in the word). In words like , familiarity with the vocabulary helps native speakers. For round vowels, (), (), (), and (), it is recommended that the first syllable containing such vowel be marked with diacritic, while the rest can remain unmarked and solely written with a
vav (). This reduces the effort of marking vowels, while also providing readers with a clue with respect to vowel harmony, namely as to whether the vowels of the word are to be front or back. Examples include , , . However, it is recommended new learners write diacritics on all round vowels, e.g., , , . In daily practice, it is rare to see vowels other than () marked. This may be due to the fact that
hamza is the only one of such symbols that is frequently written in Persian as well, and due to the fact that the inverted "v" diacritic for () does not exist on typical Persian keyboards.
Consonants While Azerbaijani Latin alphabet has nine vowels and twenty-three consonants, the Azerbaijani Arabic alphabet has thirty consonants, as there are sounds that are represented by more than one consonant. Highlighted columns indicate letters from Persian or Arabic that are exclusively used in loanwords, and not in native Azerbaijani words.
Notes • Arabic loanwords that in their original spelling end in
ʿayn (ع), such as "طمع" (təmə')
(meaning greed), or "متاع" (məta')
(meaning baggage), are instead pronounced in Azerbaijani with a final [h]. Thus they are to be written with a "ح" (
he). e.g. "طاماح" (tamah), "ماتاح" (matah). (Note that the vowels of these words are also changed in accordance with the vowel harmony system) If the change in pronunciation of
ʿayn (ع) happens mid-word, it would be written as "ه / هـ". An example being "فعله" (fə'lə)
(meaning worker) being written as "فهله" (fəhlə). • Loanwords that start with consonant sequences "SK, ST, SP, ŞT, ŞP", in Azerbaijani Arabic script, they are to be written starting with an "ای" (i). e.g. ایستئیک (isteyk)
(meaning steak), ایسپورت (isport)
(meaning sports) • There is a distinction between the pronunciation of "غ" and "ق" in Azerbaijani. Such distinction does not exist in
standard Iranian Persian. But in any case, loanwords from Arabic or Persian, regardless of how their "غ" and "ق" is pronounced, are to be kept as their original writing. This is not a rule in Latin alphabet. An example being the word meaning
Afghan, "افغان" (Əfqan). The "غ" in Azerbaijani is pronounced as a [g], meaning that, as it is done in Latin, it is being pronounced as if it is a "ق". But the writing of the loanword in Azerbaijani Arabic will remain the same. • Loanwords whose original spelling was with a "گ" (G g) but are written in Latin alphabet with a Q q, are to be written with a "ق". Examples include "قاز" (Qaz)
(meaning gas, written as "گاز" in Persian), "اوْرتوقرافی" (Orfoqrafi)
(meaning orthography, written as "اورتوگرافی" in Persian) • When suffixes are added to words ending in "ک" (K k), resulting in the letter "ک" (K k) being between two vowels, will have its pronunciation modified to [j], equivalent to the letter "ی" (Y y). This change is reflected in Latin writing. However, in the Arabic script, in order to maintain the original familiar shape of the word, the letter "گ" (G g) (functioning in a role dubbed "soft G") is used, as the letter is similar in shape to "ک". Examples: "çörə
k+im" becoming "çörə
yim" in Latin script
(meaning my bread), but "چؤر
ک+یم" becoming "چؤر
گیم". "gələcə
k+im" becoming "gələcə
yim" in Latin script
(meaning my future), but "گلهج
ک+یم" becoming "گلهج
گیم". • Whenever the letter "ی" (Y) is placed between two "ای" (İ-i) vowels, it is written as "گ" (G g) (functioning in a role dubbed "soft G"). This is not something done in Latin script. Example: "ایگیرمی" (iyirmi)
(meaning twenty) • The letters "و" ,"ه / هـ", and "ی" have a double function, as consonant, and as part of vowels. When used as consonant, they are written with no diacritic or marking. •
Shadda, the Arabic diacritic for gemination, is retained for loanwords from Arabic. Examples: "مۆکمّل" (mükəmməl)
(meaning complementary), "مدنیّت" (mədəniyyət)
(meaning civility). In native Azerbaijani words and in loanwords of European origin, double consonants are written twice. Examples: "یئددی" (yeddi)
(meaning seven), "ساققال" (saqqal)
(meaning beard), "اوْتللو" (Otello). == Sample texts ==