Proto-Indo-European phonology

Proto-Indo-European is reconstructed as having the following phonemes. Note that the phonemes are marked with asterisks to show that they are from a reconstructed language. See the article on Indo-European sound laws for a summary of how these phonemes reflected in the various Indo-European languages. Consonants The table uses the Wiktionary's notation for transcribing Proto-Indo-European; variant transcriptions often seen elsewhere are provided for individual segments in the following sections. Raised stands for aspiration, and raised for labialization. The consonant is the palatal semivowel (whose IPA transcription is and not ). Alternative overview of the phoneme inventory, merging all diverging theories (see chapters below) into one chart: Stop series Proto-Indo-European was formerly reconstructed with four series of stops: voiceless unaspirated, voiceless aspirated, voiced unaspirated, and voiced aspirated (such as *t, *tʰ, *d, *dʰ). More recent reconstructions analyze voiceless aspirated stops as sequences of stop and laryngeal, and so the standard reconstruction now includes only three series of stops, with the traditional phonetic descriptions of voiceless, voiced and voiced aspirated. (Sanskrit has a fourfold distinction, including a voiceless aspirated series), and it is typologically rare across attested languages. The absence or rarity of *b (see below) is also unusual. Additionally, Proto-Indo-European roots have a constraint that forbids roots from mixing voiceless and voiced aspirate stops or from containing two voiced stops. These considerations have led some scholars to propose a glottalic theory of the PIE stop system, replacing the voiced stops with glottalized and the voiced aspirated stops with plain voiced. Direct evidence for glottalization is limited, but there is some indirect evidence, including Winter's law in Balto-Slavic. Labials and coronals PIE are grouped with the cover symbol P. The phonemic status of is disputed: it seems not to appear as an initial consonant (except in a few dubious roots such as *bel-, noted below), while reconstructed roots with internal *b are usually restricted to Western branches, casting doubt on their validity for PIE. Some have attempted to explain away the few roots with *b as a result of later phonological developments. Suggested such developments include • *ml- > *bl-, connecting the dubious root 'power, strength' (> Sanskrit bálam, Ancient Greek ) with mel- in Latin , and *h₂ebl-/ 'apple' with a hypothetical earlier form *h₂eml-, which is in unmetathesized form attested in another reconstructible PIE word for apple, *méh₂lom (> Hittite maḫla-, Latin , Ancient Greek ). • In PIE *ph₃ the *p regularly gives *b; for example, the reduplicated present stem of 'to drink' > *pi-ph₃- > Sanskrit píbati. At best, PIE remains a highly marginal phoneme. The standard reconstruction identifies three coronal, or dental, stops: . They are symbolically grouped with the cover symbol T. Dorsals According to the traditional reconstruction, such as the one laid out in Brugmann's Grundriß der vergleichenden Grammatik der indogermanischen Sprachen more than a century ago, three series of velars are reconstructed for PIE: • "Palatovelars" (or simply "palatals"), (also transcribed or or . • "Plain velars" (or "pure velars"), . • Labiovelars, (also transcribed ). The raised or stands for labialization (lip rounding) accompanying the velar articulation. The actual pronunciation of these sounds in PIE is not certain. One current idea is that the "palatovelars" were in fact simple velars, i.e. , while the "plain velars" were pronounced farther back, perhaps as uvular consonants, i.e. . If the labiovelars were just labialized forms of the "plain velars", they would then have been pronounced but the pronunciation of the labiovelars as would still be possible in uvular theory, if the satem languages first shifted the "palatovelars" and then later merged the "plain velars" and "labiovelars". See for more support of this theory. Another theory is that there may have been only two series (plain velar and labiovelar) in PIE, with the palatalized velars arising originally as a conditioned sound change in satem languages. See . The satem languages merged the labiovelars with the plain velar series , while the palatovelars became sibilant fricatives or affricates of various types, depending on the individual language. In some phonological conditions, depalatalization occurred, yielding what appears to be a centum reflex in a satem language. For example, in Balto-Slavic and Albanian, palatovelars were depalatalized before resonants unless the latter were followed by a front vowel. The reflexes of the labiovelars are generally indistinguishable from those of the plain velars in satem languages, but there are some words where the lost labialization has left a trace, such as by u-coloring the following vowel. The centum group of languages, on the other hand, merged the palatovelars with the plain velar series , while the labiovelars were in general kept distinct. Centum languages show delabialisation of labiovelars when adjacent to *w (or its allophone *u), according to a rule known as the boukólos rule. Fricatives The only certain PIE fricative phoneme was a strident sound, whose phonetic realization could range from [s] or to palatalized or . It had a voiced allophone that emerged by assimilation in words such as ('nest'), and which later became phonemicized in some daughter languages. Some PIE roots have variants with appearing initially: such is called s-mobile. The "laryngeals" may have been fricatives, but there is no consensus as to their phonetic realization. Laryngeals The phonemes (or and ), marked with cover symbol (also denoting "unknown laryngeal"), stand for three "laryngeal" phonemes. The term laryngeal as a phonetic description is largely obsolete, retained only because its usage has become standard in the field. The phonetic values of the laryngeal phonemes are disputable; various suggestions for their exact phonetic value have been made, ranging from cautious claims that all that can be said with certainty is that represented a fricative pronounced far back in the mouth, and that exhibited lip-rounding up to more definite proposals; e.g. Meier-Brügger writes that realizations of = , = and = or "are in all probability accurate". Another commonly cited speculation for is (e.g. Beekes). Simon (2013) has argued that the Hieroglyphic Luwian sign *19 stood for (distinct from ) and represented the reflex of . It is possible, however, that all three laryngeals ultimately fell together as a glottal stop in some languages. Evidence for this development in Balto-Slavic comes from the eventual development of post-vocalic laryngeals into a register distinction commonly described as "acute" (vs. "circumflex" register on long vocalics not originally closed by a laryngeal) and marked in some fashion on all long syllables, whether stressed or not; furthermore, in some circumstances original acute register is reflected by a "broken tone" (i.e. glottalized vowel) in modern Latvian. The schwa indogermanicum symbol is sometimes used for a laryngeal between consonants, in a "syllabic" position. Sonorants In a phonological sense, sonorants in Proto-Indo-European were those segments that could appear both in the syllable nucleus (i.e. they could be syllabic) and out of it (i.e. they could be non-syllabic). PIE sonorants consist of liquids, nasals and glides: more specifically, (or ) are non-labial sonorants, grouped with the cover symbol R, while labial sonorants (or ), are marked with the cover symbol M. All of them had syllabic allophones, transcribed , which generally were used between consonants, word-initially before a consonant, or word-finally after a consonant. Even though and were certainly phonetic vowels, they behave phonologically as syllabic sonorants. was an allophone of before velar consonants. Reflexes Some of the changes undergone by the PIE consonants in daughter languages are the following: • Proto-Celtic, Albanian, Proto-Balto-Slavic and Proto-Iranian merged the voiced aspirated series with the plain voiced series . (In Proto-Balto-Slavic this postdated Winter's law. Proto-Celtic retains the distinction between and – the former became while the latter became .) • Proto-Germanic underwent Grimm's law and Verner's law, changing voiceless stops into voiceless or voiced fricatives, devoicing unaspirated voiced stops, and fricativizing and deaspirating voiced aspirates. • Grassmann's law ( > , e.g. > ) and Bartholomae's law ( > , e.g. > ) describe the behaviour of aspirates in particular contexts in some early daughter languages. Sanskrit, Greek, and Germanic, along with Latin to some extent, are the most important for reconstructing PIE consonants, as all of these languages keep the three series of stops (voiceless, voiced and voiced-aspirated) separate. In Germanic, Verner's law and changes to labiovelars (especially outside of Gothic) obscure some of the original distinctions; but on the other hand, Germanic is not subject to the dissimilations of Grassmann's law, which affects both Greek and Sanskrit. Latin also keeps the three series separate, but mostly obscures the distinctions among voiced-aspirated consonants in initial position (all except become ) and collapses many distinctions in medial position. Greek is of particular importance for reconstructing labiovelars, as other languages tend to delabialize them in many positions. Anatolian and Greek are the most important languages for reconstructing the laryngeals. Anatolian directly preserves many laryngeals, while Greek preserves traces of laryngeals in positions (e.g. at the beginning of a word) where they disappear in many other languages, and reflects each laryngeal different from the others (the so-called triple reflex) in most contexts. Balto-Slavic languages are sometimes valuable in reconstructing laryngeals since they are relatively directly represented in the distinction between "acute" and "circumflex" vowels. Old Avestan faithfully preserves numerous relics (e.g. laryngeal hiatus, laryngeal aspiration, laryngeal lengthening) triggered by ablaut alternations in laryngeal-stem nouns, but the paucity of the Old Avestan corpus prevents it from being more useful. Vedic Sanskrit preserves the same relics rather less faithfully, but in greater quantity, making it sometimes useful. Vowels It is disputed how many vowels Proto-Indo-European has, or even what counts as a "vowel" in the language. It is generally agreed that at least four vowel segments existed, which are typically denoted as and All of them are morphologically conditioned to varying extents. The long vowels are less common than the short vowels, and their morphological conditioning is especially strong, suggesting that at an earlier stage there may not have been a length opposition, and a system with as few as two vowels (or even only one vowel, according to some researchers) may have existed. The surface vowels and were extremely common, and syllabic sonorants existed, but these sounds are usually analyzed as syllabic allophones of the sonorant consonants The syllabic and non-syllabic versions of these sounds alternate in the inflectional paradigms of words such as ('tree, wood') (reconstructed with genitive singular and dative plural ) or in the derivation of words such as the noun ('yoke') with , from the same root as the verb ('to yoke, harness, join') with . Some authors (e.g. ) have argued that there is substantial evidence for reconstructing a non-alternating phoneme in addition to an alternating phoneme as well as weaker evidence for a non-alternating phoneme . Furthermore, all the daughter languages have a segment , and those with long vowels generally have long . Until the mid-20th century, PIE was reconstructed with all of those vowels. Modern versions incorporating the laryngeal theory, however, tend to view these vowels as later developments of sequences involving the PIE laryngeal consonants . For example, what used to be reconstructed as PIE is now often reconstructed as ; are now reconstructed as (*H representing any laryngeal) and has various origins, among which are a "syllabic" (any laryngeal not adjacent to a vowel) or an next to the "a-coloring" laryngeal . (Though they may have phonetically contained the vowel in spoken PIE, it would be an allophone of not an independent phoneme.) Some researchers, however, have argued that an independent phoneme *a must be reconstructed, and it cannot be traced back to any laryngeal. Any sonorant consonant can comprise the second part of a complex syllable nucleus; all can form diphthongs with any of the vowels (such as ). It is generally accepted that PIE did not allow vowels word-initially. Vowel-initial words in earlier reconstructions are now usually reconstructed as beginning with one of the three laryngeals, which disappeared before a vowel (after coloring it, if possible) in all daughter languages except Hittite. Lengthened vowels With particular morphological (such as a result of Proto-Indo-European ablaut) and phonological conditions (like in the last syllable of nominative singular of a noun ending on sonorant, in root syllables in the sigmatic aorist, etc.; compare Szemerényi's law, Stang's law) vowels and would lengthen, yielding respective lengthened-grade variants. The basic lexical forms of words contained therefore only short vowels; forms with long vowels, and appeared from well-established morphophonological rules. Lengthening of vowels may have been a phonologically-conditioned change in Early Proto-Indo-European, but at the period just before the end of Proto-Indo-European, which is usually reconstructed, it is no longer possible to predict the appearance of all long vowels phonologically, as the phonologically-justified resulting long vowels have begun to spread analogically to other forms without being phonologically justified. The prosodically-long in 'father' results by the application of Szemerényi's law, a synchronic phonological rule that operated within PIE, but prosodically-long in 'foot' was analogically levelled. /a/ It is possible that Proto-Indo-European had a few morphologically isolated words with the vowel 'sacrifice' (Latin , Ancient Greek , Old Irish dúas) or appearing as a first part of a diphthong 'left' (Latin , Ancient Greek , OCS lěvъ). The phonemic status of *a has been fiercely disputed; Beekes concludes: "There are thus no grounds for PIE phoneme "; his former student, Alexander Lubotsky, reaches the same conclusion. After the discovery of Hittite and the development of the laryngeal theory, almost every instance of previous could be reduced to the vowel preceded or followed by the laryngeal (rendering the previously reconstructed short and long respectively). The following arguments can be set forth against recognizing as a phoneme of PIE: • it does not participate in ablaut alternations (it does not alternate with other vowels, as the "real" PIE vowels do), • it makes no appearance in suffixes and endings, it appears in a very confined set of positions (usually after initial which could be the result of that phoneme being a-coloring, particularly likely if it was uvular ), • and words reconstructed with usually have reflexes in only a few Indo-European languages. For example, 'beard,' is confined to the western and northern daughter families. That makes it possible to ascribe it to some late PIE dialectalism or of expressive character (like the interjection 'alas') and so is not suitable for comparative analysis, or they are argued to have been borrowed from some other language which had phonemic (like Proto-Semitic *θawru > PIE ('aurochs')). However, others, like Manfred Mayrhofer, argue that and phonemes existed independently of . This phoneme appears to be present in reconstructions such as ("white"), ("father"), or ("away") where the absence of a laryngeal is suggested by the respective Hittite descendants; 𒀠𒉺𒀸 (al-pa-aš, "cloud"), 𒀜𒋫𒀸 (at-ta-aš, "father"), 𒀀𒀊𒉺 (a-ap-pa, "behind"). Reflexes Ancient Greek reflects the original late PIE vowel system (after the vowel-coloring and lengthening effects of the laryngeals) most faithfully, with few changes to vowels in any syllable, but its loss of certain consonants, especially and , often triggered a compensatory lengthening or contraction of vowels in hiatus, which can complicate reconstruction. Sanskrit and Avestan merge and into a single vowel (with a corresponding merger in the long vowels) but reflect PIE length differences (especially from the ablaut) even more faithfully than Greek, and they do not have the same issues with consonant loss as Greek. Furthermore, can often be reconstructed by Brugmann's law and by its palatalization of a preceding velar (see Proto-Indo-Iranian language). Germanic languages show a merger of short and (to Proto-Germanic *a) and long and (to Proto-Germanic *ō) as well as a merger of and in non-initial syllables, but (especially in the case of Gothic) they are still important for reconstructing PIE vowels. Evidence from Anatolian and Tocharian can be significant because of their conservatism, but are often difficult to interpret. Tocharian, especially, has complex and far-reaching vowel innovations. Italic languages and Celtic languages do not unilaterally merge any vowels, but have such far-reaching vowel changes (especially in Celtic and the extreme vowel reduction of early Latin) that they are somewhat less useful. Albanian and Armenian are the least useful, as they are attested relatively late, have borrowed heavily from other Indo-European languages and have complex and poorly understood vowel changes. In Proto-Balto-Slavic, short and were merged. A separate reflex of the original or is, however, argued to have been retained in some environments as a lengthened vowel because of Winter's law. Subsequently, Early Proto-Slavic merged and which were retained in the Baltic languages. Additionally, accentual differences in some Balto-Slavic languages indicate whether the post-PIE long vowel originated from a genuine PIE lengthened grade or is a result of compensatory lengthening before a laryngeal. ==Accent==