Linguists traditionally recognize two primary divisions of Austroasiatic: the Mon–Khmer languages of
Southeast Asia,
Northeast India, and the
Nicobar Islands, and the
Munda languages of
East and
Central India and parts of
Bangladesh and
Nepal. However, no evidence for this classification has ever been published. Each family written in boldface below is accepted as a valid clade. By contrast, the relationships
between these families within Austroasiatic are debated. In addition to the traditional classification, two recent proposals are given, neither of which accepts traditional "Mon–Khmer" as a valid unit. However, little of the data used for competing classifications has ever been published and, therefore, cannot be evaluated by peer review. In addition, there are suggestions that additional branches of Austroasiatic might be preserved in substrata of
Acehnese in Sumatra (Diffloth), the
Chamic languages of Vietnam, and the
Land Dayak languages of Borneo (Adelaar 1995).
Diffloth (1974) Diffloth's widely cited original classification, now abandoned by Diffloth himself, is used in
Encyclopædia Britannica and—except for the breakup of Southern Mon–Khmer—in
Ethnologue. •
Austro‑Asiatic •
Munda • North Munda • Korku •
Kherwarian • South Munda •
Kharia–Juang •
Koraput Munda • Mon–Khmer • Eastern Mon–Khmer •
Khmer (Cambodian) •
Pearic •
Bahnaric •
Katuic •
Vietic (Vietnamese,
Muong) • Northern Mon–Khmer •
Khasi (
Meghalaya, India) •
Palaungic •
Khmuic • Southern Mon–Khmer •
Mon •
Aslian (
Malaya) •
Nicobarese (
Nicobar Islands)
Peiros (2004) Peiros is a
lexicostatistic classification, based on percentages of shared vocabulary. This means that languages can appear to be more distantly related than they actually are due to
language contact. Indeed, when Sidwell (2009) replicated Peiros's study with languages known well enough to account for loans, he did not find the internal (branching) structure below. •
Austro‑Asiatic •
Nicobarese • Munda–Khmer •
Munda • Mon–Khmer •
Khasi • Nuclear Mon–Khmer •
Mangic (
Mang +
Palyu) (perhaps in Northern MK) •
Vietic (perhaps in Northern MK) • Northern Mon–Khmer •
Palaungic •
Khmuic • Central Mon–Khmer •
Khmer dialects •
Pearic • Asli-Bahnaric •
Aslian • Mon–Bahnaric •
Monic • Katu–Bahnaric •
Katuic •
Bahnaric Diffloth (2005) Diffloth compares reconstructions of various clades, and attempts to classify them based on shared innovations, though like other classifications the evidence has not been published. As a schematic, we have: |2=
Khasian |1= }} |1= |2= }} }} }} Or in more detail, •
Austro‑Asiatic •
Munda languages (India) •
Koraput: 7 languages • Core Munda languages •
Kharian–Juang: 2 languages • North Munda languages •
Korku •
Kherwarian: 12 languages •
Khasi–Khmuic languages (Northern Mon–Khmer) •
Khasian: 3 languages of north eastern India and adjacent region of Bangladesh • Palaungo-Khmuic languages •
Khmuic: 13 languages of Laos and Thailand • Palaungo-Pakanic languages •
Pakanic or
Palyu: 4 or 5 languages of southern China and Vietnam •
Palaungic: 21 languages of Burma, southern China, and Thailand • Nuclear Mon–Khmer languages • Khmero-Vietic languages (Eastern Mon–Khmer) • Vieto-Katuic languages ? •
Vietic: 10 languages of Vietnam and Laos, including
Muong and
Vietnamese, which has the most speakers of any Austroasiatic language. •
Katuic: 19 languages of Laos, Vietnam, and Thailand. • Khmero-Bahnaric languages •
Bahnaric: 40 languages of Vietnam, Laos, and Cambodia. • Khmeric languages • The
Khmer dialects of Cambodia, Thailand, and Vietnam. •
Pearic: 6 languages of Cambodia. • Nico-Monic languages (Southern Mon–Khmer) •
Nicobarese: 6 languages of the
Nicobar Islands, a territory of India. • Asli-Monic languages •
Aslian: 19 languages of peninsular Malaysia and Thailand. •
Monic: 2 languages, the
Mon language of Burma and the
Nyahkur language of Thailand.
Sidwell (2009–2015) and
Roger Blench propose that the Austroasiatic phylum dispersed via the
Mekong River
drainage basin.
Paul Sidwell (2009), in a
lexicostatistical comparison of 36 languages that are well known enough to exclude loanwords, finds little evidence for internal branching, though he did find an area of increased contact between the Bahnaric and Katuic languages, such that languages of all branches apart from the geographically distant
Munda and Nicobarese show greater similarity to Bahnaric and Katuic the closer they are to those branches, without any noticeable innovations common to Bahnaric and Katuic. He therefore takes the conservative view that the thirteen branches of Austroasiatic should be treated as equidistant on current evidence. Sidwell &
Blench (2011) discuss this proposal in more detail, and note that there is good evidence for a Khasi–Palaungic node, which could also possibly be closely related to Khmuic. If this would the case, Sidwell & Blench suggest that Khasic may have been an early offshoot of Palaungic that had spread westward. Sidwell & Blench (2011) suggest
Shompen as an additional branch, and believe that a Vieto-Katuic connection is worth investigating. In general, however, the family is thought to have diversified too quickly for a deeply nested structure to have developed, since Proto-Austroasiatic speakers are believed by Sidwell to have radiated out from the central
Mekong river valley relatively quickly. Subsequently, Sidwell (2015a: 179) proposed that
Nicobarese subgroups with
Aslian, just as how Khasian and Palaungic subgroup with each other. }} }} }} }} A subsequent computational phylogenetic analysis (Sidwell 2015b) suggests that Austroasiatic branches may have a loosely nested structure rather than a completely rake-like structure, with an east–west division (consisting of Munda, Khasic, Palaungic, and Khmuic forming a western group as opposed to all of the other branches) occurring possibly as early as 7,000 years before present. However, he still considers the subbranching dubious. Integrating computational phylogenetic linguistics with recent archaeological findings, Paul Sidwell (2015c) further expanded his Mekong riverine hypothesis by proposing that Austroasiatic had ultimately expanded into Mainland Southeast Asia from the neighboring
Lingnan area of
southern China, with the subsequent Mekong riverine dispersal taking place after the initial arrival of Neolithic farmers from southern China. Sidwell (2015c) tentatively suggests that Austroasiatic may have begun to split up 5,000 years BP during the
Neolithic transition era of
mainland Southeast Asia, with all the major branches of Austroasiatic formed by 4,000 BP. Austroasiatic would have had two possible dispersal routes from the western periphery of the
Pearl River watershed of
Lingnan, which would have been either a coastal route down the coast of Vietnam, or downstream through the
Mekong River via
Yunnan. considers the Austroasiatic language family to have rapidly diversified around 4,000 years BP during the arrival of rice agriculture in Mainland Southeast Asia, but notes that the origin of Proto-Austroasiatic itself is older than that date. The lexicon of Proto-Austroasiatic can be divided into an early and late stratum. The early stratum consists of basic lexicon including body parts, animal names, natural features, and pronouns, while the names of cultural items (agriculture terms and words for cultural artifacts, which are reconstructible in Proto-Austroasiatic) form part of the later stratum.
Roger Blench (2017) suggests that vocabulary related to aquatic subsistence strategies (such as boats, waterways, river fauna, and fish capture techniques) can be reconstructed for Proto-Austroasiatic. Blench (2017) finds widespread Austroasiatic roots for 'river, valley', 'boat', 'fish', 'catfish sp.', 'eel', 'prawn', 'shrimp' (Central Austroasiatic), 'crab', 'tortoise', 'turtle', 'otter', 'crocodile', 'heron, fishing bird', and 'fish trap'. Archaeological evidence for the presence of agriculture in northern Mainland Southeast Asia (northern Vietnam, Laos, and other nearby areas) dates back to only about 4,000 years ago (2,000 BC), with agriculture ultimately being introduced from further up to the north in the Yangtze valley where it has been dated to 6,000 BP. proposes that the locus of Proto-Austroasiatic was in the
Red River Delta area about 4,000-4,500 years before present, instead of the Middle Mekong as he had previously proposed. Austroasiatic dispersed coastal maritime routes and also upstream through river valleys. Khmuic, Palaungic, and Khasic resulted from a westward dispersal that ultimately came from the Red River valley. Based on their current distributions, about half of all Austroasiatic branches (including Nicobaric and Munda) can be traced to coastal maritime dispersals. Hence, this points to a relatively late riverine dispersal of Austroasiatic as compared to
Sino-Tibetan, whose speakers had a distinct non-riverine culture. In addition to living an aquatic-based lifestyle, early Austroasiatic speakers would have also had access to livestock, crops, and newer types of watercraft. As early Austroasiatic speakers dispersed rapidly via waterways, they would have encountered speakers of older language families who were already settled in the area, such as Sino-Tibetan. (quoted in Sidwell 2021) gives a more nested classification of Austroasiatic branches as suggested by his computational phylogenetic analysis of Austroasiatic languages using a 200-word list. Many of the tentative groupings are likely
linkages.
Pakanic and
Shompen were not included. }} }} }} }} }} }} }} }}
Possible extinct branches Roger Blench (2009) also proposes that there might have been other primary branches of Austroasiatic that are now extinct, based on
substrate evidence in modern-day languages. •
Pre-Chamic languages (the languages of coastal Vietnam before the Chamic migrations). Chamic has various Austroasiatic loanwords that cannot be clearly traced to existing Austroasiatic branches (Sidwell 2006, 2007). Larish (1999) also notes that
Moklenic languages contain many Austroasiatic loanwords, some of which are similar to the ones found in Chamic. •
Acehnese substratum (Sidwell 2006). Blench cites Austroasiatic-origin words in modern-day Bornean branches such as
Land Dayak (
Bidayuh,
Dayak Bakatiq, etc.),
Dusunic (
Central Dusun,
Visayan, etc.),
Kayan, and
Kenyah, noting especially resemblances with
Aslian. As further evidence for his proposal, Blench also cites ethnographic evidence such as musical instruments in Borneo shared in common with Austroasiatic-speaking groups in mainland Southeast Asia. Adelaar (1995) has also noticed phonological and lexical similarities between
Land Dayak and
Aslian. Kaufman (2018) presents dozens of lexical comparisons showing similarities between various Bornean and Austroasiatic languages. •
Lepcha substratum ("
Rongic"). Many words of Austroasiatic origin have been noticed in
Lepcha, suggesting a
Sino-Tibetan superstrate laid over an Austroasiatic substrate. Blench (2013) calls this branch "
Rongic" based on the Lepcha autonym
Róng. Other languages with proposed Austroasiatic substrata are: •
Jiamao, based on evidence from the register system of Jiamao, a
Hlai language (Thurgood 1992). Jiamao is known for its highly aberrant vocabulary in relation to other
Hlai languages. •
Kerinci: van Reijn (1974) notes that Kerinci, a
Malayic language of central
Sumatra, shares many phonological similarities with Austroasiatic languages, such as
sesquisyllabic word structure and vowel inventory. John Peterson (2017) suggests that "pre-
Munda" (early languages related to Proto-Munda) languages may have once dominated the eastern
Indo-Gangetic Plain, and were then absorbed by Indo-Aryan languages at an early date as Indo-Aryan spread east. Peterson notes that eastern
Indo-Aryan languages display many morphosyntactic features similar to those of Munda languages, while western Indo-Aryan languages do not. == Writing systems ==