Dialectologist
Jerry Norman estimated that there are hundreds of mutually unintelligible varieties of Chinese. These varieties form a
dialect continuum, in which differences in speech generally become more pronounced as distances increase, although there are also some sharp boundaries. However, the rate of change in mutual intelligibility varies immensely depending on region. For example, the varieties of Mandarin spoken in all three northeastern Chinese provinces are mutually intelligible, but in the province of Fujian, where Min varieties predominate, the speech of neighbouring counties or even villages may be mutually unintelligible.
Dialect groups Classifications of Chinese varieties in the late 19th century and early 20th century were based on impressionistic criteria. They often followed river systems, which were historically the main routes of migration and communication in southern China. The first scientific classifications, based primarily on correspondences with Middle Chinese voiced initials, were produced by
Wang Li in 1936 and
Li Fang-Kuei in 1937, with minor modifications by other linguists since. The conventionally accepted set of seven dialect groups first appeared in the second edition (1980) of
Yuan Jiahua's dialectology handbook: ;
Mandarin :This is the group spoken in northern and southwestern China and has by far the most speakers. This group includes the Beijing dialect, which forms the basis for
Standard Mandarin Chinese, colloquially called "Chinese" (outside Hong Kong and Macau) or "Mandarin" (everywhere) in English. In addition, the
Dungan language of
Kyrgyzstan and
Kazakhstan is a Mandarin variety written in the
Cyrillic script. ;
Wu :These varieties are spoken in
Shanghai, most of
Zhejiang and the southern parts of
Jiangsu and
Anhui. The group comprises hundreds of distinct spoken forms, many of which are not mutually intelligible. The
Suzhou dialect is usually taken as representative, because
Shanghainese features several atypical innovations. Wu varieties are distinguished by their retention of voiced or murmured
obstruent initials (
stops,
affricates and
fricatives). ;
Gan :These varieties are spoken in
Jiangxi and neighbouring areas. The
Nanchang dialect is taken as representative. In the past, Gan was viewed as closely related to
Hakka because of the way Middle Chinese voiced initials became voiceless aspirated initials as in Hakka, and were hence called by the umbrella term "Hakka–Gan dialects". ;
Xiang :The Xiang varieties are spoken in
Hunan and southern
Hubei. The
New Xiang varieties, represented by the
Changsha dialect, have been significantly influenced by Southwest Mandarin, whereas
Old Xiang varieties, represented by the
Shuangfeng dialect, retain features such as voiced initials. ;
Min :These varieties originated in the mountainous terrain of Fujian and eastern
Guangdong, and form the only branch of Chinese that cannot be directly derived from Middle Chinese. It is also the most diverse, with many of the varieties used in neighbouring counties—and, in the mountains of western Fujian, even in adjacent villages—being mutually unintelligible. Early classifications divided Min into Northern and Southern subgroups, but a survey in the early 1960s found that the primary split was between inland and coastal groups. Varieties from the coastal region around
Xiamen have spread to Southeast Asia, where they are known as
Hokkien (named from a dialectical pronunciation of "Fujian"), and Taiwan, where they are known as
Taiwanese. Other offshoots of Min are found in
Hainan and the
Leizhou Peninsula, with smaller communities throughout southern China. ;
Hakka :The
Hakka (literally "guest families") are a group of
Han Chinese living in the hills of northeastern Guangdong, southwestern Fujian and many other parts of southern China, as well as Taiwan and parts of Southeast Asia such as Singapore,
Malaysia and
Indonesia. The
Meixian dialect is the prestige form. Most Hakka varieties retain the full complement of nasal endings, and stop endings , though there is a tendency for Middle Chinese velar codas -ŋ and -k to yield dental codas -n and -t after front vowels. ;
Yue :These varieties are spoken in Guangdong,
Guangxi,
Hong Kong and
Macau, and have been carried by immigrants to Southeast Asia and many other parts of the world. The
prestige variety and by far most commonly spoken variety is
Cantonese, from the city of
Guangzhou (historically called "Canton"), which is also the native language of the majority in Hong Kong and Macau.
Taishanese, from the coastal area of
Jiangmen southwest of Guangzhou, was historically the most common Yue variety among overseas communities in the West until the late 20th century. Not all Yue varieties are mutually intelligible. Most Yue varieties retain the full complement of Middle Chinese word-final consonants () and have rich inventories of tones. The
Language Atlas of China (1987) follows a classification of
Li Rong, distinguishing three further groups: ;
Jin :These varieties, spoken in
Shanxi and adjacent areas, were formerly included in Mandarin. They are distinguished by their retention of the Middle Chinese
entering tone category. ;
Huizhou :The Hui varieties, spoken in southern
Anhui, share different features with Wu, Gan and Mandarin, making them difficult to classify. Earlier scholars had assigned them to one or other of these groups, or to a group of their own. ;
Pinghua :These varieties are descended from the speech of the earliest Chinese migrants to
Guangxi, predating the later influx of Yue and Southwest Mandarin speakers. Some linguists treat them as a mixture of Yue and Xiang. Some varieties remain unclassified, including the
Danzhou dialect (northwestern
Hainan),
Mai (southern Hainan),
Waxiang (northwestern
Hunan),
Xiangnan Tuhua (southern Hunan),
Shaozhou Tuhua (northern Guangdong), and the forms of Chinese spoken by the
She people (
She Chinese) and the
Miao people. She Chinese, Xiangnan Tuhua, Shaozhou Tuhua and unclassified varieties of southwest Jiangxi appear to be related to Hakka. Most of the vocabulary of the
Bai language of
Yunnan appears to be related to Chinese words, though many are clearly loans from the last few centuries. Some scholars have suggested that it represents a very early branching from Chinese, while others argue that it is a more distantly related
Sino-Tibetan language overlaid with two millennia of loans.
Dialect geography Jerry Norman classified the traditional seven dialect groups into three zones: Northern (Mandarin), Central (Wu, Gan, and Xiang) and Southern (Hakka, Yue, and Min). He argued that the varieties of the Southern zone are derived from a standard used in the Yangtze valley during the
Han dynasty (206 BC – 220 AD), which he called Old Southern Chinese, while the Central zone was a transitional area of varieties that were originally of southern type, but overlain with centuries of Northern influence.
Hilary Chappell proposed a refined model, dividing Norman's Northern zone into Northern and Southwestern areas, and his Southern zone into Southeastern (Min) and Far Southern (Yue and Hakka) areas, with Pinghua transitional between Southwestern and Far Southern areas. The long history of migration of peoples and interaction between speakers of different varieties makes it difficult to apply the
tree model to Chinese. Scholars account for the transitional nature of the central varieties in terms of
wave models. Iwata argues that innovations have been transmitted from the north across the
Huai River to the
Lower Yangtze Mandarin area and from there southeast to the Wu area and westwards along the
Yangtze River valley and thence to southwestern areas, leaving the hills of the southeast largely untouched. Some
dialect boundaries, such as between Wu and Min, are particularly abrupt, while others, such as between Mandarin and Xiang or between Min and Hakka, are much less clearly defined. Several east-west
isoglosses run along the Huai and Yangtze Rivers. A north-south barrier is formed by the
Tianmu and
Wuyi Mountains.
Intelligibility testing Most assessments of mutual intelligibility of varieties of Chinese in the literature are impressionistic. Functional intelligibility testing is time-consuming in any language family, and usually not done when more than 10 varieties are to be compared. However, one 2009 study aimed to measure intelligibility between 15 Chinese provinces. In each province, 15 university students were recruited as speakers and 15 older rural inhabitants recruited as listeners. The listeners were then tested on their comprehension of isolated words and of particular words in the context of sentences spoken by speakers from all 15 of the provinces surveyed. The results demonstrated significant levels of unintelligibility between areas, even within the Mandarin group. In a few cases, listeners understood fewer than 70% of words spoken by speakers from the same province, indicating significant differences between urban and rural varieties. As expected from the wide use of
Standard Chinese, speakers from Beijing were understood more than speakers from elsewhere. The scores supported a primary division between northern groups (Mandarin and Jin) and all others, with Min as an identifiable branch.
Terminology Because speakers share a
standard written form, i.e.
written vernacular Mandarin Chinese, and have a common cultural heritage with long periods of political unity, the varieties are popularly perceived among native speakers as variants of a single Chinese language, and this is also the official position of the
government of the People's Republic of China and formerly the position of the
government of the Republic of China (Taiwan). Conventional English-language usage in Chinese linguistics is to use
dialect for the speech of a particular place (regardless of status), with regional groupings like Mandarin and Wu called
dialect groups. Reflecting its internal diversity, Chinese is usually considered a language family within the Sino-Tibetan phylum. Estimates of the number of languages implied by the criterion of mutual intelligibility range from dozens to hundreds, but no one has attempted to delimit them consistently. Some authors refer to each of the eight main groups such as Wu or Yue as a "language", but each of these groups contains mutually unintelligible varieties.
ISO 639-3 and the
Ethnologue assign language codes to each of the top-level groups listed above except Min and Pinghua, whose subdivisions are assigned seven and two codes respectively. Some linguists refer to the local varieties as languages, numbering in the hundreds. The Chinese term , literally 'place speech', was the title of the
first work of Chinese dialectology in the
Han dynasty, and has had a range of meanings in the millennia since. It is used for any regional subdivision of Chinese, from the speech of a village to major branches such as Mandarin and Wu, regardless of intelligibility. Linguists writing in Chinese often qualify the term to distinguish different levels of classification. All these terms have customarily been translated into English as
dialect, a practice that has been criticized as confusing and inconsistent with typical usage.
John DeFrancis proposed the neologism
regionalect to serve as a translation for when referring to the top-level groups, which are mutually unintelligible.
Victor Mair coined the term
topolect as a translation for all uses of . The latter term appears in
The American Heritage Dictionary of the English Language. ==Phonology==