Recent from talks
Nothing was collected or created yet.
Sinitic languages
View on Wikipedia| Sinitic | |
|---|---|
| Chinese | |
| Geographic distribution | East Asia, Southeast Asia, Central Asia, North Asia |
| Ethnicity | Sinitic peoples |
| Linguistic classification | Sino-Tibetan
|
| Proto-language | Proto-Sinitic |
| Subdivisions | |
| Language codes | |
| ISO 639-5 | zhx |
| Glottolog | sini1245 (Sinitic)macr1275 (Macro-Bai) |
Map of Sinitic languages in China | |
The Sinitic languages[a] (simplified Chinese: 汉语族; traditional Chinese: 漢語族; pinyin: Hànyǔ zú), often synonymous with the Chinese languages, are a group of East Asian analytic languages that constitute a major branch of the Sino-Tibetan language family. It is frequently proposed that there is a primary split between the Sinitic languages and the rest of the family (the Tibeto-Burman languages). This view is rejected by some researchers[4] but has found phylogenetic support among others.[5][6] The Macro-Bai languages, whose classification is difficult, may be an offshoot of Old Chinese and thus Sinitic;[7] otherwise, Sinitic is defined only by the many varieties of Chinese unified by a shared historical background, and usage of the term "Sinitic" may reflect the linguistic view that Chinese constitutes a family of distinct languages, rather than variants of a single language.[b]
Population
[edit]Over 91% of the Chinese population speaks a Sinitic language, of whom about three-quarters speak a Mandarin variety.[9] Estimates of the number of global speakers of Sinitic branches as of 2018–2019, both native and non-native, are listed below:[10] Note that the numbers are uncertain due to uncertainty in the population estimates of China.
| Branch | Speakers | pct. |
|---|---|---|
| Mandarin | 1,118,584,040 | 73.50% |
| Yue | 85,576,570 | 5.62% |
| Wu | 81,817,790 | 5.38% |
| Min | 75,633,810 | 4.97% |
| Jin | 47,100,000 | 3.09% |
| Hakka | 44,065,190 | 2.90% |
| Xiang | 37,400,000 | 2.46% |
| Gan | 22,200,000 | 1.46% |
| Huizhou | 5,380,000 | 0.35% |
| Pinghua | 4,130,000 | 0.27% |
| Dungan | 56,300 | 0.004% |
| Total | 1,521,943,700 | 100% |
Languages
[edit]
Dialectologist Jerry Norman estimated that there are hundreds of mutually unintelligible Sinitic languages.[11] They form a dialect continuum in which differences generally become more pronounced as distances increase, though there are also some sharp boundaries.[12] The Sinitic languages can be divided into Macro-Bai languages and Chinese languages, and the following is one of many potential ways of subdividing these languages. Some varieties, such as Shaozhou Tuhua, are hard to classify and thus are not included in the following briefs.
Macro-Bai languages
[edit]This is a language family first proposed by linguist Zhengzhang Shangfang,[13] and was expanded to include Longjia and Luren.[14][15] It likely split off from the rest of Sinitic during the Old Chinese period.[16] The languages included are all considered minority languages in China and are spoken in the Southwest.[17][18] The languages are:
All other Sinitic languages henceforth would be considered Chinese.
Chinese
[edit]The Chinese branch of the family is classified into at least seven main families. These families are classified based on five main evolutionary criteria:[9]
- The evolution of the historical fully muddy (全浊; 全濁; quánzhuó) initials
- The distribution of rimes across the four tone qualities, as conditioned by voicing and aspiration of initials
- The evolution of the checked (入; rù) tone category
- The loss or retention of coda position plosives and nasals
- The palatalisation of the jiàn initial (見母; jiànmǔ) in front of high vowels
The varieties within one family may not be mutually intelligible with each other. For instance, Wenzhounese and Ningbonese are not highly mutually intelligible. The Language Atlas of China identifies ten groups:[19]
with Jin, Hui, Pinghua, and Tuhua not part of the seven traditional groups.
Mandarin
[edit]Varieties of Mandarin are used in the Western Regions, the Southwest, Huguang, Inner Mongolia, Central Plains and the Northeast,[19] by around three-quarters of the Sinitic-speaking population.[10] Historically, the prestige variety has always been Mandarin, which is still reflected today in Standard Chinese.[20] Standard Chinese is now an official language of the Republic of China, People's Republic of China, Singapore and United Nations.[9] Re-population efforts, such as that of the Qing dynasty in the Southwest, tended to involve Mandarin speakers.[21] Classification of Mandarin lects has undergone several significant changes, though nowadays it is commonly divided as such, based on the distribution of the historical checked tone:[19]
- Northeastern
- Beijing (sometimes considered part of Northeastern)[22][23]
- Jiaoliao (sometimes "Peninsular")
- Jilu (sometimes "Northern")
- Central Plains (or "Zhongyuan")
- Lanyin (sometimes "Northwestern" and considered part of Central Plains)
- Jin (often considered a top-level group due to the Language Atlas of China)
- Southwestern (sometimes "Upper Yangtse")
- Jianghuai (or "Lower Yangtze", sometimes "Huai", "Southern" or "Southeastern")[20]
as well as other lects, which do not neatly fall into these categories, such as Mandarin Junhua varieties.
Varieties of Mandarin can be defined by their universally lost -m final, low number of tones, and smaller inventory of classifiers, among other features. Mandarin lects also often have rhotic erhua rimes, though the amount of its use may vary between lects.[9] Loss of checked tone is an often cited criterion for Mandarin languages, though lects such as Yangzhounese and Taiyuannese show otherwise.
| Mandarin | Non-Mandarin | Gloss | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Beijing | Jinan | Zhengzhou | Xi'an | Taiyuan | Chengdu | Nanjing | Guangzhou | Meizhou | Xiamen | Anyi | ||
| 音 | in | iẽ | iən | iẽ | iəŋ | in | in | iɐm | im | im | im | 'sound' |
| 心 | ɕin | ɕiẽ | siən | ɕiẽ | ɕiəŋ | ɕin | sin | sɐm | sim | sim | ɕim | 'heart' |
Northeastern and Beijing Mandarin
[edit]Northeastern Mandarin is spoken in Heilongjiang, Jilin, most of Liaoning and northeastern Inner Mongolia, whereas Beijing Mandarin is spoken in northern Hebei, most of Beijing, parts of Tianjin and Inner Mongolia.[19] The two families' most notable features are the heavy use of rhotic erhua and seemingly random distribution of the dark checked tone, and generally having four tones with the contours of high flat, rising, dipping, and falling.
| Northeastern/Beijing | Other | Gloss | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Harbin | Changchun | Shenyang | Beijing | Heyuan | Chaozhou | Suzhou | Hefei | Wuhan | ||
| 客 | ˨˩˧ | ˥˧ | ˨˩˧ | ˥˧ | ˥ | ˨˩ | ˥˥ | ˥ | ˨˩˧ | 'guest' |
| 八 | ˦˦ | ˦˦ | ˧˧ | ˥˥ | ˥ | ˨˩ | ˥˥ | ˥ | ˨˩˧ | 'eight' |
| 北 | ˨˩˧ | ˨˩˧ | ˨˩˧ | ˨˩˧ | ˥ | ˨˩ | ˥˥ | ˥ | ˨˩˧ | 'north' |
Northeastern Mandarin, especially in Heilongjiang, contains many loanwords from Russian.[24]
| Term | Pronunciation | Meaning | Origin |
|---|---|---|---|
| 卜留克 | bǔliúkè | 'rutabaga' | брюква bryukva |
| 馬神 | mǎshén | 'machine' | машина mashina |
| 巴籬子 | bālízi | 'jail' | полиция politsiya |
Northeastern Mandarin lects can be divided into three main groups, namely Hafu (including Harbinnese and Changchunnese), Jishen (including Jilinnese and Shenyangnese), and Heisong. Notably, the extinct Taz language of Russia is also a Northeastern Mandarin language. Beijing is sometimes included in Northeastern Mandarin due to its distribution of the historical dark checked tone,[22][23] though is listed as its own group by others, often due to its more regular light checked tones.[19]
Jilu Mandarin
[edit]Jilu Mandarin is spoken in southern Hebei and western Shandong,[19] and is often represented with Jinannese.[25] Notable cities that use Jilu Mandarin lects include Cangzhou, Shijiazhuang, Jinan and Baoding.[26][27] Characteristically Jilu Mandarin features include merging the dark checked into the dark level tone, the light checked into light level or departing based on the manner of articulation of the initial, and vowel breaking in tong rime series' (通攝) checked-tone words, among other features.
Jilu Mandarin can be classified into Baotang, Shiji, Canghui and Zhangli.[28] Zhangli is of note due to its preservation of a separate checked tone.
Jiaoliao Mandarin
[edit]
Jiaoliao Mandarin is spoken in the Jiaodong and Liaodong Peninsulae, which includes the cities of Dalian and Qingdao, as well as several prefectures along the China-Korea border.[19] Like Jilu Mandarin, its light checked tone is merged into light level or departing based on the manner of articulation of the initial, though its dark checked is merged into the rising. Its rì initial (日母) terms are pronounced with a null initial (apart from open zhǐ rime series (止攝開口) finals), unlike the /ʐ/ of Northern and Beijing Mandarin.[29]
Based on, for example, the pronunciation of the palatalized jiàn initial (見母),[19] Jiaoliao Mandarin can be divided into Qingzhou, Denglian and Gaihuan areas.[28]
| Yantai | Weihai | Qingdao | Dalian | Gloss | |
|---|---|---|---|---|---|
| 交 | ciau | ciau | tɕiɔ | tɕiɔ | 'to hand in' |
| 見 | cian | cian | tɕiã | tɕiɛ̃ | 'to see' |
Central Plains and Lanyin Mandarin
[edit]Central Plains Mandarin is spoken in the Central Plains of Henan, southwestern Shanxi, southern Shandong and northern Jiangsu, as well as most of Shaanxi, southern Ningxia and Gansu and southern Xinjiang, in famous cities such as Kaifeng, Zhengzhou, Luoyang, Xuzhou, Xi'an, Xining and Lanzhou.[30][31][32] Central Plains Mandarin lects merge the historical checked tones with a lesser muddy (次濁) and clear (清) initial together with the rising tone, and those with a fully muddy (全濁) initial are merged with the light level tone.[19]
Lanyin Mandarin, spoken in northern Ningxia, parts of Gansu, and northern Xinjiang, is sometimes grouped with Central Plains Mandarin due to its merged lesser light and dark checked tones, though it is realised as a departing tone.
Subdivision of Central Plains Mandarin is not fully agreed upon, though one possible subdivision sees 13 divisions, namely Xuhuai, Zhengkai, Luosong, Nanlu, Yanhe, Shangfu, Xinbeng, Luoxiang, Fenhe, Guanzhong, Qinlong, Longzhong and Nanjiang.[33] Lanyin Mandarin, on the other hand, is divided as Jincheng, Yinwu, Hexi, and Beijiang. The Dungan language is a collection of Central Plains Mandarin varieties spoken in the former Soviet Union.
Jin
[edit]
Jin is spoken in most of Shanxi, western Hebei, northern Shaanxi, northern Henan and central Inner Mongolia,[19] often represented by Taiyuannese.[25] It was first proposed as a lect separate from the rest of Mandarin by Li Rong, where it was proposed as lects in and around Shanxi with a checked tone, though this stance is not without disagreement.[34][35] Jin varieties also often has disyllabic words derived from syllable splitting (分音詞), through the infixation of /(u)əʔ l/.[9]
笨
pəŋ꜄
→
薄
pəʔ꜇
愣
ləŋ꜄
'stupid'
滾
꜂kʊŋ
→
骨
kuəʔ꜆
攏
꜂lʊŋ
'to roll'
As per the Language Atlas by Li, Jin is divided into Dabao, Zhanghu, Wutai, Lüliang, Bingzhou, Shangdang, Hanxin, and Zhiyan branches.[19]
Southwestern Mandarin
[edit]Spoken in Yunnan, Guizhou, northern Guangxi, most of Sichuan, southern Gansu and Shaanxi, Chongqing, most of Hubei and bordering parts of Hunan, as well as Kokang of Myanmar and parts of northern Thailand, Southwestern Mandarin speakers take up the most area and population of all Mandarinic language groups, and would be the eighth most spoken language in the world if separated from the rest of Mandarin.[19] Southwestern Mandarinic tends to not have retroflex consonants, and merges all checked tone categories together. Except for Minchi, which has a standalone checked category, the checked tone is merged with another category. Representative lects include Wuhannese and Sichuanese, and sometimes Kunmingnese.[25]
Southwestern Mandarin tends to be split into Chuanqian, Xishu, Chuanxi, Yunnan, Huguang and Guiliu branches. Minchi is sometimes separated as a remnant of Old Shu.[36]
Huai
[edit]
Huai is spoken in central Anhui, northern Jiangxi, far western and eastern Hubei and most of Jiangsu.[19] Due to its preservation of a checked tone, some linguists believe that Huai ought to be treated as a top-level group, like Jin. Representative lects tend to be Nanjingnese, Hefeinese and Yangzhounese.[25] The Huai of Nanjing has likely served as a national prestige during the Ming and Qing periods,[37] though not all linguists support this viewpoint.[38]
The Language Atlas divides Huai into Tongtai, Huangxiao, and Hongchao areas, with the latter further split into Ninglu and Huaiyang. Tongtai, being geographically located furthest west, has the most significant Wu influence, such as in its distribution of historical voiced plosive series.[19][39][40]
| Tongtai | Non-Tongtai | |||||
|---|---|---|---|---|---|---|
| Nantong | Taizhou | Yangzhou | Hangzhou | Fuzhou | Huizhou | |
| 地 | tʰi | tʰi | ti | di | tei | ti |
| 病 | pʰeŋ | pʰiŋ | pin | biŋ | paŋ | piaŋ |
Yue
[edit]
Yue Chinese is spoken by around 84 million people,[10] in western Guangdong, eastern Guangxi, Hong Kong, Macau and parts of Hainan, as well as overseas communities such as Kuala Lumpur and Vancouver.[19] Famous lects such as Cantonese and Taishanese belong to this family.[9] Yue Chinese lects generally possess long-short distinctions in their vowels, which is reflected in their almost universally split dark-checked and often split light-checked tones. They generally also tend to preserve all three checked plosive finals and three nasal finals. The status of Pinghua is uncertain, and some believe its two groups, Northern and Southern, should be listed under Yue,[41] though some reject this standpoint.[19]
| Tone | Dark | Light | ||
|---|---|---|---|---|
| Short | Long | Short | Long | |
| Examples | 北 | 八 | 入 | 白 |
| Guangzhou | ˥˥ | ˧˧ | ˨˨ | |
| Hong Kong | ˥˥ | ˧˧ | ˨˨ | |
| Dongguan | ˦˦ | ˨˨˦ | ˨˨ | |
| Shiqi | ˥ | ˧ | ||
| Taishan | ˥˥ | ˧˧ | ˨˩ | |
| Bobai | ˥˥ | ˧˧ | ˨˨ | |
| Yulin | ˥ | ˧ | ˨ | ˨˩ |
Yue is generally split into Cantonese (which itself contains Yuehai, Xiangshan, and Guanbao), Siyi, Gaoyang, Qinlian, Wuhua, Goulou (which includes Luoguang), Yongxun and the two Pinghua branches.[19] Siyi is generally agreed to be the most divergent, and Goulou is believed to be the one which is closest related to Pinghua.[41]
Hakka
[edit]Hakka Chinese is a direct result of several migration waves from Northern China to the South,[42] and is spoken in eastern Guangdong, parts of Taiwan, western Fujian, Hong Kong, southern Jiangxi, as well as scattered points in the rest of Guangdong, Hunan,
| 人 | ȵin | neŋ | ȵin | ŋɡin | niẽ |
| 日 | ȵit | ni | ȵit | ŋɡit | nie |
Hakka can be divided into Yuetai, Hailu, Yuebei, Yuexi, Tingzhou, Ninglong, Yuxin and Tonggui.[19] Meizhounese is often used as the representative variety of Hakka.[25]
Min
[edit]
Min Chinese is a direct descendant of Old Chinese, and is spoken in Chaoshan and Zhanjiang of Guangdong, Hainan, Taiwan, most of Fujian and parts of Jiangxi and Zhejiang, by around 76 million people.[10] Due to significant amounts of migration, many people in Southeast Asia and Hong Kong are also able to speak Min varieties. Lects such as Teoswa, Hainanese, Hokkien (incl. Taiwanese) and Hokchiu are all Min varieties.[19]
Since Min descended from Old Chinese rather than Middle Chinese, it has some features that would be out of place in other varieties. For instance, some words with the cheng initial (澄母) are not affricates in Min. This, interestingly, has led to many languages, such as Occitan, Inuktitut, Latin, Māori and Telugu, loaning the Sinitic word for 'tea' (茶) with a plosive. Min varieties also have a very large number of words with literary pronunciations.[9]
| Min | Non-Min | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Fuzhou | Quanzhou | Chaozhou | Putian | Jian'ou | Haikou | Leizhou | Lanzhou | Guiyang | Changsha | |
| 茶 | ta | te | te | tɒ | ta | ʔdɛ | te | tʂʰa | tsʰa | tsa |
| 陳 | tiŋ | tan | tʰiŋ | tɛŋ | teiŋ | ʔdaŋ | taŋ | tʂʰən | tsʰən | tsən |
Min can primarily be split into Coastal and Inland Min varieties. The former contains the Southern Min branches of Quanzhang (Hokkien), Chaoshan (Teoswa), Datian and Zhongshan, the Eastern Min branches of Houguan and Funing, Qionglei Min, as well as Puxian Min, whereas the latter includes Northern, Central and Shaojiang Min. Shaojiang Min acts as a transitional area between Min, Gan, and Hakka.[20][34]
Wu
[edit]
Wu Chinese is spoken in most of Zhejiang, Shanghai, southern Jiangsu, parts of southern Anhui and eastern Jiangxi by around 82 million people.[19][10][43] Many large cities in the Yangtze Delta, such as Suzhou, Changzhou, Ningbo and Hangzhou, use a Wu variety. Wu varieties generally have a fricative initial in their negators, a three-way plosive distinction, as well as a checked coda preserved as a glottal stop, except for Oujiang lects, where it has become vowel length, and Xuanzhou.[43][40]
| Shanghai | Suzhou | Changzhou | Shaoxing | Ningbo | Taizhou | Wenzhou | Jinhua | Lishui | Quzhou | |
|---|---|---|---|---|---|---|---|---|---|---|
| 通 | tʰoŋ | tʰoŋ | tʰoŋ | tʰoŋ | tʰoŋ | tʰoŋ | tʰoŋ | tʰoŋ | tʰɔŋ | tʰaŋ |
| 東 | toŋ | toŋ | toŋ | toŋ | toŋ | toŋ | toŋ | toŋ | tɔŋ | taŋ |
| 同 | doŋ | doŋ | doŋ | doŋ | doŋ | doŋ | doŋ | doŋ | dɔŋ | daŋ |
Shanghainese, Suzhounese and Wenzhounese are usually used as representatives of Wu.[25] Wu Chinese varieties generally have a massive number of vowels, which rivals even North Germanic languages.[44][45] The Dondac variety has been observed to have 20 phonemic monophthongal vowels, according to one analysis.[46]
Qian Nairong divides Wu into Taihu (or Northern Wu), Taizhou, Oujiang, Chuqu and Wuzhou. Northern Wu is further divided into Piling, Suhujia, Tiaoxi, Linshao, Yongjiang, and Hangzhou, though Hangzhou's classification is unclear.[40][43]
Hui
[edit]Huizhou Chinese is spoken in western Hangzhou, southern Anhui and parts of Jingdezhen, by around 5 million people.[19][10] It is identified as a top-level group by the Language Atlas, though some linguists believe in other theories, such as it being a Gan-influenced Wu variety, due to an identifiable basis of Old Wu features.[9][47][48][49] Hui varieties are phonologically diverse, and some features are shared with Wu, such as the simplification of diphthongs.[50] Hui can be divided into Jishe, Xiuyi, Qiwu, Jingzhan and Yanzhou branches, with Tunxinese and Jixinese being representatives.
Gan
[edit]Gan Chinese is spoken in northern and central Jiangxi, parts of Hebei and Anhui and eastern Hunan, by 22 million people,[19][10] sometimes believed to be related to Hakka.[51][52] Gan varieties tend to not palatalize terms with the jian initial (見母) and have an f-like initial in closed xiao and xia initial (合口曉匣兩母) terms, among other features.[53]
| Nanchang | Yichun | Ji'an | Fuzhou | Yingtan | |
|---|---|---|---|---|---|
| 灰 | ϕɨi | fi | fei | fai | fɛi |
| 胡 | ϕu | fu | fu | fu | fu |
Gan can also be divided into Northern and Southern groups. The Northern group was formed during the Tang dynasty, whereas the Southern group was developed based on Northern Gan.[9] The Language Atlas sees Gan divided into Changdu, Yiliu, Jicha, Fuguang, Yingyi, Datong, Dongsui, Huaiyue, and Leizi branches.[19] Nanchangnese is often chosen as the representative.[25] Shaojiang Min is identified to be influenced or even closely related to Fuguang Gan.[54]
Xiang
[edit]
Xiang Chinese is spoken in central and western Hunan and nearby parts of Guangxi and Guizhou by an estimated 37 million people.[19][10] Due to migrations, Xiang can be split into New and Old Xiang groups, with Old Xiang having fewer Mandarin-influenced features.[55][9] Xiang varieties have universally lost their checked codas, but the majority of them still have a unique preserved checked tone contour. Most also have a three-way plosive distinction, like Wu varieties.[19]
One way of dividing Xiang varieties sees five distinct families, namely Changyi, Hengzhou, Louzhao, Chenxu, and Yongzhou.[56] Changshanese and one of Shuangfengnese or Loudinese are usually taken as Xiang representatives.[25]
Internal classification
[edit]
The traditional, dialectological classification of Chinese languages is based on the evolution of the sound categories of Middle Chinese. Little comparative work has been done (the usual way of reconstructing the relationships between languages), and little is known about mutual intelligibility. Even within the dialectological classification, details are disputed, such as the establishment in the 1980s of three new top-level groups: Huizhou, Jin and Pinghua, although Pinghua is itself a pair of languages and Huizhou maybe half a dozen.[58][59]
Like Bai, the Min languages are commonly thought to have split off directly from Old Chinese.[60] The evidence for this split is that all Sinitic languages apart from the Min group can fit into the structure of the Qieyun, a 7th-century rime dictionary.[61] However, this view is not universally accepted.
Points of contention
[edit]Like many other language families, Sinitic languages have had problems with classification. The following are a few examples.
Southern China
[edit]Traditionally, the lect of urban Hangzhou and New Xiang of eastern Hunan are not considered Mandarin.[19] However, linguists such as Richard VanNess Simmons and Zhou Zhenhe have observed that these two varieties possess more qualifying features of Mandarin languages.[40][62] For instance, the vowels of the second division of the jia (假) initial is often raised and backed in Wu and Xiang, while they are not in Hangzhounese and New Xiang.
| Traditionally Mandarin | Traditionally Wu | Traditionally Xiang | Gloss | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Beijing | Nanjing | Nantong | Shanghai | Suzhou | Wenzhou | Hangzhou | Changsha | Shuangfeng | ||
| 花 | xua | xuɑ | xuo | ho | ho | kʰo | hua | fa | xo | 'flower' |
| 瓜 | kua | kuɑ | kuo | ko | ko | ko | kua | kua | ko | 'melon' |
| 下 | ɕia | ɕiɑ | xo | ɦo | ɦo | ɦo | ia | xa | ɣo | 'down' |
Nantongnese has heavy Wu influence, which has led to it also having raised and backed vowels.
Danzhounese and Maihua are both traditionally considered Yue lects.[19] Recent research, however, has noted that these are both are more likely unclassified.[63] Maihua, for example, may be a Yue-Hakka-Hainanese Min mixed language.[64]
Dongjiang Bendihua (東江本地話) is spoken in and around Huizhou and Heyuan. Its classification has always been unclear, though the most common standpoint is that it is considered Hakka.[19][65]
Northern China
[edit]The variety spoken in the Ganyu District of Lianyungang (贛榆話) is listed as a variety of Central Plains Mandarin in the Language Atlas of China,[19] though its tonal distribution is more similar to Peninsular Mandarin varieties.[66]
Relationships between groups
[edit]Jerry Norman classified the traditional seven dialect groups into three larger groups: Northern (Mandarin), Central (Wu, Gan, and Xiang), and Southern (Hakka, Yue, and Min). He argued that the Southern Group is derived from a standard used in the Yangtze valley during the Han dynasty (206 BC – 220 AD), which he called Old Southern Chinese, while the Central group was transitional between the Northern and Southern groups.[67] Some dialect boundaries, such as between Wu and Min, are particularly abrupt, while others, such as between Mandarin and Xiang or between Min and Hakka, are much less clearly defined.[12]
Scholars account for the transitional nature of the central varieties in terms of wave models. Iwata argues that innovations have been transmitted from the north across the Huai River to the Lower Yangtze Mandarin area and from there southeast to the Wu area and westwards along the Yangtze River valley and thence to southwestern areas, leaving the hills of the southeast largely untouched.[68]
A quantitative study
[edit]A 2007 study compared fifteen major urban dialects on the objective criteria of lexical similarity and regularity of sound correspondences, and subjective criteria of intelligibility and similarity. Most of these criteria show a top-level split with Northern, New Xiang, and Gan in one group and Min (samples at Fuzhou, Xiamen, Chaozhou), Hakka, and Yue in the other group. The exception was phonological regularity, where the one Gan dialect (Nanchang Gan) was in the Southern group and very close to Meixian Hakka, and the deepest phonological difference was between Wenzhounese (the southernmost Wu dialect) and all other dialects.[69]
The study did not find clear splits within the Northern and Central areas:[69]
- Changsha (New Xiang) was always within the Mandarin group. No Old Xiang dialect was in the sample.
- Taiyuan (Jin or Shanxi) and Hankou (Wuhan, Hubei) were subjectively perceived as relatively different from other Northern dialects but were very close in mutual intelligibility. Objectively, Taiyuan had substantial phonological divergence but little lexical divergence.
- Chengdu (Sichuan) was somewhat divergent lexically but very little on the other measures.
The two Wu dialects (Wenzhou and Suzhou) occupied an intermediate position, closer to the Northern/New Xiang/Gan group in lexical similarity and strongly closer in subjective intelligibility but closer to Min/Hakka/Yue in phonological regularity and subjective similarity, except that Wenzhou was farthest from all other dialects in phonological regularity. The two Wu dialects were close to each other in lexical similarity and subjective similarity but not in mutual intelligibility, where Suzhou was closer to Northern/Xiang/Gan than to Wenzhou.[69]
In the Southern subgroup, Hakka and Yue grouped closely together on the three lexical and subjective measures but not in phonological regularity. The Min dialects showed high divergence, with Min Fuzhou (Eastern Min) grouped only weakly with the Southern Min dialects of Xiamen and Chaozhou on the two objective criteria and was slightly closer to Hakka and Yue on the subjective criteria.[69]
Internal comparison
[edit]The following section will be dedicated to comparing non-Bai and non-Cai–Long Sinitic languages. Though all stem from Old Chinese, they have all developed differences with each other.
Writing system
[edit]
Typographically, the vast majority of Sinitic languages use Sinographs. However, some varieties, such as Dungan and Hokkien, have alternative scripts, namely Cyrillic and Latin alphabets. Even between varieties which use Sinographs, characters are repurposed or invented to cover for the difference in vocabulary. Examples include 靚; 'pretty' in Yue,[70] 𠊎; 'I', 'me' in Hakka,[71] 即; 'this' in Hokkien,[72] 覅; 'to not want' in Wu,[44] 莫; 'do not' in Xiang, and 嘎; 'ill-tempered' in Mandarin.[73][24] Note that both traditional and simplified characters can be used to write any lect.
Phonology
[edit]Phonologically speaking, though all Sinitic languages possess tones, their contours and the total number of tones vary wildly, from Shanghainese, which can be analysed to have only two tones,[44] to Bobainese, which has ten.[74] Sinitic languages also vary wildly in their phonological inventories and phonotactics. Take for instance /mɭɤŋ/ (門兒; 'door (diminutive)') seen in Pingdingnese,[20] or /tʃɦɻʷəi/ (水; 'water') of Xuanzhounese,[75] which both show syllables which do not follow the (single) consonant-glide-vowel-consonant syllable structure of more well-known lects. Tone sandhi is also a feature which not all lects share. Cantonese, for instance, only has a very weak system,[76] whereas Wu varieties not only have complex, intricate systems, which affect almost all syllables, but also uses it to mark for grammatical part of speech.[44][45] Take for instance, this simplified analysis of Suzhounese tone sandhi:[77]
| chain length → ↓ 1st char tone cat |
2 char | 3 char | 4 char |
|---|---|---|---|
| dark level (1) | ˦ ꜉ | ˦ ˦ ꜉ | ˦ ˦ ˦ ꜉ |
| light level (2) | ˨ ˧ | ˨ ˧ ꜊ | ˨ ˧ ˦ ꜉ |
| rising (3) | ˥ ˩ | ˥ ˩ ꜌ | ˥ ˩ ˩ ꜌ |
| dark departing (5) | ˥˨ ˧ | ˥˨ ˧ ꜊ | ˥˨ ˧ ˦ ꜉ |
| light departing (6) | ˨˧ ˩ | ˨˧ ˩ ꜌ | ˨˧ ˩ ˩ ꜌ |
| chain length → | 2 char | 3 char | 4 char | |
|---|---|---|---|---|
| 2nd char tone cat |
1st char darkness | |||
| level (1, 2) | dark (7) | ˦ ˨˧ | ˦ ˨˧ ꜊ | ˦ ˨˧ ˦ ꜉ |
| light (8) | ˨ ˧ | ˨ ˧ ꜊ | ˨ ˧ ˦ ꜉ | |
| rising (3) | dark (7) | ˥ ˥˩ | ˥ ˥˩ ꜌ | ˥ ˥˩ ˩ ꜌ |
| light (8) | ˨ ˥˩ | ˨ ˥˩ ꜌ | ˨ ˥˩ ˩ ꜌ | |
| departing (5, 6) | dark (7) | ˥ ˥˨˧ | ˥ ˥˨ ˧ | ˥ ˥˨ ˨ ˧ |
| light (8) | ˨ ˥˨˧ | ˨ ˥˨ ˧ | ˨ ˥˨ ˨ ˧ | |
| checked (7, 8) | dark (7) | ˦ ˦ | ˦ ˦ ꜉ | ˦ ˦ ˦ ˨ |
| light (8) | ˧ ˦ | ˧ ˦ ꜉ | ˧ ˦ ˨ ꜋ | |
Grammar
[edit]Disregarding phonology, grammar is the feature of Sinitic languages which differ the most. The majority of Sinitic languages do not possess tenses, though exceptions include Northern Wu lects such as Shanghainese and Suzhounese, though it is largely breaking down in Shanghainese due to Mandarin influence.[45][78] Sinitic languages generally also have no case marking, though lects such as Linxianese and Hengshannese do possess case particles, with the latter expressing it through tone change.[79][80] Sinitic languages generally have SVO word order and possess classifiers.
Verb usage may be different between Sinitic languages. Notice the double verb marking seen in lects such as Beijingese, in these sentences meaning "today I go to Guangzhou":[81]
Indirect object marking
[edit]Sinitic languages tend to vary greatly in how they mark indirect objects. The area which varies tends to be the placement of the indirect and direct objects.[9][20]
Mandarinic, Xiang, Hui, and Min languages often place the indirect object (IO) before the direct object (DO). Some lects have switched to IO-DO structure due to Mandarin influence, such as Nanchangese and Shanghainese, though Shanghainese also has the alternative word order.
|
Beijingese: 他 tā 3SG 給 gěi give 了 le PERF 我 wǒ 1SG 一 yī one 盒 hé CL 糖。 táng sweets "He gave me a box of sweets." |
Taiyuanese: 給 kei53 give 我 ɣə53 1SG 一 iəʔ2 one 本 pəŋ53 CL 書。 su11 book "Give me a book."
|
|
Changshanese: 媽 媽 ma33 ma ma 誒, ei SPEC 把 pa41 give 我 ŋo41 1SG 兩 lian41 two 塊 kʰuai41 CL 錢 tɕiɛ̃13 money 咯。 lo SPEC "Mama, give me two dollars please." |
Nanchangese: 你 人 ꜂n len 2SG.POL 接 ꜀tɕia lend 了 le PERF 佢 ꜂tɕie 3SG 三 ꜀san three 隻 tsaʔ꜆ CL 鍋。 ꜀wo pot "You lent him three pots."
|
On the other hand, Gan, Wu, Hakka, and Yue languages tend to place the DO in front of the IO.
|
Yichunnese: 我 ŋo34 1SG 得 tɛ42⁻33 give 本 pun42 CL 書 ɕy34 book 你。 ȵi34 2SG "I give a book to you." |
|
|
Yining Pinghua: 分 fɐn34 give 個 ko33 CL 梨 子 lɐi31 tsə53 pear 你。 nə53 2SG "I'll give you a pear." |
Hong Kong Hakka (Lau's Romanization):[82] 分 bín give 塊 kuài CL 麪 包 mèn báu bread 𠊎。 ngāi 1SG "Give me a piece of bread."
|
Classifiers
[edit]Like other East Asian languages such as Japanese and Korean, Sinitic languages have a system of classifers, however, use of classifiers vary greatly in features such as definiteness.[20] In Cantonese, for instance, they can be used to mark possession, which is rare in Sinitic while common in Southeast Asia.[9]
我
ngo5
1SG
本
bun2
CL
書
syu1
book
'my book'
個 and 隻 are the most common generic classifiers cross-linguistically.[9] As previously mentioned, Mandarinic languages tend to have fewer classifiers whereas the Southern non-Mandarinic varieties tend to have more.[20]
Demonstratives
[edit]Sinitic languages can vary greatly in their system of demonstratives.[20] Standard Mandarin and other Northeastern varieties have a two-way system: 這; zhè (proximal) and 那; nà (distal), but this is not the only system found in Sinitic languages.
Wuhannese has a neutral demonstrative, which can be used regardless of the distance to the deictic center.[83][84] Similar systems are found in Northern Wu lects such as Suzhounese and Ningbonese.[45][20]
In the above sentence, /nɤ³⁵/ can be translated as both 'this' and 'that'. Though Wuhannese has this system of a one-term neutral system, it also has a two-way proximal-distal system. This is the same for most other lects with a one-term system.
Even within two-way systems, which is the most common system, terms could have developed to mean the opposite distance from the deitic center. Cantonese 嗰; go² (distal) and Shanghainese 搿; geq (proximal) are both etymologically from 個, for instance.[70][44]
Many Sinitic languages have three-way systems, but the three distances are not always the same ones. For instance, whereas Guangshan Mandarin has a person-oriented proximal, medial, and distal system, Xinyu Gan has a distance-oriented close, proximal, and distal system. Gan especially has many varieties with a three-way system, sometimes even marked with tone and vowel length rather than just changing the term used.[20][85]
A small number of varieties possess even four- or five-term demonstrative systems. Take for instance the following:[20]
| Dongxiang | Zhangshu | |
|---|---|---|
| Close | ꜀ko | kọ꜆ |
| Proximal | ꜁ko | ko꜆ |
| Distal | ꜀e | ꜃hɛ |
| Yonder | ꜁e | ꜃hɛ̣ |
These two lects use tone change and vowel length respectively to distinguish between the four demonstratives.
Notes
[edit]- ^ From Late Latin Sīnae, 'the Chinese', probably from Arabic Ṣīn 'China', from the Chinese dynastic name Qin. (OED). In 1982, Paul K. Benedict proposed a subgroup of Sino-Tibetan called "Sinitic" comprising Bai and Chinese.[1] The precise affiliation of Bai remains uncertain[2] and the term Sinitic is usually used as a synonym for Chinese, especially when viewed as a language family rather than as a language.[3]
- ^ See Enfield (2003:69) and Hannas (1997) for examples. The Chinese terms often translated as 'language' and 'dialect' do not correspond well to those translations. These are 語言; yǔyán, corresponding to macrolanguage or language cluster, which is used for Chinese itself; 方言; fāngyán, which separates mutually unintelligible languages within a yǔyán; and 土語; tǔyǔ or 土話; tǔhuà, which corresponds better to the familiar Western linguistic use of 'dialect'.[8]
- ^ a b This term was not assigned a character.
References
[edit]Citations
[edit]- ^ Wang (2005), p. 107.
- ^ Wang (2005), p. 122.
- ^ Mair (1991), p. 3.
- ^ van Driem (2001), p. 351.
- ^ Zhang, Menghan; Yan, Shi; Pan, Wuyun; Jin, Li (2019). "Phylogenetic evidence for Sino-Tibetan origin in northern China in the Late Neolithic". Nature. 569 (7754): 112–115. Bibcode:2019Natur.569..112Z. doi:10.1038/s41586-019-1153-z. ISSN 1476-4687. PMID 31019300. S2CID 129946000.
- ^ Sagart et al. (2019).
- ^ van Driem (2001:403) states "Bái ... may form a constituent of Sinitic, albeit one heavily influenced by Lolo–Burmese."
- ^ Bradley (2012), p. 1.
- ^ a b c d e f g h i j k l m Chan, Sin-Wai; Chappell, Hilary; Li, Lan (2017). "Mandarin and other Sinitic languages". Routledge Encyclopedia of the Chinese language. Oxford: Routledge. pp. 605–628.
- ^ a b c d e f g h "Chinese". Ethnologue.
- ^ Norman (2003), p. 72.
- ^ a b Norman (1988), pp. 189–190.
- ^ Zhengzhang, Shangfang (2010). 蔡家话白语关系及词根比较. 研究之乐 (in Chinese) (2). Shanghai Educational Publishing House: 389–400.
- ^ 貴州省民族識別工作隊語言組 (1984). 蔡家的語言 (in Chinese).
- ^ 貴州省民族識別工作隊 (1984). 南龍人(南京-龍家)族別問題調查報告.
- ^ Gong, Xun (6 November 2015). "How Old is the Chinese in Bái?". Paris.
{{cite journal}}: Cite journal requires|journal=(help) - ^ 貴州省志 民族志. Guiyang: 貴州民族出版社. 2002.
- ^ Xu, Lin; Zhao, Yansun (1984). 白语简志 (in Chinese). 民族印刷廠.
- ^ a b c d e f g h i j k l m n o p q r s t u v w x y z aa ab ac Li, Rong (2012). 中國語言地圖集.
- ^ a b c d e f g h i j k Chappell, Hilary M. (2015). Diversity in Sinitic Languages. Oxford University Press. ISBN 9780198723790.
- ^ Tsung, Linda (2014). Language Power and Hierarchy: Multilingual Education in China. Bloomsbury Publishing.
- ^ a b Lin, Tao (1987). "北京官话区的划分". 方言 (3): 166–172. ISSN 0257-0203.
- ^ a b Zhang, Shifang (2010). 北京官话语音研究. Beijing Language and Culture University Press. ISBN 978-7-5619-2775-5.
- ^ a b Yin, Shichao (1997). 哈爾濱方言詞典. 江蘇教育出版社.
- ^ a b c d e f g h 北京大學中國語言文學系 (1995). 漢語方言詞彙. 语文出版社.
- ^ Wu, Jizhang; Tang, Jianxiong; Chen, Shujing (2005). 河北省志 方言志. 方志出版社.
- ^ Qian, Zengyi (2002). "山東方言研究" (3).
{{cite journal}}: Cite journal requires|journal=(help) - ^ a b Qian, Zengyi (2010). 漢語官話方言研究. 齊魯書社.
- ^ Luo, Futeng (1997). 牟平方言詞典. 江蘇教育出版社.
- ^ He, Wei (June 1993). 洛陽方言研究. 社會科學文獻出版社.
- ^ Su, Xiaoqing; Lü, Yongwei (December 1996). 徐州方言詞典. 江蘇教育出版社. ISBN 7534328837.
- ^ Zhang, Chengcai (December 1994). 西寧方言詞典. 江蘇教育出版社. ISBN 7534322936.
- ^ He, Wei. 中原官話分區. Beijing: 中國社會科學院語言研究所.
- ^ a b Hou, Jing (2002). 現代漢語方言概論. 上海教育出版社. p. 46.
- ^ Wang, Futang (1998). 漢語方言語音的演變和層次. Beijing: 語文研究.
- ^ Zhou, Jixu (2012). "南路話和湖廣話的語音特點". 語言研究 (3).
- ^ 漢語方言學大詞典. 廣東教育出版社. 2017. p. 150. ISBN 9787554816332.
- ^ Zeng, Xiaoyu (2014). "《西儒耳目資》音系基礎非南京方言補證". 語言科學 (4).
- ^ Tao, Guoliang. 南通方言詞典. Nanjing: 江蘇人民出版社.
- ^ a b c d Richard VanNess Simmons (1999). Chinese Dialect Classification: A comparative approach to Harngjou, Old Jintarn, and Common Northern Wu. John Benjamins Publishing Co.
- ^ a b Lin, Yi (2016). "廣西的粵方言". 欽州學院學報. 31 (6): 38–42.
- ^ "The Hakka People > Historical Background". edu.ocac.gov.tw. Archived from the original on 2019-09-09. Retrieved 2010-06-11.
- ^ a b c Qian, Nairong (1992). 當代吳語研究. 上海教育出版社.
- ^ a b c d e Qian, Nairong (2007). 上海話大詞典. 上海教育出版社.
- ^ a b c d Ye, Changling (1993). 蘇州方言詞典. 江蘇教育出版社.
- ^ "奉贤金汇学校首开"偒傣话"课(图)". 人民網. Archived from the original on 2022-07-22. Retrieved 2022-07-22.
- ^ Li, Rulong (2001). 漢語方言學. Beijing: 高等教育出版社. p. 17.
- ^ Zhengzhang, Shangfang (1986). "皖南方言的分區(稿)". 方言 (1).
- ^ Zhang, Guangyu (1999). "東南方言關係總論". 方言 (1).
- ^ Meng, Qinghui (2005). 徽州方言. Beijing: 安徽人民出版社.
- ^ Peng, Xinyi (2010). 江西客贛語的特殊音韻現象與結構變遷. 國立中興大學中國文學研究所.
- ^ Lu, Guoyao (2003). 魯國堯語言學論文集·客、贛、通泰方言源於南朝通語說. 江蘇教育出版社. pp. 123–135. ISBN 7534354994.
- ^ Sun, Yizhi; Chen, Changyi; Xu, Yangchun (2001). 江西贛方言語音的特點.
- ^ Chen, Zhangtai. 閩語研究.
- ^ Song, Diwu; Cao, Shuji. 中國移民史 第五卷:名師其.
- ^ Bao, Houxing; Chen, Hui (2005). 湘語的分區(稿).
- ^ Sagart et al. (2019), pp. 10319–10320.
- ^ Kurpaska (2010), pp. 41–53, 55–56.
- ^ Yan (2006), pp. 9–18, 61–69, 222.
- ^ Mei (1970), p. ?.
- ^ Pulleyblank (1984), p. 3.
- ^ Zhou, Zhenhe; You, Rujie (1986). Fāngyán yǔ zhōngguó wénhuà 方言与中国文化 [Dialects and Chinese culture]. Shanghai Renmin Chubanshe.
- ^ Kurpaska (2010), p. 73.
- ^ Jiang, Ouyang & Zou (2007)
- ^ Liu, Ruoyun (1991). 惠州方言志.
- ^ Liu, Chuanxian (2001). 赣榆方言志. Beijing: 中华书局.
- ^ Norman (1988), pp. 182–183.
- ^ Iwata (2010), pp. 102–108.
- ^ a b c d Tang & Van Heuven (2007), p. 1025.
- ^ a b Bai, Wanru (1998). 廣州方言詞典. 江蘇教育出版社出版. ISBN 9787534334344.
- ^ Huang, Xuezhen (December 1995). 梅縣方言詞典. 江蘇教育出版社. ISBN 7534325064.
- ^ Li, Rong (1993). 廈門方言詞典. 江蘇教育出版社出版. ISBN 9787534319952.
- ^ Bao, Houxing (December 1998). 長沙方言詞典. 江蘇教育出版社出版. ISBN 9787534319983.
- ^ Xie, Jianyou (2007). 廣西漢語方言研究. 廣西人民出版社.
- ^ Shen, Ming (2016). 安徽宣城(雁翅)方言. 中國社會科學出版社.
- ^ Zheng, Ding'ou (1997). 香港粵語詞典. 江蘇教育出版社. ISBN 9787534329425.
- ^ Wang, Ping (August 1996). 蘇州方言語音研究. 華中理工大學出版社. ISBN 7560911315.
- ^ Qian, Nairong (錢乃榮) (2010). 《從〈滬語便商〉所見的老上海話時態》 (Tenses and Aspects? Old Shanghainese as Found in the Book Huyu Bian Shang). Shanghai: The Chinese University of Hong Kong Press.
- ^ Zhang, Qiang (2021). "臨夏方言格標記「哈[XA⁴³]」探究". 淮南師範學院學報. 23 (2). Guangzhou.
- ^ Liu, Juan; Peng, Zerun (July 2019). "衡山方言人稱代詞領格變調現象的實質". 湘潭大學學報(哲學社會科學版). 43 (4).
- ^ Liu, Danqing (2001). 吳語的句法類型特點.
- ^ Lau, Chun-Fat (November 2021). 香港客家話研究. Hong Kong: 中華教育. ISBN 9789888760046.
- ^ Zhu, Jiansong (1992). 武漢方言研究.
- ^ Zhu, Jiansong (May 1995). 武漢方言詞典. 江蘇教育出版社. ISBN 7534323290.
- ^ Wei, Gangqiang (1995). 黎川方言詞典. 江蘇教育出版社.
- Sagart, Laurent; Jacques, Guillaume; Lai, Yunfan; Ryder, Robin; Thouzeau, Valentin; Greenhill, Simon J.; List, Johann-Mattis (2019), "Dated language phylogenies shed light on the history of Sino-Tibetan", Proceedings of the National Academy of Sciences of the United States of America, 116 (21): 10317–10322, doi:10.1073/pnas.1817972116, PMC 6534992, PMID 31061123.
- "Origin of Sino-Tibetan language family revealed by new research". ScienceDaily (Press release). May 6, 2019.
Works cited
[edit]- Bradley, David (2012), "Languages and Language Families in China", in Rint Sybesma (ed.), Encyclopedia of Chinese Language and Linguistics., Brill
- Enfield, N. J. (2003), Linguistics Epidemiology: Semantics and Language Contact in Mainland Southeast Asia, Psychology Press, ISBN 0415297435
- Hannas, W. (1997), Asia's Orthographic Dilemma, University of Hawaii Press, ISBN 082481892X
- Iwata, Ray (2010), "Chinese Geolinguistics: History, Current Trend and Theoretical Issues", Dialectologia, Special issue I: 97–121.
- Jiang, Huo 江荻; Ouyang, Jueya 欧阳觉亚; Zou, Heyan 邹嘉彦 (2007), "Hǎinán Shěng Sānyà Shì Màihuà yīnxì" 海南省三亚市迈话音系, Fāngyán 方言 (in Chinese), 2007 (1): 23–34.
- Kurpaska, Maria (2010), Chinese Language(s): A Look Through the Prism of "The Great Dictionary of Modern Chinese Dialects", Walter de Gruyter, ISBN 978-3-11-021914-2
- Mair, Victor H. (1991), "What Is a Chinese 'Dialect/Topolect'? Reflections on Some Key Sino-English Linguistic terms" (PDF), Sino-Platonic Papers, 29: 1–31.
- Mei, Tsu-lin (1970), "Tones and prosody in Middle Chinese and the origin of the rising tone", Harvard Journal of Asiatic Studies, 30: 86–110, doi:10.2307/2718766, JSTOR 2718766
- Norman, Jerry (1988), Chinese, Cambridge: Cambridge University Press, ISBN 978-0-521-29653-3.
- Norman, Jerry (2003), "The Chinese dialects: Phonology", in Thurgood, Graham; LaPolla, Randy J. (eds.), The Sino-Tibetan languages, Routledge, pp. 72–83, ISBN 978-0-7007-1129-1
- Pulleyblank, Edwin G. (1984), Middle Chinese: A study in Historical Phonology, Vancouver: University of British Columbia Press, ISBN 978-0-7748-0192-8
- Tang, Chaoju; Van Heuven, Vincent J. (2007), "Predicting mutual intelligibility in chinese dialects from subjective and objective linguistic similarity" (PDF), Interlingüística, 17: 1019–1028.
- Thurgood, Graham (2003), "The subgroup of the Tibeto-Burman languages: The interaction between language contact, change, and inheritance", in Thurgood, Graham; LaPolla, Randy J. (eds.), The Sino-Tibetan languages, Routledge, pp. 3–21, ISBN 978-0-7007-1129-1
- van Driem, George (2001), Languages of the Himalayas: An Ethnolinguistic Handbook of the Greater Himalayan Region, Brill, ISBN 90-04-10390-2
- Wang, Feng (2005), "On the genetic position of the Bai language", Cahiers de Linguistique Asie Orientale, 34 (1): 101–127, doi:10.3406/clao.2005.1728.
- Yan, Margaret Mian (2006), Introduction to Chinese Dialectology, LINCOM Europa, ISBN 978-3-89586-629-6
Sinitic languages
View on GrokipediaOverview
Definition and scope
The Sinitic languages constitute the primary branch of the Sino-Tibetan language family, comprising all varieties descended from a common ancestor known as Proto-Sinitic.[7] These languages, often collectively referred to as Chinese varieties, are spoken natively by over 1.3 billion people worldwide, making them the largest sub-branch within the family.[8] The scope of Sinitic encompasses major groups such as Mandarin (Northern Chinese), Wu (including Shanghainese), Yue (including Cantonese), Min (including Hokkien), Xiang, Gan, Hakka, and Jin, among others, but explicitly excludes non-Sinitic Sino-Tibetan languages like those in the Tibeto-Burman branch, such as Tibetan and Burmese. As part of the broader Sino-Tibetan family, Sinitic languages share distant genetic ties with Tibeto-Burman varieties but form a distinct clade defined by their unique phonological and syntactic developments from Proto-Sinitic.[9] The term "Sinitic" emerged in 20th-century Western linguistics to describe this group as a language family rather than treating "Chinese" as a monolithic entity, highlighting the mutual unintelligibility among its varieties and their status as separate languages.[10] This nomenclature arose amid growing recognition of linguistic diversity in China, influenced by comparative studies that emphasized genetic relationships and typological distinctions over political or cultural unity.[11] Prior to this, European scholars often lumped all varieties under "Chinese," but the adoption of "Sinitic" allowed for precise classification within Indo-European and Sino-Tibetan frameworks.[12] Sinitic languages are characteristically tonal, with pitch contours on syllables distinguishing lexical meanings, and analytic in structure, relying on word order and particles rather than inflection for grammatical relations.[13] They predominantly follow a subject-verb-object (SVO) word order, which sets them apart from the more common SOV patterns in many Tibeto-Burman languages.[14] These features, inherited and elaborated from Proto-Sinitic, underscore the branch's typological profile as isolating languages with rich tonal systems.[15]Terminology and nomenclature
The term "Sinitic" derives from the Late Latin Sinae, referring to the Chinese people or China, combined with the suffix -itic to form an adjectival descriptor for languages associated with this region.[16][17] This etymology traces back through Ancient Greek Sῖnai and possibly to the Qin dynasty (Qin), reflecting early Western nomenclature for East Asian linguistic entities.[18] In linguistic scholarship, "Sinitic languages" serves as a neutral academic designation for the branch of the Sino-Tibetan language family encompassing various forms historically linked to China, distinct from the politically motivated term "Chinese language" or "varieties of Chinese" promoted by the People's Republic of China to emphasize national unity.[3][19] The latter framing underscores a single overarching language with regional dialects, aligning with state policies on cultural cohesion, whereas "Sinitic" highlights the independent linguistic status of its members based on structural criteria.[11] A central controversy surrounds the classification of these as "dialects" versus distinct "languages," largely driven by political considerations rather than purely linguistic ones, such as mutual intelligibility.[3] For instance, Mandarin and Cantonese exhibit near-zero mutual intelligibility between monolingual speakers, comparable to the divide between French and Italian, supporting their recognition as separate languages.[20][21] This linguistic perspective is reflected in international standards, where the ISO 639-3 code assigns separate identifiers to major varieties, such as cmn for Mandarin Chinese and yue for Yue Chinese (including Cantonese), under the macrolanguage zho for Chinese, thereby affirming their status as individual languages.[22][23]Historical development
Origins in Proto-Sinitic
Proto-Sinitic, the reconstructed ancestral language of the Sinitic branch of Sino-Tibetan, is estimated to have been spoken around 1250 BCE in the context of early Bronze Age developments along the Yellow River. Its reconstruction relies on the comparative method, drawing primarily from the phonological data in Middle Chinese rime tables and dictionaries, such as the Qieyun (601 CE), combined with evidence from modern Sinitic dialects and oracle bone inscriptions. Scholars like William H. Baxter and Laurent Sagart have advanced this work by proposing a systematic phonology that accounts for rhyme categories and initial consonants, allowing backward projection to the proto-stage before significant sound changes occurred. Key phonological features of Proto-Sinitic include the absence of lexical tones, which developed later during the transition to Middle Chinese due to the loss of final consonants. The language consisted of monosyllabic roots, each typically carrying a single morpheme, a trait that persists in descendant Sinitic languages and distinguishes them from more polysyllabic Sino-Tibetan relatives. Syllable structure was relatively simple, generally following a consonant-vowel (CV) or consonant-vowel-consonant (CVC) pattern, with possible prefixal elements but no complex clusters in the onset or coda beyond stops, nasals, and approximants. The earliest direct evidence for Sinitic languages emerges from the Shang dynasty oracle bone inscriptions, dated to circa 1200 BCE, which represent an early form of logographic writing used for divination on animal bones and turtle shells.[24] These inscriptions, numbering over 150,000 fragments, include vocabulary related to rituals, astronomy, and administration, providing glimpses of a language already distinct from contemporaneous non-Sinitic tongues in the region. Proto-Sinitic likely expanded in the Yellow River valley amid interactions with non-Sinitic substrates, such as pre-existing Neolithic populations speaking possibly Austroasiatic or Hmong-Mien languages, which may have contributed loanwords and influenced early lexical development.Evolution from Old to Middle Chinese
Old Chinese, spanning approximately 1250 to 200 BCE, represents the earliest well-attested stage of the Sinitic languages, primarily evidenced through the rhyming patterns in the Shijing (Classic of Poetry), a collection of ancient poems compiled around the 6th century BCE. Reconstructions of Old Chinese phonology, notably by Baxter and Sagart (2014), posit a system with post-glottalized initials such as *p', *t', and *k', alongside a rich inventory of initial consonant clusters and no lexical tones; instead, pitch variations were likely prosodic rather than phonemic. These features distinguished Old Chinese from later stages, with the glottalization contributing to subsequent aspiration patterns without altering core syllable structure. The evolution to Middle Chinese, roughly 200 to 900 CE, marked a pivotal transformation driven by the simplification of syllable codas and the emergence of tonality. Final consonants in Old Chinese, including stops (*-p, *-t, -k) and fricatives (-s), were progressively lost, leading to compensatory pitch contours that developed into the four tones of Middle Chinese: level (píngshēng), rising (shǎngshēng), departing (qùshēng), and entering (rùshēng). This tonogenesis process is exemplified in how Old Chinese voiceless finals yielded level tones, while voiced finals produced rising or departing tones, with the entering tone preserving the brevity of stop-final syllables. The Qieyun rhyme dictionary, compiled in 601 CE under Lu Fayan, provides the primary documentation of this system, categorizing syllables into 193 rhymes and using the fanqie method to indicate pronunciations based on the Sui dynasty standard near Chang'an.[25][26] Notable sound shifts underscore this transition, such as the regular correspondence of Old Chinese aspirated stops like *kʰ to Middle Chinese kh, maintaining aspiration while other initials simplified; for instance, *kʰaŋ (high) evolves to Middle Chinese kʰaŋ without further change in the initial. The four-tone split further diversified the system, with rising and departing tones arising from mergers of Old Chinese categories influenced by initial voicing. Buddhist texts, translated from Indic languages starting in the 2nd century CE, significantly aided documentation by necessitating precise phonetic glosses; these translations employed early fanqie notations and introduced loanwords that highlighted tonal contrasts, informing the phonological analyses in works like the Qieyun.[27] Regional divergences in Sinitic speech emerged during the Han dynasty (206 BCE–220 CE), as migrations southward and westward—prompted by imperial expansion, warfare, and economic pressures—exposed northern varieties to substrate influences from non-Sinitic languages in the Yangtze and Pearl River basins. These population movements, involving millions of Han settlers, initiated subtle phonetic variations, such as differential treatment of initials and tones across emerging regional norms, setting the stage for later dialectal branching without yet fully fragmenting the literary standard.Modern divergence and standardization
Following the fall of the Qing dynasty in 1911, the divergence among Sinitic varieties intensified during the 19th and 20th centuries, driven by rapid urbanization and extensive population migrations triggered by wars, economic shifts, and colonial influences. These movements, including the Taiping Rebellion (1850–1864) and subsequent labor migrations to urban centers and overseas, resulted in increased dialect contact in some areas but also fostered the emergence of regionally distinct urban speech forms, as speakers adapted local varieties to new social contexts without a unifying standard. For instance, in southern cities like Guangzhou and Shanghai, urbanization reinforced Yue and Wu features amid influxes of northern migrants, exacerbating phonological and lexical differences from northern Mandarin varieties.[28][7] In the Republican era (1912–1949), efforts to address this growing linguistic fragmentation culminated in the Baihua movement, launched as part of the May Fourth Movement in 1919, which advocated replacing classical wenyan with vernacular baihua to modernize education and national communication. This initiative standardized the written vernacular primarily on the Beijing dialect of Mandarin, promoting it as guoyu (national language) through school curricula, publications, and radio broadcasts, though implementation varied regionally due to political instability. The movement marked a pivotal shift toward phonological norms based on northern speech, influencing literary and administrative language across China.[29] After the establishment of the People's Republic of China in 1949, state policies further centralized standardization to counter dialectal divergence and support national unity. The first national conference on Chinese language reform in October 1955 designated Putonghua—based on Beijing phonology, northern grammar, and modern vernacular vocabulary—as the official national standard, with mandatory use in education, media, and government by 1956. Complementing this, the State Council promulgated the Scheme for Simplifying Chinese Characters in January 1956, reducing stroke counts for over 2,000 characters to boost literacy rates among speakers of diverse varieties. These measures built on Middle Chinese foundations by prioritizing Mandarin as a lingua franca, though regional varieties persisted in informal domains.[30][31][32] Divergence metrics underscore the deep historical separation among major Sinitic branches; for example, lexical similarity between Mandarin and Yue is approximately 24%, reflecting a split estimated around 2,000 years ago during the late Old Chinese to early Middle Chinese period. This ancient bifurcation, combined with 20th-century sociopolitical factors, highlights ongoing challenges in standardization efforts.[33][34]Demographics and distribution
Global speaker population
The Sinitic languages collectively have approximately 1.3 billion native speakers worldwide, as of 2025, representing the largest language family by native speaker population. This figure encompasses all major varieties spoken primarily in China, Taiwan, Singapore, and diaspora communities. Among these, Mandarin varieties account for the largest share, with about 990 million native speakers.[35][36] The speaker base has grown steadily due to historically high birth rates in China, which have sustained a population where over 90% are native Sinitic speakers, combined with expansion in overseas Chinese communities estimated at around 50 million individuals maintaining these languages.[37][38] Breakdowns by major varieties highlight their relative scales: Yue (including Cantonese) has roughly 85 million native speakers, as of 2025, Wu (including Shanghainese) about 83 million, and Min (including Hokkien and Teochew) around 75 million. These groups, alongside smaller varieties like Hakka and Gan, contribute to the family's dominance.[39][40][41][38] Additionally, non-native speakers bolster the total reach, adding an estimated 200 million individuals who use Mandarin as a second language, largely driven by China's national education policies promoting it as the standard tongue across diverse linguistic regions.[35][2]Regional variations and migration
The Sinitic languages display distinct regional distributions across China, shaped by historical settlement patterns and geographical barriers. Mandarin varieties dominate in northern and central China, encompassing provinces such as Hebei, Shandong, Henan, and extending into the southwest, where they serve as the primary lingua franca for over 950 million speakers.[36] In contrast, Yue varieties, including Cantonese, are concentrated in the southern provinces of Guangdong and Guangxi, as well as in Hong Kong and Macau, with around 70 million speakers maintaining these forms in daily communication.[42] Min varieties prevail in southeastern coastal areas, particularly Fujian province and Taiwan, where subgroups like Hokkien support approximately 75 million speakers in local contexts.[43] Historical migrations have extended these regional varieties into diaspora communities, profoundly influencing global linguistic landscapes. From the mid-19th century onward, labor migrations driven by economic opportunities and domestic upheavals propelled millions of Chinese overseas, primarily from southern provinces.[44] In Southeast Asia, migrants from Fujian established vibrant Hokkien-speaking (Southern Min) communities, notably in Singapore, where it remains a key heritage language spoken in about 11% of ethnic Chinese households as of the 2020 census.[45] Similarly, 19th-century laborers from Guangdong carried Yue varieties to North America, shaping Chinatowns in cities like San Francisco and New York with Taishanese dialects that persist in older generations.[46] These movements also reached Europe, though on a smaller scale during the period, fostering pockets of Sinitic language use in port cities like Liverpool and Amsterdam through subsequent waves tied to colonial trade.[44] Within China, rapid urbanization since the late 20th century has accelerated a shift toward Mandarin dominance in cities, diminishing the everyday role of traditional varieties. As rural populations migrate to urban centers for employment, Mandarin facilitates integration into diverse workforces and education systems, leading to intergenerational language changes.[47] Data from the 2020 national census indicate that Mandarin is spoken by 80.72% of the population overall, with usage exceeding this rate in urban areas—where over 900 million residents live—due to its role as the standardized medium for professional and social interactions.[48] By 2023, the urbanization rate had risen to 66.16%, with over 940 million urban residents, further reinforcing Mandarin as the primary language among city dwellers.[49] This trend underscores how urban expansion reinforces Mandarin as the primary language among city dwellers.[50]Varieties and classification
Bai varieties
The Bai languages form a small group spoken primarily by the Bai ethnic group in northwest Yunnan Province, China, with an estimated 1–2 million speakers as of 2023. These languages are concentrated around the Erhai Lake region, including areas in Dali Bai Autonomous Prefecture and surrounding counties. The group encompasses several closely related varieties, with the main dialects being Jianchuan (also known as Central Bai), Dali (Southern Bai), and Bijiang (Northern Bai, sometimes including the Lemo subdialect).[51] Jianchuan and Dali dialects are the most prominent, serving as prestige forms within their respective subregions, while Bijiang shows greater divergence due to geographic isolation in the Nujiang Valley.[52] Bai varieties exhibit distinctive phonological and lexical features that set them apart from core Sinitic languages. Their tonal systems typically feature 6 to 8 tones, varying by dialect; for instance, Jianchuan Bai has seven tones, including level, rising, falling, and checked contours, often with modal and breathy voice qualities distinguishing them.[53] [54] Lexically, Bai retains conservative elements traceable to Old Chinese, such as archaic pronunciations and vocabulary items like the word for "red" (preserved as *t-qʰrAk > chì in Old Chinese parallels) and terms for natural phenomena like "sky" and "wind," which align closely with northwestern Old Chinese forms.[55] These retentions reflect historical layers of contact and substrate influence in the region. As of 2024, the affiliation of Bai within the Sino-Tibetan family remains highly debated among linguists. Proponents of its Sinitic classification point to extensive lexical and phonological correspondences with Old Western Chinese, suggesting it diverged early from a northwestern Sinitic branch.[56] [57] However, others argue it constitutes a separate Sino-Tibetan branch, heavily influenced by Tibeto-Burman languages, particularly Qiangic and Loloish subgroups, due to substrate effects and the presence of non-Sinitic morphological traits in its core vocabulary (e.g., up to 40% non-Chinese roots in basic lists).[58] [59] This view is supported by analyses showing stratified Chinese borrowings overlaying a Tibeto-Burman base, complicating straightforward Sinitic assignment.[60]Mandarin varieties
Mandarin varieties form the largest branch of the Sinitic languages, numerically and geographically, spoken by approximately 920 million native speakers as of 2023 and covering about 70% of China's territory.[10][35] These varieties are mutually intelligible to a significant degree and serve as the foundation for Putonghua, the modern standard form of Chinese. This dominance stems from historical migrations and administrative policies that promoted northern speech forms during the Ming and Qing dynasties.[10] The classification of Mandarin varieties typically divides them into several subgroups based on phonological and lexical differences. The Beijing and Northeastern subgroup, centered in Beijing and the northeast (including Heilongjiang, Jilin, and Liaoning provinces), forms the core of standard Putonghua, with its pronunciation serving as the norm for the national language. The Southwestern subgroup, spoken in Sichuan, Chongqing, and surrounding areas, exhibits variations in tone realization and vocabulary influenced by local geography. The Jin subgroup, primarily in Shanxi province, is sometimes classified separately but shares key Mandarin traits, including innovative tone patterns from ancient entering tones. Other notable subgroups include Jilu (in Hebei and Shandong) and Jiaoliao (in Shandong and Liaoning), which feature transitional phonologies between northeastern and central forms. These subgroups, while mutually intelligible, show regional accents and lexical items that reflect historical settlement patterns.[10] Phonologically, Mandarin varieties are distinguished by a four-tone system—high level, rising, low dipping, and falling—resulting from mergers of the eight tones of Middle Chinese, with the ancient entering tone distributed among the other categories. A hallmark feature is the lack of word-final stop consonants (-p, -t, -k), which were lost in northern varieties between the 12th and 16th centuries, leading to open syllables ending in vowels or nasals. Additionally, erhua (儿化), the retroflex suffix -r derived from the diminutive particle ér (儿), is prevalent in northern subgroups, adding r-coloring to syllable finals for expressive or diminutive effect, as in huār (花儿) for "flower." These features contribute to the relatively uniform phonology across Mandarin, facilitating comprehension despite regional variations.[61][62] Mandarin's central role in language unification was formalized in 1955 when the People's Republic of China designated Putonghua as the standard language, based primarily on the Beijing dialect's phonology and northern Mandarin grammar and vocabulary. This policy, aimed at promoting national cohesion, has since elevated Mandarin varieties as the medium of education, media, and government, reinforcing their influence over other Sinitic branches.[63]Wu varieties
The Wu varieties, also known as Wu Chinese, form a major branch of the Sinitic languages spoken primarily in the lower Yangtze River region, including Shanghai, southern Jiangsu, northern Zhejiang, and parts of Anhui provinces.[64] With approximately 82 million native speakers as of 2023, Wu constitutes one of the largest Sinitic groups, concentrated in urban centers like Shanghai and rural areas of the Jiangnan region.[65] These varieties are characterized by their retention of archaic phonological elements from Middle Chinese, distinguishing them from more innovative branches like Mandarin. Wu is broadly divided into Northern and Southern subgroups, with Northern Wu encompassing dialects such as Shanghainese and Suzhounese spoken around the Taihu Lake basin, including Shanghai and Suzhou, while Southern Wu includes the Oujiang subgroup, prominently represented by the Wenzhou dialect in southern Zhejiang.[66] Northern varieties tend to show greater mutual intelligibility among themselves, whereas Southern ones, like Wenzhou, exhibit significant divergence even within Wu.[67] A hallmark of Wu phonology is its rich tonal system, typically featuring 5 to 8 tones depending on the variety, with complex tone sandhi rules that alter contours in connected speech.[68] Unlike many northern Sinitic languages, Wu preserves the Middle Chinese entering tone as a distinct category, often realized as syllables ending in a glottal stop or short vowels, and maintains voiced initials (e.g., /b/, /d/, /ɡ/) that have devoiced in other branches.[67][69] These features contribute to a phonemic inventory that is conservative yet diverse, with voiced obstruents influencing tone register splits into yin (higher) and yang (lower) categories.[70] The Wenzhou dialect within Southern Wu is particularly noted for its extreme phonological complexity and low intelligibility with other Sinitic varieties, earning it a reputation as a "cryptologic" language historically used by merchants for secretive business dealings to exclude outsiders.[71] Culturally, Wu varieties underpin traditional arts such as Pingtan, a narrative performance genre combining storytelling and ballad-singing in the Suzhou dialect, which preserves Jiangnan folklore and literary traditions.[72] They also form the basis for Wu literature, including vernacular short story collections like the Sanyan by Feng Menglong, which reflect the social and moral themes of the Wu-speaking regions during the Ming dynasty.[73]Yue varieties
The Yue varieties, also known as Yue Chinese, constitute a major branch of the Sinitic language family, primarily spoken in the southern Chinese provinces of Guangdong, Guangxi, and parts of Hainan, as well as in diaspora communities worldwide. This group encompasses several subgroups, with the most prominent being Cantonese (or Yuehai, centered in Guangzhou and Hong Kong), Taishanese (or Hoisan, from the Siyi region), and Gaoyang (a transitional variety in western Guangdong). Collectively, Yue varieties are spoken by approximately 86 million native speakers as of 2023, making them one of the largest Sinitic subgroups by native speaker count.[74] These languages are characterized by their mutual intelligibility within core subgroups but significant divergence across broader dialects, influenced by regional geography and historical isolation. Linguistically, Yue varieties are distinguished by their rich tonal systems, typically featuring 6 to 9 tones depending on the specific dialect, which evolved from Middle Chinese through a process of tone splitting and merger distinct from northern Sinitic branches. They retain three stop finals (-p, -t, -k) in syllable codas, a conservative trait lost in many other modern Sinitic languages, allowing for closed syllables that contribute to their rhythmic complexity. Additionally, Yue employs elaborate diminutive suffixes, such as -zai, -lo, and -zi, which add nuanced expressiveness to nouns and verbs, reflecting a high degree of morphological innovation compared to more analytic Sinitic varieties. These features underscore Yue's role as a southern conservative in preserving archaic phonetic elements while developing unique lexical and syntactic patterns. Yue's global prominence stems from 19th-century emigration waves, particularly during the California Gold Rush and British colonial expansions, which carried Cantonese speakers to the Americas, Southeast Asia, the United Kingdom, and Australia, establishing it as the most widely spoken Sinitic variety outside mainland China. Today, vibrant diaspora communities in cities like San Francisco, Vancouver, and London maintain Yue through family and cultural networks, often blending it with local languages. Furthermore, the influence of Hong Kong cinema and media since the mid-20th century has amplified Cantonese's reach, popularizing its idioms, songs, and slang in films and television that circulate internationally, fostering a sense of cultural identity among global speakers.Min varieties
The Min varieties constitute one of the most diverse and internally fragmented branches of the Sinitic languages, encompassing several major subgroups that exhibit significant phonological and lexical differences. The primary subgroups include Southern Min (also known as Min Nan or Hokkien/Taiwanese), Eastern Min (Min Dong), and Northern Min (Min Bei), along with smaller divisions such as Central Min and Puxian Min.[75] These varieties are spoken by approximately 76 million native speakers as of 2023, primarily in southeastern China.[74] The dialects within these subgroups are highly mutually unintelligible, with speakers of one variety often unable to comprehend others without prior exposure, reflecting the branch's deep internal fragmentation.[33] For instance, Hokkien speakers in Taiwan may struggle to understand Northern Min varieties from Jiangxi province. Phonologically, Min varieties are characterized by complex tone systems typically ranging from 5 to 7 tones, which arose from the early loss of stop codas in syllable finals—a feature distinguishing them from many other Sinitic branches that retain such closures.[76] Instead, Min languages preserve nasal codas like -m, -n, and -ŋ, contributing to their archaic sound profiles and further complicating intelligibility across subgroups.[77] This tonal and coda structure underscores the varieties' retention of pre-Middle Chinese elements, setting them apart in the broader Sinitic family. The Min branch represents the earliest divergence among Sinitic languages, with Proto-Min splitting from the rest of Old Chinese around 2,500 years ago, prior to the establishment of Middle Chinese in the 6th century CE.[78] This ancient separation is evidenced by substrate influences from pre-Han Austroasiatic languages spoken in southern China, including lexical borrowings related to agriculture and local flora that persist in Min vocabularies.[79] As a result, Min varieties maintain conservative traits not found in northern Sinitic branches. Hokkien, the most prominent Min Nan variety, has played a significant role in overseas communities due to historical maritime trade. During the Ming dynasty (1368–1644), Hokkien merchants from Fujian established trading networks that facilitated migration to Southeast Asia, including the Philippines, where they bartered goods with indigenous groups via southern Taiwan routes as early as the 13th century, with intensified activity after the 1550s.[80] Similar trade links extended to Singapore, contributing to enduring Hokkien-speaking diaspora populations there. Min varieties are concentrated in Fujian province and Taiwan, with migrations shaping their global distribution.[10]Hakka varieties
Hakka varieties, spoken by approximately 44 million native speakers worldwide as of 2023, are primarily concentrated in southern provinces of China such as Guangdong, Jiangxi, and Fujian, as well as in Taiwan.[74] These varieties form a distinct branch of the Sinitic languages, characterized by their relative homogeneity compared to other Sinitic groups, owing to the shared migratory history of Hakka speakers.[81] The major subgroups of Hakka include the Meixian (also known as Jiaying) dialect, which serves as the prestige or standard form and is centered in Meizhou, Guangdong; and the Dabu dialect, prominent in northeastern Guangdong and among migrant communities in Taiwan. Other notable subgroups encompass Hailu, Sixian, and Raoping, with Meixian and Dabu together representing a significant portion of speakers, particularly in Taiwan where Dabu influences local varieties. These subgroups exhibit minor phonological and lexical differences but remain mutually intelligible, facilitating communication across Hakka-speaking regions.[82] Linguistically, Hakka varieties are distinguished by their six-tone system, a feature shared by the majority of dialects, which contrasts with the more varied tonal profiles in neighboring Sinitic languages. They retain conservative initial consonants, including the preservation of the velar nasal *ŋ- (ng-) in words like ngai "I," reflecting archaic Middle Chinese phonology more closely than many southern varieties. The vocabulary of Hakka also bears traces of northern origins, incorporating terms and expressions from earlier Han migrations, such as kinship and agricultural lexicon that differ from southern Sinitic norms.[83][84][85] Hakka migrations, particularly those in the 19th century driven by economic opportunities and social unrest in southern China, established vibrant communities in Taiwan and Malaysia, where Hakka speakers were often designated as "guest people" (Hakka) by local populations. These migrations contributed to intergroup tensions, exemplified by the Punti-Hakka Clan Wars (1855–1867) in Guangdong, which arose from land disputes and cultural differences between Hakka newcomers and established Punti (Cantonese) residents.[86][87] The resilience of Hakka varieties is closely tied to the community's strong clan structures, which have historically fostered endogamy, communal living in fortified tulou dwellings, and cultural transmission practices that prioritize language maintenance across generations. These social organizations continue to support Hakka linguistic vitality, even in diaspora settings, by reinforcing identity through festivals, education, and family-based language use.[88][89]Gan varieties
Gan varieties, collectively known as Gan Chinese, constitute a major branch of the Sinitic languages spoken primarily in Jiangxi Province and adjacent regions of southeastern China, including parts of Hubei, Hunan, Anhui, and Fujian. With an estimated 22 million native speakers as of 2023, Gan serves as the dominant linguistic group in Jiangxi, where around 29 million individuals use it as their primary language including second-language speakers.[90] The varieties are geographically concentrated in central and northern Jiangxi, reflecting historical migrations and settlements that have shaped their distribution.[91] Key subgroups within Gan include the Nanchang variety, centered in the provincial capital, and the Yichun variety, spoken in the western part of Jiangxi. These subgroups exhibit internal diversity, with Nanchang Gan representing a more standardized form influenced by urban development, while Yichun Gan preserves more conservative rural traits. Phonologically, Gan varieties are characterized by 5 to 7 tones in most cases, though some dialects display up to 10 distinct tones due to historical tone splits from Middle Chinese. They feature a hybrid profile, with Mandarin-like initials such as the retroflex series (/ʈ/, /ʈʂ/, /ʂ/) and a relatively full set of stops, combined with southern finals that retain Middle Chinese codas like -p, -t, and -k in certain environments.[69][91][92] In Sinitic classification, Gan occupies a transitional position between northern and southern branches, often bridging Mandarin to the north with Xiang and Wu to the south through shared innovations and retentions. This intermediary role is evident in its moderate mutual intelligibility with Mandarin, driven by substantial lexical overlap estimated at around 60-70% for core vocabulary. Culturally, Gan varieties underpin traditional Gan opera (Ganju), a performative art form originating in Jiangxi that integrates music, dialogue, and dance, recognized as part of China's national intangible cultural heritage since 2008.[91][85][93]Xiang varieties
The Xiang varieties, spoken primarily in Hunan province in south-central China, form one of the major subgroups of the Sinitic languages and are estimated to have around 37 million native speakers as of 2023.[74] These varieties exhibit significant internal diversity, reflecting layers of historical development influenced by neighboring Sinitic groups, while maintaining distinct phonological characteristics. The primary areas of concentration include the central and western parts of Hunan, with extensions into adjacent regions of Hubei, Guizhou, and Guangxi provinces.[94] Xiang varieties are conventionally divided into two main subgroups: New Xiang and Old Xiang. New Xiang, centered around Changsha (the provincial capital) and extending to areas like Zhuzhou and Xiangtan, represents the more innovative branch, with dialects showing substantial convergence with neighboring Mandarin varieties. Old Xiang, spoken in regions such as Loudi, Hengyang, and Xiangxiang in southwestern Hunan, preserves more conservative features and is less affected by external influences. This subgrouping, originally proposed by linguist Yuan Jiahua, is based on differences in initial consonant voicing and other phonological traits, with New Xiang having largely lost the voiced initials typical of earlier Sinitic stages.[95][96] Phonologically, Xiang varieties are characterized by 5 to 6 tones, often split into upper (yin) and lower (yang) registers that trace back to the voicing distinction in Middle Chinese syllable initials—voiceless initials yielding higher-pitched tones and voiced ones lower-pitched. A key conservative feature is the retention of Middle Chinese checked (entering) tones, which appear as short, abrupt syllables typically ending in a glottal stop or unreleased stop, distinguishing them from the tone mergers seen in Mandarin. For example, in the Changsha dialect of New Xiang, the tone system includes six contours, with the entering tone realized as a mid-rising but clipped contour. Old Xiang dialects, such as Xiangxiang, may merge some tones but still uphold the register split and checked syllables more robustly than their New Xiang counterparts.[96][95] Compared to New Xiang, Old Xiang conserves more ancient phonological elements, including partial retention of voiced stops and nasals in initials, which contribute to its relative resistance to Mandarinization; in contrast, New Xiang displays devoicing and aspiration patterns akin to Southwestern Mandarin, reflecting prolonged contact in the northern Hunan basin. This layered conservatism in Xiang highlights its position as a transitional group between northern and southern Sinitic varieties, with Old Xiang embodying deeper historical strata from pre-Ming migrations. A prominent historical figure associated with Xiang is Mao Zedong, who was a native speaker of the Changsha dialect in New Xiang.[95][97]Hui varieties
The Hui varieties, also known as Huizhou Chinese, form a distinct group of Sinitic languages spoken primarily in southern Anhui Province in eastern China, with some extension into adjacent areas of Zhejiang and Jiangxi provinces. These varieties are estimated to have around 5 million native speakers as of 2023, concentrated in the historical Huizhou region.[74] The Language Atlas of China classifies the Hui group as an independent branch of Sinitic languages, divided into five main subgroups: Ji–She, Tun–Xi, Yi–Jing, Dong–Qian, and Jing–De. Prominent among these are the Shexian and Tunxi subgroups, which represent key dialectal centers in Anhui.[98] Phonologically, Hui varieties are characterized by 6 to 8 tones, typically including a glottalized checked tone that is often weakened in modern speech. They exhibit mergers of several Middle Chinese distinctions, particularly in initial consonants and rhyme categories, alongside retention of Wu-like voiced obstruent initials.[98] While frequently affiliated with Wu varieties due to shared phonological traits such as initial voicing, Hui displays notable Gan influences in areas like tone splits and consonant developments, positioning it as transitional between the two. Mutual intelligibility with standard Mandarin remains low, reflecting significant lexical and phonological divergence. The Hui varieties are closely tied to the region's Huizhou merchant culture, a historically influential network of traders from the Ming and Qing dynasties that shaped local economy, architecture, and social structures through commerce and Confucian values.[99]Pinghua and other minor varieties
Pinghua varieties are spoken primarily in the Guangxi Zhuang Autonomous Region of southern China, where they function as trade languages in multi-ethnic areas alongside Zhuang and other local tongues. These varieties are divided into Northern Pinghua (Guibei) and Southern Pinghua (Guinan), which are not mutually intelligible and exhibit distinct phonological and lexical profiles influenced by contact with surrounding non-Sinitic languages, such as borrowing from Northern Zhuang vocabulary while retaining relatively conservative Sinitic grammatical structures.[15][100] Spoken by approximately 4 million people as of 2023, Pinghua represents a fringe branch within Sinitic classifications, sometimes grouped with Yue varieties due to geographic proximity but often treated as a separate entity owing to its unique areal features and uncertain phylogenetic position.[101][102] Jin varieties, centered in Shanxi Province and extending into adjacent regions of Inner Mongolia, Shaanxi, and Hebei, form another major but debatably independent group within the Sinitic family, spoken by approximately 47 million native speakers as of 2023. As of 2024, linguistic consensus remains divided on whether Jin constitutes a separate primary branch from Mandarin, with some classifications including it within Mandarin due to mutual intelligibility, while others treat it independently based on phonological distinctions. Unlike standard Mandarin, Jin is distinguished by phonological innovations such as the retention of entering tones (short syllables with glottal stops) and patterns of vowel raising in palatalized contexts, where high front vowels trigger front-raising while non-palatalized ones lead to back-raising.[74] These features highlight Jin's transitional role between northern and southern Sinitic branches, with some dialects showing limited palatalization of velars compared to Mandarin, contributing to its ongoing debate over inclusion within the Mandarin continuum.[103] Other minor Sinitic varieties include Waxiang, a conservative isolate spoken by around 320,000 people in the remote northwestern mountainous areas of Hunan Province. Waxiang maintains archaic syntactic and lexical elements, such as polyfunctional comitative markers derived from verbs like 'to follow', setting it apart as an unclassified member of the family amid heavy contact with neighboring Xiang and Southwestern Mandarin varieties.[104] These fringe varieties collectively account for roughly 5–10% of Sinitic linguistic diversity, often facing pressures from assimilation into dominant regional forms like Mandarin.[105]Internal relationships
Major classification proposals
The traditional classification of Sinitic languages recognizes seven major dialect groups, a framework established by Yuan Jiahua in his 1960 work Hanyu fangyan yinyun. These groups—Mandarin (guānhuà 官话), Wu (Wú 吴), Min (Mǐn 闽), Xiang (Xiāng 湘), Gan (Gàn 赣), Hakka (Kèjiā 客家), and Yue (Yuè 粤)—are based primarily on phonological criteria derived from Middle Chinese, such as the treatment of initial and final consonants, tones, and vowel systems.[106] This schema has served as the foundation for much of subsequent dialectology in mainland China, emphasizing regional coherence within each group while acknowledging internal diversity.[107] Modern proposals have refined and expanded this classification, incorporating additional subgroups to account for varieties that do not fit neatly into the original seven, resulting in schemes with 10 to 14 primary branches. For instance, the Language Atlas of China (Wurm et al., 1987) delineates 10 main branches, including the original seven plus Jin (Jìn 晋), Hui (Huì 徽), and Pinghua (Pīnghuà 平话), with further subdivisions based on isoglosses for phonological and lexical features.[108] Jerry Norman, in his 1988 monograph Chinese, organizes these into broader zones—Northern (Mandarin and Jin), Central (Wu, Gan, Xiang, and Hui), and Southern (Min, Hakka, and Yue)—while advocating for the recognition of Pinghua and other transitional varieties as distinct due to their unique retention of archaic traits like preserved entering tones.[109] Ethnologue similarly lists over a dozen coordinate subgroups under the Chinese macrolanguage, encompassing these expansions and treating varieties like Dungan and Taiwanese Min as separate entries to reflect global distribution and mutual unintelligibility. Despite these advancements, there remains no consensus on the precise number of primary branches, with scholars proposing anywhere from 9 to 13 based on varying criteria such as shared innovations, borrowing patterns, and substrate influences.[11] This variability stems from the dialect continuum nature of Sinitic, where boundaries are often gradual rather than discrete.[108] A particular point of contention involves Macro-Bai, a cluster of languages spoken in Yunnan Province, whose inclusion within Sinitic is debated. Proponents of Sinitic affiliation, such as those examining cognates and syntactic parallels with Old Chinese, argue it represents a conservative branch influenced by local Tibeto-Burman elements.[56] Conversely, others classify it as a distinct Tibeto-Burman offshoot with heavy Sinitic borrowing, citing phonological divergences like non-Sinitic tone splits and morphology.[60] This debate underscores the challenges in delineating Sinitic boundaries amid historical contact.[57]Debates on northern vs. southern branches
The classification of Sinitic languages into northern and southern branches has been a central debate in Chinese dialectology, reflecting deep typological and historical divergences within the family. Northern varieties, primarily centered around the Yellow River basin, are characterized by fewer tones—typically four or five—and innovative syntactic features, such as a stronger preference for subject-verb-object word order and reduced use of classifiers, which align them more closely with neighboring Altaic languages. Southern varieties, spoken along the Yangtze River and further south, exhibit more complex tonal systems—often six or more tones—and more conservative phonological structures, preserving ancient initial consonants and final stops that have been lost in the north; these include groups like Wu, Min, and Yue. A key controversy surrounds the origins of these north-south differences: whether southern varieties reflect substrate influences from pre-existing non-Sinitic languages, such as Tai-Kadai (also known as Kra-Dai), or result from parallel internal evolution within Sinitic. Proponents of the substrate hypothesis argue that features like elaborate tone splits and head-initial tendencies in southern varieties stem from contact with indigenous Tai-Kadai languages during Han Chinese migrations southward, evidenced by shared lexical items and phonological patterns, such as the retention of certain syllable finals in Cantonese that mirror Tai structures.[110][111] In contrast, advocates for parallel evolution contend that these traits arose independently through divergence from a common proto-Sinitic ancestor, driven by geographic isolation and areal pressures rather than direct borrowing, with limited direct evidence of widespread lexical borrowing from Tai-Kadai substrates.[111] This debate is complicated by transitional central varieties, which blend northern and southern traits, challenging a strict binary division. Lexical similarity often serves as a practical metric in this discussion, with a threshold of around 70% commonly invoked to distinguish northern from southern branches, below which mutual intelligibility diminishes significantly; for instance, comparisons between representative northern (e.g., Beijing Mandarin) and southern (e.g., Guangzhou Cantonese) varieties yield similarities of only 20-30%, underscoring their separation.[112] Historical migrations of Sinitic speakers southward during periods of dynastic upheaval likely exacerbated these divides, incorporating local substrates without fully erasing proto-Sinitic foundations.[7] Overall, while the northern-southern framework provides a useful heuristic for understanding Sinitic diversity, ongoing typological analyses continue to refine its boundaries, emphasizing convergence over rigid genetic splits.[113]Quantitative and phylogenetic analyses
Quantitative analyses of Sinitic languages have employed lexicostatistical methods, such as Swadesh lists, to measure lexical similarity and estimate divergence times among varieties. For instance, comparisons using a 200-item Swadesh list reveal that Mandarin shares approximately 31% lexical similarity with Wu varieties like Shanghainese, indicating substantial divergence while still reflecting a shared Sinitic heritage.[33] These methods, though controversial due to assumptions about uniform vocabulary retention rates, suggest that major Sinitic branches, including Mandarin and Wu, began diverging around 1,500 years ago from a common Middle Chinese ancestor, aligning with historical records of dialectal fragmentation during the Tang dynasty.[114] Phylogenetic studies in the 2020s have advanced these estimates through Bayesian models that incorporate lexical, phonological, and syntactic data to reconstruct family trees and divergence timelines. A 2019 Bayesian analysis of 50 Sino-Tibetan languages, including multiple Sinitic varieties, dated the family's origin to approximately 7,200 years before present, with Sinitic emerging as a primary branch around 5,900 years ago; within Sinitic, Min varieties are positioned as the earliest diverging group, preserving archaic features like complex tone systems.[115] A subsequent 2020 study using expanded datasets estimated the initial divergence between Sinitic and Tibeto-Burman at approximately 8,000 years before present, confirming Sinitic's position near the root of the Sino-Tibetan tree; internal Sinitic divergences, such as those separating Min as a basal branch from other groups like Mandarin and Wu, are estimated at around 2,000–3,000 years ago based on linguistic and historical evidence.[116] These models highlight reticulate evolution due to areal contacts, challenging strict tree-based assumptions. Recent interdisciplinary correlations between DNA and linguistics further support a northern origin for Sinitic varieties, linking them to ancient millet farmers in the Yellow River basin. Genomic analyses of Neolithic remains show that populations associated with millet agriculture (ca. 5,000–3,000 BCE) contributed significantly to the ancestry of northern Han Chinese speakers, whose languages exhibit genetic affinities with these early farmers; this admixture pattern correlates with the spread of Sinitic linguistic features southward.[117] Post-2020 studies, including 2023–2024 research, have refined these findings with evidence of multiple agriculture-driven migrations from northern China, integrating linguistic phylogenies, archaeology, and genetics to explain Sino-Tibetan dispersal, including Sinitic branches.[118] Computational studies leveraging AI-driven methods like knowledge graphs and embedding models have further refined phonological reconstructions—particularly rhyme correspondences across dialects—yielding divergence estimates for major branches consistent with 2,000–3,000 years ago for internal splits, while accounting for substrate influences from non-Sinitic languages.[119]Linguistic features
Phonological systems
Sinitic languages share a core phonological profile characterized by monosyllabic morphemes and a relatively simple syllable structure inherited from Middle Chinese, typically following the template (C)V(N), where the optional initial consonant (C) is followed by a vowel nucleus (V) and an optional coda (N) limited to nasals or stops in certain varieties.[108] This structure reflects an analytic trait, with stress and intonation playing minimal roles compared to lexical tone for lexical distinction. All varieties derive their tonal systems from the four Middle Chinese tones—level (píng), rising (shǎng), departing (qù), and entering (rù)—plus mergers and splits that yield 4 to 9 tones today, with the entering tone often preserved as a short, checked syllable in southern varieties. A defining phonological evolution across Sinitic languages is the widespread loss of Middle Chinese consonant codas, including liquids and fricatives, which simplified syllable endings and contributed to tone development through compensatory mechanisms.[120] However, southern varieties like Yue, Min, and Hakka retain the stop codas -p, -t, and -k from the entering tone, distinguishing them from northern Mandarin, which reduced codas to only nasals -n and -ŋ.[108] Initial consonants also show variation: northern varieties devoiced Middle Chinese voiced obstruents, resulting in aspirated or unaspirated voiceless stops, while southern groups such as Wu preserve voiced initials (e.g., /b/, /d/, /g/), enhancing consonant inventory diversity. Tone inventories differ markedly by branch: Mandarin features a canonical four-tone system (high level, high rising, low dipping, high falling), whereas Min varieties often have seven tones, incorporating splits from the departing tone and preserved entering distinctions.[108] Wu dialects typically exhibit seven or eight tones with complex sandhi rules, and Yue maintains six tones plus entering stops, as in Cantonese where syllables like /sat/ (ten) end in -t. These variations underscore regional conservatism in the south versus simplification in the north, with tone sandhi—contextual tone changes in connected speech—being ubiquitous but varying in scope, from Mandarin's third-tone reduction to more extensive right-dominant patterns in Min.[121] Recent acoustic research in the 2020s has illuminated tone sandhi dynamics in understudied varieties. In Lishui Wu (southern Wu), a 2023 study measured fundamental frequency (F0) trajectories, revealing that sandhi applies progressively across trisyllabic sequences, with rising tones triggering mid-level realizations in preceding syllables, confirming phonological rules through precise durational and pitch height analyses.[122] Similarly, a 2022 acoustic investigation of Zhangzhou Southern Min demonstrated right-dominant sandhi, where the final tone spreads leftward, altering F0 contours in up to 80% of disyllables, with checked tones resisting full assimilation due to glottalization cues.[123] These findings highlight how acoustic properties reinforce lexical tone contrasts amid sandhi variability.Grammatical structures
Sinitic languages are predominantly analytic in their grammatical structure, lacking inflectional morphology for tense, number, case, or gender, and relying instead on word order, particles, and context to convey grammatical relations.[124] They typically follow a subject-verb-object (SVO) word order in basic clauses, which distinguishes them from the more common SOV patterns in related Sino-Tibetan branches.[124] A hallmark of their syntax is the topic-comment structure, where the topic—often a noun phrase—is fronted to set the frame, followed by a comment providing new information about it, as seen in constructions like "Zhè běn shū, wǒ kàn guò" (This book, I have read) in Mandarin.[124] This organization prioritizes pragmatic prominence over strict subject-predicate alignment, allowing flexible topicalization across varieties.[125] Numeral classifiers are mandatory in Sinitic languages when nouns are quantified or modified by demonstratives, serving to categorize and individuate referents based on shape, function, or other semantic properties.[126] In Mandarin, the general classifier gè is used for humans or abstract items (e.g., yī gè rén, one person), while běn specifies long, thin objects like books (yī běn shū, one book); these classifiers often follow numerals or demonstratives in the noun phrase. Variations exist across varieties, such as in Yue (Cantonese), where go3 functions similarly to Mandarin gè but with distinct phonological and syntactic behaviors, including optional use in possessive constructions like ngo5 go3 syu1 (my book).[127] Classifiers also play a role in definiteness marking in some contexts, evolving from individuation functions.[126] Aspect in Sinitic languages is marked through postverbal particles rather than verbal inflection, with shared markers across varieties indicating completion or ongoing states.[128] In Mandarin, le signals perfective aspect for completed actions (e.g., tā chī-le fàn, he ate the meal), while zhe denotes continuous or durative aspect (e.g., tā zuò-zhe, he is sitting).[128] Serialization, or chaining multiple verbs in a single clause without conjunctions, is common for expressing complex events, as in Mandarin qù mǎi shū (go buy book), where verbs share a subject and aspect.[129] This construction is areal, extending to southern varieties and facilitating compact expression of manner, direction, or purpose.[129] Indirect objects are typically introduced by prepositions, such as Mandarin gěi for benefactive or dative roles (e.g., wǒ gěi tā shū, I give him a book).[130] In some southern varieties, forms like bei appear in dative contexts or passive constructions, reflecting regional divergence.[130] Demonstratives distinguish proximal (zhè in Mandarin, this) from distal (nà, that), often requiring classifiers for specificity (e.g., zhè běn shū, this book). Southern varieties tend to employ more postpositions for locative and relational functions, such as Cantonese hai6 (in/at) in post-nominal position, contributing to mixed word-order patterns compared to the preposition-dominant north.[130]Writing systems and orthographies
The Sinitic languages share the Hanzi (Chinese characters) writing system, a logographic script that represents morphemes rather than sounds, enabling mutual intelligibility in written form despite spoken differences. Comprehensive dictionaries like the Zhonghua Zihai catalog over 85,000 distinct characters, though literacy typically requires mastery of only 3,000 to 5,000 for reading newspapers and modern texts. Approximately 81% of frequently used characters are semantic-phonetic compounds, featuring a radical that conveys semantic information (e.g., indicating the category of meaning) paired with a phonetic component that hints at pronunciation, a structure that originated in ancient oracle bone inscriptions and evolved through millennia. This compound design facilitates the script's adaptability across Sinitic varieties, as the characters maintain consistent visual forms while allowing diverse readings. Non-Mandarin Sinitic varieties, while relying on the same Hanzi script, often incorporate romanization systems to capture their unique phonologies for pedagogical, digital, or literary purposes. For Yue (Cantonese), Jyutping—a standardized romanization scheme developed by the Linguistic Society of Hong Kong in 1993—uses Latin letters with diacritics to denote tones and initials, such as representing the six tones distinct from Mandarin's four. Similarly, Pe̍h-ōe-jī (POJ), a 19th-century orthography pioneered by Protestant missionaries for Hokkien (Southern Min), employs modified Latin script to transcribe the language's nasalized vowels and tonal contours, historically used in religious texts and Taiwanese vernacular literature. These adaptations highlight the script's flexibility but also underscore the need for supplementary tools, as Hanzi alone does not encode dialect-specific sounds. A notable variation within the Hanzi system arises from the 1956 introduction of simplified characters by the People's Republic of China, which reduced stroke counts in over 2,000 characters to boost literacy rates, contrasting with the traditional forms retained in Taiwan, Hong Kong, and overseas communities. For instance, the character for "person," 人, is read as rén (second tone) in Mandarin but jan4 in Cantonese, illustrating how the shared orthography belies profound phonological divergence. This polysemy in readings preserves written unity but poses challenges for spoken-to-written transcription. The logographic nature of Hanzi mitigates some ambiguities from spoken homophones—common in tonal Sinitic varieties, where a single sound may correspond to dozens of characters—but digital input methods for dialects amplify these issues. Early pinyin-based systems for Mandarin required manual selection from homophone lists, and analogous tools for dialects like Jyutping keyboards face similar disambiguation hurdles, often relying on context prediction. Recent advances, such as corpus-based adaptive algorithms, improve accuracy by analyzing surrounding text to resolve homophones in real-time, facilitating broader digital expression of non-Mandarin varieties.Cultural and sociolinguistic aspects
Language policy and endangerment
In the People's Republic of China (PRC), language policies have prioritized Putonghua, the standard form of Mandarin, over other Sinitic varieties since the 1980s through bilingual education initiatives that emphasize Mandarin proficiency in schools while providing limited support for regional languages.[30] The 1982 Constitution explicitly promotes the nationwide use of Putonghua to foster national unity and communication.[32] This approach was reinforced by the 2021 Law on the National Common Language and Writing System, which mandates Putonghua as the primary medium of instruction in educational institutions and requires its use in government, media, and public services to standardize communication across diverse linguistic regions.[131] These policies contribute to the endangerment of many Sinitic varieties, with smaller ones facing significant decline due to urbanization, internal migration, and the dominance of Putonghua in formal domains.[103] Many documented Sinitic varieties are considered moribund, spoken primarily by older generations with few or no younger speakers; notable examples include certain peripheral Min dialects in northern Fujian Province, where transmission has nearly ceased.[103] The UNESCO Atlas of the World's Languages in Danger classifies several Sinitic varieties as vulnerable, including some Jin and Min forms, highlighting risks from assimilation into dominant Mandarin norms. Preservation efforts in the diaspora complement domestic challenges, with community schools in the United States and United Kingdom offering classes in heritage Sinitic varieties like Cantonese, Wu, and Hakka to maintain cultural ties among immigrant families.[132][133] In the 2020s, digital archiving projects have advanced documentation, such as the Shanghai Dialect Conversational Speech Corpus for Wu varieties, which provides transcribed audio resources for linguistic analysis, and Taiwan's Hakka Cultural Assets Digital Archives, launched in 2022 to digitize oral traditions, folklore, and historical materials.[134][135] Socially, these dynamics have driven a generational shift, as younger speakers in China increasingly favor Putonghua for educational and economic mobility, leading to reduced fluency in ancestral varieties among urban youth.[103] This trend exacerbates endangerment, particularly in migrant-heavy areas where internal relocation disrupts local language use.[103]Influence on other languages
Sinitic languages have profoundly influenced the lexical systems of neighboring East Asian languages through extensive borrowing of vocabulary, particularly via the adoption of Chinese characters and their associated readings. In Japanese, the kanji script incorporates thousands of Sino-Japanese terms derived from Middle Chinese, forming the basis for much of the formal and technical lexicon, including numbers (e.g., ichi for "one" from Chinese yī) and kinship terms (e.g., bo for "mother" from Chinese mǔ). Similarly, Korean hanja borrowings from Chinese contribute significantly to the Sino-Korean vocabulary, which comprises about 60% of the lexicon in historical texts, with examples like numbers (il for "one" from yī) and family relations (mo for "mother" from mǔ). In Vietnamese, chữ Hán loans form a substantial layer of Sino-Vietnamese words, estimated at around 60-70% of the vocabulary in classical literature, including numerals (một alongside Sino-Vietnamese nhất from yī) and kinship designations (e.g., mẫu for "mother" from mǔ). These borrowings, known collectively as Sino-Xenic vocabularies, reflect systematic adaptations of Chinese morphemes across phonological systems while preserving semantic content.[136][137][138] Beyond lexicon, Sinitic languages have exerted structural influence on syntax in contact situations, notably promoting topic-prominent word order in Japanese. Japanese exhibits topic-comment structures marked by particles like wa, a feature shared with Sinitic languages such as Mandarin, where topics are fronted for discourse focus (e.g., "The book, I read it" paralleling Mandarin shū, wǒ kàn-le). This areal convergence likely arose from prolonged literary and cultural contact during the adoption of kanji, enhancing Japanese's predisposition toward topic prominence over strict subject-predicate alignment. In Southeast Asia, southern Sinitic varieties have contributed to the development of numeral classifiers in languages like Thai and Burmese through historical migration and trade. Thai classifiers (e.g., lǝəw for round objects) parallel those in southern Chinese dialects like Cantonese (go3), suggesting diffusion via contact in the Mekong region; similarly, Burmese employs classifiers (e.g., ta for humans) that align semantically with Sinitic systems, originating potentially from a single proto-classifier innovation in early Sinitic that spread to Tai-Kadai and Tibeto-Burman languages.[139][140][141] Sinitic elements also appear in pidgin and creole languages worldwide, blending with European tongues in colonial trade contexts. In Singapore English (Singlish), Hokkien—a southern Min variety of Sinitic—provides substratal influence, contributing particles like lah for emphasis and lexical items such as kiasu ("fear of losing"), which integrate into the creole's grammar and vocabulary. Likewise, Chinook Jargon, a Pacific Northwest pidgin, incorporated Sinitic loanwords from Chinese laborers during 19th-century railroad construction, including terms like chop-chop (from Cantonese jōp-jōp, meaning "quickly") and basic numerals adapted for trade. These pidgins illustrate Sinitic's role in facilitating intercultural communication across diverse substrates.[142][143] Recent linguistic analyses highlight substantial Sinitic substrate effects in Hmong-Mien languages, with studies estimating that 20% or more of the core vocabulary derives from Chinese loans accumulated over millennia of coexistence in southern China. For instance, Hmong dialects borrow terms for agriculture and kinship (e.g., White Hmong neeg for "person" from Chinese rén), reflecting layers of borrowing from Middle Chinese onward. This lexical integration underscores the asymmetric influence of dominant Sinitic varieties on minority languages in the region.[144][145] On a global scale, Mandarin terms have permeated international lexicons, particularly in diplomacy and culture, through modern exchanges. Words like taichi (from Mandarin tài jí quán, referring to the martial art) and dimsum (from Cantonese-influenced Mandarin diǎn xīn, denoting small dishes) entered English via 20th-century migration and trade, now standard in global cuisine and wellness contexts. In diplomatic spheres, Mandarin phrases such as ping shēng ("peaceful rise") appear in international discourse on Chinese foreign policy, symbolizing soft power projection. These adoptions exemplify Sinitic's ongoing expansion beyond Asia..pdf)[146][147]References
- https://en.wiktionary.org/wiki/Sinitic