Hubbry Logo
Sindhi languageSindhi languageMain
Open search
Sindhi language
Community hub
Sindhi language
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Sindhi language
Sindhi language
from Wikipedia
Sindhi
سِنڌِي
Sindhi written in Perso-Arabic script and Devanagari
Pronunciation[sɪndʱiː]
Native toPakistan and India
RegionSindh (Pakistan)
Kutch, Marwar (India)
EthnicitySindhis
Native speakers
37 million (2011–2023)[a]
Perso-Arabic and Devanagari;[1] Khojki, Khudabadi and Gurmukhi (historically)
Official status
Official language in
Regulated by
Language codes
ISO 639-1sd
ISO 639-2snd
ISO 639-3snd
Glottologsind1272
Linguasphere59-AAF-f
The proportion of people with Sindhi as their mother tongue in each Pakistani District as of the 2017 Pakistan Census
This article contains IPA phonetic symbols. Without proper rendering support, you may see question marks, boxes, or other symbols instead of Unicode characters. For an introductory guide on IPA symbols, see Help:IPA.

Sindhi (/ˈsɪndi/ SIN-dee;[3] Sindhi: سِنڌِي (Perso-Arabic))is an Indo-Aryan language spoken by more than 30 million people in the Pakistani province of Sindh, where it has official status, as well as by 1.7 million people in India, where it is a scheduled language without state-level official status. Sindhi is primarily written in the Perso-Arabic script in Pakistan, while in India, both the Perso-Arabic script and Devanagari are used.

Sindhi is a Northwestern Indo-Aryan language, and thus related to, but not mutually intelligible with, Saraiki and Punjabi. Sindhi has several regional dialects.

The earliest written evidence of modern Sindhi as a language can be found in a translation of the Qur’an into Sindhi dating back to 883 AD.[4] Sindhi was one of the first Indo-Aryan languages to encounter influence from Persian and Arabic following the Umayyad conquest in 712 AD. A substantial body of Sindhi literature developed during the Medieval period, the most famous of which is the religious and mystic poetry of Shah Abdul Latif Bhittai from the 18th century. Modern Sindhi was promoted under British rule beginning in 1843, which led to the current status of the language in independent Pakistan after 1947.

Sindhi is an inflected language, with five cases for noun, three for personal pronoun, four for third-person pronoun; eleven case markers; two genders (masculine, feminine); and two numbers (singular, plural). The base of its vocabulary is derived from Sanskrit in the form of Prakrit and Apabhraṃśa, while a significant portion of its high-register speech is derived from Persian and Arabic, along with a number of recent loanwords borrowed from English; and to a lesser extent from Portuguese and French. It has also had minor influence from and on neighbouring languages such as Saraiki, Punjabi, Balochi, Brahui, Gujarati, and Marwari.[5]

Sindhi has a number of dialects and an established standard form, referred to as Standard Sindhi, which is based on the dialect of Hyderabad and surrounding areas of central Sindh. The primary regulatory agency for the development and promotion of the language is the Sindhi Language Authority, an autonomous institution of the government of Sindh.[6]

History

[edit]
Cover of a book containing the epic Dodo Chanesar written in Hatvanki Sindhi or Khudabadi script.

Origins

[edit]

The name "Sindhi" is derived from the Sanskrit síndhu, the original name of the Indus River, along whose delta Sindhi is spoken.[7] In the Bronze Age (c. 3300 – c. 1200 BCE), the primary language of this region was likely the Harappan language, but no records exist indicating when or how that language was replaced by the Indo-Aryan languages.[8]

Like other languages of the Indo-Aryan family, Sindhi is descended from Old Indo-Aryan (Sanskrit) via Middle Indo-Aryan (Pali, secondary Prakrits, and Apabhramsha). 20th century Western scholars such as George Abraham Grierson believed that Sindhi descended specifically from the Vrācaḍa dialect of Apabhramsha (described by Markandeya as being spoken in Sindhu-deśa, corresponding to modern Sindh)[9][10] but later work has shown this to be unclear.[11]

The sound changes that characterise the development of Sindhi from Middle Indo-Aryan are:

  • Development of implosives from geminate and initial stops (e.g. g-, -gg > ɠ); this is a highly distinctive sound change in NIA[12]
  • Shortening of geminates (e.g. MIA akkhi > Sindhi akhi "eye")[13]
  • Voicing of post-nasal consonants (e.g. MIA danta > Sindhi ɗ̣andu "tooth")[13][14]
  • Debuccalization of intervocalic -s- > -h- (shared with Saraiki and some Punjabi varieties)[15]
  • Intervocalic -l- > -r- (likely via intermediate retroflex -ḷ-), -ll- > -l-,[16] -ḍ- > -ṛ-
  • Fronting of r from medial clusters to initial (e.g. OIA dīrgha > Sindhi ḍrigho "long")[12]

Additionally, the following retentions distinguish Sindhi from other New Indo-Aryan languages:

  • Retention of MIA -ṇ-[16]
  • Retention of final short vowels -a, -i, -u,[17] but also insertion of these into loanwords[18]
  • Retention of long vowels before geminates (more archaic than e.g. Prakrit)[13]
  • Retention of stop + r clusters but with retroflexion, e.g. tr- > ṭr-[19][20]
  • Retention of v-[21]

Early Sindhi (–16th century)

[edit]

Literary attestation of early Sindhi is sparse. The earliest written evidence of Sindhi as a language can be found in a translation of the Qur’an into Sindhi dating back to 883 A.D.[4] Historically, Isma'ili religious literature and poetry in India, as old as the 11th century CE, used a language that was closely related to Sindhi and Gujarati; at this point in time, Sindhi was not clearly established as an independent literary language. Much of this work is in the form of ginans (a kind of devotional hymn).[22][23]

Sindhi was the first Indo-Aryan language to be in close contact with Arabic and Persian following the Umayyad conquest of Sindh in 712 CE. Arabic sources thus do mention the language of Sindh in various instances. The following excerpts are translated from The History of India, as Told by Its Own Historians by Henry Miers Elliot.[24]

The language of Sind is different than that of India. Sind is the country which is nearer the domains of the Moslims, India is farther from them.

— al-Masudi (c. 896–956 CE), The Meadows of Gold

The language of Mansúra, Multán, and those parts is Arabic and Sindian. In Makrán they use Persian and Makranic.

— Ibn Hawqal, Surat Al-Ard (977 CE)

Additionally, the Korean Buddhist monk Hyech'o mentions the unique language of Sindh in his travelogue:

From Takka I walked towards the West for another month and arrived at the country of Sindhukula. The dress, customs, climate, and temperature are similar to north India, although the language is slightly different.

— Hyech'o, Wang och'ŏnch'ukkuk chŏn (c. 723–728 CE)[25]

Medieval Sindhi (16th–19th centuries)

[edit]

Medieval Sindhi literature is of a primarily religious genre, comprising a syncretic Sufi and Advaita Vedanta poetry, the latter in the devotional bhakti tradition. The format of this poetry is the bayt, indicating significant influence from Arabic and Persian. The earliest known Sindhi poet of the Sufi tradition is Qazi Qadan (1493–1551). Other early poets were Shah Inat Rizvi (c. 1613–1701) and Shah Abdul Karim Bulri (1538–1623). These poets had a mystical bent that profoundly influenced Sindhi poetry for much of this period.[22]

Another famous part of Medieval Sindhi literature is a wealth of folktales, adapted and readapted into verse by many bards at various times and possibly much older than their earliest literary attestations. These include romantic epics such as Sassui Punnhun, Sohni Mahiwal, Momal Rano, Noori Jam Tamachi, Lilan Chanesar, and others.[26]

The greatest poet of Sindhi was Shah Abdul Latif Bhittai (1689/1690–1752), whose verses were compiled into the Shah Jo Risalo by his followers. While primarily Sufi, his verses also recount traditional Sindhi folktales and aspects of the cultural history of Sindh.[22]

The first attested Sindhi translation of the Quran was done by Akhund Azaz Allah Muttalawi (1747–1824) and published in Gujarat in 1870. The first to appear in print was by Muhammad Siddiq in 1867.[27]

British India (1843–1947)

[edit]

In 1843, the British conquest of Sindh led the region to become part of the Bombay Presidency. Soon after, in 1848, Governor George Clerk established Sindhi as the official language in the province, removing the literary dominance of Persian. Sir Bartle Frere, the then commissioner of Sindh, issued orders on August 29, 1857, advising civil servants in Sindh to pass an examination in Sindhi. He also ordered the use of Sindhi in official documents.[28] In 1868, the Bombay Presidency assigned Narayan Jagannath Vaidya to replace the Abjad used in Sindhi with the Khudabadi script. The script was decreed a standard script by the Bombay Presidency thus inciting anarchy in the Muslim majority region. A powerful unrest followed, after which Twelve Martial Laws were imposed by the British authorities. The granting of official status of Sindhi along with script reforms ushered in the development of modern Sindhi literature.

The first printed works in Sindhi were produced at the Muhammadi Press in Bombay beginning in 1867. These included Islamic stories set in verse by Muhammad Hashim Thattvi, one of the renowned religious scholars of Sindh.[26]

Independent Pakistan and India (1947–)

[edit]

The Partition of India in 1947 resulted in most Sindhi speakers ending up in the new state of Pakistan, commencing a push to establish a strong sub-national linguistic identity for Sindhi. This manifested in resistance to the imposition of Urdu and eventually Sindhi nationalism in the 1980s.[29]

The language and literary style of contemporary Sindhi writings in Pakistan and India were noticeably diverging by the late 20th century; authors from the former country were borrowing extensively from Urdu, while those from the latter were highly influenced by Hindi.[30]

Geographical distribution

[edit]

Sindhi is the official language of the Pakistani province of Sindh[31][2] and one of the scheduled languages of India, where it does not have any state-level status.[32] Prior to the inception of Pakistan, Sindhi was the national language of Sindh.[33][34][35][36]

Sindhi is additionally spoken by many members of the Sindhi diaspora, particularly in Malaysia, Oman, Singapore, UAE, USA and UK.

Pakistan

[edit]

In Pakistan, Sindhi is the first language of 34.40 million people, or 14.6% of the country's population as of the 2023 census. 33.46 million of these are found in Sindh, where they account for 60% of the total population of the province.[37] There are 0.55 million speakers in the province of Balochistan, especially in the Kacchi Plain.

The Pakistan Sindh Assembly has ordered compulsory teaching of the Sindhi language in all private schools in Sindh.[38] According to the Sindh Private Educational Institutions Form B (Regulations and Control) 2005 Rules, "All educational institutions are required to teach children the Sindhi language.[39] Sindh Education and Literacy Minister, Syed Sardar Ali Shah, and Secretary of School Education, Qazi Shahid Pervaiz, have ordered the employment of Sindhi teachers in all private schools in Sindh so that this language can be easily and widely taught.[40] Sindhi is taught in all provincial private schools that follow the Matric system and not the ones that follow the Cambridge system.[41]

At the occasion of 'Mother Language Day' in 2023, the Sindh Assembly under Culture minister Sardar Ali Shah, passed a unanimous resolution to extend the use of language to primary level[42] and increase the status of Sindhi as a national language[43][44][45] of Pakistan.

There are many Sindhi language television channels broadcasting in Pakistan such as Time News, KTN, Sindh TV, Awaz Television Network, Mehran TV, and Dharti TV.

India

[edit]

The Indian Government has legislated Sindhi as a scheduled language in India, making it an option for education. Despite lacking any state-level status, Sindhi is still a prominent minority language in the Indian states of Gujarat, Rajasthan and Maharashtra.[46] In India, Sindhi mother tongue speakers were distributed in the following states:

Sindhi diaspora

[edit]

In Malaysia, Indonesia, and Singapore (where Sindhi has no official status), ethnics Sindhis are largely shifting to English as their first language, excepting some monolingual first-generation immigrants and second-generation speakers who use Sindhi at home. Codeswitching of varying degrees is observed in some speakers, usually with English but also with Malay and Indonesian.[48][49][50][51] Similar shift to English is found in the smaller Hong Kong Sindhi community.[52]

Sindhi speakers by country

[edit]

Dialects

[edit]
The dialects of Sindhi language shown on map.

Sindhi has many dialects, and forms a dialect continuum at some places with neighboring languages such as Saraiki and Punjabi to the north and Gujarati to the south, but not with Marwari to the east.[5] Some of the documented dialects of Sindhi are:[57][58][59][5][60]

Furthermore, Kutchi and Jadgali are sometimes classified as dialects of Sindhi rather than independent languages.

Sindhi dialects Comparison[72]
English Vicholi Lari Uttaradi Lasi Kutchi[73] Dhatki
I Aao(n) Aao(n) Mā(n) Ã Aau(n) Hu(n)
My Muhnjo Mujo Mānjo/Māhjo Mojo/Mājo Mujo Mānjo/Māhyo
You "Sin, plu" (formal) Awha(n)/Awhee(n)

Tawha(n)/Tawhee(n)

Aa(n)/Aei(n) Taha(n)/Taa(n)/

Tahee(n)/Taee(n)

Awa(n)/Ai(n) Aa(n)/Ai(n) Awha/Ahee(n)/ Aween
To me Mukhe Muke Mānkhe Mukh Muke Mina
We Asee(n) Asee(n), Pān Asā(n) Asee(n) Asee(n), Pān Asee(n), Asā(n)
What Chha/Kahirō Kujjāro/Kujja Chha/Shha Chho Kuro Kee
Why Chho Ko Chho/Shho Chhela Kolāi/Kurelāe Kayla
How Kiya(n) Kei(n) Kiya(n) Kee(n) Kiya(n)
No Na, Kōna, Kōn Nā(n), Kīna Na, Kōna, Kāna, Kon, Kān Nā(n), Ma Nā, Ni, Ko, Kon, Ma
Legs (plural, fem) Tangu(n), Jjanghu(n) Tangu(n), Jjangu(n) Tangā(n), Jjanghā(n)
Foot Pair Pair/Pagg/Pagulo Pair Pair Pag Pagg, Pair
Far Pare Ddoor Pare/Parte Ddor Chhete Ddor
Near Vejhō Vejo/Ōdō/Ōdirō/Ore Vejhō/Vejhe/Orte Ōddō Wat, bājūme Nerro
Good/Excellent Sutho, Chaṅō Khāso/Sutho/Thhāuko Sutho, Bhalo, Chango Khāsho Khāso, Laat Sutho, Phutro, Thhāuko
High Utāho Ucho Mathe Ucho Ucho Uncho
Silver Rupo Chādi/Rupo Chāndi Rupo Rupo
Father Piu Pay/Abo/Aba/Ada Pee/Babo/Pirhe(n) Pe Pe, Bapa, Ada
Wife Joe/Gharwāri Joe/Wani/Kuwār Zaal/Gharwāri Zaal Vahu/Vau Ddosi, Luggai
Man Mardu Māņu/Mārū/Mard

/Murs/Musālu

Mānhu/Musālo/Bhāi

/Kāko/Hamra

Mānhu Māḍū/Mārū Mārū
Woman Aurat Zāla/ōrat/ōlath Māi/Ran Zāla Bāeḍi/Bāyaḍī
Child/Baby Bbār/Ningar/Bbālak Bbār/Ningar/Gabhur/

Bacho/Kako

Bbār/Bacho/Adro/

Phar (animal)

Gabhar Bār/Gabhar/Chokro
Daughter Dhiu/Niyāni Dia/Niyāni/Kañā Dhee/Adri Dhia Dhi/Dhikri Dikri
Sun Siju Sij, Sūrij Sijhu Siju Sūraj Sūraj
Sunlight Kārro Oosa Tarko
Cat Billi Bili/Pusani Billi Phushini Minni
Rain Barsāt/Mee(n)h

/Bārish

Varsāt/Mee(n)/Mai(n) Barsāt/Mee(n)hu Varsāt Meh, Maiwla
And Aēi(n) Ãū(n)/Ãē(n)/Nē Aēi(n)/Aū(n)/Aen Ãē/Or Nē/Anē A'e(n)/Ān
Also Pin/Bhi Pin, Bee Bu/Pun Pin/Pan
Is Āhe Āye Aa/Āhe/Hai Āhe/Āye Āye Āhe/Āh/Āye/Hai
Fire Bāhe Bāē/āgg/jjērō Bāhe/Bāh Jjērō Jirō/lagāņō/āg
Water Pāņī Pāņī/Jal Pāņī Pāņī Pāņī/Jal Pāņī
Where Kithē Kithē Kithē, Kāthe, Kehda, Kāday, Kādah, Kidah, Kithrē Kith Kidhē/Kidhā Kith
Sleep Nindr(a) Nind(a) Nindr(a) Nind Ninder Oongh
Slap Thaparr/Chammāt Tārr Chamātu/Chapātu/

Lapātu/Thapu

Thapaat
To Wash Dhoain(u) Dhun(u) Dhoain(u)/Dhuan(u)/

Dhowan(u)

Dhowan Dhuwan(u)/

Dhoon(u)

Will write (Masc) Likhandum, Likhandus Likhados Likhdum, Likhdus Likhdosī likhdos (m) / likhdis (f) Likhsā(n)
I Went Aao(n) Vius Aao(n) Vēs Ma(n) Vayus (m)/ Vayas (f) Ã viosī Aau vyos (m) / veyis (f) Hu Gios

Phonology

[edit]

Sindhi has a relatively large inventory of both consonants and vowels compared to other Indo-Aryan languages.[74] Sindhi has 46 consonant phonemes and 10 vowels.[75] The consonant to vowel ratio is around average for the world's languages at 2.8.[76] All plosives, affricates, nasals, the retroflex flap, and the lateral approximant /l/ have aspirated or breathy voiced counterparts. The language also features four implosives.

Consonants

[edit]
Sindhi consonants[77]
Labial Dental/
alveolar
Retroflex (Alveolo-)
Palatal
Velar Glottal
Nasal plain m م n ن ɳ ڻ ɲ ڃ ŋ ڱ
breathy مهہ म्ह نهہ न्ह ɳʱ ڻهہ ण्ह
Stop/
Affricate
plain p پ b ب ت د ʈ ٽ ɖ ڊ چ ج k ڪ ɡ گ
breathy ڦ ڀ t̪ʰ ٿ d̪ʱ ڌ ʈʰ ٺ ɖʱ ڍ tɕʰ ڇ dʑʱ جهہ ک ɡʱ گهہ
Implosive ɓ ٻ ॿ ɗ ڏ ʄ ڄ ɠ ڳ
Fricative f ف फ़ s س z ز ज़ ʂ ش श ष x خ ख़ ɣ غ ग़ h ھ ه
Approximant plain ʋ و l ل j ي
breathy لهہ ल्ह
Rhotic plain r ر ɽ ڙ ड़
breathy ɽʱ ڙهہ ढ़

The retroflex consonants are apical postalveolar and do not involve curling back of the tip of the tongue,[78] so they could be transcribed [t̠, t̠ʰ, d̠, d̠ʱ n̠ʱ ɾ̠ ɾ̠ʱ] in phonetic transcription. The affricates /tɕ, tɕʰ, dʑ, dʑʱ/ are laminal post-alveolars with a relatively short release. It is not clear if /ɲ/ is similar, or truly palatal.[79] /ʋ/ is realized as labiovelar [w] or labiodental [ʋ] in free variation, but is not common, except before a stop.

The vowel phonemes of Sindhi on a vowel chart

Vowels

[edit]
Front Central Back
Close i u
Near-close ɪ ʊ
Close-mid e o
Mid ə
Open-mid æ ɔ
Open ɑ

The vowels are modal length /i e æ ɑ ɔ o u/ and short ʊ ə/. Consonants following short vowels are lengthened: /pət̪o/ [pət̪ˑoː] 'leaf' vs. /pɑt̪o/ [pɑːt̪oː] 'worn'.

Grammar

[edit]

Nouns

[edit]

Sindhi nouns distinguish two genders (masculine and feminine), two numbers (singular and plural), and five cases (nominative, vocative, oblique, ablative, and locative). This is a similar paradigm to Punjabi. Almost all Sindhi noun stems end in a vowel, except for some recent loanwords. The declension of a noun in Sindhi is largely determined from its grammatical gender and the final vowel (or if there is no final vowel). Generally, -o stems are masculine and -a stems are feminine, but the other final vowels can belong to either gender.

The different paradigms are listed below with examples.[80] The ablative and locative cases are used with only some lexemes in the singular number and hence not listed, but predictably take the suffixes -ā̃ / -aū̃ / -ū̃ (ABL) and -i (LOC).

SG PL Gloss
NOM VOC OBL NOM VOC OBL
M I ڇوڪِرو
छोकिरो
chokiro
ڇوڪِرا
छोकिरा
chokirā
ڇوڪِري
छोकिरे
chokire
ڇوڪِرا
छोकिरा
chokirā
ڇوڪِرا / ڇوڪِرَ छोकिरो / छोकिर
chokirā / chokira
ڇوڪِرَنِ छोकिरनि
chokirani
boy
II ٻارُ
ॿारु
ɓāru
ٻارَ
ॿार
ɓāra
ٻارو / ٻارَ
ॿार /ॿारो
ɓāra / ɓāro
ٻارَنِ
ॿारनि
ɓārani
child
III ساٿِي
साथी
sāthī
ساٿِيءَ
साथीअ
sāthīa
ساٿِي
साथी
sāthī
ساٿيئَرو
साथीअरो
sāthīaro
ساٿيَنِ
साथियनि
sāthyani
companion
رَھاڪُو
रहाकू
rahākū
رَھاڪُوءَ
रहाकूअ
rahākūa
رَھاڪُو
रहाकू
rahākū
رَھاڪُئو
रहाकूओ
rahākuo
رَھاڪُنِ
रहाकुनि
rahākuni
inhabitant
IV راجا
राजा
rājā
راجا / راجائتو
राजा / राजाइतो
rājā / rājāito
راجائُنِ
राजाउनि
rājāuni
king
سيٺُ
सेठु
seṭhu
سيٺَ
सेठ
seṭha
سيٺَنِ
सेठनि
seṭhani
merchant
F I زالَ
ज़ाल
zāla
زالُون
ज़ालूं
zālū̃
زالُنِ
ज़ालुनि
zāluni
woman, wife
سَسُ
ससु
sasu
سَسُون
ससूं
sasū̃
سَسُنِ
ससुनि
sasuni
mother-in-law
II دَوا
दवा
davā
دَوائُون
दवाऊं
davāū̃
دَوائُنِ
दवाउनि
davāuni
medicine
راتِ
राति
rāti
راتيُون
रातियूं
rātyū̃
راتيُنِ
रातियुनि
rātyuni
night
هوٽَل
होटल
hoṭal
هوٽَلُون
होटलूं
hoṭalū̃
هوٽَلُنِ
होटलुनि
hoṭaluni
hotel
III ڳَئُون
ॻऊं
ɠaū̃
ڳَئُونَ
ॻऊंअ
ɠaū̃a
ڳَئُون
ॻऊं
ɠaū̃
ڳَئُونِ
ॻऊनि
ɠaūni
cow
IV نَدِي
नदी
nadī
نَدِيءَ
नदीअ
nadīa
نَديُون
नदियूं
nadyū̃
نَديُنِ
नदियुनि
nadyuni
river

A few nouns representing familial relations take irregular declensions with an extension in -r- in the plural. These are the masculine nouns ڀاءُ भाउ bhāu "brother", پِيءُ पिउ pīu "father", and the feminine nouns ڌِيءَ धीअ dhīa "daughter", نُونھَن नूंहं nū̃hã "daughter-in-law", ڀيڻَ भेण bheṇa "sister", ماءُ माउ māu "mother", and جوءِ जोइ' joi "wife".[80]

SG PL Gloss
NOM VOC OBL NOM VOC OBL
M ڀاءُ
भाउ
bhāu
ڀائُرُ / ڀائُرَ
भाउरु / भाउर
bhāuru / bhāura
ڀائُرَ / ڀائُرو
भाउर / भाउरो
bhāura / bhāuro
ڀائُرَنِ / ڀائُنِ
भाउरनि / भाउनि
bhāurani / bhāuni
brother
F ڌِيءَ / ڌِيءُ
धीअ / धीउ
dhīa / dhīu
ڌِيئَرُ / ڌِيئَرُون / ڌِيئُون
धीअरु / धीअरूं / धीऊं
dhīaru / dhīarū̃ / dhīū̃
ڌِيئَرُنِ / ڌِيئُنِ
धीअरुनि / धीउनि
dhīaruni / dhīuni
daughter

Pronouns

[edit]

Personal pronouns

[edit]
Personal pronouns
SG PL
1 2 1 2
NOM مَان‎ / آئُون
मां / आऊं
mā̃ / āū̃
تُون
तूं
tū̃
اَسِين
असीं
asī̃
تَوِھِين
तव्हीं
tavhī̃
OBL مُون
मूं
mū̃
تو
तो
to
اَسَان
असां
asā̃
تَوِھَان
तव्हां
tavhā̃
GEN مُنھِنجو
मुंहिंजो
mũhinjo
تُنھِنجو
तुंहिंजो
tũhinjo

Like other Indo-Aryan languages, Sindhi has first and second-person personal pronouns as well as several types of third-person proximal and distal demonstratives. These decline in the nominative and oblique cases. The genitive is a special form for the first and second-person singular, but formed as usual with the oblique and case marker جو जो jo for the rest. The personal pronouns are listed to the right.[81][82]

The third-person pronouns are listed below. Besides the unmarked demonstratives, there are also "specific" and "present" demonstratives. In the nominative singular, the demonstratives are marked for gender. Some other pronouns which decline identically to ڪو को ko "someone" are ھَرڪو हरको har-ko "everyone", سَڀڪو सभको sabh-ko "all of them", جيڪو जेको je-ko "whoever" (relative), and تيڪو तेको te-ko "that one" (correlative).[81]

Third-person pronouns
Demonstrative Interrogative Relative Correlative
Unmarked Specific Present Indefinite
PROX DIST PROX DIST PROX DIST
SG NOM M ھِي
ही
ھُو
हू
اِھو
इहो
iho
اُھو
उहो
uho
اِجهو
इझो
ijho
اوجهو
ओझो
ojho
ڪو
को
ko
ڪيرُ
केरु
keru
جو
जो
jo
سو
सो
so
F ھِيءَ
हीअ
hīa
ھُوءَ
हूअ
hūa
اِھَا
इहा
ihā
اُھَا
उहा
uhā
اِجَها
इझा
ijhā
اوجَها
ओझा
ojhā
ڪَا
का
ڪيرَ
केर
kera
جَا
जा
سَا
सा
OBL ھِنَ
हिन
hina
ھُنَ
हुन
huna
اِنهين
इन्हें
inhẽ
اُنهين
उन्हें
unhẽ
ڪَنْھِن
कंहिं
kãhĩ
جَنْھِن
जंहिं
jãhĩ
تَنْھِن
तंहिं
tãhĩ
PL NOM ھِي
ही
ھُو
हू
اِھي
इहे
ihe
اُھي
उहे
uhe
اِجهي
इझे
ijhe
اوجهي
ओझे
ojhe
ڪي
के
ke
ڪيرَ
केर
kera
جي
जे
je
سي
से
se
OBL ھِنَنِ
हिननि
hinani
ھُنَنِ
हुननि
hunani
اِنَهنِ
इन्हनि
inhani
اُنَهنِ
उन्हनि
unhani
ڪِنِ
किनि
kini
جِنِ
जिनि
jini
تنِ
तिनि
tini

Numerals

[edit]
Num. Cardinal
0 ٻُڙِي ॿुड़ी ɓuṛi
1 هِڪُ हिकु hiku
2 ٻَہ ॿ ɓa
3 ٽِي टी ṭī
4 چَارِ चारि cāri
5 پَنج पञ्ज pañja
6 ڇَھَہ छह chaha
7 سَتَ सत sata
8 اَٺَ अठ aṭha
9 نَوَ नव nava
Num. Cardinal
10 ڏَھَہ ॾह ɗaha
11 يَارَنھَن यारंहं yārãhã
12 ٻَارَھَن ॿारहं ɓārahã
13 تيرَھَن तेरहं terahã
14 چوڏَھَن चोॾहं coɗahã
15 پَندرَھَن पन्द्रहं pandrahã
16 سورَھَن सोरहं sorahã
17 سَترَھَن सत्रहं satrahã
18 اَرِڙَھَن / اَٺَارَھَن अरिड़हं/ अठारहं ariṛahã / aṭhārahã
19 اُڻوِيھَہ उणवीह uṇvīha

Postpositions

[edit]

Most nominal relations (e.g. the semantic role of a nominal as an argument to a verb) are indicated using postpositions, which follow a noun in the oblique case. The subject of the verb takes the bare oblique case, while the object may be in nominative case or in oblique case and followed by the accusative case marker کي खे khe.[83]

The postpositions are divided into case markers, which directly follow the noun, and complex postpositions, which combine with a case marker (usually the genitive جو जो jo).

Case markers

[edit]

The case markers are listed below.[83]: 399 

The postpositions with the suffix -o decline in gender and number to agree with their governor, e.g. ڇوڪِرو جو پِيءُ छोकिरो जो पीउ chokiro j-o pīu "the boy's father" but ڇوڪِر جِي مَاءُ छोकिरो जी माउ chokiro j-ī māu "the boy's mother".

Case markers
Case Marker Example English
Nominative ڇوڪِرو
छोकिरो
chokiro
the boy
Accusative
Dative
کي
खे
khe
ڇوڪِري کي
छोकिरे खे
chokire khe
the boy
to the boy
Genitive جو
जो
j-o
ڇوڪِري جو
छोकिरे जो
chokire jo
of the boy
سَندو
सन्दो
sand-o
ڇوڪِري سَندو
छोकिरे सन्दो
chokire sando
Sociative سُڌو
सुधो
sudh-o
ڇوڪِري سُڌو
छोकिरे सुधो
chokire sudho
along with the boy
Comitative
Instrumental
سَان
सां
sā̃
ڇوڪِري سَان
छोकिरे सां
chokire sā̃
with the boy
سَاڻُ
साणु
sāṇu
ڇوڪِري سَاڻُ
छोकिरे साणु
chokire sāṇu
Locative ۾
में
mẽ
ڇوڪِري ۾
छोकिरे में
chokire mẽ
in the boy
مَنجِهہ
मंझि
manjhi
ڇوڪِري مَنجِهہ
छोकिरे मंझि
chokire manjhi
Adessive تي
टे
te
ڇوڪِري تي
छोकिरे टे
chokire te
on the boy
وَٽِ
वटि
vaṭi
ڇوڪِري وَٽِ
छोकिरे वटि
chokire vaṭi
near the boy
the boy has...
Orientative ڏَانھَن
ॾांहं
ḍā̃hã
ڇوڪِري ڏَانھَن
छोकिरे ॾांहं
chokire ḍā̃hã
towards the boy
Terminative تَائيِن
ताईं
tāī̃
ڇوڪِري تَائيِن
छोकिरे ताईं
chokire tāī̃
up to the boy
Benefactive لاءِ
लाइ
lāi
ڇوڪِري لاءِ
छोकिरे लाइ
chokire lāi
for the boy
Semblative وَانگُرُ
वांगुरु
vānguru
ڇوڪِري وَانگُرُ
छोकिरे वांगुरु
chokire vānguru
like the boy
جَھْڙو
जहड़ो
jahṛ-o
ڇوڪِري جَھْڙو
छोकिरे जहड़ो
chokire jahṛo

There are several ablative case markers formed from the spatial postpositions and the ablative ending -ā̃. These indicate complex motion such as "from inside of".[83]: 400 

Ablative case markers
Marker Example English
کَان
खां
khā̃
ڇوڪِري کَان
छोकिरे खां
chokire khā̃
from the boy
مَان
मां
mā̃
ڇوڪِري مَان
छोकिरे मां
chokire mā̃
from inside the boy
تَان
तां
tā̃
ڇوڪِري تَان
छोकिरे तां
chokire tā̃
from upon the boy
ڏَانھَان
ॾांहां
ḍā̃hā̃
ڇوڪِري ڏَانھَان
छोकिरे ॾांहां
chokire ḍā̃hā̃
from the direction of the boy

Finally, some case markers are found in medieval Sindhi literature and/or modern poetic Sindhi, and otherwise not used in standard speech.

Obsolete/rare case markers
Case Marker Example English
Accusative
Adessive
ڪَني
कने
kane
ڇوڪِري ڪَني
छोकिरे कने
chokire kane
to/near the boy

Complex postpositions

[edit]

The complex postpositions are formed with a case marker, usually the genitive but sometimes the ablative. Many are listed below.[83]: 405 

سِنڌِي सिन्धी Transliteration Explanation
جي اَڳيَان जे अॻ्यां je aɠyā̃ "ahead of, before"; apudessive
جي اَندَرِ जे अन्दरि je andari "inside of"; inessive
جي بَدِرَان जे बदिरां je badirā̃ "instead of, in place of"
جي بَرَابَر जे बराबर je barābar "equal to"
جي ٻَاھَرَان जे ॿाहरां je ɓāharā̃ "outside of"
کَان ٻَاھَرِ खां ॿाहरि khā̃ ɓāhari
جي باري ۾ जे बारे में je bāre mẽ "about, concerning"
جي چَوڌَارِي जे चौधारी je caudhārī "around"
جي ھيٺَان जे हेठां je heṭhā̃ "below, under"
جي ڪَري जे करे je kare "for, on account of"
جي لَاءِ जे लाइ je lāi "for"
جي مَٿَان जे मथां je mathā̃ "above, on top of, upon"
کَان پَري खां परे khā̃ pare "far from"
جي پَارِ जे पारि je pāri "across, on the other side of"
جي پَاسي जे पासे je pāse "on the side of, near"
کَان پوءِ खां पोइ khā̃ poi "after"
جي پُٺيَان जे पुठियां je puṭhyā̃ "behind"
جي سَامهون जे साम्हों je sāmhõ "in front of, facing"
کَان سِوَاءِ खां सिवाइ khā̃ sivāi "besides, apart from"
جي وَاسطي जे वास्ते je vāste "for the sake of, on account of"
جي ويجهو जे वेझो je vejho "near"; adessive
جي وِچِ ۾ जे विचि में je vici mẽ "between, among"
جي خَاطِرِ जे ख़ातिरि je xātiri "for the sake of"
جي خِلَافِ जे ख़िलाफ़ि je xilāfi "against"
جي ذَرِيعي जे ज़रिये je zarī'e "via, through"; perlative

Vocabulary

[edit]

According to historian Nabi Bux Baloch, most Sindhi vocabulary is from ancient Sanskrit. However, owing to the influence of the Persian language over the subcontinent, Sindhi has adapted many words from Persian and Arabic. It has also borrowed from English and Hindustani. Today, Sindhi in Pakistan is slightly influenced by Urdu[citation needed], with more borrowed Perso-Arabic elements, while Sindhi in India is influenced by Hindi[citation needed], with more borrowed tatsam Sanskrit elements.[84]

Writing systems

[edit]

Sindhis in Pakistan use a version of the Perso-Arabic script with new letters adapted to Sindhi phonology, while in India a greater variety of scripts are in use, including Devanagari, Khudabadi, Khojki, and Gurmukhi.[85] Perso-Arabic for Sindhi was also made digitally accessible relatively earlier.[86]

The earliest attested records in Sindhi are from the 15th century.[30] Before the standardisation of Sindhi orthography, numerous forms of Devanagari and Laṇḍā scripts were used for trading. For literary and religious purposes, a Perso-Arabic script developed by Abul-Hasan as-Sindi and Gurmukhi (a subset of Laṇḍā) were used. Another two scripts, Khudabadi and Shikarpuri, were reforms of the Landa script.[87][88] During British rule in the late 19th century, the Perso-Arabic script was decreed standard over Devanagari.[89]

Perso-Arabic script

[edit]

During the British Raj, a variant of the Persian alphabet was adopted for Sindhi in the 19th century. The script is used in Pakistan and India today. It has a total of 52 letters, augmenting the Persian with digraphs and eighteen new letters (ڄ ٺ ٽ ٿ ڀ ٻ ڙ ڍ ڊ ڏ ڌ ڇ ڃ ڦ ڻ ڱ ڳ ڪ) for sounds particular to Sindhi and other Indo-Aryan languages. Some letters that are distinguished in Arabic or Persian are homophones in Sindhi.

Below table presents Sindhi Perso-Arabic alphabet. Letters shaded in yellow are solely used in writing of loanwords, and the phoneme they represent are also represented by other letters in the alphabet. Letters and digraphs shaded in green aren't usually considered as part of the base alphabet. They are either commonly used digraphs representing aspirated consonants, or are ligatures serving a grammatical function. These ligatures include the ۽‎, which is pronounced as [ãĩ̯] and represents and, and the ۾‎, which is pronounced as [mẽ] and it creates a locative relationship between words.

Sindhi alphabet
Perso-Arabic
[IPA]
ا
[]/[ʔ] /[]
ب
[b]
ٻ
[ɓ]
ڀ
[]
ت
[t]
ٿ
[]
Perso-Arabic
[IPA]
ٽ
[ʈ]
ٺ
[ʈʰ]
ث
[s]
پ
[p]
ج
[d͡ʑ]
ڄ
[ʄ]
Perso-Arabic
[IPA]
جهہ
[d͡ʑʰ]
ڃ
[ɲ]
چ
[t͡ɕ]
ڇ
[t͡ɕʰ]
ح
[h]
خ
[x]
Perso-Arabic
[IPA]
د
[d]
ڌ
[]
ڏ
[ɗ]
ڊ
[ɖ]
ڍ
[ɖʱ]
ذ
[z]
Perso-Arabic
[IPA]
ر
[r]
ڙ
[ɽ]
ڙهہ
[ɽʰ]
ز
[z]
ژ
[ʒ]
س
[s]
Perso-Arabic
[IPA]
ش
[ʂ]
ص
[s]
ض
[z]
ط
[t]
ظ
[z]
ع
[ɑː] /[] /[] /[ʔ] /[]
Perso-Arabic
[IPA]
غ
[ɣ]
ف
[f]
ڦ
[]
ق
[q]
ڪ
[k]
ک
[]
Perso-Arabic
[IPA]
گ
[ɡ]
ڳ
[ɠ]
گهہ
[ɡʱ]
ڱ
[ŋ]
ل
[l]
لهہ
[]
Perso-Arabic
[IPA]
مـ
[m]
مهہ
[]
ن
[n] /[◌̃]
نهہ
[]
ڻ
[ɳ]
ڻهہ
[ɳʰ]
Perso-Arabic
[IPA]
و
[ʋ] /[ʊ] /[] /[ɔː] /[]
ھ
[h]
هـ ه
[h]
ـہ ہ
[ə]/[əʰ]/[∅]
ء
[ʔ] /[]
ي
[j] /[]
Perso-Arabic
[IPA]
۽
[ãĩ̯]
۾
[mẽ]

The orthography of the letter hāʾ in Sindhi, especially as it comes to typing as opposed to handwriting, has been a source of confusion for many. Especially because whereas in Arabic and Persian, there exists one single letter for hāʾ, in Urdu, the letter has diverged into two distinct variants: gol he ("round he") and do-cašmi he ("two-eyed he"). The former is written is written round and zigzagged as "ہـ ـہـ ـہ ہ", and can impart the "h" (/ɦ/) sound anywhere in a word, or the long "a" or the "e" vowels (/ɑː/ or /eː/) at the end of a word. The latter is written in Arabic Naskh style (as a loop) (ھ), in order to be used in digraphs and to create the aspirate consonants.

For most aspirated consonants, Sindhi relies on unique letters as opposed to the Urdu practice of digraphs. However, this doesn't apply to all aspirated consonants. Some are still written as digraphs. The letter hāʾ is also used in Sindhi to represent the sound [h] in native Sindhi words, in Arabic and Persian loanwords, and to represent vowels (/ə/ or /əʰ/) at the end of the word. The notations and conventions in Sindhi are different from either Persian or Arabic and from Urdu. Given the variety of the types of hāʾ across these languages for which Unicode characters have been designed, in order for the letters to be displayed correctly when typing, a correct and consistent convention needs to be followed. The following table will present these in detail.[90][91]

Unicode Letter or Digraphs IPA Note Examples
Final Medial Initial Isolated
U+06BE ـھ ـھـ ھـ ھ [h] دوھَھُو⹁ مھينن⹁ ويھُ
U+0647 ـه [h] Used for borrowed words وحدهُ لا⹁ والله
U+062C +
U+0647
ـجهہ ـجهـ جهـ جهہ [d͡ʑʰ] In isolated and final positions, an extra hāʾ ـہ‎ (U+06C1) is added ٻاجَهہ⹁ اُجِهي⹁ منجهان⹁ ڪُجهہ
U+06AF +
U+0647
ـگهہ ـگهـ گهـ گهہ [ɡʱ] گهہگهوٽُ⹁ گهڻگُهرون⹁ سگهہ
U+0647 ـهہ ـهـ - [◌ʰ] Forming part of digraph for representation of other aspirated consonants ([ɽʰ], [lʱ], [mʰ], [nʰ], [ɳʰ]). In isolated and final positions, an extra hāʾ ـہ‎ (U+06C1) is added ٻنهي⹁ ٿالهہ
U+06C1 ـہ - ہ [ə] / [əʰ] / [∅] نہ

The punctuation of Sindhi Perso-Arabic script differs slightly from that of Urdu, Persian, and Arabic. Namely, instead of using the typical inverted comma (،‎ [U+060C]) common in these mentioned alphabet, a reversed comma (‎ [U+2E41]) is used, although many documents do indeed incorrectly use Urdu punctuations.[92]

Comparison of Punctuations
Full Stop Comma ‌ Semicolon
Sindhi .
Urdu ۔ ، ؛
Persian/Arabic .
Farsi (perso-Arabic) or Shikarpuri Sindhi.

Devanagari script

[edit]

In India, the Devanagari script is also used to write Sindhi.[93] A modern version was introduced by the government of India in 1948; however, it did not gain full acceptance, so both the Sindhi-Arabic and Devanagari scripts are used. In India, a person may write a Sindhi language paper for a Civil Services Examination in either script.[94] Devanagari was seen as the most practical option for Sindhi language in India.[1] Diacritical bars below the letter are used to mark implosive consonants, and dots called nukta are used to form other additional consonants.

ə a ɪ i ʊ e ɛ o ɔ
ख़ ग़
k x ɡ ɠ ɣ ɡʱ ŋ
ज़
t͡ɕ t͡ɕʰ d͡ʑʰ ʄ z d͡ʑ ɲ
ड़ ढ़
ʈ ʈʰ ɖ ɗ ɽ ɖʱ ɽʱ ɳ
t d n
फ़ ॿ
p f b ɓ m
j r l ʋ
ʂ ʂ s h

Laṇḍā scripts

[edit]

Laṇḍā-based scripts, such as Gurmukhi, Khojki, and the Khudabadi script were used historically to write Sindhi.

Khudabadi

[edit]
Khudabadi
or Sindhi
ISO 15924
ISO 15924Sind (318), ​Khudawadi, Sindhi
Unicode
Unicode alias
Khudawadi
U+112B0–U+112FF

The Khudabadi alphabet was invented in 1550 CE, and was used alongside other scripts by the Hindu community until the colonial era, where the sole usage of the Arabic script for official purposes was legislated.

The script continued to be used on a smaller scale by the trader community until the Partition of India in 1947.[95]

ə a ɪ i ʊ e ɛ o ɔ
k ɡ ɠ ɡʱ ŋ
c ɟ ʄ ɟʱ ɲ
ʈ ʈʰ ɖ ɗ ɽ ɳ
t d n
p f b ɓ m
j r l ʋ
ʂ s h

Khojki

[edit]

Khojki was employed primarily to record Muslim Shia Ismaili religious literature, as well as literature for a few secret Shia Muslim sects.[93][96]

Gurmukhi

[edit]

The Gurmukhi script was also used to write Sindhi, mainly in India by Hindus.[95][93]

Roman Sindhi

[edit]

The Sindhi-Roman script or Roman-Sindhi script is the contemporary Sindhi script usually used by the Sindhis when texting messages on their mobile phones.[97][98]

Advocacy

[edit]

In 1972, a bill was passed by the provincial assembly of Sindh which saw Sindhi, given official status thus becoming the first provincial language in Pakistan to have its own official status.

  • Sindhi language was made the official language of Sindh according to Language Bill.
  • All Educational institutes in Sindh are mandated to teach Sindhi as per the bill.

Software

[edit]

By 2001, Abdul-Majid Bhurgri[failed verification] had coordinated with Microsoft to develop Unicode-based Software in the form of the Perso-Arabic Sindhi script which afterwards became the basis for the communicated use by Sindhi speakers around the world.[99] In 2016, Google introduced the first automated translator for Sindhi language.[100][101] Later on in 2023 an offline support was introduced by Google Translate.[102][103] Which was followed by Microsoft Translator strengthening support in May of same year.[104][105]

In June 2014, the Khudabadi script of the Sindhi language was added to Unicode, However as of now the script currently has no proper rendering support to view it in unsupported devices.

See also

[edit]

Notes

[edit]

References

[edit]

Sources

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Sindhi is an Indo-Aryan belonging to the Indo-European family, primarily spoken in the province of , where it functions as the and is the mother tongue of approximately 30 million , representing about 14-15% of 's according to census data. It is also used by around 2-3 million speakers in , particularly among communities displaced during the 1947 partition, as well as by diaspora populations in the , the , and other regions. In , Sindhi is written in a modified Perso-Arabic script that incorporates 52 letters to accommodate its phonology, including implosive consonants and a range of vowels distinct from neighboring . The exhibits significant dialectal variation, with major forms such as Vicholi (central), Lari (eastern), and Lasi (southern), reflecting geographic and historical influences from Persian, , and Dravidian substrates. Sindhi possesses a venerable literary heritage, with the earliest extant works tracing to the 11th century in Sufi poetry composed by Ismaili missionaries and later enriched by poets like in the 18th century, whose remains a cornerstone of the canon. This tradition emphasizes mystical themes and folk elements, evolving under Muslim rule and British colonial introduction of print media, which spurred modern prose and journalism. Despite its antiquity—potentially rooted in Vedic-era Prakrits around 1500 BCE—Sindhi's development was shaped by successive invasions and cultural exchanges in the Indus Valley, yielding a lexicon blending Indo-Aryan core with substantial Perso-Arabic borrowings. In , Hindu Sindhi speakers historically employed scripts like Khudabadi or , highlighting ongoing debates over standardization amid partition-era migrations.

Linguistic classification

Family affiliation and subgrouping

Sindhi belongs to the Indo-Aryan branch of the Indo-Iranian languages, which form part of the broader Indo-European language family. This affiliation traces its origins to Old Indo-Aryan (exemplified by ), evolving through Middle Indo-Aryan stages such as and Apabhramśa, before emerging as a New Indo-Aryan language around the 10th century CE. The language's core vocabulary, inflectional morphology, and phonological patterns—such as the retention of aspirated stops and retroflex consonants—align with these proto-forms, distinguishing it from non-Indo-Aryan neighbors like to the south. Within Indo-Aryan, Sindhi is classified under the northwestern subgroup of New Indo-Aryan languages, a category that includes relatives like Punjabi, Lahnda (including Hindko and Saraiki), and sometimes Dardic varieties. This subgrouping reflects geographic contiguity in the Indus Valley and shared innovations, such as simplified case systems, periphrastic verb constructions, and implosive consonants in some dialects, which diverged from central and eastern Indo-Aryan branches (e.g., Hindi-Urdu or Bengali) after the Middle Indo-Aryan period. Linguistic reconstructions, drawing on comparative method applied to attested texts from the 16th century onward, support this positioning, with Sindhi exhibiting transitional traits between inner (e.g., Gujarati) and outer (e.g., Pashto-influenced) western varieties. Sindhi's internal subgrouping encompasses a rather than discrete branches, with principal varieties including Vicholo (central standard), Lari (eastern), Lasi (southern), and Thari (southeastern), unified by and common isoglosses in and syntax. These dialects form a cohesive unit within the northwestern group, though border varieties like Kachchi show partial convergence with Gujarati due to areal effects. Empirical phonological studies, including patterns and consonant shifts, affirm the language's integrity as a single entity without requiring further subdivision at the family level.

External influences and substrate theories

The phonological inventory of Sindhi includes distinctive implosive consonants such as /ɓ/, /ɗ/, /ʄ/, and /ɠ/, which are rare among Indo-Aryan languages and may reflect either archaic retentions from Proto-Indo-European glottalics under the glottalic theory or local innovations potentially influenced by pre-Indo-Aryan substrates in the Indus region. These implosives involve ingressive airflow with glottal lowering, contrasting with pulmonic voiced stops in neighboring Indo-Aryan tongues like Punjabi or Hindi. Substrate theories propose that Sindhi preserves traces of non-Indo-Aryan languages spoken by Indus Valley Civilization inhabitants prior to circa 2000–1500 BCE, with a Dravidian affiliation hypothesized due to the presence of Brahui—a Dravidian outlier—in nearby and proposed etymologies linking Indus seals to Proto-Dravidian roots. Evidence includes Sindhi's retention of retroflex series and vowel-final structures, atypical for core Indo-Aryan but paralleled in Dravidian phonologies; however, such features could arise from areal diffusion rather than direct inheritance, as retroflexes spread widely in Indo-Aryan via Dravidian contact further south. This Dravidian substrate model remains speculative, supported by place-name analyses and substrate loans in early (e.g., terms for / absent in western Indo-European branches), but lacks of for confirmation. Post-conquest external adstrates from and Persian dominate Sindhi's lexical strata, introduced after the 712 CE Umayyad invasion led by Muhammad bin Qasim, which established Islamic rule and facilitated borrowing in religious, administrative, and scientific domains. Persian loans, amplified under and Mughal administrations (13th–19th centuries), number in the thousands and extend to syntax (e.g., pronominal suffixes) and literary forms like poetry; contributions, often mediated via Persian, include core Islamic terminology (e.g., namaz for ). These superstrates comprise up to 20–30% of modern Sindhi in formal registers, far exceeding Sanskrit-derived roots in borrowed abstract concepts, while basic remains predominantly Indo-Aryan. Areal contacts with neighboring languages like Balochi (Iranian) and Saraiki (Indo-Aryan) yield minor phonological and lexical exchanges in peripheral dialects, such as shared aspirates or pastoral terms, but do not alter Sindhi's core grammar. Claims of Sindhi as inherently Dravidian, citing suffixal morphology or verb derivations, represent fringe views without broad empirical support, as comparative reconstruction firmly affiliates it with Northwestern Indo-Aryan via intermediaries.

Fringe classifications and empirical rebuttals

Some proponents of alternative classifications argue that Sindhi originated as a Dravidian language associated with the Indus Valley Civilization, later overlaid by Indo-Aryan elements due to migrations, citing phonological traits like implosives and certain lexical items as evidence of pre-Aryan substrate influence. This perspective, advanced by figures such as Kingrani, draws on comparative word lists and to suggest proto-Dravidian roots persisting despite external contacts. Similarly, isolated claims link Sindhi to Semitic or Sumerian languages, positing an independent ancient lineage tied to Mesopotamian civilizations, as proposed in certain regional historical analyses to underscore pre-Indo-European antiquity in the Indus region. These theories often emerge from archaeological interpretations of and nationalist efforts to assert cultural continuity beyond Vedic influences. Such fringe views are empirically rebutted by the in , which demonstrates Sindhi's systematic descent from Old Indo-Aryan via Middle Indo-Aryan Prakrits, evidenced by regular sound shifts (e.g., kṣetra to Sindhi khetar 'field') and shared morphological paradigms with other Northwestern like and Gujarati. The (1919), based on extensive dialectal data and cognate analysis, firmly places Sindhi in the Indo-Aryan North-Western subgroup, noting its divergence from Central Indo-Aryan forms but retention of core fusional inflection, such as nominative-accusative alignment and tense-aspect markers absent in agglutinative Dravidian structures. Implosives, while atypical for core Indo-Aryan, represent areal innovations from multilingual contact in the Indus basin, paralleled in neighboring Dardic and , rather than a Dravidian inheritance, as Dravidian phonologies emphasize retroflex series without implosive stops. Lexical overlap further undermines non-Indo-Aryan origins: approximately 70% of Sindhi core vocabulary traces to Sanskrit-Prakrit roots, with innovations attributable to Persian-Arabic loans post-712 AD, not Semitic or Sumerian substrates, for which no regular correspondences exist despite claims. Dravidian parallels, when present, reflect borrowing (e.g., via Brahui isolate in ) rather than genetic affiliation, as systematic etymological reconstruction favors Indo-Aryan proto-forms. Mainstream classifications, corroborated by phonological inventories and syntactic typology, affirm Sindhi's Indo-Aryan status, rendering alternative theories unsubstantiated by the absence of shared innovations or regular sound laws required for family membership.

Historical development

Pre-Islamic origins and early forms

The Sindhi language belongs to the Indo-Aryan branch of the Indo-European family, with its roots in the Old Indo-Aryan Vedic Sanskrit introduced to the Indus Valley region via migrations around 1500 BCE. These early forms evolved amid interactions with pre-existing substrates, though Sindhi's core grammar and lexicon remain distinctly Indo-Aryan, showing minimal Dravidian or other non-Indo-European retention despite fringe theories linking it to undeciphered Indus Valley scripts. By the 3rd century BCE, Vedic Sanskrit had transitioned into Middle Indo-Aryan Prakrits, regional vernaculars spoken in Sindh under empires like the Mauryas, as evidenced by Prakrit inscriptions and texts from Buddhist sites in the region. In the Sindh context, these Prakrits likely included Northwestern variants, adapted to local phonology, with features like retroflex sounds emerging from the region's linguistic environment. The Natyashastra, composed between 200 BCE and 200 CE, provides one of the earliest literary references to dialects in the broader Indus area, implying proto-Sindhi speech forms akin to . Oral traditions dominated, supplemented by religious texts in used by Jains and Buddhists prevalent in pre-Islamic until the 7th century CE. The late pre-Islamic phase saw s devolve into Apabhramsha around the 6th century CE, a transitional stage characterized by phonetic simplification and loss of complex inflections, setting the foundation for New Indo-Aryan languages. Specifically, Sindhi descends from the Vrachada (or Vracada) Apabhramsha dialect of the lower Indus Valley, distinguished by innovations such as implosive stops and patterns reconstructible through comparative methods. This precursor is corroborated by 11th-century accounts like Al-Biruni's in Kitab al-Hind, which describe the local vernacular's divergence from norms prior to Arab incursions. No dedicated scripts for Vrachada survive, but Brahmi derivatives likely served for administrative or religious purposes, reflecting a continuum from epigraphy. Reconstruction relies on rather than direct attestation, as Sindhi's distinct identity solidified post-712 CE amid external contacts.

Post-712 AD Islamic conquest impacts

The Arab conquest of Sindh in 712 CE by under the established the first sustained Muslim rule in the , initiating linguistic contact that primarily manifested in lexical enrichment rather than structural overhaul of Sindhi. , as the language of , , and early , introduced loanwords focused on Islamic terminology, administration, and jurisprudence; these borrowings occurred through elite bilingualism among rulers and converts, with Sindhi speakers adapting terms for local use without displacing the language's Indo-Aryan core. Subsequent Abbasid oversight and the rise of local Muslim dynasties, such as the Habbari (9th-10th centuries) and Soomra (11th-14th centuries), amplified Persian's role as the administrative and literary medium, leading to deeper integration of Perso-Arabic vocabulary. This influence peaked under later rulers like the Sammas and Arghuns, where Persian court usage permeated Sindhi, particularly in formal domains; examples include darvāzō ("gate," from Persian darvāze) and dafnāińu ("to bury," calqued on Persian patterns), often retaining original orthographic forms with minor phonological adaptations to Sindhi's retroflex and implosive sounds. Such loans, concentrated in urban educated speech, constitute a substantial portion of Sindhi's lexicon, akin to patterns in neighboring , but did not alter syntax or morphology significantly due to the persistence of Sindhi as the vernacular. The Perso-Arabic script's adoption for Sindhi writing emerged during this era, evolving from Arabic models to accommodate the language's phonology via digraphs (e.g., for aspirates like jh and gh) and diacritics for implosives (, ) and retroflexes. Early attestations appear in medieval Sufi texts, such as those by Qāẓī Qādan (d. 1551 CE), reflecting Islamic literary traditions; this script supplanted pre-conquest indigenous systems like Khudabadi for official and religious purposes, though full standardization awaited British reforms in 1853 CE. Overall, the conquest's linguistic legacy emphasized vocabulary expansion—estimated through comparative analyses to involve thousands of terms—fostered by cultural patronage of Persianate elites, while preserving Sindhi's distinct phonological profile against wholesale Arabization.

Medieval and pre-colonial evolution

During the Sumra dynasty (c. 1050–1350 CE), Sindhi began transitioning from its Vrachada Apabhramsa roots toward a more distinct New Indo-Aryan form, retaining core grammatical structures while showing limited early integration of terms primarily for religious contexts following the 712 CE conquest. This period marks the earliest recorded literary activity in Sindhi, with verses attributed to seven fakirs under Jam Tamachi II in the , evidencing the language's capacity for poetic expression in a vernacular mode minimally influenced by Persian at the time. Under Samma rule (c. 1350–1520 CE), the language incorporated subtle phonological and lexical shifts, including nascent Persian vocabulary, as seen in surviving verses by poets like Shaikh Hamad bin Rashiduddin Jamali (d. 1362 CE) and Shaikh Ishaq Ahangar, whose works blend local with emerging Sufi themes. Script usage remained non-standardized, drawing from Landa derivatives and proto-Perso-Arabic forms adapted for Sindhi , which preserved unique features like implosive consonants absent in neighboring . The poetic form emerged, characterized by two-line structures with end rhymes, reflecting Hindi influences alongside indigenous patterns. The and periods (c. 1520–1700 CE) accelerated due to courtly adoption of Persian as the administrative language, introducing approximately 8–29 loanwords per early poetic composition, as in Shah Abdul Karim's (1536–1620 CE) 94 preserved verses, which fuse bardic traditions with Sufi mysticism. Qazi Qadan (d. 1551 CE) represents the first poet with authenticated verses, numbering seven, demonstrating Sindhi's syntactic flexibility in adapting Arabic-Persian prosody while maintaining Prakrit-derived case endings and verb conjugations. Shah Inayat (d. c. 1718 CE) further synthesized these elements, blending indigenous motifs with Islamic esotericism in his kafis, evidencing the language's maturation into a vehicle for syncretic spiritual expression. In the Kalhora (1701–1783 CE) and Talpur (1783–1843 CE) eras, preceding British annexation, Sindhi reached a literary zenith, with vocabulary expanding to 12,000–20,000 words suited for complex narrative and lyrical forms, as exemplified by Shah Abdul Latif Bhittai's Risalo (compiled posthumously from works c. 1689–1752 CE), which features 12 surs drawing on like Sur Marui and employs refined wai (precursor to kafi) structures rooted in local dialects. Contemporaries (1739–1829 CE) and Sami (c. 1743–1850 CE) contributed to Upper Sindhi variants, incorporating denser Persian-Arabic lexicon in Sachal's kafis and simpler sloka forms in Sami's oeuvre, highlighting dialectal isoglosses such as Larvi's purity versus Lari's foreign admixtures. Script diversity persisted, with Muslims using Perso-Arabic adaptations and Hindus or , underscoring the language's oral primacy and resistance to full standardization until colonial intervention. These developments reflect causal pressures from dynastic and Sufi networks, prioritizing accessibility over elite Persian dominance.

Colonial standardization (1843–1947)

Following the British annexation of Sindh in 1843 after the defeat of the Talpur rulers at the , colonial administrators sought to establish Sindhi as a functional for and , replacing Persian which had been used under prior Muslim rule. In 1847, was incorporated into the , prompting early efforts to codify the amid its dialectal diversity and lack of standardized form. recommended the Naskhi ( in 1848 for its prevalence among and adaptability to Sindhi phonemes, while George Stack compiled the first Sindhi-English dictionary and grammar using script, printed in 1849 at the American Mission Press in Bombay with 500 copies produced at a cost of Rs. 3 each. Script standardization advanced rapidly in the early 1850s amid debates over Arabic, Devanagari, Khudabadi, and modified Hindustani variants. Bartle Frere, as Commissioner, promulgated Sindhi as the official language in 1851 via Bombay Government Circular No. 1825, mandating colloquial proficiency exams for civil officers and allocating Rs. 10,000 annually for education under Court of Directors Despatch No. 46 of December 8, 1852. A committee led by B.H. Ellis finalized the Arabic-Sindhi script in 1853, incorporating 52 letters with modifications like additional dots for unique sounds such as implosives, sanctioned by the East India Company's Court of Directors (Dispatch No. 216, December 8, 1852) and published in July 1853; this became the official standard by 1855, though Khudabadi (revised as Hindoo-Sindhi in 1856 and introduced in schools for Hindus in 1868) persisted as an alternative for non-Muslim communities. Ellis also oversaw printing of educational texts like Esop’s Fables (1854) and completed Stack’s dictionary (1855, 500 copies at Rs. 2,496 total), while Ernest Trumpp published editions of Shah Abdul Latif Bhitai’s Shah Jo Risalo (1866, lithographed in Leipzig) and a comprehensive Grammar of the Sindhi Language (1872), comparing it to Sanskrit-Prakrit roots. Educational reforms emphasized Sindhi as the vernacular medium, with Frere establishing schools in 1851 and Ellis proposing a Rs. 20,000 annual budget in 1854 for expanded vernacular instruction across towns like , Hyderabad, and , reaching 27 schools with 7,443 scholars by 1856. By August 29, 1857, the mandated all official applications in Sindhi, enforcing its administrative use and requiring 4-6 month training for foreign officers. infrastructure, initiated with the Sindh Advertiser press in 1844, supported proliferation of texts like Hikayat-ul-Salheen (1851, first Arabic-Sindhi book) and Bab Namah (1853), alongside dictionaries by George Shirt (1866). These measures spurred prose literature and neologisms for modern administration, though dialectal variations persisted; George Grierson’s (1919, Vol. VIII) later classified Sindhi as Indo-Aryan, documenting its phonological traits. Into the , Sindhi typewriter development (e.g., Remington’s “” model) and sustained school integration solidified the standardized form, facilitating its role in local courts, media, and bureaucracy until partition in 1947.

Post-partition trajectories (1947–present)

Following the partition of British on August 14, 1947, the Sindhi language experienced divergent paths shaped by mass migrations and state policies, with province allocated to and approximately 1.2 million relocating to , severing the language's primary territorial base there. In , Sindhi retained its status as the dominant language of , bolstered by provincial autonomy, while in , it transitioned to a minority tongue among dispersed communities, facing assimilation pressures and limited institutional support. These shifts influenced script usage, recognition, , and literary production, with emphasizing Perso-Arabic orthography continuity and permitting dual scripts amid declining vitality. In Pakistan, the Perso-Arabic script for Sindhi, standardized in 1853 under British administration with modifications for 52 letters including aspirated and implosive consonants, persisted post-independence without major alteration, facilitating administrative and educational continuity in Sindh. Sindhi was reinstated as the province's official language in the early post-partition decade, with the 1972 Sindh Assembly resolution mandating its use in government operations alongside Urdu, countering centralizing Urdu policies that marginalized regional languages nationally. Educationally, Sindhi became the primary medium of instruction in Sindh's public schools from the primary level, with 2019 provincial orders requiring its compulsory teaching in private institutions up to class 5, though implementation varies due to Urdu and English dominance in higher education and urban elites. Media expansion included Sindhi broadcasts on Radio Pakistan from 1947 and dedicated channels like Sindh TV by the 2000s, alongside newspapers such as Sindh Express, sustaining vernacular journalism despite national Urdu prioritization. Modern Sindhi literature in Pakistan flourished post-1947, building on pre-partition foundations with progressive themes in poetry and prose; poets like Sheikh Ayaz (1923–1998) incorporated and resistance motifs, publishing over 50 collections that critiqued and state centralism. Prose evolved through novels addressing partition trauma and identity, such as Jamal Abro's works in the 1980s–1990s, while academic standardization efforts, including dictionaries and grammars by institutions like the Sindhi Adabi Board established in 1951, supported linguistic codification. In , post-partition Sindhi speakers, numbering around 2.7 million by recent estimates but concentrated in urban enclaves like and , adopted both Perso-Arabic and scripts, with the latter promoted for Hindu-majority users to align with national linguistic norms. Initial exclusion from constitutional recognition—Sindhi was not listed in the Eighth Schedule until a amendment following community agitation—delayed institutionalization, contributing to domain loss in and administration where and English prevailed. Educational policies in states like incorporated Sindhi in refugee-settlement schools sporadically, but intergenerational shift toward dominant languages eroded proficiency, with surveys indicating only 15–20% fluency among younger by the . Indian Sindhi literature post-1947 grappled with themes, as in the works of Krishna Kotwani and Popati Hiranandani, who chronicled displacement in and memoirs, yet faced marginalization from scarce infrastructure and audience fragmentation. Efforts like the P.G. Sindhi Library's digitization of over 2,000 post-partition titles since the aim to preserve output, but critics note linguistic hybridization and declining output, with production dropping from dozens of annual titles in the to fewer than ten by the 2000s. Overall, while Pakistan's trajectories reinforced Sindhi's regional robustness amid national multilingual tensions, India's fostered attrition, underscoring causal links between territorial continuity and linguistic vitality.

Geographic distribution and demographics

Primary regions in Pakistan

Sindhi is primarily spoken in Pakistan's Sindh province, where it holds official status alongside . The language predominates as the mother tongue in this region, reflecting its deep historical and cultural roots among the local population. According to the 2017 Pakistan Census, Sindhi accounts for 14.57% of the national population's mother tongues. Within Sindh, analyses of census data indicate that Sindhi speakers comprised approximately 62% of the provincial population in 2017, though this figure declined slightly to 60% by 2023 per survey estimates. The highest concentrations occur in rural districts such as , , and Hyderabad, with urban centers like showing lower proportions due to linguistic diversity from migration. In Balochistan province, Sindhi-speaking communities form a notable minority, estimated at around 5.6% of the provincial population based on earlier census distributions, primarily in southern districts like Lasbela where the Lasi dialect prevails. These speakers trace origins to historical migrations and shared cultural ties with Sindh. Sindhi presence in Punjab remains marginal, limited to border areas with fewer than 0.2% of speakers nationally outside Sindh and Balochistan. Overall, over 90% of Pakistan's Sindhi speakers reside in Sindh, underscoring the province's centrality to the language's demographic core.

Usage in India

In India, Sindhi is spoken primarily by Hindu communities displaced from during the 1947 partition, with concentrations in urban areas of (particularly and ), , (such as and Barmer districts), and to lesser extents in , , and . As per the , there were 2,772,264 native Sindhi speakers, constituting 0.23% of the national population, a figure reflecting stability from prior censuses but masking intergenerational shifts toward bilingualism in or regional languages. Sindhi holds scheduled language status under the Eighth Schedule of the Indian Constitution, added via the 20th Amendment in 1967, granting it recognition alongside 21 other languages for purposes of cultural preservation and limited administrative use, though it lacks official status in any state and remains without a designated homeland region. In India, the language employs both the Perso-Arabic script (adapted for Sindhi phonology) and , with the latter predominating in education and print media to align with Hindu linguistic traditions and facilitate integration with other Indic scripts. Educationally, Sindhi is offered as a medium of instruction in select primary and secondary schools, particularly in Maharashtra and Gujarat, supported by the National Council for the Promotion of Sindhi Language, which advises on curriculum development and publishes materials; however, enrollment has declined steadily due to preferences for Hindi or English, with younger generations showing reduced proficiency amid urbanization and economic assimilation. Media usage includes community radio broadcasts, periodicals like the Sindhi Sansar, and digital platforms, but output is limited compared to dominant languages, contributing to a broader trend of language attrition where daily spoken Sindhi persists in familial and religious contexts yet faces erosion from intermarriage and migration. This decline underscores causal factors such as the absence of institutional reinforcement in non-native regions and competition from nationally promoted languages, despite constitutional safeguards.

Diaspora communities

Sindhi-speaking diaspora communities, predominantly consisting of Hindu Sindhis with pre-existing trade networks, have formed in several countries outside and , driven by commercial migration that intensified after the 1947 partition of British India. These communities maintain the language primarily within families, religious institutions, and business dealings, though intergenerational shift toward English or local languages is prevalent due to assimilation pressures and lack of institutional support. Organizations such as the Sindhi Association of (SANA), established to promote cultural preservation including , facilitate community events and online resources for younger generations. In the United States, Sindhi speakers number approximately 8,965 according to the U.S. Census Bureau's (2009–2013), concentrated in urban centers like New York, , and where expatriate networks support vernacular use in private spheres. Similarly, Canada's 2016 reported 11,860 individuals with Sindhi as their mother tongue, mainly in and , where community centers and festivals sustain oral traditions despite predominant English dominance in public life. In both nations, Sindhi functions as a , with limited formal instruction available through private initiatives rather than public curricula. The hosts one of the largest Sindhi expatriate populations, estimated at over 100,000 individuals engaged in trade and retail, particularly in and ; here, Sindhi persists in interpersonal commerce and household settings amid and English prevalence. Comparable communities exist in the , with concentrations in fostering cultural associations that organize classes and media, and in , where smaller groups of around 4,000 maintain ties through business guilds. retention challenges are acute in these settings, as evidenced by studies on Malaysian showing and shift as adaptive strategies in multilingual environments. Overall, diaspora Sindhi speakers total several hundred thousand globally, but precise enumeration remains elusive due to fluid migration and underreporting in host-country censuses focused on rather than or . In Pakistan, the 2017 census recorded Sindhi as the mother tongue of 14.57% of the national population, equating to approximately 30.3 million speakers out of a total of 207.68 million people. The 2023 census reported a comparable national proportion of 14.3%, corresponding to roughly 34.5 million speakers amid a population exceeding 241 million, with the vast majority concentrated in Sindh province where Sindhi accounts for 60.14% of residents or about 33.5 million individuals. In India, the 2011 census identified 2,772,264 Sindhi speakers, representing 0.23% of the total population and primarily residing in states such as , , and following post-1947 partition migrations. Diaspora communities maintain smaller pockets of speakers in countries including the , the , the , and , with estimates suggesting several hundred thousand individuals, though intergenerational transmission often weakens due to host-language dominance and limited institutional support. Nationally in Pakistan, Sindhi speaker numbers have expanded in absolute terms alongside population growth since 1998, when they comprised 14.1% or about 21.8 million of 132.4 million total residents, but the provincial share in Sindh dipped marginally from 62% to 60% between 2017 and 2023 amid urbanization and influxes of Urdu- and Pashto-speaking migrants. In Karachi, the proportion of Sindhi speakers rose from 7.22% in 1998 to 10.67% in 2017, attributable to rural-to-urban Sindhi migration offsetting language shift pressures from Urdu as the national lingua franca. In India, speaker counts increased steadily from roughly 2.5 million in the 2001 census to 2.77 million in 2011, reflecting community efforts to preserve the language through education and media despite assimilation in urban Hindu-majority settings. Overall, while demographic expansion sustains Sindhi vitality in Pakistan's rural Sindh heartland, urban bilingualism and migration pose risks of gradual erosion in fluency among younger cohorts, with no evidence of acute decline but persistent challenges from Urdu's institutional precedence.

Dialectal variation

Major dialect groups

Sindhi features six principal dialect groups, primarily distinguished by geographic distribution within Sindh province and adjacent regions: Sireli (or Siraiki) in upper Sindh, Vicholi in central Sindh, Lari in lower Sindh, Thari in the Thar Desert area, Lasi in Lasbela and parts of Balochistan, and Kachhi (or Kutchi) in the Kutch region. These groupings reflect historical settlement patterns and substrate influences from neighboring languages like Balochi and Gujarati. Vicholi, centered around Hyderabad and the Vicholo region of central , serves as the prestige variety and forms the foundation for the standardized literary form of Sindhi. It exhibits relatively conservative phonological features compared to peripheral dialects and has been promoted through and media since the colonial . Lari predominates in lower , including districts like and Sujawal, where it is spoken by communities along the Indus Delta. This dialect incorporates some maritime and coastal lexical elements, distinguishing it from inland varieties. Sireli occupies upper Sindh, bordering , and shows partial convergence with adjacent Saraiki speech forms, though it retains core Sindhi grammar and vocabulary. Thari, prevalent in district, adapts to arid desert conditions with influences from Rajasthani dialects, featuring distinct phonetic shifts such as aspiration patterns. Lasi, found in and extending into Balochistan's Hub and areas, demonstrates substrate effects from Balochi, including retroflex enhancements and loanwords related to . Kachhi bridges Sindhi with Kutchi dialects in the , spoken across the Pakistan-India border, and preserves archaic Prakrit-derived terms amid Gujarati admixture. These dialects maintain high overall, with variations chiefly in lexicon and prosody rather than syntax.

Isoglosses and mutual intelligibility

Sindhi dialects exhibit a with isoglosses primarily aligned to geographic boundaries, separating phonological, lexical, and grammatical innovations influenced by neighboring languages. Northern isoglosses demarcate transitions to varieties like Siraiki, featuring clearer articulation and Punjabi lexical borrowings, while eastern boundaries with mark Thareli (or Thari) dialects through vigorous intonation and Marwari substrate effects. Southern isoglosses distinguish Kachhi from Gujarati influences, with blended vocabulary and reduced implosive contrasts, and southwestern lines separate Lasi from Balochi admixtures. Key phonological isoglosses include variations in aspiration and vowel quality: Lari dialects south of the Indus Delta show disaspiration of voiced stops (e.g., aspirated bh > b), contracted vowels, and softened consonants, contrasting with Vicholi's retention of double consonants and standard implosives (ḇ, ḋ, etc.). Lasi varieties exhibit transitional traits, with minor pitch excursions (F0 rise ~140-155 Hz) differing from Vicholi's stable contours, while Larri maintains higher vowel durations in closed syllables (CVCC: 0.033s vs. Vicholi's 0.021s). Lexical isoglosses bundle Dardic or Dravidian suffixes in peripheral dialects, such as Lari's pronominal endings resembling Dravidian patterns (e.g., -en for first-person singular). Mutual intelligibility is high among contiguous dialects, with Vicholi speakers readily understanding Lasi despite phonological shifts, but decreases toward peripheries due to substrate divergences and social barriers from historical settlements. Neighboring varieties like Siraiki maintain partial comprehension with northern Vicholi, yet Thareli's Rajasthani admixtures and Kachhi's Gujarati mixes reduce intelligibility for central speakers, often requiring accommodation. Acoustic variations in intonation and duration across Lasi, Lari, and Vicholi imply challenges in dialectal speech recognition, though no formal asymmetry tests quantify lexical overlap below 80% in fringe areas. Overall, while core dialects support fluid communication, peripheral isogloss bundles foster dialectal distinctiveness without full unintelligibility.

Standardization debates

In 1853, the British colonial administration in Bombay appointed an eight-member committee to standardize the Perso-Arabic script for Sindhi, resulting in a modified alphabet with additional graphemes to represent unique Sindhi phonemes, such as implosives and aspirates, which remains the official orthography in Pakistan. This decision resolved earlier inconsistencies among variants like Arabic-Sindhi and Lunda scripts but sparked colonial-era debates over adopting a Devanagari-based system instead, favored by some Hindu communities for its alignment with indigenous traditions. The standardization prioritized administrative efficiency and printing needs over phonetic completeness, leading to persistent graphematic variations, including inconsistent diacritic usage and allograph choices. Following the 1947 partition, script choice became a flashpoint tied to religious identity, with in advocating a shift to to disassociate from the Perso-Arabic script's Islamic connotations, culminating in its constitutional recognition as the 15th scheduled language on April 10, 1967, alongside official allowance of both scripts since a 1949 government resolution. In , Perso-Arabic retention reinforced national linguistic policy, but in , proponents of Perso-Arabic, such as the Sindhi Sangat organization, argue it better accommodates Sindhi's 52-letter inventory and accesses a larger literary corpus—approximately 10,000 titles in Perso-Arabic versus 1,000 in over the past 50 years—while criticizing adaptations for inadequate phonological mapping. This divide has communal undertones, with some viewing Perso-Arabic revival as essential for cultural continuity amid declining literacy, though others prioritize for educational integration. Contemporary debates center on orthographic uniformity and digital viability, with variations in both scripts affecting spelling, , and reduced vowels; for instance, the Perso-Arabic "Heh" cluster lacks full disambiguation, complicating encoding. In , the Sindhi Language Authority has pursued reforms for spelling consistency in compound words and digital standards, addressing non-uniformities in baro-words and technical terms. India's 2024 controversy over textbooks, initially in , highlighted demands for Perso-Arabic editions to boost youth engagement, underscoring unresolved tensions between preservation and accessibility. Standard Sindhi, based on the Vicholi dialect of the Hyderabad region, faces minimal dialect-specific contention compared to script issues, though phonological divergences in peripheral varieties like Lasi persist without formal resolution.

Phonological system

Consonant inventory

The consonant inventory of Sindhi is notably extensive among , comprising approximately 39 to 52 phonemes depending on the analysis, which accounts for dialectal variation and inclusion of marginal or sounds. This richness stems from a combination of inherited Indo-Aryan features, such as aspirated stops and retroflex consonants, alongside innovations like implosive stops produced via glottalic ingressive . A defining trait is the presence of four implosive consonants—/ɓ/ (bilabial), /ɗ/ (alveolar), /ʄ/ (palatal affricate), and /ɠ/ (velar)—which contrast phonemically with pulmonic egressive voiced stops (/b/, /d/, /ɖ/, /d͡ʒ/, /ɡ/) and occur natively rather than solely in borrowings. These implosives, rare outside South Asia and Africa, arise from historical sound shifts and contribute to Sindhi's phonological complexity, enabling distinctions in minimal pairs (e.g., /ɓaɾu/ "full" vs. /baru/ "child"). Implosives are more frequent in initial and medial positions but less common word-finally. The core obstruent series features voiceless unaspirated and aspirated stops/affricates at labial, dental, retroflex, palatal, and velar places of articulation, alongside voiced counterparts. Fricatives include both (/s/, /z/, /ʃ/, /ʒ/) and non-sibilants (/f/, /θ/, /x/, /ɣ/, /h/), with /θ/ and /f/ partly attributable to Perso-Arabic influence but integrated into the native system. Nasals exhibit a five-way contrast (/m/, /n/, /ɳ/, /ɲ/, /ŋ/), while (/l/, /ɭ/, /j/, /w/) and rhotics (/ɾ/, /ɽ/) complete the inventory, with retroflex variants marking a hallmark of Indo-Aryan .
Manner/PlaceLabialDental/AlveolarRetroflexPalatal/Alveolo-palatalVelarGlottal
**ptʈk
**ʈʰ
**bdɖg
Implosiveɓɗʄ (affricate)ɠ
**
**tʃʰ
**
Nasalmnɳɲŋ
**fθ, sʃxh
Fricative (voiced)zʒɣ
**lɭj
Rhotic/Flapɾɽ
Glidew
This table represents the phonemic contrasts in standard (Vicholi) Sindhi, though peripheral dialects may merge or add sounds (e.g., breathy nasals or additional fricatives in eastern varieties). Allophones include aspirated nasals and retroflexion assimilation, but phonemic status prioritizes contrastive units over surface variants.

Vowel system

The Sindhi vowel system comprises ten monophthongal phonemes, distinguished primarily by tongue height, backness, and length, with three short vowels and seven long vowels forming the core inventory. The short vowels are the high front unrounded /ɪ/, high back rounded /ʊ/, and mid central unrounded /ə/, while the long vowels include high front unrounded /iː/, high back rounded /uː/, mid front unrounded /eː/, mid back rounded /oː/, low front unrounded /ɛː/, low back rounded /ɔː/, and low central unrounded /aː/. This asymmetrical length contrast reflects a historical development from Proto-Indo-European and Prakrit precursors, where short low /a/ merged or shifted, leaving no short counterpart to /aː/.
FrontCentralBack
High long/iː//uː/
High short/ɪ//ʊ/
Mid long/eː//oː/
Lower-mid long/ɛː//ɔː/
Mid-central short/ə/
Low/aː/
Nasalization constitutes a phonemic feature across the vowel inventory, with nasal counterparts to each oral vowel serving to contrast meanings, as in minimal pairs like oral /paːl/ "moment" versus nasal /pãːl/ "to take care." Acoustic studies confirm distinct formant structures for nasalized vowels, typically showing lowered F1 and F2 frequencies compared to oral equivalents, with nasal murmur evident in spectrograms. Vowel length is contrastive and phonemic, particularly in non-final positions, though shortening occurs in rapid speech or before certain consonants; duration measurements from native speakers average 150-250 ms for short vowels and 250-400 ms for long ones. Allophonic variations include centralization of /eː/ and /oː/ to [ɛ̞ː] and [ɔ̞ː] before retroflex consonants, and raising of /ɛː/ to [eː]-like in some dialects, but these do not alter phonemic distinctions. Diphthongs are marginal, primarily /ai/ and /au/ arising from vowel plus glide sequences, but they are not core to the system and often analyze as disyllabic in phonological processes. The system's symmetry aligns with broader Indo-Aryan patterns, yet Sindhi's inclusion of /ɛː/ and /ɔː/ as phonemic long lowers distinguishes it from neighboring languages like Hindi-Urdu, which treat them as allophones.

Suprasegmental features

Sindhi exhibits lexical stress as a primary suprasegmental feature, functioning at both word and sentence levels, with one prominent syllable per lexical item typically bearing primary stress. Stress placement is influenced by syllable weight, favoring heavy syllables containing long vowels or closed by consonants, and is realized acoustically through elevated fundamental frequency (F0), prolonged duration, and increased intensity in the stressed vowel compared to unstressed counterparts. Phonetic analyses of word pairs confirm statistically significant differences in these correlates, supporting classification of Sindhi as a stress-accent language rather than a tone language. Extra-heavy stress may occur for emphasis, while drawled variants signal confirmation or persuasion. Intonation contours in Sindhi operate independently of lexical stress, employing four pitch levels—low, mid, high, and extra high—along with three terminal patterns: level, falling, and rising. These elements convey syntactic distinctions, such as declarative statements versus interrogatives or exclamations, with rising intonation often marking questions (e.g., level statement /hənə khã pəcəndə/ versus rising /hənə khã pəcəndə↑/). Acoustic investigations reveal pitch (F0) modulates both stress prominence and broader prosodic phrasing, yet intonation remains separable, contributing to rhythm and discourse functions without altering lexical contrasts. Prosodic rhythm further integrates variable pitch and duration ranges, with adult speakers showing mean F0 spans of approximately 100-200 Hz and duration modulations aiding speech processing applications. Nasalization functions phonemically, contrasting oral and nasalized vowels (e.g., /a/ versus /ã/, /i/ versus /ĩ/), and spreads as a prosodic feature across adjacent vowels or semivowels in sequences, as in /ɡəʊə/ realized as [ɡə̃ʊə̃] 'cow (oblique)'. This regressive or progressive assimilation enhances phonological contrasts without segment-level specification. Juncture demarcates boundaries, with close juncture involving smooth transitions (e.g., /paŋɦi/ [paŋɦi] 'water') and open junctures signaling pauses or phrase edges, including internal, terminal falling, and terminal rising types. Phonetic effects include reduced aspiration pre-juncture and contrasts like /tʃoŋkiri/ 'girl' versus /tʃo + kiri/ 'why did she fall', where juncture resolves ambiguity. These features collectively underpin Sindhi's prosodic structure, influencing mutual intelligibility across dialects through variations in stress timing and intonational melody.

Orthographic systems

Perso-Arabic adaptations

The Perso-Arabic script for Sindhi, introduced during the Arab conquest of Sindh in the 8th century CE, represents a specialized adaptation of the Arabic abjad to accommodate the language's phonological inventory. This script evolved to include modifications for sounds absent in classical Arabic, such as implosives, retroflexes, and aspirated consonants, drawing on Persian influences while adding unique graphemes. It was formally standardized in 1853 by a committee appointed by the British colonial Government of Bombay, which regulated the alphabet's graphemes and promoted its use in printing and education. Sindhi's Perso-Arabic employs 49 basic letters plus 7 digraphs for aspirated sounds, totaling around 52 characters, significantly expanding the standard Perso-Arabic set of approximately 32-40 letters. Additional letters, such as ٻ for the implosive /ɓ/, ٺ for /ʈʰ/, ڍ for the retroflex /ɖ/, and ڙ for the retroflex flap /ɽ/, were introduced or modified with diacritics to distinguish these Indo-Aryan phonemes. Digraphs like جھ for /dʒʰ/ and digraph forms for aspiration (e.g., bh, dh) further adapt the script, though aspiration is inconsistently marked in practice. Three variants of "heh" (ه, ھ, ہ) are used, with ہ often reserved for aspirated /ɦ/ to avoid ambiguity. Vowel representation relies on matres lectionis— letters ا, و, ي, and sometimes ع standing for long vowels /aː/, /uː/, /iː/, and /ə/—while short vowels /ɪ/, /ʊ/, /ə/ are typically omitted in writing but can be indicated with diacritics (َ for /a/, ُ for /u/, ِ for /i/) in pedagogical texts. This defectiveness mirrors abjads but leads to ambiguities resolved contextually, as Sindhi omits schwa vowels more frequently than in fully vocalized scripts. Standalone vowels use carriers like ا for /a/ or ئ for /e/. The script is written right-to-left in a connected style, primarily using the Naskh form rather than , facilitating distinction from script despite shared Perso-Arabic roots. Orthographic variations persist in usage and final marking, particularly for loanwords and dialectal forms, though efforts since 1853 have promoted consistency in official Pakistani usage, where it remains the mandated script for Sindhi.

Devanagari and Landa-derived scripts

In India, following the partition of British India in 1947, the Devanagari script was widely adopted by the displaced Hindu Sindhi community for writing the language, reflecting a shift toward alignment with dominant Indic scripts amid resettlement in states like Maharashtra and Gujarat. This adaptation received formal constitutional recognition on April 10, 1967, via the 21st Amendment to the Indian Constitution, which designated Sindhi as the fifteenth scheduled language and endorsed Devanagari as its primary script for official purposes. The Sindhi variant of Devanagari modifies the standard form with diacritics—such as dots over letters for fricatives like /f/, /x/, /ɣ/, and /ʤ/ (/z/), and vertical lines for the four implosive consonants (/ɓ/, /ɗ/, /ʄ/, /ɠ/) distinctive to Sindhi phonology—enabling representation of its 52-letter inventory, including 10 vowels and 43 consonants. Despite these accommodations, Devanagari's abugida structure, derived from ancient Brahmi via Gupta and Nagari evolutions, has been critiqued for imperfectly capturing Sindhi's retroflex and implosive sounds compared to indigenous alternatives. Landa-derived scripts, rooted in the Brahmi tradition and employed by Sindhi Hindu merchants for centuries in the Indus region, offered a more phonetically tailored indigenous system prior to colonial standardization. These cursive, merchant scripts—lacking a unified form and used informally for trade ledgers, religious texts, and poetry—include variants like Lunda (also called Hatavaniki or Hatta Wanki), which evolved as an archaic offshoot resembling early Devanagari but adapted for Sindh's linguistic features. The most prominent Landa-based script for Sindhi, Khudabadi (or Khudawadi), originated in the mercantile hubs of Hyderabad and Khudabad, Sindh, with standardization efforts in the 1860s led by educator Narayan Jagannath Vaidya; it was formally documented and published in 1868 by the Government of Bombay Presidency. Comprising 37 consonants, 10 independent vowels, 9 vowel signs, and ancillary marks for a total of 69 glyphs (plus digits), Khudabadi facilitated writing Sindhi's full phonemic range, including implosives and aspirates, and saw application in commerce, early education, and literature such as the 13th-century epic Dodo Chanesar. By the late , British administrative preferences for the Perso-Arabic script—standardized in —marginalized Landa-derived systems, rendering largely obsolete as printing presses and formal schooling prioritized the Arabic adaptation. Post-partition, while dominated in , sporadic revival initiatives for emerged among communities, bolstered by its encoding in the Khudawadi block (U+112B0–U+112FF) approved in 2015, though active usage remains confined to cultural preservation rather than widespread . Both and Landa scripts underscore ongoing debates over script suitability, with indigenous forms like valued for historical authenticity but challenged by the practicality of established systems.

Romanization and historical scripts

Prior to the mid-19th century standardization of the Perso-Arabic script, Sindhi employed indigenous Landa-derived writing systems, including the Khudabadi and Khojki scripts. The Khudabadi script, originating from the Sindhi Hindu goldsmith (Sonara) community in Khudabad around 1550 CE, evolved into a cursive form used for trade records, religious texts, and literature among Hindu Sindhis. It features 52 primary characters, written left-to-right without inherent vowel marks, relying on diacritics for vowels, and was promoted in British-era schools until supplanted. The Khojki script, developed by the Nizari Ismaili Khoja community in the 15th century, served for esoteric religious manuscripts like ginans, incorporating additional characters for Sindhi phonemes absent in standard Arabic scripts. In 1853, the British East India Company administration in Bombay Presidency officially adopted a modified Perso-Arabic script for Sindhi, overriding the Khudabadi script despite its prevalence among the Hindu majority, to align with Muslim usage and administrative efficiency following the 1843 annexation of Sindh. This decision, formalized by 1856, marginalized indigenous scripts; Khudabadi persisted in private Hindu use in India post-Partition but declined due to lack of institutional support. Romanization of Sindhi lacks a single standardized system, with ad hoc transliterations employed for digital input, diaspora communication, and inter-script conversion between Perso-Arabic and Devanagari variants. Proposals for phonetic Roman systems, such as those mapping Sindhi's 48-52 phonemes to Latin letters with diacritics (e.g., "aa" for long /aː/, "bh" for aspirated /bʰ/), emerged in the 20th century for accessibility, particularly among non-literate or bilingual users. A "Standardized Roman Sindhi Script" initiative in 2010 advocated simplified rules for learning and typing, emphasizing consistency in vowel length and retroflex sounds. Recent linguistic discussions, as of 2024, recommend Romanization for non-native readers in Pakistan, addressing challenges like inconsistent online romanized text prone to spelling variations.

Script choice controversies

The choice of script for the Sindhi language has been contentious since the British colonial period, with debates centering on the suitability of Perso-Arabic, Devanagari, and indigenous Landa-derived scripts like Khudabadi for representing Sindhi phonology. In the 1850s, British officials such as Richard Francis Burton advocated for the Perso-Arabic script due to its prevalence among Muslim Sindhis and administrative familiarity, while others like Captain Stack supported Devanagari for its alignment with other Indian languages. These early disagreements highlighted tensions between religious-cultural affiliations and phonetic adequacy, as the Perso-Arabic script required extensive modifications—adding 17 extra characters to reach 52 letters—to accommodate Sindhi's implosive consonants and aspirates absent in standard Arabic. Post-partition in 1947, script selection became intertwined with national identity and migration dynamics. In Pakistan, the Perso-Arabic script was standardized and promoted through reforms by the Sindhi Adabi Board in the 1940s and 1950s, aligning Sindhi with Urdu and reinforcing Islamic linguistic heritage amid efforts to marginalize non-Muslim influences. This shift sidelined indigenous scripts like Khudabadi, which had been used by Hindu Sindhis for centuries and offered a left-to-right orientation better suited to Sindhi's Indic roots, leading to accusations of cultural erasure among Sindhi nationalists and Hindu communities. In India, the government recognized both Perso-Arabic and Devanagari scripts for Sindhi in 1960 under the Official Languages Act, but this dual system fragmented the refugee community's literary continuity, with Devanagari facilitating integration into Hindi-medium education while Perso-Arabic preserved access to pre-partition texts from Sindh. Contemporary controversies persist over practicality, technology, and unification. The cursive Perso-Arabic script poses challenges for optical character recognition (OCR) and digital input, with studies noting higher error rates in automated processing compared to angular scripts like Khudabadi. In India, surveys indicate preference for Devanagari among younger Sindhis for its compatibility with national scripts, yet older writers and those maintaining ties to Pakistani literature favor Perso-Arabic, exacerbating a generational and diasporic divide. Efforts to revive Khudabadi, such as advocacy by cultural groups in Pakistan and software bridges converting between scripts, aim to restore phonetic fidelity—Khudabadi's 46-52 characters directly map Sindhi sounds without diacritics—but face resistance due to entrenched habits and lack of institutional support. These debates underscore broader sociolinguistic tensions, where script choice reflects not only orthographic efficiency but also assertions of ethnic autonomy against state-imposed standardization.

Grammatical structure

Nominal morphology

Sindhi nouns exhibit inflectional morphology primarily for two grammatical genders—masculine and feminine—applicable to both animate and inanimate referents, as well as for number (singular and plural) and a binary case distinction between direct and oblique forms. Gender assignment is largely predictable from phonological endings, with masculine nouns typically terminating in short vowels such as /u/ or /o/ (e.g., pəṭu 'son') and feminine nouns in /a/, /i/, or long vowels like /aː/ (e.g., bəhən 'sister'), though semantic and lexical exceptions persist, such as naturally feminine terms like zen 'woman' despite non-standard endings. Inflection occurs via suffix addition, vowel replacement, or occasional morpheme subtraction, affecting the noun stem to signal these categories. Number marking differentiates by gender. Masculine singular forms, often ending in /o/, shift to /aː/ in the plural (e.g., ʧʰokro 'boy' → ʧʰokraː 'boys'), while feminine plurals append /-ũ/ or /-un/ to the singular stem (e.g., ʧʰokri 'girl' → ʧʰokrijũ 'girls'; hawaː 'wind' → hawaːũ 'winds'). Some nouns employ zero affixation for plural, retaining the singular form contextually, particularly among irregular or abstract nouns. Nouns are categorized into concrete (common and proper) and abstract types, but pluralization rules apply uniformly across categories with gender-based variations. The case system features a direct form for nominative use and an oblique stem for accusative, dative, genitive, and other oblique functions, realized through postpositions attached to the oblique (e.g., kən for ablative 'from'). Masculine singular oblique typically involves vowel replacement or suffix /-i/ (e.g., ʧʰokroʧʰokri-), while plural oblique adds /-ũ/ or /-in/ to the plural direct (e.g., ʧʰokraːʧʰokrũ). Feminine nouns show minimal stem change in singular oblique, often identical to direct, but plural oblique may append /-in/ (e.g., ʧʰokrijũʧʰokrijũ). Vocative case prefixes interjections like o or to the direct form, varying by gender and familiarity (e.g., o ʧʰokraː 'O boy!'). This yields five functional cases—nominative, accusative-dative, postpositional, genitive, vocative—though structurally binary in stem inflection. The following table illustrates a representative declension paradigm for the masculine noun ʧʰokro 'boy' and feminine ʧʰokri 'girl', using Romanized forms with postpositional examples where relevant:
CaseMasculine SingularMasculine PluralFeminine SingularFeminine Plural
Nominative (Direct)ʧʰokroʧʰokraːʧʰokriʧʰokrijũ
Oblique (e.g., Accusative: + nũ)ʧʰokri-nũʧʰokrũ-nũʧʰokri-nũʧʰokrijũ-nũ
Genitive (Oblique + to)ʧʰokri-toʧʰokrũ-toʧʰokri-toʧʰokrijũ-to
Ablative (Oblique + kən)ʧʰokri-kənʧʰokrũ-kənʧʰokri-kənʧʰokrijũ-kən
Vocativeo ʧʰokraːo ʧʰokraːaː ʧʰokriaː ʧʰokrijũ
This paradigm reflects standard patterns, with irregularities in categories like VII nouns that resist inflection. Adjectives agree with nouns in gender, number, and case, deriving from the nominal system.

Verbal system

The verbal system of Sindhi is characterized by compound constructions typical of Indo-Aryan languages, where finite verb forms combine a non-finite participle agreeing in gender and number with the subject or object, plus a copula auxiliary inflected for tense, person, and number. Verbs inflect richly for tense, aspect, mood, person, number, and gender, with suffixes and auxiliaries denoting these categories; transitive verbs in perfective tenses exhibit split ergativity, marking the subject in the oblique case with the postposition ne (or ) and agreeing the participle with the direct object's gender and number rather than the subject's. This agreement pattern shifts in imperfective aspects and non-perfective tenses, where the verb aligns with the subject's features. Aspect distinguishes imperfective (habitual or continuous, marked by affixes like -and- or -ī-) from perfective (completed action, marked by -y-), yielding ten primary aspectual tenses through combination with four copula bases: present āhē, past , presumptive hundō, and subjunctive hujē. Present habitual forms use the imperfective participle plus āhē (e.g., likhandō āhē "writes/he is writing"), while continuous aspects incorporate the auxiliary rahaṇu "to stay" (e.g., likhandō rahyo āhē "is continuing to write"). Past perfective employs the perfective participle plus (e.g., transitive mā(n) ne khat likhyo hō "I (erg.) wrote the letter," with likhyo agreeing in masculine singular with khat). Future tenses form with a future participle (e.g., -iṇō) plus the appropriate copula, and additional forms include past conditional or counterfactual with . Moods include indicative (default in tensed forms), subjunctive (via hujē copula for hypothetical or desiderative senses), imperative (bare stem with optional person markers, e.g., halu "go!"), and presumptive (for inference, via hundō). Passive voice derives from future (-ij-, e.g., sikhijaṇu "to be taught") or imperfective (-ibō) stems, often with the auxiliary to become. Causative verbs insert -ā- into the stem (e.g., sikhāiṇu "to teach" from sikhṇu "to learn"). Non-finite forms encompass the infinitive (-aṇu, e.g., halaṇu "to go"), imperfective participle (-andō), perfective participle (-yalu or -yō), future adjectival (-iṇō), adverbial imperfective (-andē), and conjunctive (). Auxiliary verbs like ho- (copula) further specify tense and aspect in compounds, with full inflection across persons, numbers, and genders for simple tenses. Transitivity influences morphology: transitive verbs (e.g., likhṇu "to write") require object agreement in perfectives, while intransitives (e.g., sūmhṇu "to sleep") align with the subject.

Pronominal and numeral forms

Sindhi personal pronouns distinguish three persons, singular and plural number, and—in the third person singular—gender and sometimes proximity distinctions. They lack inherent gender in the first and second persons but exhibit and oblique forms, with personal pronouns typically inflected for three cases: nominative (), oblique (used with postpositions), and a or genitive form derived via suffixes. The third-person pronouns may show four cases, incorporating vocative elements in some analyses. Independent forms are used nominally, while enclitic variants serve as clitics attached to verbs or nouns for emphasis or agreement. The following table lists common personal pronouns in their nominative forms, with and approximate English equivalents (Pakistani Sindhi variants predominant):
PersonSingularPlural
1stمانْ (mān) or آءُوْ (āū̃)اسانْ (asān)
2ndتُوْ (tū̃)توھانْ (tohān) or توھينْ (tohīn)
3rd Masc.ھُوْ (hū)ھِيَ (hī) or ھُوَ (hū̃)
3rd Fem.ھِيَ (hī)ھِيَ (hī) or ھُوَ (hū̃)
In , forms shift: e.g., first singular مانْ becomes مَانْ (mān), attaching postpositions like ٿي (ṭhī, 'by' or ); third masculine ھُوْ becomes اُنْھِيَ (unhī̃). Possessive pronouns derive from genitive suffixes such as -جو (-jō) for masculine or -جي (-jī) for feminine, e.g., مُونجا (mūnjā, 'mine'). A is the pronominal suffix on verbs, where direct and indirect object pronouns cliticize as es, reflecting a pronominalized verbal construction unique among many for its complexity. For instance, suffixes like -ان (-ān) for first singular direct object or -تَ (-ta) for second singular indirect object attach to finite verbs, enabling pro-drop and object agreement without independent pronouns: e.g., دِيَانْ (dīyān, 'gave me'). This correlates with ergative alignment in past tenses but shows independent variation from subject agreement or case marking. Sindhi employs a decimal numeral system for cardinals, with native Indo-Aryan roots for low numerals and compounds for higher ones (e.g., vigesimal influences absent, unlike some Dravidian systems). Cardinals precede nouns and agree in gender for 2–4 (masculine forms default in counting). The basic cardinals 1–10 are:
NumeralSindhi (Arabic script)Romanization
1ھِڪُhiku
2ٻَھْbha
3ٽِيṭi
4چَارِcār
5پَنجُpanj
6چَھْcha
7سَتْsat
8اَٺْaṭh
9نَوْnaw
10دَھْdah
Higher numbers form via tens (e.g., 20 وِيسُ vis, 30 تِيسُ tis) plus units, or literal compounds like بِيھَرَندْ (bīharand, 21). Ordinals derive by suffixing -يُونْ (-iun) or -يَهْ (-yah) to cardinals, e.g., پَھِرِيُونْ (pahriyun, 'first'), used post-nominally. Zero is ٻُڙِي (buṛi), a non-native borrowing. Numeral agreement follows noun gender, with feminine forms for 2 (ٻِي bī), 3 (ٽِيَهْ ṭiyah), and 4 (چَارِيَهْ cāriyah).

Postpositional cases and syntax

Sindhi nouns distinguish between direct and oblique forms to handle core grammatical relations, with the direct form serving nominative (subject) and accusative (direct object) functions, while the oblique form precedes postpositions to mark additional cases such as genitive, dative, locative, ablative, and instrumental. This binary case system aligns with broader Indo-Aryan patterns, where postpositions attach to oblique stems rather than inflecting nouns directly for peripheral cases. Oblique inflection involves stem modifications, primarily vowel alternations conditioned by gender, number, and phonological ending. Masculine singular nouns ending in short -u shift to -a (e.g., qalamu 'pen' → qalama); those in -o: shift to -e (e.g., ghoṛo: 'horse' → ghoṛe). Feminine singulars ending in -a or -i typically remain unchanged, though long-vowel forms like -ī: may insert a short -a for euphony. Plural obliques often add -ūn or -yūn to the stem, with postpositions following to specify the relation. Key postpositions include (dative, 'to/toward'), sā̃ (comitative/instrumental, 'with'), mē̃ (locative, 'in/at'), and ablative derivations like kā̃ ('from') formed via suffixation to locative bases. The genitive postposition (jo: or equivalents) is atypical, inflecting adjectivally to agree in gender, number, and case with the possessed noun (e.g., mātar-jo: 'mother's' agrees with a masculine head); for feminine nouns such as the borrowed term يونيورسٽي (yūniwarṣṭī 'university'), which is treated as feminine, the possessive form adds ءَ before جو, yielding يونيورسٽيءَ جو (yūniwarṣṭī-ē̃ jo: 'university's'). These markers total around 11 in common usage, governing oblique nominals exclusively. Syntactically, postpositional phrases are head-final, with the postposition concluding the phrase and the entire unit functioning adverbially or adnominally in subject-object-verb (SOV) clauses. Oblique-marked postpositional phrases denote agents in ergative alignments (e.g., perfective transitives, where the subject takes oblique + ergative postposition), while verbs agree in gender and number with the nominative subject or pivot. Word order is flexible for topicalization, but postpositions rigidly follow their nominals, enabling complex embeddings without preposed modifiers disrupting flow.

Lexical composition

Indigenous roots

The core lexicon of Sindhi, encompassing fundamental concepts such as kinship terms, numerals, body parts, and environmental descriptors, traces its origins to Proto-Indo-Aryan via Middle Indo-Aryan Prakrit dialects spoken in the lower Indus Valley since at least the early centuries BCE. This native vocabulary layer reflects phonological innovations unique to the region, such as the development of implosive consonants in words like bʱīn ('sister', cognate with Sanskrit bhrātr̥), while preserving semantic continuity with other Northwestern Indo-Aryan languages like Lahnda and Gujarati. Comparative reconstruction, drawing on attested Prakrit texts from the Sindh area, confirms that over 60% of basic Swadesh-list items in modern Sindhi derive directly from this inherited stock, underscoring its resilience amid later admixtures. Local evolution of this indigenous base occurred through phonetic shifts and semantic extensions adapted to the agrarian and fluvial ecology of Sindh, as evidenced by terms like dariyo ('river', from Prakrit dariyā, ultimately Sanskrit nadī). Unlike peripheral loanwords, these roots exhibit high-frequency usage in oral traditions and pre-Islamic folklore, maintaining morphological patterns such as oblique case marking in nouns (e.g., pāṇī 'water' becoming pāṇīyūn in oblique plural). Scholarly analyses attribute minimal substrate influence from hypothetical pre-Indo-Aryan Indus Valley languages to the lexicon, given the overwhelming dominance of Indo-Aryan forms and the absence of deciphered lexical parallels; any such elements, if present, likely pertain to onomatopoeic or specialized agricultural terms rather than core structures. This foundational layer, documented in 19th-century philological surveys, forms the bedrock for Sindhi's grammatical and expressive capacities, independent of exogenous impositions.

Loanwords from Persian, Arabic, and English

The Sindhi lexicon incorporates a substantial number of loanwords from Persian, primarily due to the language's role as the administrative and cultural medium in Sindh from the 8th century through the Mughal era until the British conquest in 1843. These borrowings encompass nouns, adjectives, verbs, and other parts of speech, often adapted phonologically and morphologically to Sindhi patterns, such as shifting Persian nouns ending in -a to -ō (e.g., darvāzō "gate" from Persian darvāza). Examples include sugun "auspicious sign" from Persian shagun, vecharo "poor fellow" from bechāra, and pahryan "clothing" from pairahan. Verbs like dafnāińu "to bury" derive from Arabic dafn via Persian mediation, while ṭalbańu "to seek" comes from Arabic ṭalab. Arabic loanwords entered Sindhi directly through the Umayyad conquest of Sindh in 711–712 CE and subsequent Islamic rule, but many arrived indirectly via Persian, particularly in religious, legal, and abstract domains. Common examples include umata "" from Arabic ummat and hikmat "" from ḥikmah, reflecting influences on Sufi poetry and everyday Islamic terminology. These terms, numbering in the thousands across like Sindhi, often retain Perso-Arabic in the Sindhi script to preserve historical spellings, contributing to the language's 52-letter alphabet. The integration preserves Sindhi's core Indo-Aryan structure while enriching semantic fields like and . English loanwords proliferated in Sindhi following British colonial rule from 1843 to 1947 and accelerated in the post-independence era due to education, media, and technology. They constitute approximately 9% of vocabulary in analyzed corpora of modern Sindhi texts, primarily in domains like science, transport, and consumer goods, with examples including ڊاڪٽر "doctor," ڪمپيوٽر "computer," ٽيليويزن "television," بس "bus," and موبائيل "mobile." These are often pluralized or inflected per Sindhi grammar, such as messaga for "messages," and appear frequently in print media through code-mixing. Unlike earlier borrowings, English terms show less phonological adaptation, reflecting ongoing globalization rather than deep assimilation.

Semantic fields and evolution

The Sindhi lexicon encompasses semantic fields typical of Indo-Aryan languages, with core domains such as kinship, agriculture, and body parts predominantly featuring indigenous terms derived from Prakrit and earlier Indo-Aryan strata, reflecting the language's ancient roots in the Indus valley. Kinship terminology, for instance, employs terms like bap for father and for mother, preserving basic familial structures without significant foreign overlay in rural or traditional usage. Agricultural vocabulary, central to Sindh's riverine economy, includes indigenous words for cultivation activities, such as pokh for plowing a field, underscoring the language's adaptation to local agrarian practices over millennia. Body parts and natural phenomena similarly draw from proto-Indo-Aryan bases, maintaining semantic stability in everyday domains despite external contacts. In contrast, semantic fields related to religion, administration, and abstract concepts exhibit heavy borrowing from Arabic and Persian, introduced after the Umayyad conquest of Sindh in 711 AD, which established Arabic as a religious and official medium for approximately three centuries. Terms for Islamic theology and jurisprudence, such as those denoting prayer (namāz) or community (umma), entered via Arabic, reshaping religious semantics while core grammatical structures remained intact. Persian influence intensified during subsequent dynasties, including the Mughals, contributing loans to administrative and cultural fields—e.g., darvāzō for gate or bāzu for hawk—often adapted phonologically to Sindhi patterns, with nouns shifting to vowel endings like –ō in direct forms. This layering enriched high-register speech, particularly in urban and literary contexts, where Persian abstracts supplanted or coexisted with indigenous equivalents. The evolution of these fields reflects causal historical pressures: pre-Islamic Sindhi, evolving from Vrachada Apabhramsa around 600–1000 AD, prioritized concrete, local semantics tied to Indus and . Post-conquest Islamization drove semantic expansion in spiritual and legal domains, with loans comprising thousands of items by the medieval period, though without altering basic or morphology. Persian dominance as the administrative until the British conquest in 1843 further stratified the , evident in Sufi works like Shah Abd al-Latif Bhitai's Risālō (18th century), where mystical and ethical fields blend Persian-derived abstractions with indigenous metaphors. British rule prompted purist efforts to revive Prakrit-derived terms, reducing some Persian density in modern prose, while English introduced technical vocabulary in and post-1947. Contemporary shifts, particularly in fields among urban youth, show simplification or /English substitutions due to migration and , indicating ongoing without wholesale semantic rupture. Overall, Sindhi's semantic evolution demonstrates resilience in core fields amid accretive borrowing, yielding a hybrid where indigenous roots anchor daily usage and loans delineate elite or specialized domains.

Modern usage and technology

Literature and media

Sindhi literature emerged prominently through Sufi poetry during the medieval period, with Shah Abdul Latif Bhittai (1689–1752) authoring Shah Jo Risalo, a compilation of verses fusing mystical themes of divine love, local folklore such as the tales of Sasui Punhun and Sohni Mahiwal, and Indo-Islamic spiritual elements, establishing it as the foundational text of the tradition. Classical poets like Sachal Sarmast (1739–1829) and Sami (1743–1850) further advanced Sufi expression in verse, emphasizing spiritual ecstasy and cultural motifs. The modern era, initiated after the British conquest of Sindh in 1843, saw the rise of prose alongside poetry. Pioneering prose writers included Kauromal Khilnani (1844–1916), whose Ratnavali (1888) explored historical narratives, and Mirza Qalich Beg (1853–1929), known for Zinat (1890) and contributions to fiction and drama. Poets such as Kishin Chand Bewas (1885–1947) with Shirin Shair (1929) and Shaikh Ayaz (1923–1998) with Baghi (1945) addressed social reform, nationalism, and modernism, while Hyder Baksh Jatoi (1901–1970) produced influential works like Dariya Sah (1925). In post-Partition India, Sindhi writers like Krishin Rahi and Hari Dilgir focused on displacement and identity in poetry and prose. Sindhi print media thrives primarily in Pakistan, where Daily Kawish, established in 1991 by Kazi Aslam Akbar, holds the largest circulation among Sindhi dailies, alongside Daily Sindh Times and Pahenji Akhbar. In India, periodicals such as Sindhi Times sustain community readership among the diaspora. Electronic media includes Sindhi-language television channels in Pakistan licensed by PEMRA, such as KTN News, Mehran TV, and Sindh TV, offering news, cultural programs, and dramas. Radio services, including those from Pakistan Broadcasting Corporation stations, broadcast Sindhi content on folklore, music, and current affairs, while limited digital platforms extend reach to global audiences.

Digital tools and NLP advancements

Sindhi benefits from support in the Arabic Extended-A block (U+0750–U+077F), which accommodates its orthographic needs, including the alternate Heh Doachashmee (U+068E) for distinct phonetic rendering in the language's Arabic-based script. This enables consistent digital representation across platforms, though full locale-specific implementations remain limited in many operating systems. Input methods have advanced with phonetic keyboards like the MBSindhi layout, originally designed by Abdul-Majid Bhurgri in 1988 for Macintosh and later standardized by the Sindhi Language Authority, available via Keyman software for Unicode-compliant typing. Mobile applications such as Easy Sindhi Keyboard support phonetic entry on Android and , facilitating real-time composition in messaging and web applications. Online tools, including virtual keyboards on platforms like Branah, allow script-agnostic input without software installation. Fonts tailored for Sindhi, such as those recommended by the South Asia Language Resource Center, ensure proper glyph rendering for pedagogy and web use, with Unicode compliance promoting interoperability. Government and institutional initiatives, like the Indian Language Digital Corpus's free Unicode fonts and C-DAC's Unicode Typing Tool, provide standardized resources for content creation in Sindhi across Windows environments. These tools address historical challenges in script digitization, stemming from Sindhi's extended Arabic character set, enabling broader adoption in digital publishing and education. In (NLP), Sindhi has seen corpus development, including a exceeding 1 million words annotated for parts-of-speech and morphological , supporting tasks like sentiment detection and language variation studies. Specialized resources, such as the Sindhi with synsets for semantic disambiguation and the 61-million-word corpus derived from web sources, facilitate vector-based models for analogies and . applications include a parts-of-speech tagger using networks, achieving higher accuracy on morphologically rich Sindhi text compared to rule-based predecessors. Speech-to-text prototypes process audio inputs via recognition engines tailored to Sindhi , converting spoken forms to editable text for applications like transcription. Open-access initiatives, including AMBILE's curated corpora for and resolution, promote further innovation in low-resource NLP for Sindhi. repositories aggregating , like those from Sindhi-NLP projects, underscore community-driven progress amid sparse commercial support.

Advocacy efforts and policy pushes

The Sindhi Language Authority (SLA) in , operating under provincial legislation from the 1972 Sindh Assembly acts on teaching and promotion, coordinates key advocacy for expanded Sindhi usage. In February 2025, the SLA inaugurated a nationwide awareness campaign demanding Sindhi's elevation to national language status alongside Punjabi and Siraiki, the formation of a federal Language Commission, and mandatory implementation in official correspondence, courts, and public signage to counter Urdu's dominance. This push reflects ongoing provincial efforts to integrate Sindhi into broader governance, building on earlier mandates like the Sindh Education Department's 2022 directive requiring private schools to teach Sindhi as a compulsory subject from primary levels. Diaspora organizations amplify these domestic campaigns internationally. The World Sindhi Congress, focused on human rights for Sindhis, advocates for cultural preservation including language rights through global lobbying and initiatives. Similarly, the Sindhi Association of has pushed for accurate census representation of Sindhi speakers in events like its 2023 conference on Pakistan's census from a Sindh perspective, aiming to bolster demographic arguments for policy protections. In the United States, bipartisan congressional support in July 2022 secured funding for Voice of America's Sindhi-language service launch, enhancing external media access and cultural advocacy amid concerns over information suppression in . In India, where Sindhi gained constitutional recognition via the 21st Amendment in 1967 adding it to the Eighth Schedule, advocacy centers on educational and media expansion despite speaker decline. The National Council for the Promotion of Sindhi Language reviewed strategic goals for preservation in a July 2025 meeting, emphasizing integration into curricula and digital resources. A 2024 Supreme Court petition by an NGO for a dedicated 24-hour Doordarshan Sindhi channel was dismissed, with justices noting alternative preservation methods like community programs suffice without state broadcasting mandates. These efforts highlight tensions between federal multilingual policies and practical usage, with Sindhi's official status not fully translating to widespread institutional adoption.

References

  1. https://commons.wikimedia.org/wiki/File:Distribution_of_Pakistanis_speaking_Sindhi_as_a_first_language_in_1998.png
Add your contribution
Related Hubs
User Avatar
No comments yet.