Hubbry Logo
Sotho phonologySotho phonologyMain
Open search
Sotho phonology
Community hub
Sotho phonology
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Sotho phonology
Sotho phonology
from Wikipedia
Notes:

  • All examples marked with are included in the audio samples. If a table caption is marked then all Sesotho examples in that table are included in the audio samples.

The phonology of Sesotho and those of the other Sotho–Tswana languages are radically different from those of "older" or more "stereotypical" Bantu languages. Modern Sesotho in particular has very mixed origins (due to the influence of Difaqane refugees) inheriting many words and idioms from non-Sotho–Tswana languages.

There are in total 39 consonantal phonemes[1] (plus 2 allophones) and 9 vowel phonemes (plus two close raised allophones). The consonants include a rich set of affricates and palatal and postalveolar consonants, as well as three click consonants.

Historical sound changes

[edit]

Probably the most radical sound innovation in the Sotho–Tswana languages is that the Proto-Bantu prenasalized consonants have become simple stops and affricates.[2] Thus isiZulu words such as entabeni ('on the mountain'), impuphu ('flour'), ezinkulu ('the big ones'), ukulanda ('to fetch'), ukulamba ('to become hungry'), and ukuthenga ('to buy') are cognates to Sesotho [tʰɑbeŋ̩] thabeng, [pʰʊfʊ] phofo, [t͡sʼexʊlʊ] tse kgolo, [hʊlɑtʼɑ] ho lata, [hʊlɑpʼɑ] ho lapa, and [hʊʀɛkʼɑ] ho reka, respectively (with the same meanings).

This is further intensified by the law of nasalization and nasal homogeneity, making derived and imported words have syllabic nasals followed by homogeneous consonants, instead of prenasalized consonants.

Another important sound change in Sesotho which distinguishes it from almost all other Sotho–Tswana languages and dialects is the chain shift from /x/ and /k͡xʰ/ to /h/ and /x/ (the shift of /k͡xʰ/ to /x/ is not yet complete).

In certain respects, however, Sesotho is more conservative than other Sotho–Tswana languages. For example, the language still retains the difference in pronunciation between /ɬ/, /t͡ɬʰ/, and /tʰ/.[3] Many other Sotho–Tswana languages have lost the fricative /ɬ/, and some Northern Sotho languages, possibly influenced by Tshivenda, have also lost the lateral affricate and pronounce all three historical consonants as /tʰ/ (they have also lost the distinction between /t͡ɬ/ and /t/ — thus, for example, speakers of the Northern Sotho language commonly called Setlokwa call their language "Setokwa").[4]

The existence of (lightly) ejective consonants (all unvoiced unaspirated stops) is very strange for a Bantu language and is thought to be due to Khoisan influence. These consonants occur in the Sotho–Tswana and Nguni languages (being over four times more common in Southern Africa than anywhere else in the world), and the ejective quality is strongest in isiXhosa, which has been greatly influenced by Khoisan phonology.

As with most other Bantu languages, almost all palatal and postalveolar consonants are due to some form of palatalization or other related phenomena which result from a (usually palatal) approximant or vowel being "absorbed" into another consonant (with a possible subsequent nasalization).

The Southern Bantu languages have lost the Bantu distinction between long and short vowels. In Sesotho the long vowels have simply been shortened without any other effects on the syllables; while sequences of two dissimilar vowels have usually resulted in the first vowel being "absorbed" into the preceding consonant, and causing changes such as labialization and palatalization.

As with most Southern African Bantu languages, the "composite" or "secondary" vowels *e and *o have become /ɛ, e/ and /ɔ, o/. These usually behave as two phonemes (conditioned by vowel harmony), although there are enough exceptions to justify the claim that they have become four separate phonemes in the Sotho–Tswana languages.

Additionally, the first-degree (or "superclose", "heavy") and second-degree vowels have not merged as in many other Bantu languages, resulting in a total of 9 phonemic vowels.

Almost uniquely among the Sotho–Tswana languages, Sesotho has adopted clicks.[5] There is one place of articulation, alveolar, and three manners and phonations: tenuis, aspirated, and nasalized. These most probably came with loanwords from the Khoisan and Nguni languages, though they also exist in various words which don't exist in these languages and in various ideophones.

These clicks also appear in environments which are rare or non-existent in the Nguni and Khoisan languages, such as a syllabic nasal followed by a nasalized click ([ŋ̩ǃn] written ⟨nnq⟩, as in [ŋ̩ǃnɑnɪ] nnqane 'that other side'), a syllabic nasal followed by a tenuis click ([ŋ̩ǃ], also written ⟨nq⟩, as in [sɪŋ̩ǃɑŋ̩ǃɑnɪ] senqanqane 'frog'; this is not the same as the prenasalized radical click written ⟨nkq⟩ in the Nguni languages),[clarification needed] and a syllabic nasal followed by an aspirated click ([ŋ̩ǃʰ] written ⟨nqh⟩, as in [sɪǃʰɪŋ̩ǃʰɑ] seqhenqha 'hunk').

Vowels

[edit]

Sesotho has a large inventory of vowels compared with many other Bantu languages[citation needed]. However, the nine phonemic vowels are collapsed into only five letters in the Sesotho orthography. The two close vowels i and u (sometimes called "superclose" or "first-degree" by Bantuists)[citation needed] are very high (with advanced tongue root) and are better approximated by French vowels than English vowels[according to whom?]. That is especially true for /u/, which, in English, is often noticeably more fronted[citation needed] and can be transcribed as [u̟] or [ʉ] in the IPA; which is absent from Sesotho (and French).

Vowels[6]
/i/ /u/
[huˌbit͡sʼɑ] ho bitsa ('to call') beet [tʼumɔ] tumo ('fame') boot
/ɪ/ /ʊ/
[hʊlɪkʼɑ] ho leka ('to attempt') pit [pʼʊt͡sʼɔ] potso ('query') put
/e/ /o/
[hʊʒʷet͡sʼɑ] ho jwetsa ('to tell') cafe [pʼon̩t͡sʰɔ] pontsho ('proof') yawn (RP, SAE)
/ɛ/ /ɔ/
[hʊʃɛbɑ] ho sheba ('to look') bed [mʊŋɔlɔ] mongolo ('writing') board
/ɑ/
[hʊˈɑbɛlɑ] ho abela ('to distribute') spa
Approximate tongue positions for the 9 vowels, from Doke & Mofokeng (1974:?)

Consonants

[edit]

The Sotho–Tswana languages are peculiar among the Bantu family in that most do not have any prenasalized consonants and have a rather-large number of heterorganic compounds. Sesotho, uniquely among the recognised and standardised Sotho–Tswana languages, also has click consonants, which were acquired from Khoisan and Nguni languages.

Labial Alveolar Post-
alveolar
Palatal Velar Uvular Glottal
median lateral
Click glottalized ᵏǃʼ
aspirated ᵏǃʰ
nasal ᵑǃ
Nasal m n ɲ ŋ
Stop ejective
aspirated
voiced b (d)1
Affricate ejective tsʼ tɬʼ tʃʼ
aspirated tsʰ tɬʰ tʃʰ kxʰ ~ x
Fricative voiceless f s ɬ ʃ h ~ ɦ
voiced ʒ ~
Approximant l j w
Trill r ʀ
  1. [d] is an allophone of /l/, occurring only before the close vowels (/i/ and /u/). Dialectical evidence shows that in the Sotho–Tswana languages /l/ was originally pronounced as a retroflex flap [ɽ] before the two close vowels.

Sesotho makes a three-way distinction between lightly ejective, aspirated and voiced stops in several places of articulation.

Stops
Place of articulation IPA Notes Orthography Example
bilabial // unaspirated: spit p [pʼit͡sʼɑ] pitsa ('cooking pot')
//   ph [pʰupʼut͡sʼɔ] phuputso ('investigation')
/b/ this consonant is fully voiced b [lɪbɪsɪ] lebese ('milk')
alveolar // unaspirated: stalk t [bʊtʼɑlɑ] botala ('greenness')
//   th [tʰɑʀʊl̩lɔ] tharollo ('solution')
[d] an allophone of /l/, only occurring before the close vowels (/i/ and /u/) d [muˌdimʊ] Modimo ('God')
velar // unaspirated: skill k [buˌˈikʼɑʀɑbɛlɔ] boikarabelo ('responsibility')
// fully aspirated: kill; occurring mostly in old loanwords from Nguni languages and in ideophones kh [lɪkʰɔkʰɔ] lekhokho ('pap baked onto the pot')

Sesotho possesses four simple nasal consonants. All of these can be syllabic and the syllabic velar nasal may also appear at the end of words.

Nasals
Place of articulation IPA Notes Orthography Example
bilabial /m/   m [hʊmɑmɑʀet͡sʼɑ] ho mamaretsa ('to glue')
/m̩/ syllabic version of the above m [m̩pɑ] mpa ('stomach')
alveolar /n/   n [lɪnɑnɛˈɔ] lenaneo ('programme')
/n̩/ syllabic version of the above n [n̩nɑ] nna ('I')
alveolo-palatal /ɲ/ a bit like Spanish el niño ny [hʊɲɑlɑ] ho nyala ('to marry')
/ɲ̩/ syllabic version of the above n [ɲ̩ɲeʊ] nnyeo ('so-and-so')
velar /ŋ/ can occur initially ng [lɪŋɔlɔ] lengolo ('letter')
/ŋ̩/ syllabic version of the above n [hʊŋ̩kʼɑ] ho nka ('to take')

The following approximants occur. All instances of /w/ and /j/ most probably come from original close /ʊ/, /ɪ/, /u/, and /i/ vowels or Proto-Bantu *u, *i, *û, and *î (under certain circumstances).

Note that when ⟨w⟩ appears as part of a syllable onset this actually indicates that the consonant is labialized.

Approximants
Place of articulation IPA Notes Orthography Example
labial-velar /w/   w [sɪwɑ] sewa ('epidemic')
lateral /l/ never occurs before close vowels (/i/ and /u/), where it becomes [d] l [sɪlɛpʼɛ] selepe ('axe')
/l̩/ a syllabic version of the above; note that if the sequence [l̩l] is followed by the close [i] or [u] then the second [l] is pronounced normally, not as a [d] l [mʊl̩lɔ] mollo ('fire')
palatal /j/   y [hʊt͡sʼɑmɑjɑ] ho tsamaya ('to walk')

The following fricatives occur. The glottal fricative is often voiced between vowels, making it barely noticeable.[7] The alternative orthography used for the velar fricative is due to some loanwords from Afrikaans and ideophones which were historically pronounced with velar fricatives, distinct from the velar affricate. The voiced postalveolar affricative sometimes occurs as an alternative to the fricative.

Fricatives
Place of articulation IPA Notes Orthography Example
labiodental /f/   f [huˌfumɑnɑ] ho fumana ('to find')
alveolar /s/   s [sɪsʊtʰʊ] Sesotho
postalveolar /ʃ/   sh [mʊʃʷɛʃʷɛ] Moshweshwe ('Moshoeshoe I')
/ʒ/   j [mʊʒɑlɪfɑ] mojalefa ('heir
lateral /ɬ/   hl [hʊɬɑɬʊbɑ] ho hlahloba ('to examine')
velar /x/   kg. Also ⟨g⟩ in Gauta ('Gauteng') [xɑˈutʼɑ] and some ideophones such as gwa ('of extreme whiteness') [xʷɑ] [sɪxɔ] sekgo ('spider')
glottal /h/ h [hʊˈɑhɑ] ho aha ('to build')

There is one trill consonant. Originally, this was an alveolar rolled lingual, but today most individuals pronounce it at the back of the tongue, usually at the uvular position. The uvular pronunciation is largely attributed to the influence of French missionaries at Morija in Lesotho. Just like the French version, the position of this consonant is somewhat unstable and often varies even in individuals, but it generally differs from the "r"'s of most other South African language communities. The most stereotypical French-like pronunciations are found in certain rural areas of Lesotho, as well as some areas of Soweto (where this has affected the pronunciation of Tsotsitaal).

Trill
Place of articulation IPA Notes Notes Orthography Example
alveolar /r/ can also be a tap similar to the spanish perro' r [ke.a u ɾata] kea o rata ('I love you')
uvular /ʀ/ soft Parisian-type r r [muˌʀiʀi] moriri ('hair')

Sesotho has a relatively large number of affricates. The velar affricate, which was standard in Sesotho until the early 20th century, now only occurs in some communities as an alternative to the more common velar fricative.[8]

Affricates
Place of articulation IPA Notes Orthography Example
alveolar /t͡sʼ/   ts [hʊt͡sʼʊkʼʊt͡sʼɑ] ho tsokotsa ('to rinse')
/t͡sʰ/ aspirated tsh [hʊt͡sʰʊhɑ] ho tshoha ('to become frightened')
lateral /t͡ɬʼ/   tl [hʊt͡ɬʼɑt͡sʼɑ] ho tlatsa ('to fill')
/t͡ɬʰ/ occurs only as a nasalized form of hl or as an alternative to it[3] tlh [t͡ɬʰɑhɔ] tlhaho ('nature')
postalveolar /t͡ʃʼ/   tj [ɲ̩t͡ʃʼɑ] ntja ('dog')
/t͡ʃʰ/   tjh [hʊɲ̩t͡ʃʰɑfɑt͡sʼɑ] ho ntjhafatsa ('to renew')
/d͡ʒ/ this is an alternative to the fricative /ʒ/ j [hʊd͡ʒɑ] ho ja ('to eat')
velar /k͡xʰ/ alternative to the velar fricative kg [k͡xʰɑlɛ] kgale ('a long time ago')

The following click consonants occur.[9] In common speech they are sometimes substituted with dental clicks. Even in standard Sesotho the nasal click is usually substituted with the tenuis click. ⟨nq⟩ is also used to indicate a syllabic nasal followed by an ejective click (/ŋ̩ǃkʼ/), while ⟨nnq⟩ is used for a syllabic nasal followed by a nasal click (/ŋ̩ǃŋ/).

Clicks
Place of articulation IPA Notes Orthography Example
postalveolar /ǃkʼ/ ejective[citation needed] q [hʊǃkʼɔǃkʼɑ] ho qoqa ('to chat')
/ᵑǃ/ nasal; this is often pronounced as an ejective click nq [hʊᵑǃʊsɑ] ho nqosa ('to accuse')
/ǃʰ/ aspirated qh [lɪǃʰekʼu] leqheku ('an elderly person')

The following heterorganic compounds occur. They are often substituted with other consonants, although there are a few instances when some of them are phonemic and not just allophonic. These are not considered consonant clusters.

In non-standard speech these may be pronounced in a variety of ways. bj may be pronounced /bj/ (followed by a palatal glide) and pj may be pronounced /pjʼ/. pj may also sometimes be pronounced /ptʃʼ/, which may alternatively be written ptj, though this is not to be considered standard.

Heterorganic compounds
Place of articulation IPA Notes Orthography Example
bilabial-palatal /pʃʼ/ alternative tj pj [hʊpʃʼɑt͡ɬʼɑ] ho pjatla ('to cook well;)
/pʃʰ/ aspirated version of the above; alternative tjh pjh [m̩pʃʰe] mpjhe ('ostrich')
/bʒ/ alternative j bj [hʊbʒɑʀɑnɑ] ho bjarana ('to break apart')
labiodental-palatal /fʃ/ only found in short passives of verbs ending with [fɑ] fa; alternative sh fj [hʊbɔfʃʷɑ] ho bofjwa ('to be tied')

Syllable structure

[edit]

Sesotho syllables tend to be open, with syllabic nasals and the syllabic approximant l also allowed. Unlike almost all other Bantu languages, Sesotho does not have prenasalized consonants (NC).

  1. The onset may be any consonant (C), a labialized consonant (Cw), an approximant (A), or a vowel (V).
  2. The nucleus may be a vowel, a syllabic nasal (N), or the syllabic l (L).
  3. No codas are allowed.

The possible syllables are:

  • V ho etsa ('to do') [hʊˈet͡sʼɑ]
  • CV fi! ('ideophone of sudden darkness') [fi]
  • CwV ho tswa ('to emerge') [hʊt͡sʼʷɑ]
  • AV wena ('you') [wɛnɑ]
  • N nna ('I') [n̩nɑ]
  • L lebollo ('circumcision rite') [lɪbʊl̩lɔ]

Note that heterorganic compounds count as single consonants, not consonant clusters.

Additionally, the following phonotactic restrictions apply:

  1. A consonant may not be followed by the palatal approximant /j/ (i.e. C+y is not a valid onset).[10]
  2. Neither the labio-velar approximant /w/ nor a labialized consonant may be followed by a back vowel at any time.

Syllabic l occurs only due to a vowel being elided between two l's:

[mʊlɪlɔ] *molelo (Proto-Bantu *mu-dido) > [mʊl̩lɔ] mollo ('fire') (cf Setswana molelo, isiZulu umlilo)
[hʊlɪlɑ] *ho lela (Proto-Bantu *-dida) > [hʊl̩lɑ] ho lla ('to cry') (cf Setswana go lela, isiXhosa ukulila, Tshivenda u lila)
isiZulu ukuphuma ('to emerge') > ukuphumelela ('to succeed') > Sesotho [hʊpʰʊmɛl̩lɑ] ho phomella

There are no contrastive long vowels in Sesotho, the rule being that juxtaposed vowels form separate syllables (which may sound like long vowels with undulating tones during natural fast speech).[11] Originally there might have been a consonant between vowels which was eventually elided that prevented coalescence or other phonological processes (Proto-Bantu *g, and sometimes *j).

Other Bantu languages have rules against vowel juxtaposition, often inserting an intermediate approximant if necessary.

Sesotho [xɑˈutʼeŋ̩] Gauteng ('Gauteng') > isiXhosa Erhawudeni

Phonological processes

[edit]

Vowels and consonants very often influence one another resulting in predictable sound changes. Most of these changes are either vowels changing vowels, nasals changing consonants, or approximants changing consonants. The sound changes are nasalization, palatalization, alveolarization, velarization, vowel elision, vowel raising, and labialization. Sesotho nasalization and vowel-raising are extra-strange since, unlike most processes in most languages, they actually decrease the sonority of the phonemes.

Nasalization (alternatively Nasal permutation or Strengthening) is a process in Bantu languages by which, in certain circumstances, a prefixed nasal becomes assimilated to a succeeding consonant and causes changes in the form of the phone to which it is prefixed. In the Sesotho language series of articles it is indicated by ⟨N⟩.

In Sesotho it is a fortition process and usually occurs in the formation of class 9 and 10 nouns, in the use of the objectival concord of the first person singular, in the use of the adjectival and enumerative concords of some noun classes, and in the forming of reflexive verbs (with the reflexive prefix).

Very roughly speaking, voiced consonants become devoiced and fricatives (except /x/ [12]) lose their fricative quality.

Vowels and the approximant /w/ get a /kʼ/ in front of them[13]

  • Voiced stops become ejective:
    /b/ > /pʼ/
    /l/ > /tʼ/
  • Fricatives become aspirated:
    /f/ > /pʰ/
    /ʀ/ > /tʰ/
    /s/ > /t͡sʰ/
    /ʃ/ > /t͡ʃʰ/
    /ɬ/ > /t͡ɬʰ/ (except with adjectives)
  • /h/ becomes /x/
  • /ʒ/ becomes /t͡ʃʼ/

The syllabic nasal causing the change is usually dropped, except for monosyllabic stems and the first person objectival concord. Reflexive verbs don't show a nasal.

[hʊˈɑʀbɑ] ho araba ('to answer') > [kʼɑʀɑbɔ] karabo ('response'), [hʊŋ̩kʼɑʀɑbɑ] ho nkaraba ('to answer me'), and [huˌˈikʼɑʀɑbɑ] ho ikaraba ('to answer oneself')
[hʊfɑ] ho fa ('to give') > [m̩pʰɔ] mpho ('gift'), [hʊm̩pʰɑ] ho mpha ('to give me'), and [huˌˈipʰɑ] ho ipha ('to give oneself')

Other changes may occur due to contractions in verb derivations:

[hʊbɔnɑ] ho bona ('to see') > [hʊbon̩t͡sʰɑ] ho bontsha ('to cause to see') (causative [bɔn] -bon- + [isɑ] -isa)

Nasal homogeneity consists of two points:

  1. When a consonant is preceded by a (visible or invisible) nasal it will undergo nasalization, if it supports it.
  2. When a nasal is immediately followed by another consonant with no vowel betwixt them, the nasal will change to a nasal in the same approximate position as the following consonant, after the consonant has undergone nasal permutation. If the consonant is already a nasal then the previous nasal will simply change to the same.


Palatalization is a process in certain Bantu languages where a consonant becomes a palatal consonant.

In Sesotho it usually occurs with the short form of passive verbs and the diminutives of nouns, adjectives, and relatives.

  • Labials:
    /pʼ/ > /pʃʼ/ / /t͡ʃʼ/
    /pʰ/ > /pʃʰ/ / /t͡ʃʰ/
    /b/ > /bʒ/ / /ʒ/
    /f/ > /fʃ/ / /ʃ/
  • Alveolars:
    /tʼ/ > /t͡ʃʼ/
    /tʰ/ > /t͡ʃʰ/
    /l/ > /ʒ/
  • The nasals become /ɲ/:
    /n/, /m/, and /ŋ/ > /ɲ/

For example:

[hʊlɪfɑ] ho lefa ('to pay') > [hʊlɪfʃʷɑ] ho lefjwa / [hʊlɪʃʷɑ] ho leshwa ('to be paid')


Alveolarization is a process whereby a consonant becomes an alveolar consonant. It occurs in noun diminutives, the diminutives of colour adjectives, and in the pronouns and concords of noun classes with a [di] di- or [di] di[N]- prefix. This results in either /t͡sʼ/ or /t͡sʰ/.

  • /pʼ/, /b/, and /l/ become /t͡sʼ/
  • /pʰ/, /f/, and /ʀ/ become /t͡sʰ/

Examples:

[xʷɑdi] -kgwadi ('black with white spots') > [xʷɑt͡sʼɑnɑ] -kgwatsana (diminutive)
[dikʼet͡sʼɔ  t͡sʼɑhɑˈʊ] diketso tsa hao ('your actions')

Other changes may occur due to phonological interactions in verbal derivatives:

[hʊbʊt͡sʼɑ] ho botsa ('to ask') > [hʊbʊt͡sʼet͡sʼɑ] ho botsetsa ('to ask on behalf of') (applied [bʊt͡sʼ] -bots- + [ɛlɑ] -ela)

The alveolarization which changes Sesotho /l/ to /t͡sʼ/ is by far the most commonly applied phonetic process in the language. It's regularly applied in the formation of some class 8 and 10 concords and in numerous verbal derivatives.


Velarization in Sesotho is a process whereby certain sounds become velar consonants due to the intrusion of an approximant. It occurs with verb passives, noun diminutives, the diminutives of relatives, and the formation of some class 1 and 3 prefixes.

  • /m/ becomes /ŋ/
  • /ɲ/ becomes /ŋ̩ŋ/[14]

For example:

[hʊsɪɲɑ] ho senya ('to destroy') > [hʊsɪŋ̩ŋʷɑ] ho senngwa ('to be destroyed') (short passive [sɪɲ] -seny- + [wɑ] -wa)
Class 1 [mʊ] mo- + [ɑhɑ] -aha > [ŋʷɑhɑ] ngwaha ('year') (cf Kiswahili mwaka; from Proto-Bantu *-jaka)


Elision of vowels occurs in Sesotho less often than in those Bantu languages which have vowel "pre-prefixes" before the noun class prefixes (such as isiZulu), but there are still instances where it regularly and actively occurs.

There are two primary types of regular vowel elision:

  1. The vowels /ɪ/, /ɛ/, and /ʊ/ may be removed from between two instances of /l/, thereby causing the first /l/ to become syllabic. This actively occurs with verbs, and has historically occurred with some nouns.
  2. When forming class 1 or 3 nouns from noun stems beginning with /b/ the middle /ʊ/ is removed and the /b/ is contracted into the /m/, resulting in [m̩m]. This actively occurs with nouns derived from verbs commencing with [b] and has historically occurred with many other nouns.

For example:

[bɑlɑ] -bala ('read') > [bɑl̩lɑ] -balla (applied verb suffix [ɛlɑ] -ela) ('read for'), and [m̩mɑdi] mmadi ('person who reads')


Vowel raising is a form of vowel harmony where a non-open vowel (i.e. any vowel other than /ɑ/) is raised in position by a following vowel (in the same phonological word) at a higher position. The first variety — in which the open-mid vowels become close-mid — is commonly found in most Southern African Bantu languages (where the Proto-Bantu "mixed" vowels have separated). In the 9-vowel Sotho–Tswana languages, a much less common process also occurs where the near-close vowels become raised to a position slightly lower than the close vowels (closer to the English beat and boot than the very high Sesotho vowels i and u) without ATR (or, alternatively, with both [+ATR] and [+RTR]).


Mid vowel raising is a process where /ɛ/ becomes /e/ and /ɔ/ becomes /o/ under the influence of close vowels or consonants that contain "hidden" close vowels.

ho tsheha ('to laugh') [hʊt͡sʰɛhɑ] > ho tshehisa ('to cause to laugh') [hʊt͡sʰehisɑ]
ke a bona ('I see') [kʼɪˈɑbɔnɑ] > ke bone ('I saw') [kʼɪbonɪ]
ho kena ('to enter') [hʊkʼɛnɑ] > ho kenya ('to insert') [hʊkʼeɲɑ]

These changes are usually recursive to varying depths within the word, though, being a left spreading rule, it is often bounded by the difficulty of "foreseeing" the raising syllable:

diphoofolo ('animals') [dipʰɔˈɔfɔlɔ] > diphoofolong ('by the animals') [dipʰɔˈɔfoloŋ̩]

Additionally, a right-spreading form occurs when a close-mid vowel is on the penultimate syllable (that is, the stressed syllable) and, due to some inflection or derivational process, is followed by an open-mid vowel. In this case the vowel on the final syllable is raised. This does not happen if the penultimate syllable is close (/i/ or (/u/).

-besa ('roast') [besɑ] > subjunctive ke bese ('so I may roast...') [kʼɪbese]

but

-thola ('find') [tʰɔlɑ] > subjunctive ke thole ('so I may find...') [kʼɪtʰɔlɛ]

These vowels can occur phonemically, however, and may thus be considered to be separate phonemes:

maele ('wisdom') [mɑˈele]
ho retla ('to dismantle') [hʊʀet͡ɬʼɑ]

Close vowel raising is a process which occurs under much less common circumstances. Near-close /ɪ/ becomes [iˌ][15] and near-close /ʊ/ becomes [uˌ][15] when immediately followed by a syllable containing the close vowels /i/ or /u/. Unlike the mid vowel raising this processes is not iterative and is only caused directly by the close vowels (it cannot be caused by any hidden vowels or by other raised vowels).

[hʊt͡sʰɪlɑ] ho tshela ('to pass over') > [hʊt͡sʰiˌdisɑ] ho tshedisa ('to comfort')
[hʊlʊmɑ] ho loma ('to itch') > [sɪluˌmi] selomi ('period pains')

Since these changes are allophonic, the Sotho–Tswana languages are rarely said to have 11 vowels.


Labialization is a modification of a consonant due to the action of a bilabial /w/ element which persists throughout the articulation of the consonant and is not merely a following semivowel. This labialization results in the consonant being pronounced with rounded lips[16] (but, in Sesotho, with no velarization) and with attenuated high frequencies (especially noticeable with fricatives and aspirated consonants).

It may be traced to an original /ʊ/ or /u/ being "absorbed" into the preceding consonant when the syllable is followed by another vowel. The consonant is labialized and the transition from the labialized syllable onset to the nucleus vowel sounds like a bilabial semivowel (or, alternatively, like a diphthong). Unlike in languages such as Chishona and Tshivenda, Sesotho labialization does not result in "whistling" of any consonants.

Almost all consonants may be labialized (indicated in the orthography by following the symbol with ⟨w⟩), the exceptions being labial stops and fricatives (which become palatalized), the bilabial and palatal nasals (which become velarized), and the voiced alveolar [d] allophone of /l/ (which would become alveolarized instead). Additionally, syllabic nasals (where nasalization results in a labialized [ŋ̩kʼ] instead) and the syllabic /l/ (which is always followed by the non-syllabic /l/) are never directly labialized. Note that the unvoiced heterorganic doubled articulant fricative /fʃ/ only occurs labialized (only as [fʃʷ]).

Due to the inherent bilabial semivowel, labialized consonants never appear before back vowels:

[hʊlɑt͡sʼʷɑ] ho latswa ('to taste') > [tʼɑt͡sʼɔ] tatso ('flavour')
[hʊt͡sʼʷɑ] ho tswa ('to emerge') > [lɪt͡sʼɔ] letso ('a derivation')
[hʊnʷɑ] ho nwa ('to drink') > [sɪnɔ] seno ('a beverage')
[hʊˈɛlɛl̩lʷɑ] ho elellwa ('to realise') > [kʼɛlɛl̩lɔ] kelello ('the mind')

Tonology

[edit]

Sesotho is a tonal language spoken using two contrasting tones: low and high; further investigation reveals, however, that in reality it is only the high tones that are explicitly specified on the syllables in the speaker's mental lexicon, and that low tones appear when a syllable is tonally under-specified. Unlike the tonal systems of languages such as Mandarin, where each syllable basically has an immutable tone, the tonal systems of the Niger–Congo languages are much more complex in that several "tonal rules" are used to manipulate the underlying high tones before the words may be spoken, and this includes special rules ("melodies") which, like grammatical or syntax rules that operate on words and morphemes, may change the tones of specific words depending on the meaning one wishes to convey.

Stress

[edit]

The word stress system of Sesotho (often called "penultimate lengthening" instead, though there are certain situations where it doesn't fall on the penultimate syllable) is quite simple. Each complete Sesotho word has exactly one main stressed syllable.

Except for the second form of the first demonstrative pronoun, certain formations involving certain enclitics, polysyllabic ideophones, most compounds, and a handful of other words, there is only one main stress falling on the penult.

The stressed syllable is slightly longer and has a falling tone. Unlike in English, stress does not affect vowel quality or height.

This type of stress system occurs in most of those Eastern and Southern Bantu languages which have lost contrastive vowel length.

The second form of the first demonstrative pronoun has the stress on the final syllable. Some proclitics can leave the stress of the original word in place, causing the resultant word to have the stress at the antepenultimate syllable (or even earlier, if the enclitics are compounded). Ideophones, which tend to not obey the phonetic laws which the rest of the language abides by, may also have irregular stress.

There is even at least one minimal pair: the adverb fela ('only') [ˈfɛlɑ] has regular stress, while the conjunctive fela ('but') [fɛˈlɑ] (like many other conjunctives) has stress on the final syllable. This is certainly not enough evidence to justify making the claim that Sesotho is a stress accent language, though.

Because the stress falls on the penultimate syllable, Sesotho, like other Bantu languages (and unlike many closely allied Niger–Congo languages), tends to avoid monosyllabic words and often employs certain prefixes and suffixes to make the word disyllabic (such as the syllabic nasal in front of class 9 nouns with monosyllabic stems, etc.).

Notes

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Sesotho phonology refers to the systematic organization of sounds in Sesotho (Southern Sotho), a Southern Bantu language of the Sotho-Tswana group spoken primarily in and by approximately 6.9 million people (as of ). As a tonal , it features a rich inventory of 40 phonemes—including voiceless aspirated and ejective stops, affricates, fricatives, nasals, laterals, and a single —alongside nine vowels distinguished by height, backness, and advanced tongue root (ATR) features. Key phonological processes include high-low tone contrasts for lexical and grammatical distinctions, , ATR vowel harmony, labial palatalization, and penultimate vowel lengthening, with a predominant CV structure that permits vowel-initial syllables and syllabic nasals or laterals. The consonant system of Sesotho is notable for its complexity relative to many Bantu languages, encompassing places of articulation from bilabial to glottal and including phonemic aspiration (e.g., /pʰ/ vs. /p/) and ejectives (e.g., /pʼ/). Affricates such as /t͡s, t͡ɬ, t͡ʃ/ and the click /ǃ/ (dental, from a neighboring Khoisan influence) add to the diversity, while geminates of nasals and the lateral /l/ arise through morphophonological processes like vowel deletion. Syllabic consonants, particularly nasals, function as independent syllables in structures like N (nasal) or NC (nasal + consonant). Sesotho's vowel system comprises nine monophthongs—/i, , e, ɛ, a, , u, o, ɔ/—with ATR affecting advanced ([i, u, e, o]) versus retracted ([ɪ, ɛ, ʊ, ɔ, a]) sets, and the low central /a/ exhibiting allophonic variation influenced by adjacent vowels through coarticulation. Unlike stress-accent languages, Sesotho lacks lexical , relying instead on tonal downdrift across phrases in declarative intonation, where high tones are realized prominently on key syllables. These features contribute to the language's agglutinative morphology and its distinct profile among , influencing areas such as speech acquisition and orthographic representation.

Historical Development

Proto-Bantu Changes

The major diachronic shifts in Sotho phonology from Proto-Bantu occurred during the into , approximately between 500 CE and 1500 CE, as Proto-Bantu speakers migrated southward from the , adapting their phonological system amid regional linguistic contacts. These changes transformed the relatively simple Proto-Bantu inventory into the more complex Sotho system, characterized by ejective consonants, an expanded vowel set, and simplified nasal compounds, while preserving core Bantu syllable structure. A key innovation in Sotho-Tswana languages, including Sotho, was the loss of Proto-Bantu prenasalized stops and affricates, which simplified to plain voiceless stops and affricates, often realized as ejectives in post-nasal position due to devoicing. For instance, Proto-Bantu *mb, *nd, *ŋg shifted to voiceless [p, t, k], with ejective articulation (e.g., prenasalized stops like *mb often shifted to ejectives like /pʼ/). Similarly, prenasalized *nt and *ŋt became plain and [tʰ], exemplified by Proto-Bantu *ntàbà 'mountain' evolving to Sotho thába 'mountain'. This post-nasal devoicing to ejectives developed from the glottalization of voiced stops in nasal environments, a widespread Southern Bantu feature distinguishing Sotho from Central and Eastern that retain prenasalization. Fricatives and sibilants underwent chain shifts influenced by spirantization, where Proto-Bantu voiceless stops lenited to s in intervocalic or pre-high vowel positions. Proto-Bantu *p, *t, *k often became [f, s, x] or further weakened to in Sotho, but post-nasal blocking preserved stops (e.g., *p > h word-initially, as in *pìtà 'pass by' > Sotho féta 'pass by'). derived from palatal *c and *j showed variable reflexes: *c typically shifted to /s/, retained in Sotho (e.g., *cí 'eat' causative > -s- in verbs), while *j strengthened to /ʃ/ or /z/ in certain contexts, contributing to the sibilant series /s, ʃ/. These shifts reflect a broader Southern Bantu pattern of fricative development during the expansion, enhancing . The Proto-Bantu seven-vowel system (*i, *ɪ, *e, *a, *o, *ʊ, *u) evolved in Sotho through mergers and height distinctions, resulting in a nine-vowel inventory (*i, *ɩ, *e, *ɛ, *a, *ɔ, *o, *ʊ, *u). Low-level vowels *ɪ and *ʊ raised to high central [ɩ, ʊ], while mid vowels *e and *o split into close-mid [e, o] and open-mid [ɛ, ɔ] based on tonal or environmental conditioning, as in *kɪ̀lɛ̀ 'tongue' > Sotho kélɛ́ (with open-mid /ɛ/). This expansion preserved more Proto-Bantu vowel contrasts than the typical five-vowel merger in other , allowing for advanced height gradations. Vowel length from Proto-Bantu diphthongs or long vowels often simplified via syncope during the same migratory period.

External Influences

The external influences on Sotho phonology stem largely from prolonged contact with during the Bantu expansions and intensified interactions amid the Difaqane migrations in the early . These migrations, known as the in Nguni terminology, involved widespread warfare, displacement, and the integration of groups, including Khoisan speakers, into Sotho-Tswana communities. This historical upheaval fostered hybrid phonological traits, such as the notable absence of prenasalized consonants—a standard feature in most but lacking in Sotho due to Khoisan substrate effects that prohibit nasal-consonant clusters. A prominent outcome of this contact is the incorporation of the consonant /ǃ/, with tenuis (/ǃkʼ/, written ⟨q⟩), aspirated (/ǃʰ/, written ⟨qh⟩), and nasalized (/ᵑǃ/, written ⟨nq⟩) variants, acquired from substrates during the Difaqane period (ca. 1818–1830). During the Difaqane wars, integration of refugees into Sotho communities facilitated click adoption, primarily the alveolar type from San languages, entering via loanwords and ideophones. These non-native sounds, absent in Proto-Bantu, entered the lexicon through loanwords and later spread to expressive or native terms via and social identity marking. For instance, the word qeta 'pay' derives from a Khoisan root, illustrating how clicks integrated into core vocabulary amid refugee assimilations. Ejective consonants, including /p'/, /t'/, and /k'/, also developed under likely areal influence, distinguishing Sotho from typical Bantu systems that favor voiced or aspirated stops over glottalized ones. This innovation reflects broader Southern African phonological convergence, where ' ejective inventories shaped neighboring Bantu varieties through bilingualism and substrate interference. The uvular trill /ʀ/ [or ʁ], used in emphatic expressions or loanwords to convey intensity, likely entered via French influence during colonial contact. Meanwhile, extensions of labialization (e.g., /t͡sw/) and palatalization (e.g., /pʲa/ → /t͡ʃa/) show influences from neighboring Sotho-Tswana dialects, where variational patterns in these processes—such as glide dissimilation feeding palatal shifts—arose through inter-dialectal contact and reinforced areal norms.

Phonemic Inventory

Vowels

Sesotho possesses a nine-vowel phonemic inventory, consisting of /i, , e, ɛ, a, ɔ, o, , u/, which distinguishes it from the more typical five- or seven-vowel systems found in many other . These vowels are articulated at various heights and backnesses: the front unrounded vowels include high /i/ and lax high /ɪ/, close-mid /e/, and open-mid /ɛ/; the central low /a/; and the back rounded vowels include open-mid /ɔ/, close-mid /o/, high /u/, and lax high /ʊ/. Orthographically, this system is simplified to five basic letters (), with diacritics employed to indicate distinctions such as ê for /ɛ/ and ô for /ɔ/, while the lax high vowels /ɪ/ and /ʊ/ are typically represented by i and u in non-final positions without additional marking.
FrontCentralBack
Closeiu
Near-closeɪʊ
Close-mideo
Open-midɛɔ
Opena
Vowel length in Sesotho is not generally considered phonemic, though some analyses propose limited contrasts; instead, lengthening is primarily allophonic, often resulting from compensatory processes such as the elision of consonants or moras in specific morphological contexts, leading to extended vowel durations without altering lexical meaning. For instance, penultimate vowels frequently undergo lengthening in phrase-final positions as a prosodic feature. Key allophonic variations include a breathy-voiced realization [ə̤] of /a/, typically in environments influenced by preceding depressor consonants that impart a breathy phonation to the vowel. Additionally, the lax high vowels /ɪ/ and /ʊ/ centralize to [ɨ] and [ʉ] respectively before nasal consonants, contributing to the language's surface phonetic diversity. These allophones enhance articulatory ease but do not create new phonemic contrasts. Contrasts among the vowels are demonstrated by minimal pairs, such as /kɛ́lɛ́/ 'strain (liquid)' versus /kíli/ 'anger', which highlight the distinction between open-mid /ɛ/ and high /i/. Other pairs, like /bɛ́la/ 'boil' and /béla/ 'count', underscore differences in mid vowel height (/ɛ/ vs. /e/). Sesotho lacks phonemic diphthongs; any apparent diphthongal sequences on the surface emerge from phonological processes like glide insertion or vowel adjacency resolution. Vowel height harmony may briefly influence realizations across morpheme boundaries, but such effects are detailed elsewhere.

Consonants

The consonant inventory of Sesotho consists of 39 phonemes, characterized by a three-way contrast in stops and affricates (ejective, aspirated, and voiced), a set of fricatives including a lateral, four nasal consonants, two liquids, two glides, and three basic click consonants borrowed from . Unlike most , Sesotho lacks prenasalized consonants, with nasals functioning independently or syllabically. The stops occur at bilabial, alveolar, and velar places of articulation. Ejective stops /pʼ tʼ kʼ/ are produced with a glottalic egressive , distinguishing them from the aspirated /pʰ tʰ kʰ/ (with post-release aspiration) and voiced series /b l g/ (where /l/ is the alveolar lateral with before /i u/). Affricates appear at alveolar (/tsʼ tsʰ dz/, /tɬʼ tɬʰ dɮ/), and postalveolar (/tʃʼ tʃʰ dʒ/) places, following the same three-way contrast. Fricatives include labiodental /f/, alveolar /s/, postalveolar /ʃ/, glottal /h/, and alveolar lateral /ɬ/. Nasals are bilabial /m/, alveolar /n/, palatal /ɲ/, and velar /ŋ/. Liquids comprise alveolar lateral /l/ and alveolar trill /r/. Glides are labial-velar /w/ and palatal /j/. Clicks are dental /ǃ/, palato-alveolar /ǂ/, and lateral /ǁ/.
Manner/PlaceBilabialLabiodentalAlveolarPostalveolarPalatalVelarGlottalClick
Ejective stop
Aspirated stop
Voiced stopblg
Ejective tsʼ, tɬʼtʃʼ
Aspirated tsʰ, tɬʰtʃʰ
Voiced dz, dɮ
fs, ɬʃh
Nasalmnɲŋ
Lateral l
Trillr
Glidewj
Clickǃǂǁ
In Sesotho orthography, ejective stops are written as plain p t k, aspirated as ph th kh, and voiced as b l g (with l realized as before /i u/). Affricates follow suit: ejective ts tl, aspirated tsh tlh, voiced dz dl j. Fricatives use f s sh h hl for /f s ʃ h ɬ/. Nasals are m n ny ŋg. Liquids are l r, glides w y. Clicks are represented by q (/ǃ/), x (/ǂ/), and ǁ or ql (/ǁ/). The ejective and aspirated stops are distinct phonemes, with ejectives showing glottalic initiation and aspirates post-release breathiness; positional variations may occur but do not interchange them phonemically. The trill /r/ has an emphatic uvular trill [ʀ] allophone in emphatic speech or loanwords. Additionally, /l/ alternates with before high vowels /i u/. Contrasts among consonants are maintained through minimal pairs, such as /pá/ 'give' versus /pʰá/ 'finish' (illustrating ejective vs. aspirated at bilabial place), and /t͡sʼá/ 'hide' versus /t͡sʰá/ 'burn' (alveolar contrast). Click contrasts appear in loanwords or expressive terms, like /ǃá/ (click for calling animals) versus /xá/ ( variant in interjections).

Structural Organization

Syllable Patterns

In Sotho, syllables predominantly follow open structures, adhering to templates such as V (vowel-only), CV (consonant-vowel), and CwV (consonant-labialized vowel), with nasals (N) and laterals (L) occasionally serving as syllabic nuclei in compounds or specific phonetic contexts. These patterns reflect the language's Bantu heritage, where s typically form the nucleus, and consonants occupy the onset position without forming codas. For instance, the monosyllabic word ba 'they' exemplifies a simple CV structure, while the disyllabic bana 'children' illustrates CV-CV sequencing. Sotho strictly prohibits closed syllables, meaning no true codas exist; any word-final consonant functions as the onset of a potential subsequent syllable in or compounds. Heterorganic consonant clusters, such as /t͡ɬ/ in tlala '', are analyzed as single complex onsets rather than separate syllables, maintaining the open integrity. Glides like /w/ and /j/ play a key role in these patterns, acting as semi-vowels that labialize or palatalize onsets within CV frameworks, as seen in forms like ngwana /ŋwana/ '', where /w/ integrates into the CwV template. Constraints on the formation of these complex onsets are further detailed in the section.

Phonotactics

Sesotho phonotactics are characterized by strict constraints on syllable onsets and the absence of true clusters, ensuring that most syllables conform to a basic CV template or variations like (C)V, CV(G), or syllabic s followed by CV. Onsets typically consist of a single , optionally followed by a glide such as /w/ or /j/, but apparent clusters like /kw/ are analyzed as labialized singletons rather than distinct segments. For instance, the form /kwa/ in words like kwala ('write') represents a labialized velar stop, not a separate /k/ and /w/. Word-initial non-syllabic nasals are prohibited in native vocabulary, though syllabic nasals (e.g., /n̩tja/ '') are permitted as onsets for 9/10 prefixes; loanwords violating this constraint undergo , such as English "bank" adapting to /banka/ with a vowel insertion to repair the illicit nasal-initial sequence. Clicks, borrowed from substrates and primarily used in ideophones and interjections, occur only in CV structures and cannot form clusters. Illicit onsets like *nba (hypothetical invalid form) are repaired via to yield licit sequences such as /neba/, maintaining onset integrity. Geminates are not underlying but emerge phonotactically at boundaries through vowel deletion or assimilation, restricted to nasals (/m/, /n/, /ŋ/, /ɲ/) and the lateral /l/, as in mmusi ('') from prefix assimilation. These do not constitute true clusters but function as lengthened singletons within onsets or codas, with durations nearly three times that of non-geminate counterparts (e.g., 231.8 ms vs. 82.1 ms for nasals). No other sonorants, such as /r/, form geminates, and ejective consonants (e.g., /tʼ/, /kʼ/) do not combine with fricatives in onsets due to articulatory incompatibility. At the word level, is generally avoided through glide insertion between adjacent vowels across boundaries, preventing VV sequences without mediation; for example, underlying /a-u/ may surface as /a-wu/ in rapid speech. Penultimate lengthening also conditions boundary phenomena, such as enhanced or tone adjustments that indirectly enforce phonotactic across words. Licit forms like /ma-me-l-lo/ ('') exemplify permitted open syllables, while illicit clusters in loans (e.g., English "" → /se-ko-lo/) are systematically epenthesized to conform to CV patterns. Dialectal variations influence cluster tolerance, with urban South African Sesotho exhibiting greater accommodation of adaptations due to contact with English and , allowing more epenthetic vowels in complex onsets compared to rural varieties that adhere more strictly to native CV constraints. For instance, rural dialects may further simplify urban loan forms like /tra-ŋkʰa/ ('trunk') to /ta-ra-ŋka/ with additional . These differences arise from sociolinguistic factors, including higher exposure to external influences in urban settings.

Phonological Processes

Consonantal Alternations

In Sesotho phonology, consonantal alternations refer to systematic changes in realization triggered by morphological processes, adjacent sounds, or phonological environments, primarily affecting place and . These alternations are prominent in verb conjugations, noun derivations, and prefixation, contributing to the language's rich morphophonology. Key processes include palatalization, alveolarization, and nasal assimilation of obstruents, which ensure euphonic integration within the predominantly CV structure. Palatalization is a widespread alternation in like Sesotho, where consonants shift toward a palatal or alveo-palatal , often before front vowels or glides, or in specific derivations such as passives and diminutives. Velars /k/ and /g/ front to alveo-palatal affricates /tʃ/ and /dʒ/ when followed by front vowels like /i/ or the palatal glide /j/, as part of a broader non-local feature spreading process involving a floating coronal [cor] feature. For instance, in verb roots, underlying forms surface with velar changes in contexts with front vowel influence, illustrating the shift in the velar stop. Alveolars and labials also undergo palatalization in passive forms; for example, the labial /p/ in the verb root /ʃapa/ 'lash' becomes /tʃ/ in the passive derivation, yielding underlying /ʃapa/ → surface [ʃatʃwa] 'be lashed'. This process is constrained by faithfulness constraints like *MAX C-PL DOR, preventing overapplication in diminutives where velars may remain unchanged, as in /buka/ → [bukana] 'small book'. In noun diminutives, alveolars like /t/ palatalize to /tʃ/, e.g., underlying /le-thaba/ → surface [le-thadʒana] 'small mountain'. Alveolarization involves the shift of non-alveolar consonants to alveolar or alveo-palatal positions, particularly affecting liquids and in specific environments. The lateral /l/ realizes as the alveolar stop before high vowels /i/ and /u/, serving as its primary in such contexts; for example, underlying /leleme/ 'tongue' surfaces with in forms with high vowel adjacency, such as in certain conjugations. This alternation is consistent in verb conjugations and noun stems across classes, though children acquiring Sesotho inconsistently apply it until age 3. like /s/ palatalize to /ʃ/ before /i/, enhancing coarticulation in sequences such as underlying /sina/ → surface [ʃina] 'not have' in certain derivations. In noun classes, particularly classes 5/6 with prefixes like le-/di-, /l/ undergoes this realization before high vowels in stems, e.g., underlying /le-lapa/ 'home (class 5)' → surface [di-dapa] in the plural form di-lapa with high vowel /i/ adjacency. Nasalization of obstruents occurs primarily in prefix contexts, where a homorganic nasal prefix assimilates in place to the following , forming prenasalized stops and often triggering post-nasal voicing or strengthening. For instance, the class 9/10 nasal prefix /n-/ assimilates to labials as /m/, yielding underlying /n + bopa/ → surface [m͡bopa] 'tie (class 9)' with a prenasalized bilabial. In verb object incorporation, /n + p/ → [m͡p], as in underlying /n + fa/ 'give (me)' → surface [m͡phe] [mʔpʰɛ], where the may ejectives or aspirate for emphasis. This process is morpheme-bound and applies in first-person singular object concords, ensuring phonological without delinking the nasal feature. Voiced obstruents following nasals may neutralize to voiceless or ejective forms, e.g., /n + b/ → [m͡pʼ], as seen in noun derivations like class 9/10 augmentatives. Depalatalization appears in loanword adaptation, where non-native palatal affricates simplify to non-palatal stops or fricatives to fit Sesotho , avoiding marked palatal clusters. For example, English /tʃ/ in "church" adapts to [tʃʰetʃʰi] but may depalatalize to [kʰetʃʰi] in casual speech or older loans, prioritizing velar stability over palatal retention. Click assimilation involves the single alveo-palatal click /ǃ/ (with nasal and aspirated variants) adjusting manner based on following vowels; before front vowels, it may nasalize or tenuisize, e.g., underlying /ǃi/ → surface [ŋǃi] in interjections, though this is variable in standard forms. These alternations highlight Sesotho's adaptive phonology in derivations, with underlying forms often abstracting floating features that surface contextually.

Vocalic Modifications

In Sesotho, vocalic modifications primarily involve processes that resolve hiatus through or coalescence, partial via advanced tongue root (ATR) spreading, and vowel raising in morphological contexts, alongside limited reduction in certain positions. These changes ensure well-formedness and feature assimilation across boundaries, often triggered by adjacency to high vowels or morphological concatenation. Vowel elision occurs in hiatus contexts to avoid adjacent vowels, typically resulting in coalescence or glide formation rather than complete deletion, though full elision is less frequent than in other . For instance, when a stem-final low meets a suffix-initial , the sequence may fuse, as in the potential concatenation from verbs like /bon-a/ "see" plus object -a, yielding forms with centralization and tone preservation. This process interacts with syllable structure by maintaining open syllables, without affecting codas, and may involve of the surviving to preserve moraic weight. Dialectal variation exists, with urban varieties showing more coalescence and rural ones favoring glides in some /a + i/ sequences. Partial vowel harmony in Sesotho manifests as ATR spreading from root vowels to suffixes, particularly affecting mid vowels by raising [-ATR] variants to [+ATR] counterparts when adjacent to high vowels. The language's seven-vowel system (/i, e, ɛ, a, ɔ, o, u/) distinguishes [+ATR] (e, o, i, u) from [-ATR] (ɛ, ɔ) mids, with harmony propagating rightward from the root to ensure feature agreement. This is evident in derived forms like tlola 'smear' becoming tlotsa in the causative, where the mid vowel raises due to ATR influence from the suffix -isa. Acoustic studies confirm that such raising produces greater height shifts (via lower F1 formants) than phonemic distinctions between mid pairs, supporting its phonological status over mere coarticulation. The scope of harmony is typically morpheme-bound, with dialectal differences in Lesotho varieties extending it further in compounds compared to South African ones. Vowel raising specifically targets mid vowels in suffixes or adjacent positions, converting /ɛ, ɔ/ to /e, o/ before high vowels /i, u/, as part of the ATR system. Examples include reka 'buy' (with /ɛ/) raising to rekile 'have bought' (/e/ before /i/), and similar shifts in causatives like bopa 'tie' to bopisa. This process is morphologically driven, applying regressively within the verb stem-suffix domain, and acoustic show consistent height advancement across speakers. Reduction of /a/ to a central [ə]-like variant occurs in unstressed positions, particularly in rapid speech or non-prominent syllables, reflecting contextual modification rather than phonemic neutralization. Acoustic analyses reveal significant formant shifts for /a/ in syntactic environments, lowering its perceptual distinctiveness without full centralization in all dialects. Compensatory lengthening follows elision in hiatus, as in nominal derivations where deleted vowels lengthen the preceding one (e.g., e-khatleth-eni avoiding *e-khatleth-eni). These modifications prioritize prosodic integration over exhaustive feature change.

Prosodic Features

Tonology

Sesotho features a two-level tonal system with high (H) and low (L) tones, in which the low tone functions as the default or unspecified tone while high tones are explicitly marked in underlying lexical representations. This binary contrast is realized primarily on vowels or syllabic nasals, with high tones associated with elevated (F0) and low tones following the phrase's baseline contour or exhibiting vocal fry in some contexts. Unlike contour tone languages, Sesotho does not permit rising or falling tones on single syllables, maintaining a level tone realization. Tone assignment in Sesotho combines lexical and grammatical mechanisms. Lexically, high tones are specified on particular morphemes, such as certain (e.g., /kóm/ 'tie' with underlying H), while most and other elements are toneless or low by default. Grammatically, high tones are introduced by morphological categories, including prefixes and tense affixes; for instance, subject markers often receive a rule-assigned H tone, and syntactic positions like subject or object can trigger specific H docking to the left edge of , distinguishing functions such as focus or predicative roles through four variant H types. This grammatical overlay creates surface patterns that override or interact with lexical tones, as seen in forms like [gì-kómbó] (LHH) for subject position versus [gí-kómbó] (HHH) for predicative use of the 'broom'. A primary phonological process governing tone distribution is high tone spreading (HTS), which typically propagates an underlying H rightward to adjacent syllables within bounded domains, often limited to one syllable in core Sesotho but extending iteratively up to the penultimate syllable in longer verb forms across Sotho varieties. This spreading adheres to a finality restriction, avoiding the phrase-final syllable, and contributes to the language's characteristic penultimate prominence. For contrastive HL sequences, particularly across word boundaries within the same phonological phrase, downstep (marked as ˋ or ꜜ) lowers a subsequent H relative to preceding Hs, enhancing tonal distinctions without altering the underlying inventory; Khoali notes this in adjacent H sequences, such as in phrasal contexts where the second H is downstepped. Examples illustrate these rules: the minimal pair seba (HL, 'gossip') contrasts with seba (LL, 'do mischief'), while verb forms like go á gísanya ('to live in harmony') show HTS applying rightward from the root H. Tonal melodies also appear in ideophones, where fixed HL or HH patterns convey sensory impressions, such as sudden actions, independent of the surrounding tonal context. Tones interact closely with prosodic structure, being prominently realized on stressed syllables—typically the penultimate due to lengthening—without independent stress-tone conflicts; this alignment reinforces the language's rhythmic and pitch-based prosody.

Stress

In Sesotho, the default placement of stress falls on the penultimate of a word, particularly when the word appears in phrase-final position. This pattern forms a trochaic , with the stressed syllable exhibiting increased duration through vowel lengthening and slightly higher intensity, but without a concomitant rise in pitch, as pitch variations are governed by the independent tonal system. The phonetic realization of stress emphasizes duration as the primary cue, where the in the penultimate is lengthened, often by 20-50% compared to unstressed s, contributing to rhythmic prominence without altering in a stress-specific manner. For example, the form ho-ló-me 'to drink' is realized as [ho.ló.mɛ̀], with the lengthened /o/ in the penultimate marking the stress. Exceptions to the penultimate pattern include final-syllable stress in certain grammatical categories such as , ideophones, and short monosyllabic or disyllabic words, where the ostensive or emphatic function overrides the default rule; there is no of initial-syllable stress in the language. For instance, the 'which' bears final stress, pronounced [kɛ́]. Ideophones often exhibit this final stress for expressive effect, diverging from the norm.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.