Hubbry Logo
CedillaCedillaMain
Open search
Cedilla
Community hub
Cedilla
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Cedilla
Cedilla
from Wikipedia
◌̧
Cedilla
U+0327 ◌̧ COMBINING CEDILLA (diacritic)
See also
U+00B8 ¸ CEDILLA (symbol)

A cedilla (/sɪˈdɪlə/ sih-DIL; from Spanish cedilla, "small ceda", i.e. small "z"), or cedille (from French cédille, pronounced [sedij]), is a hook or tail (¸) added under certain letters (as a diacritical mark) to indicate that their pronunciation is modified. In Catalan (where it is called trenc), French, and Portuguese (where it is called a cedilha) it is used only under the letter ⟨c⟩ (to form ⟨ç⟩), and the entire letter is called, respectively, c trencada (i.e. "broken C"), c cédille, and c cedilhado (or c cedilha, colloquially). It is used to mark vowel nasalization in many languages of Sub-Saharan Africa, including Vute from Cameroon.

This diacritic is not to be confused with the ogonek (◌̨), which resembles the cedilla but mirrored. It looks also very similar to the diacritical comma, which is used in the Romanian and Latvian alphabet, and which is misnamed "cedilla" in the Unicode standard.

There is substantial overlap between the cedilla and a diacritical comma. The cedilla is traditionally centered on the letter, and when there is no stroke for it to attach to in that position, as in Ņ ņ, the connecting stroke is omitted, taking the form of a comma. However, the cedilla may instead be shifted left or right to attach to a descending leg. In some orthographies the comma form has been generalized even in cases where the cedilla could attach, as in Ḑ ḑ, but is still considered to be a cedilla. This produces a contrast between attached and non-attached (comma) glyphs, which is usually left to the font but in the cases of Ş ş Ţ ţ and Ș ș Ț ț is formalized by Unicode.

Origin

[edit]
Origin of the cedilla from the Visigothic z
A conventional "ç" and 'modernist' cedilla "c̦" (right). (Helvetica and Akzidenz-Grotesk Book)

The tail originated in Spain as the bottom half of a miniature cursive z. The word cedilla is the diminutive of the Old Spanish name for this letter, ceda (zeta).[1] Modern Spanish and isolationist Galician no longer use this diacritic, although it is used in Reintegrationist Galician, Portuguese,[2] Catalan, Occitan, and French, which gives English the alternative spellings of cedille, from French "cédille", and the Portuguese form cedilha. An obsolete spelling of cedilla is cerilla.[2] The earliest use in English cited by the Oxford English Dictionary[2] is a 1599 Spanish-English dictionary and grammar.[3] Chambers' Cyclopædia[4] is cited for the printer-trade variant ceceril in use in 1738.[2] Its use in English is not universal and applies to loan words from French and Portuguese such as façade, limaçon and cachaça (often typed facade, limacon and cachaca because of lack of ç keys on English-language keyboards).

With the advent of typeface modernism, the calligraphic nature of the cedilla was thought somewhat jarring on sans-serif typefaces, and so some designers instead substituted a comma design, which could be made bolder and more compatible with the style of the text.[a] This reduces the visual distinction between the cedilla and the diacritical comma.

C

[edit]

The most frequent character with cedilla is "ç" ("c" with cedilla, as in façade). It was first used for the sound of the voiceless alveolar affricate /ts/ in old Spanish and stems from the letter ⟨ꝣ⟩ (the Visigothic form of the letter ⟨z⟩), whose upper loop was lengthened and reinterpreted as a "c", whereas its lower loop became the diminished appendage, the cedilla.

It represents the "soft" sound /s/, the voiceless alveolar sibilant, where a "c" would normally represent the "hard" sound /k/ (before "a", "o", "u", or at the end of a word) in English and in certain Romance languages such as Catalan, Galician, French (where ç appears in the name of the language itself, français), Ligurian, Occitan, and Portuguese. In Occitan, Friulian, and Catalan, ç can also be found at the beginning of a word (Çubran, ço) or at the end (braç).

It represents the voiceless postalveolar affricate /tʃ/ (as in English "church") in Albanian, Azerbaijani, Crimean Tatar, Friulian, Kurdish, Tatar, Turkish (as in çiçek, çam, çekirdek, Çorum), and Turkmen. It is also sometimes used this way in Manx, to distinguish it from the velar fricative.

In the International Phonetic Alphabet, ⟨ç⟩ represents the voiceless palatal fricative.

S

[edit]

The character "ş" represents the voiceless postalveolar fricative /ʃ/ (as in "show") in several languages, including many belonging to the Turkic languages, and included as a separate letter in their alphabets:

In HTML character entity references Ş and ş can be used.

T

[edit]

Gagauz uses Ţ (T with cedilla), one of the few languages to do so, and Ş (S with cedilla). Besides being present in some Gagauz orthographies, T with Cedilla also exists in the General Alphabet of Cameroon Languages, in the Kabyle language, in the Manjak and Mankanya languages, and possibly elsewhere.

In 1868, Ambroise Firmin-Didot suggested in his book Observations sur l'orthographe, ou ortografie, française (Observations on French Spelling) that French phonetics could be better regularized by adding a cedilla beneath the letter "t" in some words. For example, the suffix -tion is usually not pronounced as /tjɔ̃/ but as /sjɔ̃/. It has to be distinctly learned that in words such as diplomatie (but not diplomatique), it is pronounced /s/. A similar effect occurs with other prefixes or within words. Firmin-Didot surmised that a new character could be added to French orthography. A letter with the same description, T-cedilla (majuscule: Ţ, minuscule: ţ), is used in Gagauz. A similar letter, the T-comma (majuscule: Ț, minuscule: ț), exists in Romanian, but it has a comma accent, not a cedilla.

Languages with other characters with cedillas

[edit]

Latvian

[edit]

Comparatively, some consider the diacritics on the palatalized Latvian consonants "ģ", "ķ", "ļ", "ņ", and formerly "ŗ" to be cedillas. Although their Adobe glyph names are commas, their names in the Unicode Standard are "g", "k", "l", "n", and "r" with a cedilla. The letters were introduced to the Unicode standard before 1992, and their names cannot be altered. The uppercase equivalent "Ģ" sometimes has a regular cedilla.

Marshallese

[edit]

In Marshallese orthography, four letters in Marshallese have cedillas: ⟨ļ m̧ ņ o̧⟩. In standard printed text they are always cedillas, and their omission or the substitution of comma below and dot below diacritics are nonstandard.[citation needed]

As of 2011, many font rendering engines do not display any of these properly, for two reasons:

  • "ļ" and "ņ" usually do not display properly at all, because of the use of the cedilla in Latvian. Unicode has precombined glyphs for these letters, but most quality fonts display them with comma below diacritics to accommodate the expectations of Latvian orthography. This is considered nonstandard in Marshallese. The use of a zero-width non-joiner between the letter and the diacritic can alleviate this problem: "l‌̧" and "n‌̧" may display properly, but may not; see below.
  • "" and "" do not currently exist in Unicode as precombined glyphs, and must be encoded as the plain Latin letters "m" and "o" with the combining cedilla diacritic. Most Unicode fonts issued with Windows do not display combining diacritics properly, showing them too far to the right of the letter, as with Tahoma ("" and "") and Times New Roman ("" and ""). This mostly affects "", and may or may not affect "". But some common Unicode fonts like Arial Unicode MS ("" and ""), Cambria ("" and "") and Lucida Sans Unicode ("" and "") do not have this problem. When "" is properly displayed, the cedilla is either underneath the center of the letter, or is underneath the right-most leg of the letter, but is always directly underneath the letter wherever it is positioned.

Because of these font display issues, it is not uncommon to find nonstandard ad hoc substitutes for these letters. The online version of the Marshallese-English Dictionary (the only complete Marshallese dictionary in existence)[citation needed] displays the letters with dot below diacritics, all of which do exist as precombined glyphs in Unicode: "", "", "" and "". The first three exist in the International Alphabet of Sanskrit Transliteration, and "" exists in the Vietnamese alphabet, and both of these systems are supported by the most recent versions of common fonts like Arial, Courier New, Tahoma and Times New Roman. This sidesteps most of the Marshallese text display issues associated with the cedilla, but is still inappropriate for polished standard text.

Vute

[edit]

Vute, a Mambiloid language from Cameroon, uses cedilla for the nasalization of all vowel qualities (cf. the ogonek used in Polish and Navajo for the same purpose). This includes unconventional Roman letters that are formalized from the IPA into the official writing system. These include <i̧ ȩ ɨ̧ ə̧ a̧ u̧ o̧ ɔ̧>.

Hebrew

[edit]

The ISO 259 romanization of Biblical Hebrew uses Ȩ (E with cedilla) and Ḝ (E with cedilla and breve).

Saanich

[edit]

Saanich uses a spacing cedilla ⟨¸⟩ as a letter. For fonts created or updated after 2025, a cedilla in the middle of a word should not trigger a word break at the end of a line, to accommodate Saanich.

Diacritical comma

[edit]

Languages such as Romanian, Latvian and Livonian add a comma (virgula) to some letters, such as ș, which looks somewhat like a cedilla, but is more precisely a diacritical comma. This is particularly confusing with letters which can take either diacritic: for example, the consonant /ʃ/ is written as "ş" in Turkish but as "ș" in Romanian, and Romanian writers will sometimes use the former instead of the latter because of insufficient computer support.

Adobe names of the Latvian letters ("ģ", "ķ", "ļ", "ņ", and formerly "ŗ") use the word "comma", but in the Unicode Standard they are named "g", "k", "l", "n", and "r" with cedilla. The letters were introduced to the Unicode standard before 1992, and their names cannot be altered. Influenced by Latvian, Livonian has the same problem for "d̦", "ļ", "ņ", "ŗ" and "ț". The Polish letters "ą" and "ę" and Lithuanian letters "ą", "ę", "į", and "ų" are not made with the cedilla either, but with the unrelated ogonek diacritic.

Unicode

[edit]

Unicode encodes a number of cases of "letter with cedilla" (so called, as explained above) as precomposed characters. In addition, several more letters in language orthographies are composed using the combining character facility (U+0327 ◌̧ COMBINING CEDILLA and U+0326 ◌̦ COMBINING COMMA BELOW).

In ambiguous cases, typeface designers must choose whether to use a cedilla diacritic or comma-below diacritic for these codepoints, leaving it to others to provide the user with a method to achieve the other form (i.e., that relies on the combining character method). Here are three popular faces that demonstrate the choices made:

  • Arial: Ç ç Ḉ ḉ Ḑ ḑ Ȩ ȩ Ḝ ḝ Ģ ģ Ḩ ḩ Ķ ķ Ļ ļ M̧ m̧ Ņ ņ O̧ o̧ Ŗ ŗ Ş ş Ţ ţ Z̧ z̧
  • Times New Roman: Ç ç Ḉ ḉ Ḑ ḑ Ȩ ȩ Ḝ ḝ Ģ ģ Ḩ ḩ Ķ ķ Ļ ļ M̧ m̧ Ņ ņ O̧ o̧ Ŗ ŗ Ş ş Ţ ţ Z̧ z̧
  • Courier New: Ç ç Ḉ ḉ Ḑ ḑ Ȩ ȩ Ḝ ḝ Ģ ģ Ḩ ḩ Ķ ķ Ļ ļ M̧ m̧ Ņ ņ O̧ o̧ Ŗ ŗ Ş ş Ţ ţ Z̧ z̧

In each case, the diacritic displayed with D, G, K, L, N and R is a comma-below; in the other cases it is displayed as a cedilla. It may be that computer fonts are sold in the Romanian and Turkish markets that favour the national standard form of this diacritic.

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The cedilla (¸) is a diacritical mark shaped like a small hook or tail, placed beneath certain letters—most commonly under the letter c to form ç—to modify their pronunciation, typically indicating a soft s sound (/s/) instead of a hard k or g sound before vowels a, o, or u. This mark originated in medieval Spanish as a diminutive form of the letter z (from the Spanish cedilla, meaning "little z," derived from Late Latin zeta), where it was initially used under c to represent the affricate /ts/ sound in Old Spanish. By the 16th century, it had evolved and spread to other Romance languages, with its first documented English use dating to 1599. In modern usage, the cedilla is essential in several languages to distinguish phonetic values and prevent ambiguity. In French, it appears in words like façade and garçon to produce the /s/ sound, ensuring clarity in before back vowels. Similarly, in , ç is employed before a, o, and u—as in açaí or começar—to indicate the alveolar /s/, a convention rooted in the language's historical development from medieval . It also features in Catalan for the same softening effect before a, o, u, or at word ends, such as in plaça. Beyond , the cedilla appears in Turkish under s (ş /ʃ/) and c (ç /tʃ/) to denote those sounds, respectively, while a similar diacritical comma appears under s (ș /ʃ/) and t (ț /ts/) in Romanian for and distinctions. These applications highlight the cedilla's role in adapting the Latin alphabet to diverse phonological needs across and beyond.

History and Etymology

Origin

The cedilla, a diacritical mark resembling a small hook or tail placed beneath certain letters, emerged in the 15th and 16th centuries as a means to modify pronunciation in European vernacular scripts, particularly to denote a soft or sound where a harder articulation might otherwise occur. Derived ultimately from a form of the letter —termed "cedilla" or "little z" in Spanish, from zeta via Greek zēta—this mark adapted earlier scribal conventions to the demands of movable-type , facilitating clearer representation of phonetic nuances in languages like Spanish, , and French. Its evolution stemmed from medieval scribal practices in the , prevalent in the from the 8th to 13th centuries, where a tailed variant of z (ꝣ) served to indicate palatalization or the /ts/ sound in manuscripts. This underdot or comma-like descender, used for phonetic distinction in Visigothic texts, gradually simplified into a subscript hook as scribes from Spanish and traditions influenced early printers, transitioning the mark from a standalone letter form to a versatile attached to c or other consonants. The cedilla's integration into printing began in the late 15th century, with typesetters drawing on Iberian influences to incorporate it into vernacular works; one early example appears in Spanish orthographic texts, such as Antonio de Nebrija's Reglas de Ortografía Española (1512), which formalized its role in denoting softened sounds. In French contexts, the mark gained prominence through the efforts of printer and orthographic reformer Geofroy Tory, who introduced it in his seminal 1529 treatise Champ Fleury: L'art et science de la vraye proportion des lettres, positioning it under c (as in françois) to signal an /s/ pronunciation before a, o, or u. Tory's innovations, inspired by humanistic ideals and classical proportions, were printed amid the expansion of French typography in Paris, marking a pivotal adoption in European book production.

Name and Terminology

The term cedilla derives from the Spanish cedilla, a of ceda (or zeda), the name for the letter , ultimately from zeta via Greek zêta. This reflects the mark's historical resemblance to a small, cursive form of the letter zeta (ζ), which in medieval Spanish manuscripts served to soften the pronunciation of before certain vowels. The word entered English around , marking its first known use as a term for the diacritical mark. Variations in naming appear across , adapting the Spanish root to local and orthographic traditions. In French, it is termed cédille, a direct borrowing that emphasizes the mark's role under the letter C to produce a soft sound, as in garçon. employs cedilha, the form akin to its Spanish progenitor, while Italian uses cediglia, also derived from Spanish cedilla. These terms highlight the mark's dissemination through printing and scholarly exchanges in the 16th and 17th centuries, with English adopting cedilla primarily via French influence during the latter period. Terminological debates in and center on distinguishing the cedilla—a curved, hook-shaped (Unicode U+0327, COMBINING CEDILLA)—from visually similar marks with different functions or shapes. For instance, the straight "comma below" (U+0326, COMBINING COMMA BELOW), used in languages like Romanian for letters such as Ș andȚ, is not considered a true cedilla but a separate to avoid phonetic in rendering. Similarly, the hook under vowels in Vietnamese (e.g., ơ, ư) is termed a "horn" or tone mark, not a cedilla, underscoring the mark's specificity to modification in Romance contexts. Scholarly discussions, particularly in standards, advocate for precise nomenclature to resolve display inconsistencies, sometimes referring to the comma below variant as a "subcomma" in typographic analyses. Historical name shifts trace back to early 16th-century European printing manuals, where the mark appeared before the term cedilla standardized. In French typographic texts, it was often described as a "comma below" (virgule sous la lettre) or "subscript apostrophe" (apostrophe souscrite), reflecting its initial perception as a modified element rather than a dedicated . These earlier designations evolved as the mark's role solidified in orthographic reforms, transitioning to the diminutive Z-based names by the late 1500s.

Primary Uses by Letter

With C

The cedilla under the letter C, denoted as ç, serves to modify the pronunciation of C from the velar stop /k/ (or historically /g/ in some contexts) to the alveolar /s/ when it precedes the back vowels a, o, or u in several . This emerged in the late medieval period to distinguish palatalized consonants in evolving Romance phonologies, particularly in after the , where the /ts/ before back vowels gradually simplified to /s/ while lagged behind spoken changes. In French, the cedilla has been standard since the 15th century, with early adoption in printed texts to reflect the soft pronunciation; grammarian Louis Meigret formalized its use in his 1550 Traité de la grammere françoze, the first French grammar written in French, as part of phonetic reforms to align spelling with contemporary speech. The Académie Française, established in 1635 to regulate the language, codified ç in its 1694 Dictionnaire and subsequent editions, mandating its placement exclusively before a, o, or u to ensure /s/ (e.g., garçon for "boy," where plain garcon would imply /garkɔ̃/; façade for "facade"). Exceptions occur in proper names (e.g., Caron) or archaic spellings, but modern rules prohibit ç before e or i, where plain C already yields /s/, and its omission before back vowels is nonstandard. This convention influences English loanwords like façade and garçon, retaining ç etymologically despite anglicized pronunciations. Portuguese employs ç identically before a, o, or u to produce /s/, as regulated by the 1990 Acordo Ortográfico da Língua Portuguesa under the (CPLP), unifying Brazilian and European variants (e.g., ação for "action," praça for "square"). Loanwords adapt similarly, with ç added to foreign terms like açúcar (from via Spanish) to match native phonetics. In Catalan, standardized by the Institut d'Estudis Catalans (IEC) since 1913, ç indicates /s/ before a, o, or u, distinguishing it from hard /k/ (e.g., plaça for "square," dolç for "sweet"); this usage, inherited from medieval Occitano-Romance scripts, applies in both Central and Valencian norms without exceptions for loanwords, which are often respelled to fit.

With S

The cedilla under the letter S, forming ş, modifies its pronunciation from the voiceless alveolar sibilant /s/ to the /ʃ/, akin to the "sh" in English "ship." This adaptation allows for precise representation of the affrication in languages where the plain S does not suffice for native phonemes. In non-Romance contexts, particularly , this function emerged to bridge the phonetic gaps between traditional scripts and the Latin alphabet. In Turkish, the letter ş was formalized during the 1928 alphabet reform led by , which replaced the Ottoman Perso-Arabic script with a phonetically tailored Latin-based system to promote literacy and secular modernization. The reform, overseen by a language council, incorporated ş alongside other diacritics like ç and ğ to capture distinct Turkish sounds, including the /ʃ/ phoneme prevalent in words of Turkic origin influenced by earlier Arabic script representations such as ش (shīn). For instance, the capital of Turkey is spelled İstanbul, where ş denotes the /ʃ/ sound essential to the city's name. The Turkish Language Association (TDK), established in 1932, upholds these orthographic standards, ensuring ş appears in positions that align with vowel harmony rules, though it does not impose absolute prohibitions on its placement before specific vowels like i or ı. Romanian adopted ș in the 19th century amid the transition from Cyrillic to Latin script, particularly in Wallachian dialects that formed the basis of standard Romanian. This shift, driven by nationalistic efforts to emphasize Romance roots, introduced ș to represent /ʃ/ in loanwords and native terms, as seen in București, the capital's name, which reflects historical Wallachian orthographic practices dating back to transitional alphabets in the 1840s. The letter's use solidified in official orthography by the late 19th century, distinguishing Romanian from neighboring Slavic influences. The Romanian Standards Association adopted SR 13411 in 1999, standardizing ș as s-comma below (Ș/ș) rather than s-cedilla (Ş/ş), addressing inconsistencies in digital encoding where cedilla variants had been prevalent due to early ISO standards; however, cedilla forms persist in some fonts for visual compatibility while maintaining the /ʃ/ pronunciation.

With T

The cedilla placed beneath the letter T, forming ț, serves a phonetic purpose in rare orthographic systems to indicate palatalization of the voiceless alveolar stop /t/, typically softening it to a palatal stop [tʲ] or affricate such as /tʃ/ or /c/. In historical Breton orthography, the cedilla under T was introduced to mark palatalization of dentals, including /t/, particularly to represent /ts/ or a palatal variant in contexts like pre-vocalic positions before front vowels such as /i/ or /e/. This usage appeared in early modern systems for phonetic accuracy, though specific printed examples vary due to early printing limitations. This usage declined sharply with 19th- and 20th-century orthographic reforms, including the 1941 Peurunvan standardization, which favored alternative diacritics like apostrophes or digraphs (e.g., "tz" for /ts/); today, ț appears only in academic transliterations or historical linguistics studies of Breton texts. Livonian, a Finnic language of Latvia, employs ț in its modern 36-letter Latin alphabet to denote a fully palatalized voiceless plosive [tʲ], distinct from the partial palatalization in related Estonian. This sound occurs in native vocabulary and loanwords, often lengthening phonetically in word-final monosyllabic contexts with plain tone, as in "kațki" ('broken'), where it contrasts with plain /t/ to convey palatal articulation influenced by vowel harmony or adjacent front vowels. The orthography, standardized in the mid-20th century based on Latvian conventions, retains ț for this purpose despite the language's near-extinction, with usage persisting in educational materials, folklore collections, and the works of linguists like Valts Ernštreits. Overall, the cedilla with T remains marginal outside these contexts, supplanted by orthographic simplifications and digital encoding preferences for comma-below variants in related scripts, underscoring its role as a vestige of early modern European efforts to adapt Latin letters to minority language phonologies.

Uses in Other Languages and Scripts

Latvian

In Latvian orthography, the cedilla—often rendered as or visually resembling a comma below (known as apakškomats)—serves primarily to indicate palatalized consonants, distinguishing them from their non-palatal counterparts. This appears under the letters g, k, l, and n, producing ģ (pronounced /ɟ/, akin to the "g" in English "argue"), ķ (/c/, a voiceless palatal stop), ļ (/ʎ/, a palatal lateral approximant), and ņ (/ɲ/, a palatal nasal). These modifications reflect the language's rich inventory of palatal sounds, which were historically represented by digraphs or other conventions in earlier spelling systems. The use of the cedilla in Latvian was formalized during the 1908 orthographic reform, led by linguists Kārlis Mīlenbahs and Jānis Endzelīns as part of the Orthography Commission of the Riga Latvian Society. This reform aimed to create a more phonetic and standardized Latin-based script, replacing inconsistent German- and Polish-influenced orthographies prevalent in the 19th century. Although proposed in 1908, full implementation occurred after Latvia's independence in 1918, with official decrees in 1922 confirming the system's adoption and phasing out digraphs like gj for ģ or nj for ņ. For instance, the word daļa (meaning "part," with palatal /ʎ/) contrasts with dala (a non-palatal variant in some contexts), highlighting how the cedilla clarifies pronunciation and morphology. The diacritic's form evolved from a classic hooked cedilla in early 20th-century publications to a more comma-like shape by the mid-century, influenced by printing practices and typographic standards. Today, the cedilla (or comma below) remains a core element of official Latvian orthography, integral to state documents, education, and media, despite occasional technical challenges in digital rendering and discussions around diacritic harmonization in international standards. It frequently appears in surnames, such as Ozoliņš (featuring ņ for the palatal nasal) or Kalniņš, underscoring its everyday prevalence among Latvia's over two million speakers. Efforts to retain these marks, including Unicode updates distinguishing the Latvian comma from the traditional cedilla, have preserved linguistic identity amid broader European integration since EU accession in 2004.

Marshallese

In Marshallese orthography, the cedilla is used under the letters l, m, n, and o (rendered as ļ, m̧, ņ, o̧) to indicate secondary articulations, such as palatalization, , or labiovelarization, which are crucial for distinguishing phonemes in this spoken by about 60,000 people in the . For example, the cedilla under l (ļ) marks a palatalized lateral approximant, while under m and n it denotes labial-velar or palatal variants, and under o it signals a specific vowel quality or nasalization in certain contexts. These diacritics reflect the language's complex phonological system, including 13 vowels and four diphthongs, with the orthography standardized in the mid-20th century based on influences during U.S. administration post-. The cedilla in Marshallese is always a true hooked form in standard printed text, distinct from comma-below variants used elsewhere, and its omission can lead to mispronunciation or ambiguity. For instance, words like ļōk (with palatal l) contrast with non-palatal forms. The system was formalized in linguistic descriptions from the onward, with ongoing Unicode support to ensure proper rendering, as the language lacks tones but relies heavily on these diacritics for prosody. Today, it is taught in schools and used in official documents, though digital fonts sometimes display issues with combining cedillas.

Vute

In the Vute language, a Mambiloid Niger-Congo language spoken primarily in Cameroon with approximately 44,000 speakers there and 3,600 in Nigeria (as of 2020s estimates), the cedilla (¸) serves as a diacritical mark to indicate nasalization of vowels. This orthographic convention applies to all vowels in the language, which features a rich system of 32 vowel phonemes including oral and nasal variants, short and long forms. The cedilla is placed directly under the vowel letter to denote nasal quality, distinguishing nasalized vowels from their oral counterparts in a language where nasalization is phonemic and can affect meaning. The use of the cedilla for vowel nasalization in Vute emerged as part of a standardized orthography developed in the late 20th century by SIL International linguists working on Cameroonian languages. This system was formalized on , drawing from the General Alphabet of Cameroonian Languages proposed in 1978, and documented in detail by Rhonda Thwing in 1981. In Vute, a tonal language with five contrastive tones (high, mid, low, rising, falling), the cedilla integrates with tone marks, which are typically superscript symbols placed above vowels; for instance, nasalized vowels can simultaneously bear tone diacritics without altering their positioning rules. This orthography aids in distinguishing nasalized forms in a language influenced by regional Bantoid phonological traits, though Vute itself lacks widespread click consonants. Examples of cedilla usage appear in linguistic documentation, such as the word for "one" transcribed as lvhə̨ (with the cedilla under the central vowel to mark nasalization) or forms like kdə̨́də̨ ("deep"), where the nasalized low central vowel combines with a high tone mark on the first and a mid tone on the second. Orthographic rules specify that the cedilla follows standard Latin vowel letters (a, e, ə, i, o, u, etc.) and precedes any or tone indicators for clarity in writing. These conventions help capture Vute's complex prosody, where nasalization interacts with tone to convey lexical distinctions. Today, the cedilla's application in Vute remains confined to academic and missionary linguistic materials produced by SIL International, with limited adoption in community literacy or digital media due to the language's endangered status and small speaker base. Challenges persist in rendering the mark accurately in digital fonts, as the combining cedilla (Unicode U+0327) is recommended for African language nasalization but often lacks consistent support in standard typography, leading to display issues in non-specialized software. Ongoing efforts by Unicode and SIL aim to improve encoding for such diacritics in lesser-resourced languages.

Hebrew and Other Scripts

In academic transliteration systems for Hebrew, the cedilla has been proposed or used under "c" to represent the phoneme /ts/ of the letter tsade (צ), as in a phonemic conversion scheme designed for reversibility across Hebrew dialects and periods. For example, the word צדק (justice) would be rendered as "çedeq" in this system. However, such usage remains rare in modern Israeli Hebrew romanization, which typically simplifies to "ts" without diacritics for broader accessibility. The cedilla under "s" (ş) appears in older scholarly systems for shin (ש) or sin (שׂ) to denote the /ʃ/ sound, though contemporary standards like the Society of Biblical Literature (SBL) prefer the caron (š) instead. In the SBL Handbook of Style, shin is consistently transliterated as š, reflecting a shift away from cedilla-based marks in biblical scholarship. In Arabic transliteration, the cedilla was employed in some early systems for certain pharyngealized sounds, but the 1972 United Nations romanization system (Resolution II/8) used a sub-macron (s̱) for the emphatic /sˤ/ of sad (ص). This approach aimed to distinguish pharyngealized sounds but has been largely supplanted by dots below (e.g., ṣ) in later standards like BGN/PCGN. For other Semitic scripts like Ugaritic and Phoenician, the cedilla occasionally appears in European scholarly traditions to mark specific consonants, such as under "t" (ţ) for emphatic or sounds in Semitic transliterations, though dots or carons are more common. In Phoenician studies, ç has been used sporadically for velar or /k/ variants, but such applications are not standardized. Nineteenth-century biblical scholarship sometimes adapted the cedilla for guttural sounds (e.g., ḥ or ʿ) in Semitic reconstructions, drawing from early European phonetic notations to approximate pharyngeal fricatives. However, standards like those from the United Nations Group of Experts on Geographical Names () now avoid the cedilla in favor of digraphs (e.g., kh for /x/) or other diacritics to promote consistency across Semitic languages. In modern applications for Semitic languages, including software tools for Yiddish romanization, the cedilla under "c" (ç) supports /ts/ or affricate representations in hybrid systems, facilitating digital processing of texts with Hebrew-origin words. For instance, Yiddish terms borrowing tsade may use ç in phonemic parsers to align with Unicode standards.

Diacritical Comma

The diacritical comma, also known as the comma below, is a comma-shaped diacritical mark positioned beneath a base letter to alter its pronunciation, typographically distinct from the cedilla, which features a more curved, hook-like form. Both marks share phonetic roles in modifying sounds, such as palatalization or affrication, but international standards have maintained their separation since the establishment of ISO 8859-2 in 1987, which initially encoded cedilla forms while later updates like ISO/IEC 8859-16 (2001) recognized comma variants for specific languages. In Romanian and Moldovan orthography, the virgula— the official term for the diacritical comma—is applied to s and t to produce ș (/ʃ/) and ț (/t͡s/), as standardized by the Romanian Academy in 2003 and reaffirmed in the 2005 orthographic dictionary DOOM 2. Despite this, many digital fonts render these as cedilla-like glyphs due to legacy encoding practices, leading to widespread visual inconsistency in printed and online Romanian text. Latvian uses the comma below (apakškomats) primarily for palatalized consonants like g (ģ), k (ķ), l (ļ), n (ņ), and r (ŗ), where it indicates a softer, more dental articulation, separate from the caron marks on s (š) and z (ž). Confusion between the diacritical comma and cedilla intensified in the 1990s digital transition, as TrueType fonts in early computing environments often merged their glyphs, treating the comma as a stylistic variant of the cedilla to conserve space in limited character sets. This merger stemmed from initial Unicode unifications, such as equating Turkish s-cedilla with Romanian s-comma, resulting in incorrect displays across platforms. The European Commission, through updates to language support standards around 2001–2003, advocated for distinct comma glyphs in Romanian to align with national orthography, influencing font developers and software vendors. Efforts to resolve these issues began with Unicode's 1993 clarification in version 1.1, which separated the combining comma below (U+0326) from the combining cedilla (U+0327) as independent code points to support accurate rendering. Precomposed Romanian characters like Ș (U+0218) and Ț (U+021A) were introduced in Unicode 3.0 (1999), enabling proper distinction. Prior to 2000, Microsoft Windows systems frequently misrendered Romanian diacritics by defaulting to cedilla forms in fonts like Arial and Times New Roman, a problem persisting until improvements in Windows Vista (2007) via the European Union Expansion Font Update.

Evolution and Printing Issues

By the 15th century, as emerged, irregular scribal hooks were adapted into early type designs, often retaining variable, hand-like forms influenced by manuscript traditions. In the 18th century, type foundries began standardizing the cedilla into more consistent curved shapes to suit mechanical casting and uniformity. Printing the cedilla presented challenges in metal type eras due to its sub-baseline position, leading to alignment issues, especially in multi-line compositions or when combined with varying letter heights. Early digital vectorization in PDFs exacerbated distortions, as low-resolution rendering and primitive font hinting caused the cedilla's fine curves to pixelate or warp, particularly in composite glyphs like ç. Design variations reflect typeface styles: curved, calligraphic forms appear in serif fonts like , evoking nib-pen influences, while 20th-century sans-serifs adopted a straighter, comma-like shape for geometric simplicity and legibility. Modern revivals often draw from calligraphy to infuse organic flow, as seen in contemporary type designs that prioritize expressive hooks over rigid standardization. Contemporary issues include reduced accessibility on low-resolution screens, where aliasing distorts the cedilla's subtle form, potentially hindering readability in multilingual texts. Type designers recommend consistent baseline placement for cedillas to ensure optical alignment across weights and sizes.

Technical Encoding

Unicode Representation

The cedilla is encoded in Unicode through both precomposed characters and a combining diacritic, allowing for representation in various languages and scripts. The primary precomposed codepoint for the lowercase c with cedilla is U+00E7 (ç), named LATIN SMALL LETTER C WITH CEDILLA, located in the Latin-1 Supplement block (U+0080–U+00FF). Similarly, U+015F (ş) represents LATIN SMALL LETTER S WITH CEDILLA in the Latin Extended-A block (U+0100–U+017F), while U+0163 (ţ) denotes LATIN SMALL LETTER T WITH CEDILLA, also in Latin Extended-A. For uppercase forms, corresponding codepoints include U+00C7 (Ç), U+015E (Ş), and U+0162 (Ţ). Note that U+015E (Ş) and U+015F (ş), and U+0162 (Ţ) and U+0163 (ţ) are used in Turkish with a cedilla glyph. For Romanian, while these codepoints were historically used, the preferred encoding for the comma below forms is U+0218 (Ș), U+0219 (ș), U+021A (Ț), U+021B (ț) in Latin Extended-B (U+0180–U+024F). The combining cedilla is encoded at U+0327 (◌̧), named COMBINING CEDILLA, in the Combining Diacritical Marks block (U+0300–U+036F), which can be applied to base letters to form custom combinations such as e + U+0327 (ȩ). For comma below in languages like Romanian and Latvian, the combining form is U+0326 (◌̦) COMBINING COMMA BELOW, with precomposed characters decomposing to base + U+0326. These encodings were introduced early in the Unicode Standard's development. The precomposed cedilla characters from Latin-1, including U+00E7, were included in Unicode 1.0, released in October 1991, to support compatibility with existing Western European encodings. The combining cedilla (U+0327) and additional precomposed forms like U+015F and U+0163 were added in Unicode 1.1 in 1993, expanding support for extended Latin scripts. The comma below precomposed characters (U+0218–U+021B) were added in Unicode 3.0 in 2000. Unicode 3.0, published in September 2000, provided clarifications on the handling of precomposed versus composed forms through updates to normalization algorithms, ensuring consistent decomposition and composition rules for diacritics like the cedilla to avoid ambiguities in text processing. Compatibility with legacy standards arises from direct mappings between ISO/IEC 8859-1 (Latin-1) and Unicode. In ISO 8859-1, the cedilla under c is at byte value 0xE7, which maps one-to-one to Unicode U+00E7, facilitating migration of Western European text without loss. Decomposition rules further support interoperability: for instance, the precomposed U+00E7 (ç) canonically decomposes to U+0063 (c) followed by U+0327 (◌̧), as defined in the Unicode Character Database's Decomposition_Mapping property. Similarly, U+015F (ş) decomposes to s + U+0327, while U+0219 (ș) decomposes to s + U+0326. This allows systems to normalize text into composed (NFC) or decomposed (NFD) forms as needed. Font support for the cedilla requires coverage across relevant Unicode blocks to ensure proper rendering. The Latin-1 Supplement handles basic forms like ç, while Latin Extended-A is essential for characters such as Ş and ş used in Turkish. Latin Extended-B (U+0180–U+024F) provides additional cedilla variants, such as U+0228 (Ĉ) for LATIN CAPITAL LETTER E WITH CEDILLA, though these are less common, as well as the preferred Romanian forms Ș and ș. Normalization processes like NFC can affect cedilla stacking when multiple diacritics are present; for example, a sequence with a cedilla (combining class 202, below-right) and another mark, such as a dot below (class 220), will be reordered to place the cedilla first during canonical composition, potentially altering visual stacking in decomposed forms. Comprehensive font families, such as those compliant with the Unicode Standard, must include glyphs for these blocks to avoid fallback rendering issues.

Keyboard and Input Methods

In standard keyboard layouts for languages using the cedilla, dedicated keys or modifier combinations facilitate direct input. The French AZERTY layout employs a dead key (the comma key producing ¸) followed by 'c' to generate 'ç'. The Turkish QWERTY layout has a dedicated key for 'ş'. In the Romanian standard layout, 'ț' is produced using right-Alt + 't'. Software-based methods offer cross-layout alternatives for inserting cedilla characters. On Windows, the Character Map utility allows selection and insertion of 'ç', while holding Alt and typing 0231 on the numeric keypad directly inputs the lowercase form. macOS users can press Option + 'c' to type 'ç'. Linux systems support Compose key sequences, such as Compose + 'c' + ',' to yield 'ç'. Mobile platforms integrate gesture-based input for diacritics. On iOS, long-pressing the 'c' key displays a menu of variants, from which users can slide to select 'ç'. For web applications, developers embed the cedilla using HTML entities like ç or the decimal reference ç. Accessibility features ensure cedilla input and rendering for assistive technologies. Screen readers JAWS and NVDA fully support pronunciation and navigation of cedilla characters in Unicode-compliant text across supported languages. Input Method Editors (IMEs) for languages like Latvian allow customization of layouts to handle cedilla-like diacritics (e.g., comma below in 'ģ') via system language settings. These methods align with Unicode representations, such as U+00E7 for 'ç'.

References

  1. https://en.wiktionary.org/wiki/cedilla#English
  2. https://en.wiktionary.org/wiki/cediglia
Add your contribution
Related Hubs
User Avatar
No comments yet.