Recent from talks
Nothing was collected or created yet.

T-comma (majuscule: Ț, minuscule: ț) is a letter which consists of a t with a diacritical comma underneath it, and is distinct from t-cedilla. It is part of the Romanian alphabet, used to represent the Romanian language sound /t͡s/, the voiceless alveolar affricate (like the letter C in Slavic languages that use the Latin alphabet). The letter is also a part of the Finno-Ugric Livonian language alphabet, representing the /c/ sound.[1]
It is written as the letter T with a small comma below and it has both the lower-case (U+021B) and the upper-case variants (U+021A).
The letter was proposed in the Buda Lexicon, a book published in 1825, which included two texts by Petru Maior, Orthographia romana sive Latino-valachica una cum clavi and Dialogu pentru inceputul limbei române, introducing ș for /ʃ/ and ț for /t͡s/.[2]
Software support
[edit]T-comma was not part of early Unicode versions; it was introduced only in Unicode 3.0.0 (September 1999) at the request of the Romanian national standardization body. Thus, some legacy systems do not have fonts compatible with it; for example, Microsoft's Windows XP requires installing the European Union Expansion Font Update.[3] Full support of this letter has been available on Macintosh computer since Mac OS X and on PC since Windows Vista. Although accessibility issues are a concern only on legacy systems, because of inertia or ignorance, or both, some newly-produced Romanian texts still use Ţ (T-cedilla, available from Unicode version 1.1.0, June 1993).
The letter is placed in Unicode in the Latin Extended-B range, under "Additions for Romanian", as the "Latin capital letter T with comma below" (U+021A) and "Latin small letter t with comma below" (U+021B).[4] In HTML these can be encoded by Ț and ț, respectively.

In Windows XP, most of the fonts including Arial Unicode MS render T-cedilla as T-comma because T-cedilla was not believed to be used in any language. (It is in fact used, but in very few languages. T with Cedilla exists as part of the General Alphabet of Cameroon Languages, in some Gagauz orthographies, in local spelling usages for the Kabyle language, and possibly elsewhere.) Technically, this is incorrect as a mismatching glyph is associated with a certain character code. Therefore, text written using S-cedilla and T-cedilla can often look as if it had been written using S-comma and T-comma. However, in order to correctly encode and render both S-comma and T-comma, one has to install the European Union Expansion Font Update. There is no official way to add keyboard support for these characters. In order to type them, one has to either install third-party keyboards, or use the Character Map.
All Linux distributions are able to correctly render S-comma and T-comma, since at least 2005. If these characters are missing from a certain font, they will be substituted with the glyph from another font. Although the X.Org Server supports the correct keyboard (ro comma) since at least 2005, selecting this keyboard from the user interface (e.g. GNOME Keyboard Properties) has only recently[when?] been made possible.
Character encoding
[edit]| Preview | Ț | ț | ||
|---|---|---|---|---|
| Unicode name | LATIN CAPITAL LETTER T WITH COMMA BELOW | LATIN SMALL LETTER T WITH COMMA BELOW | ||
| Encodings | decimal | hex | dec | hex |
| Unicode | 538 | U+021A | 539 | U+021B |
| UTF-8 | 200 154 | C8 9A | 200 155 | C8 9B |
| Numeric character reference | Ț |
Ț |
ț |
ț |
See also
[edit]- T-cedilla (Ţ)
- D-comma (D̦)
- S-comma (Ș)
- Other diacritics confused with the cedilla
- C c : Latin letter C
References
[edit]- ^ Everson, Michael (2001-11-12). "Livonian" (PDF). The Alphabets of Europe. Retrieved 2015-06-01.
- ^ Marinella Lörinczi Angioni, "Coscienza nazionale romanza e ortografia: il romeno tra alfabeto cirillico e alfabeto latino ", La Ricerca Folklorica, No. 5, La scrittura: funzioni e ideologie. (Apr., 1982), pp. 75–85.
- ^ European Union Expansion Font Update
- ^ Unicode code charts. Latin Extended-B: Range 0180–024F
External links
[edit]History
Origins
The letter Ț derives from the Latin letter T, adapted through the addition of a diacritic during the 19th-century shift from the Cyrillic to the Latin alphabet in Romania, as part of efforts to align the writing system with the language's Latin roots.[7] This transition, which began in earnest after 1830 and culminated in the official adoption of the Latin script in 1860, necessitated modifications to represent Romanian phonemes absent in standard Latin letters.[7] Diacritics for T, initially in the form of a cedilla or hook, emerged to distinguish its modified pronunciation from the plain T.[8] French orthography significantly influenced the development of Romanian diacritics, including early variants of the mark under T, as Romanian intellectuals drew on French models for softening consonant sounds around the 1860s.[8] Proponents like Titu Maiorescu advocated for cedilla-like forms in 1866, citing their familiarity from French to promote phonetic clarity in the emerging Latin-based system.[8] These adaptations reflected broader European trends in orthographic reform, where diacritics were employed to bridge Latin scripts with local phonetic needs. The first notable printed appearance of Ț occurred in the 1825 Lexiconul de la Buda, reprinting Petru Maior's 1819 proposal for a cedilla under T, during early Latinization initiatives predating full alphabet standardization.[9] Such experimental forms appeared sporadically in mixed-script texts from the 1820s to 1850s, as printers adapted Latin typefaces amid the uneven shift from Cyrillic, with further appearances during the post-1840s Latinization initiatives.[7] The Romanian alphabet, ultimately comprising 31 letters, incorporated these innovations to fully represent the language.[7]Adoption in Romanian
The adoption of the letter Ț into the Romanian alphabet marked a key step in the re-latinization efforts of the 19th century, aiming to better represent non-Latin phonemes through diacritics while transitioning from the Cyrillic script. In 1819, Petru Maior, a prominent figure in the Transylvanian School, proposed the use of diacritics such as Ț (t with cedilla) in his work Orthographia romana sive latino-valachica una cum clavi, to denote the affricate sound /ts/ and distinguish it from digraphs like "ts." This innovation was part of broader initiatives by Transylvanian scholars to align Romanian orthography with its Latin roots, replacing the inadequate Cyrillic representations that had been in use for centuries. Maior's proposal was reprinted in the influential Lexicon românesc-latin-unguresc-nemțesc (Buda Lexicon) of 1825, helping to disseminate the diacritic system among intellectuals.[8] The official shift to the Latin alphabet in 1860, following the unification of Wallachia and Moldavia, incorporated diacritics like Ț to address unique phonetic needs, though initial implementations varied with multiple spelling options persisting. This reform, driven by nationalistic movements emphasizing Romanian's Romance heritage, gradually supplanted the Cyrillic script, which the Romanian Orthodox Church continued using in publications until 1881. By 1881, under the influence of Titu Maiorescu's 1866 treatise Despre scrierea limbii române, which advocated a phonetic principle prioritizing pronunciation over etymology, Ț became more standardized as a single grapheme for /ts/, reducing reliance on transitional mixed alphabets.[8] The 1904 Romanian Orthographic Regulations, formalized by the Romanian Academy, solidified Ț's role by enforcing the "one sound—one graphic sign" rule, explicitly replacing digraphs such as "ts" with the diacritic in standard literary usage. This reform, building on Maiorescu's phonetic advocacy, streamlined orthography and promoted uniformity across print media and education, making Ț essential for words like țară (country), where it accurately captured the affricate without ambiguity. The changes ensured Ț's integration as a core element of the modern Romanian alphabet, reflecting a balance between phonetic precision and Latin-inspired simplicity.[9][8]Phonetics
Sound Representation
The letter Ț primarily represents the voiceless alveolar affricate sound in Romanian, transcribed in the International Phonetic Alphabet (IPA) as /t͡s/. This affricate combines a stop closure at the alveolar ridge followed by a sibilant fricative release, functioning as a single phoneme rather than a sequence of distinct sounds. It is akin to the "ts" cluster in English words like "cats," but integrated as a unitary consonant in Romanian phonetics. A representative example is the word țară, meaning "country," pronounced as /ˈt͡sarə/, where Ț initiates the syllable with the affricate articulation. Another instance appears in puțin ("few"), rendered as /puˈt͡sɨn/, highlighting the sound's occurrence in various lexical positions.[10][11] This contrasts sharply with the letter T, which denotes the voiceless alveolar stop /t/, a pure plosive without the subsequent fricative component, as in tară (hypothetical form, but illustrating the stop in words like tară if un-affricated). The affrication in Ț ensures it occupies a dedicated slot in the phonological inventory, avoiding confusion with the stop in minimal pairs or derivations.[10]Phonological Role
The letter Ț represents the voiceless alveolar affricate phoneme /t͡s/ in Romanian, a distinct consonant that integrates into the language's phonological system as one of the twenty consonants in its inventory. This phoneme occurs in all positions within words: initially, as in țară (/ˈt͡sarə/, "country"); medially, as in cățel (/kəˈt͡sel/, "puppy"); and finally, as in drumeț (/ˈdru.met͡s/, "hiker"). The phonemic status of /t͡s/ is demonstrated by minimal pairs that contrast it with the stop /t/, such as tare (/ˈta.re/, "tough") versus the hypothetical țare (/ˈt͡sa.re/), highlighting how the affricate creates meaningful distinctions in the lexical inventory. Although true minimal pairs are less frequent due to distributional constraints, such contrasts underscore /t͡s/'s role as a full phoneme rather than a variant of /t/ plus /s/. In Romanian's phonological patterns, /t͡s/ interacts with stress assignment, which typically falls on one of the last three syllables and can occur on the affricate itself, as in stressed-initial țară where the /t͡s/ syllable bears primary stress. The phoneme occurs in words subject to metaphony, such as țară (/ˈt͡sarə/) forming the plural țări (/ˈt͡sərʲ/) with raising of the stressed vowel /a/ to /ʌ/ before the unstressed /i/.[12]Orthographic Development
Comma vs. Cedilla Variants
The cedilla variant of the letter, represented as Ţ (majuscule) and ţ (minuscule) with Unicode code points U+0162 and U+0163, employs a hook-like mark positioned below the stem of the T. This form was historically utilized in Romanian orthography, borrowed from its use in other Romance languages to indicate sound softening, but it has been deemed phonetically mismatched for the voiceless alveolar affricate /ts/ sound, as the cedilla traditionally denotes modifications toward sibilant pronunciations, such as the /s/ in French ç.[13][9] In contrast, the comma below variant, Ț (majuscule) and ț (minuscule) with Unicode code points U+021A and U+021B, features a straight, vertical comma-shaped diacritic detached below the letter, providing a more accurate representation for affricates and aligning with conventions in the International Phonetic Alphabet for such sounds.[14] This adoption reflects a shift toward orthographic precision, as the comma avoids the sibilant connotations of the cedilla while better suiting the affricate articulation in Romanian.[9] Visually, the cedilla curves to the left in a hook shape, often attaching more closely to the letter base, whereas the comma below remains a distinct, upright line without curvature, leading to clear glyph differentiation in typography. Substitution errors between the variants are common in digital and printed texts due to legacy font support; for instance, a 2011–2013 analysis of Romanian web content found the cedilla form appearing in over 91% of instances for words like "mulți" and "și," while the comma below appeared in only about 5%, causing inconsistencies in text processing and search accuracy.[14] The Romanian Academy affirmed the comma variant as standard in its 2005 orthographic guide.[9]Standardization Efforts
In the late 1990s, efforts to standardize the orthographic form of Ț gained momentum through official technical norms. The Romanian Standards Association (ASRO) adopted SR 13411 in 1999, designating the comma-below variant (Ț/ț) as the official representation for the letter, distinguishing it from the cedilla form (Ţ/ţ) and aligning with historical typographical practices in Romanian printing. This standard provided a foundational framework for its use in formal documents, education, and publishing.[15] The Romanian Academy further advanced these efforts in 2003 when its Linguistic Institute issued a formal declaration affirming the comma below as the correct diacritic for Ț (and Ș), rejecting the cedilla as non-standard for Romanian. This position was codified in the 2005 edition of the Dicționarul ortografic, ortoepic și morfologic al limbii române (DOOM2), coordinated by the Academy's Institute of Linguistics "Iorgu Iordan – Al. Rosetti." DOOM2 explicitly mandated Ț for all official and normative contexts, including typography and education, marking the culmination of prior proposals to unify the glyph across print and digital media.[9][16] Despite these authoritative decisions, implementation faced resistance during a transitional period after 2005, driven by entrenched habits and technological limitations. In informal digital texts, the cedilla variant persisted into the 2010s due to incomplete font support in early software and keyboards, leading to mixed usage in online communication and legacy systems. Adoption accelerated with updates such as ISO/IEC 8859-16 in 2001 and Microsoft's inclusion of comma-below glyphs in Windows Vista (2007), gradually enforcing the standard in education and professional settings.[9][17]Usage
In Romanian
In contemporary Romanian, the letter Ț plays a specific role in representing the voiceless alveolar affricate sound /t͡s/, which is integral to the language's phonetic inventory. This diacritic letter appears with a frequency of approximately 1.08% in Romanian texts, making it a relatively uncommon but essential component of the alphabet. It is prevalent in both native vocabulary and loanwords, such as țară (country), muncă (work), and pașaport (passport), where it denotes the /ts/ phoneme consistently across standard orthography.[18] According to Romanian orthographic rules, Ț is the mandatory grapheme for the /ts/ sound in all native words and most adapted loanwords, ensuring phonetic accuracy and uniformity in spelling. The digraph "ts" is not interchangeable with Ț in standard usage; it is reserved primarily for foreign proper names or unassimilated borrowings, such as Tsingtao or tsunami, to preserve original etymologies without diacritics. This distinction upholds the phonetic principle of Romanian writing, where single letters like Ț promote concise representation over digraphs.[19] In Romanian literature and media, Ț features prominently in canonical works, reflecting its embedded role in everyday and poetic language. For instance, in Mihai Eminescu's Scrisoarea III (as rendered in post-1904 editions), the letter appears in phrases like "țară după țară," evoking themes of national expanse and historical journey. Contemporary media, including newspapers like Adevărul and broadcasts on Televiziunea Română, routinely employ Ț in reporting on topics such as țesături (textiles) or political discourse around țărani (peasants), underscoring its vitality in modern expression. These examples illustrate how Ț contributes to the rhythmic and semantic flow of Romanian prose and verse.[20]In Other Languages
The letter Ţ/ţ (often rendered with a cedilla in Gagauz orthography) is employed in the Gagauz language, a Turkic language spoken primarily in Moldova and Ukraine, to represent the voiceless alveolar affricate /ts/. This usage appears mainly in loanwords and some native terms influenced by neighboring languages, distinguishing it from the standard Turkish orthography which lacks a dedicated symbol for this sound. For instance, in place names and borrowed vocabulary, such as "ţara" (country, borrowed from Romanian), Ţ/ţ facilitates accurate phonetic representation in Gagauz texts. Although the normative Moldovan script prefers the comma-below variant (Ț/ț) for compatibility, cedilla forms predominate in many Gagauz publications due to historical and typographic conventions.[21] In the extinct Finnic language Livonian, spoken historically along the Baltic coast in Latvia and Estonia, the letter Ț was incorporated into 20th-century orthographies to denote the palatal plosive /c/ (a palatalized [tʲ] sound, akin to "ty" in rapid English "hit you"). This diacritic emerged in the modern standardized system developed during the 1930s under Latvian influence, evolving from earlier 19th-century notations using acute accents for palatalization in works by linguists like Andreas Johan Sjögren and Ferdinand Johann Wiedemann. Limited to descriptive, onomatopoeic, and loanword contexts—such as Latvian borrowings like "leţ" (song)—Ț helped capture affricate distinctions in Livonian's hybrid Latvian-Estonian script, though its use waned with the language's decline after World War II.[22][23]Typography
Glyph Design
The glyph for the Romanian letter Ț features a capital T with a vertical comma diacritic positioned directly below the vertical stem of the T. This comma is horizontally centered on the glyph for visual balance and vertically aligned at a height consistent with other lowercase diacritics, such as the acute or grave, to maintain typographic harmony across a font family. In design practice, the comma's thickness is often based on the stem width of the T, and its size is adjusted to ensure legibility without overpowering the base letter, particularly by scaling it down from a full punctuation comma to approximately the size of a period.[25][26] Historically, the diacritic evolved from the cedilla form, which was predominant in Romanian typography and early digital encodings before the official adoption of the comma below in 1998 by the Romanian Standards Association, formalized by the Romanian Academy in 2003, and mandated by law in public institutions from 2006 onward.[27] In 19th-century prints and earlier typefaces, the cedilla was often rendered with a more curved or slanted appearance, reflecting broader European typographic conventions, but post-2005 digital standards emphasized a straight, vertical comma to distinguish it clearly from the hook-shaped cedilla used in languages like Turkish.[27] This shift addressed legacy encoding issues where the two forms were conflated, promoting the comma as the invariant glyph for Romanian.[28] Proportions of the comma relative to the T base prioritize optical balance, with the diacritic appropriately scaled—often to the size of a period—to avoid disrupting the ascender-descender rhythm in text setting.[25] In serif typefaces, the comma may include subtle terminal flourishes or height variations to integrate with the font's decorative elements, enhancing readability in body text, whereas sans-serif designs favor a simpler, unadorned vertical form for clean alignment and modern aesthetics.[26] The comma below differs from the cedilla primarily in its straighter, less hooked shape, ensuring distinct rendering in Romanian contexts.[28]Font Rendering
Modern font families vary in their support for the Ț glyph, which requires a precise comma below the stem of the T to adhere to Romanian orthographic standards. Comprehensive typefaces like those in Google Fonts, such as Noto Sans, provide full support for the correct comma-below form at Unicode points U+021A (majuscule) and U+021B (minuscule), ensuring accurate rendering without fallback substitutions.[29] In contrast, older versions of widely used fonts like Arial, particularly those bundled with legacy Microsoft Office installations, often lack the dedicated comma-below glyph and instead fallback to a cedilla variant (U+0162/0163), resulting in visually incorrect diacritics that resemble Turkish forms rather than the required Romanian comma.[30] Proper kerning is essential for Ț in Romanian text to maintain readability, especially when paired with adjacent vowels such as ă or â, where the descending comma can otherwise cause optical collisions or uneven spacing. Font designers address this by defining specific kerning pairs in the font's metrics tables, adjusting the space between Ț and these vowels to align the diacritic without overlapping stems or descenders; for instance, tools like Glyphs App recommend optical adjustments to anchors for precise positioning in composite glyphs.[25] Such pairs are particularly important in proportional sans-serif and serif faces to prevent the comma from clashing with curved elements in vowels like â. Cross-platform rendering of Ț benefits significantly from OpenType font format capabilities compared to legacy TrueType limitations. OpenType fonts leverage the 'locl' (localized forms) feature to substitute cedilla glyphs with proper comma-below variants specifically for Romanian and Moldovan locales, enabling consistent display across applications that support OpenType layout tables, such as modern web browsers and desktop publishing software.[31] In contrast, pure TrueType fonts without these advanced features may default to incorrect cedilla rendering or combining marks, leading to positioning errors on platforms like older Windows systems or basic text engines that do not process OpenType substitutions.[32] This disparity highlights the importance of using OpenType-compliant fonts for reliable cross-platform support of Romanian diacritics.Encoding and Technical Support
Unicode Assignment
The uppercase letter Ț is encoded at code point U+021A (LATIN CAPITAL LETTER T WITH COMMA BELOW), and the lowercase letter ț at U+021B (LATIN SMALL LETTER T WITH COMMA BELOW). These precomposed characters reside in the Latin Extended-B block (U+0180–U+024F), specifically in the sub-block designated for additions supporting Romanian.[33] Both characters were introduced in Unicode version 3.0, released in September 1999, to provide distinct encoding for the comma below diacritic used in Romanian orthography.[34] For compatibility, they include canonical decompositions: U+021A decomposes to U+0054 (LATIN CAPITAL LETTER T) followed by U+0326 (COMBINING COMMA BELOW), and U+021B to U+0074 (LATIN SMALL LETTER T) followed by U+0326.[33] This decomposition ensures rendering on systems lacking native support for the precomposed forms while preserving the intended glyph shape. In Unicode version 4.0 (2003), the standard further clarified the preferred use of these comma below code points for Romanian, distinguishing them from the cedilla variants. Prior to Unicode 3.0, the Romanian letters with this diacritic had been unified with the cedilla forms Ţ (U+0162, LATIN CAPITAL LETTER T WITH CEDILLA) and ţ (U+0163, LATIN SMALL LETTER T WITH CEDILLA), but this unification was reversed due to orthographic and glyph rendering differences—the comma below being a straighter, tail-like mark versus the more curved cedilla.[34] This separation addressed requests from the Romanian national standards body to accurately represent the language's standardized diacritics.[34]Legacy Encodings and Compatibility
In the ISO/IEC 8859-2 standard (Latin-2), established in 1987, the positions intended for Romanian diacritics were assigned to the cedilla variants Ţ (U+0162) at byte 0xDE and ţ (U+0163) at 0xFE, causing systems to display cedillas instead of the desired comma below forms.[35] This substitution arose from the standard's design for Central and Eastern European languages, where cedilla was available but comma below was not encoded separately.[36] This was later addressed in ISO/IEC 8859-16 (2001), which introduced dedicated positions for the comma below variants (U+0218–U+021B) to better support Romanian orthography.[34] The Windows-1250 code page, an extension supporting Central European languages including Romanian, mirrored this mapping with Ţ at 0xDE and ţ at 0xFE, perpetuating the visual mismatch in pre-Unicode environments.[37] This issue, often termed the "Romanian bug," resulted in incorrect rendering of Ț as Ţ on Windows systems until UTF-8 became prevalent, as the code page lacked dedicated slots for the comma below variants.[38][36] Following the 1999 Romanian standard (SR 13411) establishing comma below as the official diacritic, the 2005 standard (SR 13392:2004) reinforced this through keyboard layout specifications, but legacy documents such as PDFs and databases generated in ISO 8859-2 or Windows-1250 continued to embed cedilla bytes, complicating data interchange.[15] These migration challenges required specialized normalization tools, such as byte-replacement scripts or converters, to remap the legacy encodings to the distinct Unicode codepoints for comma below (e.g., U+021A for Ț), ensuring accurate representation without altering other text.[39] In contrast to these legacy assignments, Unicode provides separate codepoints for cedilla and comma below forms to resolve such ambiguities.References
- https://learn.[microsoft](/page/Microsoft).com/en-us/typography/develop/character-design-standards/diacritics