Hubbry Logo
Two dots (diacritic)Two dots (diacritic)Main
Open search
Two dots (diacritic)
Community hub
Two dots (diacritic)
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Two dots (diacritic)
Two dots (diacritic)
from Wikipedia
◌̈ ◌̤
Two dots
  • U+0308 ◌̈ COMBINING DIAERESIS[a]
  • U+0324 ◌̤ COMBINING DIAERESIS BELOW
  • U+07F3 ߳ NKO COMBINING DOUBLE DOT ABOVE

Diacritical marks of two dots ¨, placed side-by-side over or under a letter, are used in several languages for several different purposes. The most familiar to English-language speakers are the diaeresis and the umlaut, though there are numerous others. For example, in Albanian, ë represents a schwa. Such diacritics are also sometimes used for stylistic reasons (as in the family name Brontë or the band name Mötley Crüe).

In modern computer systems using Unicode, the two-dot diacritics are almost always encoded identically, having the same code point.[1] For example, U+00F6 ö LATIN SMALL LETTER O WITH DIAERESIS represents both o-umlaut and o-diaeresis. Their appearance in print or on screen may vary between typefaces but rarely within the same typeface.

The word trema (French: tréma), used in linguistics and also classical scholarship, describes the form of both the umlaut diacritic and the diaeresis rather than their function and is used in those contexts to refer to either.

Uses

[edit]

Diaeresis

[edit]

As the "diaeresis" diacritic, it is used to mark the separation of two distinct vowels in adjacent syllables when an instance of diaeresis (or hiatus) occurs, so as to distinguish from a digraph or diphthong. For example, in the obsolete spelling coöperate, the diaeresis reminded the reader that the word has four syllables co-op-er-ate, not three. It is used in several languages of western and southern Europe, though rarely now in English.[2] One well-known usage is in French - the diaeresis is used in naïve, which is commonly spelled in English without the diaeresis. It is, however, obligatory in French, to show that it is pronounced [na.iv] rather than [nɛv].

Umlaut

[edit]

As the "umlaut" diacritic, it indicates a sound shift  – also known as umlaut – in which a back vowel becomes a front vowel. It is a specific feature of German and other Germanic languages, affecting the graphemes ⟨a⟩, ⟨o⟩, ⟨u⟩ and ⟨au⟩, which are modified to ä, ö, ü and ⟨äu⟩.

It can be seen in the Sütterlin script, formerly used widely in German handwriting, in which the letter e is formed as two short parallel vertical lines very close together (see under Sütterlin#Characteristics).

Stylistic use

[edit]

The two dot diacritic is also sometimes used for purely stylistic reasons. For example, the Brontë family's surname was derived from Gaelic and had been anglicised as "Prunty", or "Brunty", but at some point, the father of the sisters, Patrick Brontë (born Brunty), decided on the alternative spelling with a diaeresis diacritic over the terminal ⟨e⟩ to indicate that the name had two syllables.

Similarly, the "metal umlaut" is a diacritic that is sometimes used gratuitously or decoratively over letters in the names of hard rock or heavy metal bands – for example, those of Motörhead and Mötley Crüe, and of parody bands, such as Spın̈al Tap.

Other uses by language

[edit]

A double dot is also used as a diacritic in cases where it functions as neither a diaeresis nor an umlaut. In the International Phonetic Alphabet (IPA), a double dot above a letter is used for a centralized vowel, a situation more similar to umlaut than to diaeresis. In other languages it is used for vowel length, nasalization, tone, and various other uses where diaeresis or umlaut was available typographically. The IPA uses a double dot below a letter to indicate breathy (murmured) voice.[3][b]

Vowels

[edit]
  • In Albanian, Tagalog, Kashubian, and Luxembourgish ⟨ë⟩ represents a schwa [ə].
  • In Aymara, a double dot is used on ⟨ä⟩ ⟨ï⟩ ⟨ü⟩ for vowel length.
  • In the Basque dialect of Soule, ⟨ü⟩ represents [y]
  • In the DMG romanization of Tunisian Arabic, ⟨ä⟩, ⟨ö⟩, ⟨ṏ⟩, ⟨ü⟩, and ⟨ṻ⟩ represent [æ], [œ], [œ̃], [y], and [y:].
  • In Ligurian official orthography, ⟨ö⟩ is used to represent the sound [oː].
  • In Māori, a diaeresis (e.g. wähine) was often used on computers in the past instead of the macron to indicate long vowels, as the diaeresis was relatively easy to produce on many systems, and the macron difficult or impossible.[4][5]
  • In Seneca, ⟨ë⟩ ⟨ö⟩ are nasal vowels, though ⟨ä⟩ is [ɛ], as in German umlaut.
  • In Vurës (Vanuatu), ⟨ë⟩ and ⟨ö⟩ encode respectively [œ] and [ø].
  • In the Pahawh Hmong script, a double dot is used as one of several tone marks.
  • The double dot was used in the early Cyrillic alphabet, which was used to write Old Church Slavonic. The modern Cyrillic Belarusian and Russian alphabets include the letter ⟨ё⟩ (yo), although replacing it with the letter е without the diacritic is allowed in Russian.
  • Since the 1870s, ⟨Ї⟩, ⟨ї⟩ (Cyrillic letter yi) has been used in the Ukrainian alphabet for iotated [ji]; plain і is not iotated [i]. In Udmurt, ӥ is used for uniotated [i], with и for iotated [ji].
  • The form ⟨ÿ⟩ is common in Dutch handwriting and also occasionally used in printed text – but is a form of the digraph "ij" rather than a modification of the letter ⟨y⟩.
  • Komi and Udmurt use Ӧ (a Cyrillic O with two dots) for [ə].
  • The Swedish, Finnish and Estonian languages use Ä and Ö to represent [æ] and [ø]
  • In the languages of J.R.R. Tolkien's Middle-Earth novels, a diaeresis is used to separate vowels belonging to different syllables (e.g. in Eärendil) and on final e to mark it as not a schwa or silent (e.g. in Manwë, Aulë, Oromë, etc.). (There is no schwa in these languages but Tolkien wanted to make sure that readers wouldn't mistakenly pronounce one when speaking the names aloud.)[citation needed]

Consonants

[edit]

Jacaltec (a Mayan language) and Malagasy are among the very few languages with a double dot on the letter "n"; in both, is the velar nasal [ŋ].

In Udmurt, a double dot is also used with the consonant letters ӝ [dʒ] (from ж [ʒ]), ӟ [dʑ] (from з [z] ~ [ʑ]) and ӵ [tʃ] (from ч [tɕ]).

When distinction is important, and are used for representing [ħ] and [ɣ] in the Kurdish Kurmanji alphabet (which are otherwise represented by "h" and "x"). These sounds are borrowed from Arabic.

and ÿ: Ÿ is generally a vowel, but it is used as the (semi-vowel) consonant [ɰ] (a [w] without the use of the lips) in Tlingit. This sound is also found in Coast Tsimshian, where it is written .

A number of languages in Vanuatu use double dots on consonants, to represent linguolabial (or "apicolabial") phonemes in their orthography. Thus Araki contrasts bilabial p [p] with linguolabial [t̼]; bilabial m [m] with linguolabial [n̼]; and bilabial v [β] with linguolabial [ð̼].

Seneca uses ⟨s̈⟩ for [ʃ].

In Arabic the letter is used in the ISO 233 transliteration for the tāʾ marbūṭah [ة], used to mark feminine gender in nouns and adjectives.

Syriac uses a two dots above a letter, called Siyame, to indicate that the word should be understood as plural. For instance, ܒܝܬܐ (bayta) means "house", while ܒܝ̈ܬܐ (bayte) means "houses". The sign is used especially when no vowel marks are present, which could differentiate between the two forms. Although the origin of the Siyame is different from that of the diaeresis sign, in modern computer systems both are represented by the same Unicode character. This, however, often leads to wrong rendering of the Syriac text.

The N'Ko script, used to write the Mandé languages of West Africa uses a two-dot diacritic (among others) to represent non-native sounds. The dots are slightly larger than those used for diaeresis or umlaut.

Diacritic underneath

[edit]

The IPA specifies a "subscript umlaut", for example Hindi [kʊm̤ar] "potter";[3]: 25  the ALA-LC romanization system provides for its use and is one of the main schemes to romanize Persian (for example, rendering ض as ⟨z̤⟩). The notation was used to write some Asian languages in Latin script, for example Red Karen.

The double-dot underneath a vowel is still used in Fuzhou romanization of Eastern Min to indicate a modified vowel sound; placing the modifier diacritic underneath the vowel letter makes it easier to combine it with tonal diacritics above the letter, as in the word Mìng-dĕ̤ng-ngṳ̄ ("Eastern Min language").

Side dots

[edit]

The diacritics and  , known as Bangjeom (방점; 傍點), were used to mark pitch accents in Hangul for Middle Korean. They were written to the left of a syllable in vertical writing and above a syllable in horizontal writing.

Computer encodings

[edit]

In Unicode

[edit]

Character encoding generally treats the umlaut and the diaeresis as the same diacritic mark. Unicode refers to both as diaereses without making any distinction, although the term itself has a more precise literary meaning. For example, U+00F6 ö LATIN SMALL LETTER O WITH DIAERESIS represents both o-umlaut and o-diaeresis, while similar codes are used to represent all such cases.

Unicode encodes a number of cases of "letter with a two dots diacritic" as precomposed characters and these are displayed below. (Unicode uses the term "Diaeresis" for all two-dot diacritics, irrespective of the actual term used for the language in question.) In addition, many more symbols may be composed using the combining character facility, U+0308 ◌̈ COMBINING DIAERESIS, that may be used with any letter or other diacritic to create a customised symbol, though only those with real-world use are shown below.

Both the combining character U+0308 and the pre-composed codepoints may be regarded as an umlaut or a diaeresis according to context. Compound diacritics are possible, for example U+01DA ǚ LATIN SMALL LETTER U WITH DIAERESIS AND CARON, used as a tonal marks for Hanyu Pinyin, which uses both a two dots diacritic with a caron diacritic. Conversely, when the letter to be accented is an ⟨i⟩, the diacritic replaces the tittle, thus: ⟨ï⟩.

Sometimes, there's a need to distinguish between the umlaut sign and the diaeresis sign. For instance, either may appear in a German name. ISO/IEC JTC 1/SC 2/WG 2 recommends the following for these cases:[6]

  • To represent the umlaut use the Combining Diaeresis (U+0308)
  • To represent the diaeresis use Combining Grapheme Joiner (CGJ, U+034F) + Combining Diaeresis (U+0308)

The same advice can be found in the official Unicode FAQ.[7]

Since version 3.2.0, Unicode also provides U+0364 ◌ͤ COMBINING LATIN SMALL LETTER E which can produce the older umlaut typography.

Unicode provides a combining double dot below as U+0324 ◌̤ COMBINING DIAERESIS BELOW.

For use with the N'Ko script, there is U+07F3 ◌߳ NKO COMBINING DOUBLE DOT ABOVE.

Finally (presumably for [theoretical] 'backspace and overtype' usage), Unicode encodes a free-standing two-dots character, U+00A8 ¨ DIAERESIS.

Pre-Unicode

[edit]

ASCII, a seven-bit code with just 95 "printable" characters, has no provision for any kind of dot diacritic. Subsequent standardisation treated ASCII as the US national variant of ISO/IEC 646: the French, German and other national variants reassigned a few code points to specific vowels with diacritics, as precomposed characters. Some of these variants also defined the sequence e,backspace," as producing ë but few terminals supported this.

The subsequent (eight bit) ISO 8859-1 character encoding includes the letters ä, ë, ï, ö, ü, and their respective capital forms, as well as ÿ in lower case only, with Ÿ added in the revised edition ISO 8859-15 and Windows-1252.

Computer usage

[edit]
Letters with umlaut on a German computer keyboard.

Character encoding generally treats the umlaut and the diaeresis as the same diacritic mark. Unicode refers to both as diaereses without making any distinction, although the term itself has a more precise literary meaning. For example, U+00F6 ö LATIN SMALL LETTER O WITH DIAERESIS represents both o-umlaut and o-diaeresis, while similar codes are used to represent all such cases.

In countries where the local language(s) routinely include letters with diacritics, local keyboards are typically engraved with those symbols. If letters with double dots are not present on the keyboard, there are a number of ways to input them into a computer system. (For details, see local sources, computer system documentation and the article Unicode input.)

See also

[edit]

Notes

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The two dots diacritic, known variously as the diaeresis, umlaut, or trema, is a combining mark ( U+0308) consisting of two small dots placed above a letter, most commonly a , to either modify its pronunciation through vowel mutation or to indicate the separation of two adjacent vowels into distinct syllables (a hiatus). This diacritic originated in during the 2nd century BCE, with the dialytika (from Greek diaíresis, meaning "division") formalized by to mark breaks in vowel pairs, such as in αϊ (ai). In , it evolved into the umlaut during the , initially represented by a small superscript e above vowels to denote sound shifts caused by adjacent vowels, later stylized as two dots—a form that appeared in printed texts from the and became standard in the . The primary distinction lies in function: the diaeresis or trema separates vowels without altering their inherent sounds, as in French naïve (where the i is pronounced separately from the a) or Spanish vergüenza (indicating the u in gue is vocalized as /ɡw/). In contrast, the umlaut signals a phonemic change, such as fronting or rounding of vowels in German (ä for /ɛ/, ö for /ø/, ü for /y/), where it distinguishes meanings like Mann ("man") from Männer ("men"). It appears in numerous languages beyond these, including Catalan, Danish, Dutch, Swedish, Albanian, Welsh, and Occitan, often in loanwords or to clarify , though its use in English is sporadic and largely confined to proper names like Brontë or archaic hiatus markers like coöperate. It also appears in languages like Turkish and . In non-Latin scripts, analogous marks exist, such as the Greek dialytika on ϊ and ϋ, while in digital encoding, the combining form ensures compatibility across writing systems. Style guides such as recommend using the diaeresis for separation and umlaut for mutation in English contexts.

History and Etymology

Origins in Scripts

The two dots diacritic, known in various forms as diaeresis or umlaut, traces its origins to ancient writing systems where it served as a marker for phonetic distinctions. In papyri, dots placed over the letter appear as early as circa 200 BCE, functioning as potential precursors to the modern diaeresis by indicating the separation of vowels for pronunciation clarity, particularly in diphthongs or to avoid confusion with other letters. These marks were used inconsistently in post-classical documentary texts, often over or to highlight individual vowel sounds rather than contractions, as seen in examples from the Duke Databank of Documentary Papyri where the diaeresis appears above υ in forms like υἱ(¨)οῦ. A significant early attestation of the two dots emerged in 15th-century Latin manuscripts of texts, where scribes employed the to denote vowel mutations resulting from i-umlaut, a phonological process altering back s to front s before following high front vowels or glides. This usage marked a shift from earlier superscript e notations to the more compact two-dot form, aiding in the representation of sounds like /ɛ/ from /a/ in words such as fader becoming vater. Such markings first appear in monastic scriptoria, reflecting the need to standardize irregular vowel shifts in Germanic during the transition from Old to . In Slavic scripts, the two dots were adopted in Cyrillic writing around the 14th century for vowel notation in Church Slavonic, particularly in the Ustav style of manuscripts, where the mark known as kendema (two dots) indicated unstressed or specific phonetic qualities of vowels, such as in the letter for /i/ pronounced without stress. This diacritic helped distinguish subtle prosodic features in liturgical texts derived from Old Church Slavonic, building on earlier Glagolitic influences but formalized in Cyrillic ustav up to the late medieval period. The diacritic's development also drew influence from Greek prosodic traditions, where the (dasía), originally a reversed comma-like mark denoting initial /h/, was used in Byzantine musical notations by the medieval era. In these notations, classical breathing marks were adapted for ecphonetic and chant purposes in Orthodox liturgical manuscripts.

Terminology and Evolution

The term "diaeresis" originates from the Greek word διαίρεσις (diaíresis), meaning "division" or "separation," and entered English usage around the late to describe the diacritic's role in separating vowel sounds. In contrast, "umlaut," translating to "changed sound" or "around sound" in German, was coined by linguist in 1819 within his foundational work Deutsche Grammatik, specifically to denote the historical vowel mutation in . These terms reflect distinct phonetic functions: diaeresis for hiatus (vowel separation), and umlaut for assimilation-based sound shifts. The neutral term "trema," derived from the Greek τρῆμα (trêma) meaning "perforation" or "hole," emerged in French as a function-agnostic descriptor for the two dots mark and gained broader adoption in 19th-century linguistic scholarship to encompass both diaeresis and umlaut roles without linguistic bias. This terminology facilitated cross-linguistic analysis during the era of comparative philology, allowing scholars to discuss the diacritic's visual form independently of its interpretive application in specific languages. In printing history, the two dots form of the achieved standardization in 16th-century German typefaces, evolving from a handwritten superscript "e" (used since the period) as a typographic to save space and ligatures in early . This innovation, pioneered by printers like those in the school, marked a shift toward consistent representation in printed texts, bridging traditions with emerging orthographic norms. The 1901 German Orthographic Conference further refined these conventions by codifying umlauts (ä, ö, ü) as integral letters in the alphabet, including their use in capitals, while distinguishing the diaeresis function for rare cases in loanwords to indicate syllable breaks rather than sound mutation.

Linguistic Uses

Diaeresis Function

The diaeresis, consisting of two dots placed over a , serves a primary phonetic function in by marking the separation of two adjacent vowels into distinct syllables, thereby preventing them from forming a or blending in pronunciation. This separative role ensures clarity in , where the marked is pronounced independently rather than as part of a combined sound. For instance, in English loanwords such as "naïve," the diaeresis over the "i" indicates pronunciation as two syllables (na-ïve) instead of a single diphthongal sound. Similarly, in French, it appears in words like "aigüe," separating the "u" and "e" to maintain distinct articulation (/ɛ.ɡy/). The origins of this diacritic trace back to , where it derived from the term diairesis, meaning "division" or "separation," from the verb diairein ("to divide"), introduced in the 3rd-2nd century BCE by of . In classical Greek manuscripts, the two dots—known as a trema—were placed over letters like (ι) or (υ) to denote separate vowel sounds, as in the example "Σαΐας" (Saias), where the diaeresis over the prevents it from forming a with the preceding alpha. This usage emphasized division without altering the inherent quality of the vowels involved. Unlike the umlaut, which modifies a vowel's sound quality (as in ), the diaeresis is purely separative and does not indicate a change in the vowel's phonetic value. In , its application has become rare and is largely confined to loanwords or proper names, such as "coöperate" in pre-20th-century texts to signal four syllables (co-öp-er-ate), though it is now often omitted or replaced by hyphens in contemporary usage. This decline reflects broader trends in toward simplification, while the diaeresis persists in languages like French and Catalan for precise phonetic guidance.

Umlaut Function

The umlaut functions primarily as a indicating fronting or quality modification in , resulting from historical sound changes known as i-mutation. This phonological process, which occurred in the early stages of Proto-Germanic and its descendants, involves the assimilation of a stem to a following high (/i/ or /j/) in an unstressed , leading to front-rounded vowels such as German ü (from Proto-Germanic u). For instance, the /u/ shifts forward to , represented orthographically by two dots over the u, preserving the rounded quality while altering the tongue position toward the front of the mouth. Historically, the umlaut notation evolved from a superscript "e" placed above the base to denote the , a convention used in medieval manuscripts to indicate the influence of the following /i/ or /j/ without writing it out fully. By the , printers simplified this superscript "e"—which resembled two small loops or dots—into the modern two-dot form for practicality in . formalized the term "Umlaut" (meaning "changed sound") in his Deutsche Grammatik (first volume published 1819), describing it as a systematic alteration analogous to but distinct from ablaut (internal gradation in verbs). This notation substitution is evident in examples like German Mann (man) versus Männer (men), where the umlaut on ä reflects the historical i- from the plural ending -iz. In languages like Swedish and Danish, the umlaut diacritic often serves as a historical remnant, marking letters such as ä and ö that originated from i-mutation but now function as independent phonemes with distinct pronunciations, sometimes appearing in stylized or decorative contexts without further altering base vowel quality in certain loanwords or proper names. In phonetic transcription, the combining diaeresis (Unicode U+0308) is employed in the International Phonetic Alphabet (IPA) to denote centralized vowels, such as a schwa [ə̈] or centralized [ä], distinguishing it from its umlaut role in indicating fronting.

Functions in Specific Languages

In Albanian, the two dots diacritic appears as ë (uppercase Ë), representing the schwa vowel /ə/, a mid-central unrounded sound essential for phonemic distinctions in the language's . This usage is crucial for words like "këndon," meaning "sings," where the schwa differentiates meaning from similar forms without it, such as "kendon" which would alter pronunciation and semantics. The diacritic was standardized in the modern developed in the late 19th and early 20th centuries to reflect the language's phonetic inventory accurately. In Catalan, the diaeresis (represented by two dots) is placed over vowels such as i or u to indicate a hiatus, preventing formation and ensuring separate syllable pronunciation. For instance, in "aïllar" (to isolate), the dots on the i mark the break between vowels, distinguishing it from potential diphthongs like those in "ai". This orthographic feature was formalized in the 1913 standardization of Catalan spelling by the Institut d'Estudis Catalans, which aimed to unify regional variations and align with phonetic realities. Turkish employs the two dots diacritic on o and u to form ö and ü, denoting front rounded vowels /ø/ and /y/ respectively, which are integral to the language's vowel harmony system where vowels in suffixes must match the frontness or backness of the root. These letters were introduced as part of the 1928 Latin alphabet reform led by , replacing the Ottoman Arabic script to promote and reflect Turkic more directly; prior to this, such sounds were approximated without dedicated letters. In Danish and Swedish, the umlaut-like diacritics , , and mark front rounded s resulting from historical i-mutation, functioning as distinct letters in the . In Dutch, the trema separates adjacent vowels in hiatus, as in "reëel" (real), pronounced as three syllables. Welsh uses the diaeresis occasionally in loanwords to indicate vowel separation, such as in "naïve". In Occitan, the trema functions similarly to French, marking hiatus in words like "noël". Southern Sámi incorporates ä and ö (with two dots) to represent front vowels, including in loanwords, as part of its standardized .

Non-Linguistic Applications

Typographic and Stylistic Roles

In the typographic history of the two dots diacritic, often referred to as the umlaut or diaeresis, its integration into scripts such as marked a key evolution in German printing and handwriting from the onward. , a bold, fractured style derived from medieval Gothic scripts, incorporated the two dots over vowels like ä, ö, and ü to denote phonetic shifts, with the diacritic evolving from an earlier superscript "e" form used in manuscripts. This adaptation ensured clarity in dense, ornate text blocks typical of , where the dots provided visual distinction without disrupting the script's angular, interconnected letterforms. A notable extension occurred with script, introduced in 1915 by calligrapher Ludwig Sütterlin as a standardized form for German schools and official use, remaining prevalent until 1941. In , derived from (a variant), the two dots retained their role as a simplified umlaut, replacing the medieval tiny "e" with paired dots for practicality in everyday penmanship, though alternatives like "ae" or "oe" were sometimes substituted in informal contexts. This script's typographic emphasis on fluidity and legibility in personal and administrative documents highlighted the diacritic's adaptability to handwritten aesthetics, influencing regional printing styles until its official phase-out during . Beyond functional , the two dots have served stylistic purposes in English-language branding and media to evoke or emphasis, often detached from phonetic intent. In rock and , the "metal umlaut" emerged as a decorative flourish, with bands like adopting it in 1975—frontman Lemmy Kilmister cited inspiration from —to lend a faux-Germanic edge to otherwise standard spellings, enhancing visual aggression in album art and logos. Similarly, , launched in the 1960s but prominently featuring umlauts in its 1980s branding, fabricated a pseudo-Scandinavian identity through the diacritic on the "ä" and "ä," despite the name's American origins, to convey premium, "old-world" luxury in packaging and advertising. In modern digital contexts, the two dots have seen a revival for aesthetic effects, particularly in since the , where they contribute to "vintage" or "Nordic cool" visuals by mimicking European scripts in logos, interfaces, and marketing. This trend leverages Unicode's robust support—introduced via combining character U+0308 in version 1.1 (1993)—to enable seamless rendering in creative fonts and even experimental combinations, such as overlaying diaeresis on faces (e.g., 😊̈) for stylized social graphics. Such applications underscore the diacritic's shift toward ornamental versatility in pixel-based , often prioritizing cultural allure over linguistic accuracy.

Uses in Other Writing Systems

In the Cyrillic script, the two dots diacritic appears above the letter е to form ё, which represents the sound /jo/ in Russian. This letter was popularized by writer in 1797, marking a distinct that previously lacked a dedicated symbol. In and Persian scripts, two dots serve as i'jām marks to distinguish consonants, such as in letters like ت (tāʾ) and ي (yāʾ). Within the International Phonetic Alphabet (IPA), the two dots diacritic, known as diaeresis (◌̈), is placed above a symbol to denote centralization of vowels, shifting the articulation toward the center of the vocal tract; for instance, it modifies rounded vowels to indicate a more centralized quality. These symbols are standardized in the 2020 IPA chart for precise phonetic transcription. In Hebrew pointed texts (), two vertical dots positioned below a letter form the (ְ), functioning as a marker for a short /e/ sound (vocal shva) or indicating (silent shva), particularly in biblical and liturgical contexts where precise vocalization is required.

Digital Representation

Unicode Implementation

In , the two dots diacritic is primarily represented using the combining character U+0308 COMBINING DIAERESIS, a nonspacing mark from the block (U+0300–U+036F) that applies above any preceding base character, such as forming ä from a + U+0308. This approach enables flexible composition across scripts and languages, supporting both diaeresis (vowel separation) and umlaut (vowel modification) functions without functional distinction in the encoding model. For efficiency in common usage, Unicode provides numerous precomposed characters that integrate the diaeresis directly with Latin base letters, such as U+00EB LATIN SMALL LETTER E WITH DIAERESIS for ë in the block (U+0080–U+00FF). These extend across blocks like (U+0100–U+017F) and (U+0180–U+024F), encompassing over 100 specific code points for uppercase and lowercase forms in various linguistic contexts, including accented variants like ẗ (U+1E96) for German . Unicode normalization ensures interoperability between precomposed and combining forms: under Normalization Form C (NFC), compatible systems compose sequences like e + U+0308 into ë, while Normalization Form D (NFD) decomposes ë into e + U+0308, facilitating consistent text processing and search across decomposed or composed representations. Diaeresis and umlaut glyphs share identical code points, with semantic interpretation left to linguistic context rather than encoding differentiation. Unicode 17.0 (2025) added combining diacritics for historical notations, including variants from John Peabody Harrington's manuscripts such as U+1ADC COMBINING FALLING DIAGONAL DIAERESIS (a rotated two-dot mark), in the Extended block (U+1AB0–U+1AFF), to support complex combinations in anthropological and linguistic transcriptions. Earlier, Unicode 15.0 (2022) introduced (U+A720–U+A7FF) with precomposed characters for minority scripts, enhancing overall support for diacritic applications in transcriptions of ancient or minority languages.

Pre-Unicode and Legacy Systems

The American Standard Code for Information Interchange (ASCII), standardized in 1963, was a 7-bit encoding limited to 128 characters primarily for English-language text and lacked native support for diacritics such as the two dots (diaeresis or umlaut). This limitation prompted workarounds in early digital systems, including —such as replacing "ü" with "ue" or "ë" with "e:"—common in 1960s teletype communications for European languages. Overstriking techniques on terminals also allowed approximate rendering by combining base letters with available symbols like colons or quotes. The ISO 8859 series addressed these gaps by introducing 8-bit extensions to ASCII. Specifically, ISO 8859-1 (Latin-1), published in 1987, allocated positions in the upper byte range (0x80–0xFF) for Western European characters, including the two dots diacritic; for example, "ë" (lowercase e with diaeresis) is encoded at hexadecimal 0xEB. This standard became widely adopted for text processing and early internet applications, enabling direct representation of umlauts in languages like German and French. On mainframes, the (), introduced in 1964 with the System/360, provided variant mappings for the two dots across s from the 1960s to 1980s. In 037 (used for U.S. and international English variants), umlaut characters occupied positions such as hexadecimal 0x43 for "ä" (lowercase a with diaeresis), with differences in other pages like 273 for German. These mappings supported business data processing but varied by region, complicating interoperability. Prior to widespread digital encoding, mechanical typewriters from the to adapted to European languages through dead keys and overstrike methods. Dead keys, which imprinted diacritics without advancing the carriage, allowed users to add two dots over vowels by striking the accent key followed by the base letter; this technique, common on models for French and German, originated in early 20th-century designs to minimize key count. Overstriking with the period or quote keys served as a manual workaround on standard English keyboards. As digital text transitioned to the web in the 1990s, HTML entities bridged pre-Unicode gaps by referencing ISO 8859-1 codes, such as ü for "ü" (lowercase u with diaeresis), formalized in W3C specifications around 1996. This facilitated rendering of the two dots in browsers before full Unicode adoption.

Computing Practices

Input Methods

The two dots diacritic, known as diaeresis or umlaut, is entered in computing environments through various keyboard layouts and software features designed to support accented characters. In the US International keyboard layout, commonly used on Windows systems, dead key sequences enable input by pressing a modifier key followed by the base letter; for instance, pressing the quotation mark key (") as a dead key and then "a" produces "ä". Similarly, on macOS with the US layout, holding Option + U and then pressing a vowel generates the umlaut, such as Option + U followed by "e" for "ë". In environments using the , the (often mapped to the right ) allows multi-key sequences for diacritics; for example, pressing , then ", and then "a" inputs "". This method is configurable via XKB options and supports a wide range of characters, including those with two dots, through files like ~/.XCompose. On mobile devices, input methods emphasize touch gestures and predictive features. keyboards support long-pressing vowels to reveal a popup menu with diaeresis options, such as holding "i" to select "ï"; autocorrect also suggests forms like "naïve" when typing "naive" in English or French contexts, provided the relevant language is enabled. Android's similarly offers long-press access to umlauts on letters like "u" for "ü", with autocorrect applying diacritics in multilingual typing; since the early 2020s, AI-powered predictions in have enhanced this by suggesting context-aware accented words based on user patterns and language models like Gemini Nano. Language-specific layouts provide direct key access for frequent use. The German QWERTZ keyboard includes dedicated keys for "", "", and "" adjacent to their base vowels, allowing single-keystroke input on physical hardware connected to Windows, macOS, or systems. For rarer scripts requiring the diaeresis, such as in or polytonic text, virtual on-screen keyboards like those in Lexilogos or TypeIt.org facilitate selection via clickable grids or combining sequences. Voice-to-text systems have incorporated support for the in recent years. As of 2024, Apple's and dictation features in 18 use on-device to transcribe spoken words with umlauts or diaereses accurately in supported languages like German or French, such as rendering "naive" as "naïve" from phonetic input. Gboard's voice typing on Android similarly handles these through Cloud Speech-to-Text, which recognizes diacritic pronunciation in over 120 languages.

Display and Rendering

The two dots diacritic, known as the diaeresis or umlaut mark, is represented in digital text through either precomposed characters—such as (U+00E4), (U+00F6), or (U+00FC)—or as a sequence consisting of a base letter followed by the U+0308 COMBINING DIAERESIS. Precomposed forms integrate the diacritic directly into the design, ensuring fixed positioning, while combining sequences allow flexible attachment to any base character, supporting a wider range of languages and scripts. In modern text rendering engines, such as those implementing the standard, the combining diaeresis is positioned as a detached mark above the base . The horizontal placement centers it over the base character's bounding , calculated as Xpoint=(XbaseXmark)+WbaseWmark2X_{\text{point}} = (X_{\text{base}} - X_{\text{mark}}) + \frac{W_{\text{base}} - W_{\text{mark}}}{2}, where XX and WW denote the left edge and width of the respective bounding rectangles. Vertically, it is offset by a gap of approximately one-eighth of the font's cap height from the top of the base, using the formula Ypoint=(Ybase+Hbase+Gap)YmarkY_{\text{point}} = (Y_{\text{base}} + H_{\text{base}} + \text{Gap}) - Y_{\text{mark}}, with Gap = CapHeight / 8 for above-center detached marks like the diaeresis. This method, outlined in Unicode Technical Note #2, relies on metrics from the font to handle arbitrary combinations without predefined ligatures. For italic and oblique styles, rendering adjusts the diaeresis position rightward by a=c×sinαa = c \times \sin \alpha, where cc is the horizontal offset from the vertical center line and α\alpha is the italic angle, to compensate for the base glyph's slant and preserve visual alignment. OpenType font features, such as the 'mark' positioning class, further refine this in advanced layout systems like or Core Text, attaching the diacritic to an anchor point on the base for precise and avoidance of overlaps in multi-diacritic stacks. Display challenges arise in legacy systems or fonts lacking robust combining mark support, where the diaeresis may appear misaligned, stacked incorrectly, or fallback to separate spacing dots (U+00B7 or U+22C5) instead of the proper combining form. Inconsistent shapes across fonts—such as rounded dots in designs versus slanted strokes evoking historical handwriting in faces—can also affect perceived uniformity, particularly when rendering precomposed versus combining variants. Modern solutions emphasize font-wide metric consistency and normalization algorithms to mitigate these issues, ensuring reliable cross-platform display.

References

  1. https://www.[merriam-webster](/page/Merriam-Webster).com/dictionary/diaeresis
  2. https://en.wiktionary.org/wiki/trema
Add your contribution
Related Hubs
User Avatar
No comments yet.