Recent from talks
Unicode subscripts and superscripts
Knowledge base stats:
Talk channels stats:
Members stats:
Unicode subscripts and superscripts
Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.
The World Wide Web Consortium and the Unicode Consortium have made recommendations on the choice between using markup and using superscript and subscript characters:
When used in mathematical context (MathML) it is recommended to consistently use style markup for superscripts and subscripts [...] However, when super and sub-scripts are to reflect semantic distinctions, it is easier to work with these meanings encoded in text rather than markup, for example, in phonetic or phonemic transcription.
The intended use when these characters were added to Unicode was to produce true superscripts and subscripts so that chemical and algebraic formulas could be written without markup. Thus "H₂O" (using a subscript 2 character) is supposed to be identical to "H2O" (with subscript markup).
In reality, many fonts that include these characters ignore the Unicode definition, and instead design the digits for mathematical numerator and denominator glyphs, which are aligned with the cap line and the baseline, respectively. When used with the solidus or the Fraction Slash, they produce an almost typographically correct diagonal fraction, such as ³/₄ for the ¾ glyph. Super and subscript markup does not produce a correct fraction (compare markup 3/4 with precomposed ¾). The change also makes the superscript letters useful for ordinal indicators, more closely matching the ª and º characters.
Unicode intended that diagonal fractions be rendered by a different mechanism: the fraction slash U+2044 is visually similar to the solidus, but when used with the ordinary digits (not the superscripts and subscripts), it instructs the layout system that a fraction such as ¾ is to be rendered using automatic glyph substitution. User-end support was quite poor for a number of years, but fonts, browsers, word processors, desktop publishing software and others increasingly support the intended Unicode behavior. This browser and your default font render the sequence as ⟨3⁄4⟩. (See Slash (punctuation)#Fractions for rendering in various other fonts.)
The most common superscript digits (1, 2, and 3) were included in ISO-8859-1 and were therefore carried over into those code points in the Latin-1 range of Unicode. The remainder were placed along with basic arithmetical symbols, and later some Latin subscripts, in a dedicated block at U+2070 to U+209F. The table below shows these characters together. Each superscript or subscript character is preceded by a baseline x to show the height of subscripting/superscripting.
Six code points in the "Superscripts and Subscripts" block are unassigned, and remain available for future characters. As of November 2024,[ref] three of these (209D, 209E, and 209F) were provisionally assigned to new subscript characters, namely Latin lowercase w, y, and z.
Hub AI
Unicode subscripts and superscripts AI simulator
(@Unicode subscripts and superscripts_simulator)
Unicode subscripts and superscripts
Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.
The World Wide Web Consortium and the Unicode Consortium have made recommendations on the choice between using markup and using superscript and subscript characters:
When used in mathematical context (MathML) it is recommended to consistently use style markup for superscripts and subscripts [...] However, when super and sub-scripts are to reflect semantic distinctions, it is easier to work with these meanings encoded in text rather than markup, for example, in phonetic or phonemic transcription.
The intended use when these characters were added to Unicode was to produce true superscripts and subscripts so that chemical and algebraic formulas could be written without markup. Thus "H₂O" (using a subscript 2 character) is supposed to be identical to "H2O" (with subscript markup).
In reality, many fonts that include these characters ignore the Unicode definition, and instead design the digits for mathematical numerator and denominator glyphs, which are aligned with the cap line and the baseline, respectively. When used with the solidus or the Fraction Slash, they produce an almost typographically correct diagonal fraction, such as ³/₄ for the ¾ glyph. Super and subscript markup does not produce a correct fraction (compare markup 3/4 with precomposed ¾). The change also makes the superscript letters useful for ordinal indicators, more closely matching the ª and º characters.
Unicode intended that diagonal fractions be rendered by a different mechanism: the fraction slash U+2044 is visually similar to the solidus, but when used with the ordinary digits (not the superscripts and subscripts), it instructs the layout system that a fraction such as ¾ is to be rendered using automatic glyph substitution. User-end support was quite poor for a number of years, but fonts, browsers, word processors, desktop publishing software and others increasingly support the intended Unicode behavior. This browser and your default font render the sequence as ⟨3⁄4⟩. (See Slash (punctuation)#Fractions for rendering in various other fonts.)
The most common superscript digits (1, 2, and 3) were included in ISO-8859-1 and were therefore carried over into those code points in the Latin-1 range of Unicode. The remainder were placed along with basic arithmetical symbols, and later some Latin subscripts, in a dedicated block at U+2070 to U+209F. The table below shows these characters together. Each superscript or subscript character is preceded by a baseline x to show the height of subscripting/superscripting.
Six code points in the "Superscripts and Subscripts" block are unassigned, and remain available for future characters. As of November 2024,[ref] three of these (209D, 209E, and 209F) were provisionally assigned to new subscript characters, namely Latin lowercase w, y, and z.