Hubbry Logo
Halfwidth and fullwidth formsHalfwidth and fullwidth formsMain
Open search
Halfwidth and fullwidth forms
Community hub
Halfwidth and fullwidth forms
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Halfwidth and fullwidth forms
Halfwidth and fullwidth forms
from Wikipedia
A command prompt (cmd.exe) with Korean localisation, showing halfwidth and fullwidth characters

In CJK (Chinese, Japanese, and Korean) computing, graphic characters are traditionally classed into fullwidth[a] and halfwidth[b] characters. Unlike monospaced fonts, a halfwidth character occupies half the width of a fullwidth character, hence the name.

Halfwidth and Fullwidth Forms is also the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to and from Unicode.

Rationale

[edit]
Characters which appear in both JIS X 0201 (single byte) and JIS X 0208 / JIS X 0213 (double byte) have both a halfwidth and a fullwidth form in Shift JIS.

In the days of text mode computing, Western characters were normally laid out in a grid on the screen, often 80 columns by 24 or 25 lines. Each character was displayed as a small dot matrix, often about 8 pixels wide, and an SBCS (single-byte character set) was generally used to encode characters of Western languages.

For aesthetic reasons and readability, it is preferable for Chinese characters to be approximately square-shaped, therefore twice as wide as these fixed-width SBCS characters. As these were typically encoded in a DBCS (double-byte character set), this also meant that their width on screen in a duospaced font was proportional to their byte length. Some terminals and editing programs could not deal with double-byte characters starting at odd columns, only even ones (some could not even put double-byte and single-byte characters in the same line). So the DBCS sets generally included Roman characters and digits also, for use alongside the CJK characters in the same line.

On the other hand, early Japanese computing used a single-byte code page called JIS X 0201 for katakana. These would be rendered at the same width as the other single-byte characters, making them half-width kana characters rather than normally proportioned kana. Although the JIS X 0201 standard itself did not specify half-width display for katakana, this became the visually distinguishing feature in Shift JIS between the single-byte JIS X 0201 and double-byte JIS X 0208 katakana. Some IBM code pages used a similar treatment for Korean jamo,[1] based on the N-byte Hangul code and its EBCDIC translation.

In Unicode

[edit]

For compatibility with existing character sets that contained both half- and fullwidth versions of the same character, Unicode allocated a single block at U+FF00–FFEF containing the necessary "alternative width" characters. This includes a fullwidth version of all the ASCII characters and some non-ASCII punctuation such as the Yen sign, halfwidth versions of katakana and hangul, and halfwidth versions of some other symbols such as circles. Only characters needed for lossless round trip to existing character sets were allocated, rather than (for instance) making a fullwidth version of every Latin accented character.

Unicode assigns every code point an "East Asian width" property. This may be:[2]

Unicode character properties based on width
Abbreviation Name Description
W Wide Naturally wide character, e.g. Hiragana.
Na Narrow Naturally narrow character, e.g. ISO Basic Latin alphabet.
F Fullwidth Wide variant with compatibility normalisation to naturally narrow character, e.g. fullwidth Latin script.
H Halfwidth Narrow variant with compatibility normalisation to naturally wide character, e.g. half-width kana. Includes U+20A9 () as an exception.
A Ambiguous Characters included in East Asian DBCS codes but also in European SBCS codes, e.g. Greek alphabet. Duospaced behaviour can consequently vary.
N Neutral Characters which do not appear in East Asian DBCS codes, e.g. Devanagari.

Terminal emulators can use this property to decide whether a character should consume one or two "columns" when figuring out tabs and cursor position.

In OpenType

[edit]

OpenType has the fwid, halt, hwid, and vhal feature tags to be used to reproduce fullwidth or halfwidth form of a character. CSS provides control over these features using font-variant-east-asian and font-feature-settings properties.[3]

See also

[edit]

Notes

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Halfwidth and fullwidth forms are typographic variants of characters primarily used in East Asian scripts, such as Chinese, Japanese, and Korean, where halfwidth characters occupy approximately half the horizontal space (1/2 em) of a standard character cell, and fullwidth characters occupy the full space (1 em) to ensure uniform alignment in fixed-width fonts. These forms facilitate the mixing of narrow scripts like Latin letters with wider CJK (Chinese-Japanese-Korean) ideographs and , addressing visual spacing needs in legacy computing environments. Historically, halfwidth forms emerged in early Japanese encodings to enable single-byte representation compatible with ASCII, as seen in (1976), which includes halfwidth for efficient storage and display alongside Roman characters. Fullwidth forms, conversely, were standardized in double-byte encodings like (1978) to match the square proportions of ideographic glyphs, with terms hankaku (halfwidth) and zenkaku (fullwidth) originating from Japanese conventions for these display widths. Encodings such as Shift-JIS later combined these, using single bytes for halfwidth elements and two bytes for fullwidth ones, influencing widespread adoption in text processing for . In the Unicode Standard, halfwidth and fullwidth forms are preserved in the dedicated Halfwidth and Fullwidth Forms block (U+FF00–U+FFEF), which includes fullwidth variants of ASCII and Latin letters (e.g., U+FF01 for fullwidth ), halfwidth (U+FF65–U+FF9F), and compatibility to support round-trip compatibility with legacy systems. These characters carry compatibility decomposition mappings tagged as <wide> or <narrow>, normalizing to their base forms during text processing. Additionally, the normative East Asian Width property classifies all characters into categories—such as Narrow (half-em width), Wide (full-em width), Ambiguous (context-dependent), Halfwidth, Fullwidth, and Neutral—to assist in line-breaking, justification, and rendering algorithms for East Asian .

Overview

Definition and Purpose

Halfwidth and fullwidth forms are typographic variants of characters, primarily Latin letters, numerals, and , used in East Asian contexts to achieve balanced spacing in mixed-script text. Halfwidth forms occupy approximately half the width of a standard fullwidth character, typically 1/2 em in fixed-pitch East Asian fonts, while fullwidth forms span a full em width, aligning visually with the square proportions of CJK (Chinese, Japanese, Korean) ideographs. The primary purpose of these forms is to enable proportional spacing and visual harmony in fixed-width displays and terminals, where standard narrow Latin characters might appear cramped alongside wider CJK scripts, causing imbalance in line layout and . Halfwidth variants support compact, monospaced usage for Latin-like elements in environments requiring efficiency, such as early systems, whereas fullwidth variants emulate traditional East Asian by treating all characters as uniformly square for aesthetic consistency. The terminology originates from early computing practices in : "halfwidth" (or Japanese hankaku) refers to single-byte encoding efficiency in legacy character sets like , allowing denser text representation, while "fullwidth" (or zenkaku) describes the full visual occupancy in line printers and terminals, often requiring two bytes for broader glyphs. These forms are encoded in Unicode's Halfwidth and Fullwidth Forms block (U+FF00–U+FFEF) for compatibility with such legacy systems.

Historical Development

The origins of halfwidth and fullwidth forms trace back to the and in Japanese computing and printing technologies, where halfwidth characters were developed to enable denser text representation in fixed-width media such as teleprinters and early digital systems. These forms addressed the challenge of integrating Japanese scripts with limited character spaces, initially through standards like , established in 1976 but rooted in 1969 developments, which encoded 63 halfwidth characters alongside ASCII for single-byte efficiency. This allowed Japanese text to align with monospaced Latin fonts in resource-constrained environments, marking an early milestone in East Asian . In the late 1970s, advanced these concepts through 8-bit code pages tailored for Japanese, notably CPGID 300 (IBM Japanese DBCS-Host), which integrated halfwidth variants to fit more characters within fixed byte limits while supporting double-byte sets for . Drawing from , which defined 6,151 characters, including halfwidth from and fullwidth forms for broader compatibility, this facilitated data interchange in mainframe and PC systems. This development reflected IBM's efforts to adapt international standards for Japanese , with subsequent revisions in the and 1990s expanding their extended character sets, including non-kanji support to over 2,500 characters by 1999, beyond core JIS standards. The 1980s saw widespread adoption of halfwidth and fullwidth forms in ASCII-compatible systems for East Asian terminals and word processors, driven by the proliferation of personal computing in . Platforms like the PC-9800 series, launched in 1982, relied on encodings such as Shift-JIS, which incorporated halfwidth for efficient text processing and display in business applications. A key milestone was the formalization of ISO 2022 in its 1986 edition, which introduced escape sequences (e.g., ESC F for designation) to switch between halfwidth and fullwidth modes in 7-bit and 8-bit codes, enabling dynamic character set invocation for Japanese protocols and ensuring across telegraphic and computing networks. By the , the transition to 16-bit encodings culminated in 's inclusion of halfwidth and fullwidth forms as compatibility mappings for legacy systems, with the Halfwidth and Fullwidth Forms block (U+FF00–U+FFEF) introduced in Unicode 1.1 (1993) to support round-trip conversion from standards like and Shift-JIS. This block encoded 240 characters, including fullwidth Latin variants and halfwidth , preserving visual distinctions for East Asian while aligning with ISO/IEC 10646. These mappings ensured without altering core Unicode principles, solidifying the forms' role in modern digital text processing.

Character Encoding

In Unicode

The Halfwidth and Fullwidth Forms block in spans the range U+FF00–U+FFEF and encompasses 240 code points, of which 225 are assigned to characters such as fullwidth variants of Latin letters, digits, and punctuation; halfwidth ; and related punctuation forms. This block serves primarily as a compatibility zone, enabling the preservation of distinctions from legacy East Asian encodings where character width variants were essential for efficient data storage and display in fixed-width contexts. These characters are classified under the East_Asian_Width property, with fullwidth forms assigned the value "F" (fullwidth) and halfwidth forms assigned "H" (halfwidth), reflecting their intended rendering widths of 1 em and 1/2 em, respectively, in East Asian . The inclusion of these forms ensures round-trip compatibility with encodings like , where halfwidth characters (e.g., single-byte ) map to their Unicode equivalents without loss of width-specific information, while fullwidth characters (often double-byte) correspond to widened variants of ASCII or other base forms. For instance, the fullwidth Latin capital letter A (U+FF21) has a compatibility decomposition mapping to the standard Latin capital letter A (U+0041). In normalization, fullwidth and halfwidth characters are treated as compatibility equivalents rather than equivalents, meaning they are not decomposed in Normalization Form C (NFC) or Form D (NFD), which rely solely on mappings to avoid unintended alterations in legacy data. However, in Normalization Form KC (NFKC) and Form KD (NFKD), compatibility decompositions apply, mapping these variants to their base forms (e.g., fullwidth to standard ), though this process may disrupt round-trip fidelity with systems dependent on width distinctions. The block was introduced in Unicode version 1.1 in 1993 to accommodate early East Asian character set integrations, with subsequent expansions in later versions, such as Unicode 3.2, adding further CJK punctuation variants to enhance compatibility coverage.

Compatibility with Other Standards

In Shift-JIS, standardized as Microsoft Code Page 932, halfwidth katakana characters are encoded in the single-byte range from 0xA1 to 0xDF, drawing from , while fullwidth characters, including katakana and from , occupy the double-byte range starting from 0x81 to 0x9F and 0xE0 to 0xEF for the lead byte. This structure ensures backward compatibility with early Japanese computing systems that relied on single-byte efficiency for katakana in limited-storage environments. EUC-JP, a fixed-width encoding prevalent in Unix systems, supports halfwidth katakana through a two-byte sequence where the lead byte 0x8E (equivalent to the SS2 control in ISO-2022) is followed by a trail byte in the 0xA1 to 0xDF range, aligning with mappings. Similarly, ISO-2022-JP utilizes escape sequences to switch character sets, with the SS2 control (0x8E) enabling temporary invocation of the katakana set for halfwidth forms, facilitating interoperability in and network protocols on platforms. These mechanisms allow seamless mode switching between ASCII, fullwidth , and halfwidth katakana without disrupting legacy Unix text processing workflows. Chinese standards like primarily encode fullwidth forms for in double-byte sequences, with halfwidth Latin characters handled via single-byte ASCII compatibility in the 0x00 to 0x7F range; extensions in GBK (Code Page 936) incorporate additional halfwidth Latin and symbols to support Windows environments, preserving these mappings in for . Big5, for traditional Chinese, follows a comparable approach, emphasizing fullwidth glyphs in double-byte ranges while retaining halfwidth Latin for basic interoperability. In Korean standards, KS C 5601 (superseded by ) includes halfwidth compatibility forms, which map to Unicode's U+FFA0–U+FFDC range for round-trip preservation during conversion from legacy systems. Interoperability challenges arise from byte-order differences, as UTF-8 employs variable-length encoding without inherent byte order, contrasting with the big-endian fixed sequences in legacy encodings like EUC-JP and Shift-JIS, potentially causing misalignment in mixed-content streams. Unicode's CESU-8 encoding addresses this by providing a UTF-8-compatible format that serializes surrogate pairs as in Java's modified UTF-8, enabling reliable round-trip handling of fullwidth and halfwidth forms in Java applications interfacing with legacy East Asian data.

Font and Typography Support

OpenType Features

OpenType fonts support halfwidth and fullwidth forms through specific feature tags that enable glyph substitution and positioning adjustments, primarily for East Asian typography. The 'fwid' (Full Widths) feature replaces glyphs set on proportional or other widths with full-width variants occupying one em unit, ensuring uniform spacing in CJKV (Chinese, Japanese, Korean, Vietnamese) text; this is achieved via GSUB (Glyph Substitution) lookup type 1 tables. Similarly, the 'hwid' (Half Widths) feature substitutes proportional or full-width glyphs with half-width versions (half an em), commonly applied to Latin characters, figures, or kana in Japanese fonts for compact layouts. The 'halt' (Alternate Half Widths) feature, distinct from 'hwid', repositions existing full-width glyphs to half their nominal width without substitution, using GPOS (Glyph Positioning) lookup type 1; it targets full-width Latin or katakana in Japanese text for better punctuation fit, ignoring proportional forms. These features are mutually exclusive with other width-related tags like 'pwid' (Proportional Widths), which substitutes uniform-width glyphs (e.g., 1/4 to full em) with proportional variants suitable for mixed Latin and CJK content, and they disable kerning ('kern') when active to maintain fixed spacing. Glyph substitution and positioning occur through GSUB and GPOS tables, often script-specific such as 'latn' (Latin script) for fullwidth variants of Roman letters within CJK fonts, allowing locale-aware selection based on text context. Fonts like Noto Sans CJK implement these features to toggle between halfwidth and fullwidth glyphs, with 'fwid' enabling wide forms for Japanese or Korean locales while preserving narrow defaults for others. Layout engines integrate these mechanisms for seamless rendering: HarfBuzz processes GSUB lookups to apply width substitutions automatically in mixed East Asian and Latin text, supporting script tags like 'hani' (Han) or 'kana'. Core Text similarly utilizes these features via its typography panel options for "Full Width" ('fwid') and "Half Width" ('hwid'), adjusting widths in heterogeneous scripts without manual intervention. As a fallback for non-monospaced environments, the 'pwid' feature contrasts fixed half/fullwidth forms by providing proportional spacing, which can combine with for natural flow in proportional CJKV while avoiding conflicts with monospaced needs like terminals. These features trigger based on properties such as East Asian Width, ensuring compatibility post-encoding.

Rendering Considerations

The Unicode Line Breaking Algorithm, as defined in UAX #14, classifies fullwidth alphabetic characters in the Halfwidth and Fullwidth Forms block (U+FF00–U+FFEF) primarily as "AL" (Alphabetic), which prohibits breaks between them unless tailored, while characters such as are treated as "ID" (Ideographic), allowing breaks before or after (ID ÷ ID) to support flexible line formation in mixed scripts. These classifications incorporate wide metrics for fullwidth forms, ensuring they occupy the equivalent of two halfwidth units in East Asian , which influences justification by enabling even space distribution across CJK lines without irregular gaps. In scenarios where fonts lack glyphs for halfwidth characters, operating systems like Windows employ font substitution mechanisms in GDI+ that select alternative fonts for missing halfwidth s, often resulting in inconsistent spacing and alignment due to differing metrics during text layout. This fallback behavior prioritizes glyph availability over precise metrics, potentially disrupting monospaced rendering in applications handling mixed-width text. Web rendering of halfwidth and fullwidth forms is managed through CSS properties such as font-variant-east-asian, which leverages @font-feature-values rules to select variants like full-width or proportional-width for Japanese typography, with additional options like jis78 for specific forms, enabling browsers like Chrome to adjust widths dynamically based on features. This approach ensures consistent display across devices but requires explicit author control to avoid default substitutions that alter intended spacing. Platform-specific rendering varies significantly: interfaces for Japanese text favor fullwidth forms to maintain monospaced alignment with and , as recommended in Apple's input guidelines for seamless UI integration. In contrast, Android applications, particularly code editors, often default to halfwidth for compact, monospaced output in programming contexts, with metrics adapting to DPI scaling for optimal readability on diverse screen sizes. Accessibility considerations arise in how screen readers process these forms; tools like NVDA announce fullwidth characters as their standard semantic equivalents without denoting width, treating them identically to halfwidth counterparts for verbal output. However, since fullwidth characters are compatibility equivalents, they are transcribed in identically to their base forms without width distinctions.

Usage Contexts

In East Asian Typography

In Japanese typography, fullwidth forms are predominantly used in vertical writing systems, known as tategaki, to ensure balanced columns and a uniform visual rhythm that aligns with the traditional grid-based layout of printed matter. This preference stems from the need for consistent character spacing in columnar formats, where fullwidth glyphs occupy a square em unit, preventing disruptions in line flow and enhancing readability in book spines, novels, and historical texts. Halfwidth katakana, referred to as hankaku, emerged as a practical alternative for space-constrained applications, particularly in technical diagrams and tables, following the establishment of the JIS X 0201 standard in 1976, which encoded these forms as single-byte characters for compatibility with early computing environments. Chinese typographic conventions emphasize fullwidth punctuation in publications employing simplified script, such as the fullwidth comma (U+FF0C, ,) and period (U+FF0E, 。), to harmonize with the square proportions of hanzi characters and maintain optical balance in body text. This approach is codified in style guides for digital and print media, where fullwidth variants ensure seamless integration without introducing irregular spacing. Halfwidth forms remain uncommon in mainland Chinese contexts but persist in Taiwan's Big5 encoding environments, where they support mode-switching between fullwidth Chinese and halfwidth Latin for bilingual interfaces and legacy systems. In Korean practices, fullwidth dominate print media like newspapers, promoting aesthetic uniformity by rendering characters in consistent square cells that echo the modular structure of and facilitate grid alignment across pages. This monospaced approach enhances legibility in dense layouts, where varying widths could disrupt the visual harmony essential to journalistic design. Conversely, halfwidth Latin characters are integrated via Hangul-compatible keyboards, enabling efficient input of Romanized terms or acronyms without requiring separate input modes, thus streamlining multilingual composition in everyday . Contemporary East Asian design, including and advertisements, often mixes halfwidth and fullwidth forms strategically for emphasis and compactness, as seen in the use of halfwidth English loanwords to create visual contrast or fit tight spaces within panels and headlines. For instance, brand names or technical terms may employ halfwidth Roman letters to stand out against fullwidth Japanese or Korean text, adding a modern, international flair. Since the , the adoption of proportional fonts has diminished the rigid necessity of fullwidth forms, allowing more flexible widths that adapt to digital screens and variable layouts while preserving typographic elegance.

In Computing and Terminals

Terminal emulators such as and handle halfwidth and fullwidth forms to ensure proper display in grid-based interfaces, where fullwidth characters occupy two cells to match their visual width. In , the wcwidth function determines character widths, treating fullwidth CJK and similar characters as double-width for accurate rendering, while supporting legacy encodings like Shift-JIS for input in older applications. similarly provides robust support for fullwidth characters and Unicode, including normalization options that preserve fullwidth attributes during text processing. In programming environments, Python's unicodedata module includes the east_asian_width() function, which categorizes characters by their East Asian width (e.g., "W" for fullwidth, "N" for narrow/halfwidth) to facilitate console output alignment in CLI tools. This function, based on Standard Annex #11, is used to compute display widths and prevent misalignment when handling strings with mixed-width characters, such as in custom file listers or progress bars that process CJK filenames akin to how the ls command relies on terminal width calculations. Integrated development environments (IDEs) and text editors incorporate options to manage halfwidth and fullwidth forms for precise cursor positioning and text handling. In Vim, the 'ambiwidth' option controls the rendering of ambiguous-width characters (e.g., certain Latin or Greek symbols), with 'single' treating them as halfwidth like ASCII and 'double' rendering them at fullwidth to align with CJK contexts and avoid cursor misalignment over double-width glyphs. Visual Studio Code extensions, such as Zenkaku-Hankaku, enable toggling between fullwidth and halfwidth Japanese characters, useful for editing comments or code with embedded Japanese text while maintaining consistent formatting. Web and mobile applications leverage editors (IMEs) that integrate halfwidth and fullwidth forms for natural language entry. The Japanese IME defaults to half-width alphanumeric input but allows seamless switching to full-width modes (e.g., via F7 for full-width katakana) during composition, supporting both during romaji-to-kana conversion for Japanese text entry. rendering in these environments often applies fullwidth metrics, where terminals and browsers treat many emojis as double-width to fit grid layouts, ensuring consistent spacing alongside . Legacy systems like IBM z/OS maintain support for halfwidth characters in applications to ensure compatibility with historical data formats, including code pages that include halfwidth from punch-card eras, facilitating migrations in the 2020s without altering core logic. This backward compatibility allows older programs to process mixed-width text while integrating with modern handling.

Examples and Comparisons

Visual Differences

Halfwidth and fullwidth forms exhibit distinct visual characteristics primarily in their horizontal spacing and proportions, designed to integrate with East Asian conventions. Fullwidth characters, such as the Latin capital letter A at U+FF21 (A), are rendered with a width approximating 1 em in fixed-pitch East Asian fonts, ensuring they occupy a square-like comparable to CJK ideographs; for instance, in a standard 12-point font at 96 dpi, this equates to about 16 pixels wide. In contrast, halfwidth forms like the letter A at U+FF71 (ア) span roughly 0.5 em, or half that width, allowing two such characters to fit within the space of one fullwidth slot for denser layout in mixed-script text. This difference facilitates balanced line composition in documents combining Latin scripts with CJK elements, where the purpose of width balancing is to maintain optical evenness across character sets. Punctuation marks further highlight these visual disparities, particularly in vertical alignment and baseline positioning. The fullwidth full stop at U+FF0E (.) is proportioned to align centrally with the optical midline of CJK ideographs, creating a uniform horizontal rhythm in East Asian typesetting. Conversely, the halfwidth period at U+002E (.) typically sits lower on the baseline, aligned with Western Latin conventions, which can disrupt vertical harmony in mixed lines by introducing subtle shifts in perceived height. These alignments ensure fullwidth punctuation integrates seamlessly into ideographic blocks without protruding or receding, while halfwidth variants prioritize compatibility with narrower scripts. In terms of script-specific visuals, fullwidth Latin characters emulate the squareness of CJK glyphs, with bold, blocky strokes that fill the em square for cohesive appearance in bilingual texts. Halfwidth forms for non-Latin scripts, such as those in the range U+FFA0–U+FFDC for compatibility jamo elements, are compressed into a narrower , suitable for inline insertions like in Japanese publications where space efficiency is key. This compression preserves legibility while reducing footprint, often used to embed symbols without expanding line length. Alignment effects become evident in monospaced fonts, where fullwidth characters maintain straight, non-jagged edges in tabular layouts by matching the uniform cell width of surrounding ideographs. Halfwidth forms, by contrast, enable denser packing in lists or columns, accommodating more content per line without truncation or overflow. However, common pitfalls arise in fonts, where halfwidth characters may appear disproportionately narrow due to variable proportions, exacerbating challenges on low-resolution displays through increased and reduced contrast.

Specific Character Mappings

The Halfwidth and Fullwidth Forms block (U+FF00–U+FFEF) includes characters designed for compatibility with legacy East Asian encodings, where many have defined compatibility decompositions mapping them to their base forms in other blocks, such as Basic Latin or CJK Symbols and Punctuation. These mappings are specified in the UnicodeData.txt file and facilitate normalization processes like NFKC, allowing conversion between variant forms while preserving semantic equivalence. Representative mappings are cataloged below in a table, focusing on key categories including Latin letters, , , and symbols. Each entry shows the variant and name, its base form, and brief usage notes derived from the type (e.g., for fullwidth expansions, for halfwidth contractions). These examples illustrate common correspondences used in text processing and legacy .
Variant Code PointVariant NameBase Code PointBase NameDecomposition TypeUsage Notes
U+FF21FULLWIDTH LATIN CAPITAL LETTER AU+0041LATIN CAPITAL LETTER APart of fullwidth Latin range U+FF21–U+FF3A, used in East Asian fixed-width contexts.
U+FF41FULLWIDTH LATIN SMALL LETTER AU+0061LATIN SMALL LETTER ACorresponds to fullwidth lowercase range U+FF41–U+FF5A for compatibility.
U+FF01FULLWIDTH EXCLAMATION MARKU+0021EXCLAMATION MARKFullwidth U+FF01–U+FF20 mirrors Basic Latin U+0021–U+0040.
U+FF06FULLWIDTH AMPERSANDU+0026AMPERSANDEmployed in legacy Japanese encodings for wider visual spacing.
U+FF0EFULLWIDTH FULL STOPU+002EFULL STOPStandard fullwidth period equivalent.
U+FF61HALFWIDTH IDEOGRAPHIC FULL STOPU+3002IDEOGRAPHIC FULL STOPHalfwidth CJK U+FF61–U+FF65 for compact layouts.
U+FF71HALFWIDTH KATAKANA LETTER AU+30A2KATAKANA LETTER AStart of halfwidth range U+FF67–U+FF9D, unique to Japanese input systems.
U+FF94HALFWIDTH KATAKANA LETTER YAU+30E4KATAKANA LETTER YAUsed in halfwidth for terminal displays and older software.
U+FF66HALFWIDTH KATAKANA LETTER WOU+30F2KATAKANA LETTER WOIncludes voiced marks like U+FF9E mapping to U+3099 (KATAKANA VOICED SOUND MARK).
U+FFE5FULLWIDTH YEN SIGNU+00A5YEN SIGNFullwidth compatibility characters including monetary symbols in U+FFE0–U+FFE6 for East Asian display.
U+FFE1FULLWIDTH POUND SIGNU+00A3POUND SIGNCompatibility form for international currency in East Asian text.
U+FFA1HALFWIDTH HANGUL LETTER KIYEOKU+3131HANGUL LETTER KIYEOKHalfwidth Hangul jamo U+FFA0–U+FFBE for Korean legacy compatibility.
U+FF64HALFWIDTH IDEOGRAPHIC COMMAU+3001IDEOGRAPHIC COMMACompact variant for CJK typography in narrow spaces.
These mappings ensure , with fullwidth forms expanding base characters to occupy the width of a CJK and halfwidth forms contracting them for ASCII-like layouts. For exhaustive lists, consult the full UnicodeData.txt.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.