Recent from talks
Nothing was collected or created yet.
Zero-width joiner
View on Wikipedia
The zero-width joiner (ZWJ, /ˈzwɪdʒ/;[1] rendered: ; HTML entity: ‍ or ‍) is a non-printing character used in the computerized typesetting of writing systems in which the shape or positioning of a grapheme depends on its relation to other graphemes (complex scripts), such as the Arabic script or any Indic script. Sometimes the Roman script is to be counted as complex, e.g. when using a Fraktur typeface. When placed between two characters that would otherwise not be connected, a ZWJ causes them to be printed in their connected forms.
The exact behaviour of the ZWJ varies depending on whether the use of a conjunct consonant or ligature (where multiple characters are shown with a single glyph) is expected by default; for instance, it suppresses the use of conjuncts in Devanagari (whilst still allowing the use of the individual joining form of a dead consonant, as opposed to a halant form as would be required by the zero-width non-joiner), but induces the use of conjuncts in Sinhala (which does not use them by default).[2][3] Similarly to Sinhala, when a ZWJ is placed between two emoji characters (or interspersed between multiple), it can result in a single glyph being shown, such as the family emoji, made up of two adult emoji and one or two child emoji.[4]
In some cases, such as the second Devanagari example below, the ZWJ can be used to display a joining form in isolation, when included after the character and combining halant code.
The character's code point is U+200D ZERO WIDTH JOINER (‍). In the InScript keyboard layout for Indian languages, it is typed by the key combination Ctrl+⇧ Shift+1. However, many layouts use the position of QWERTY's ']' key for this character.[5]
Examples
[edit]
| Character sequence | Appearance |
|---|---|
| [ra র] [virāma ্ ] [ya য] | র্য |
| [ra র] [ZWJ] [virāma ্ ] [ya য] | র্য |
| Character sequence | Appearance |
|---|---|
| [ka क] [virāma ्] | क् |
| [ka क] [virāma ्] [ZWJ] | क् |
| [ka क] [virāma ्] [ṣa ष] | क्ष |
| [ka क] [virāma ्] [ZWJ] [ṣa ष] | क्ष |
| Character sequence | Appearance |
|---|---|
| [ra ರ] [virāma ್] [ka ಕ] | ರ್ಕ |
| [ra ರ] [ZWJ] [virāma ್] [ka ಕ] | ರ್ಕ |
| Character sequence | Appearance |
|---|---|
| [śa ශ] [virāma ්] [ra ර] | ශ්ර |
| [śa ශ] [virāma ්] [ZWJ] [ra ර] | ශ්ර |
| Character sequence | Appearance |
|---|---|
| [Na ണ] [virāma ്] [ZWJ] | ണ് |
| [na ന] [virāma ്] [ZWJ] | ന് |
| [ra ര] [virāma ്] [ZWJ] | ര് |
| [la ല] [virāma ്] [ZWJ] | ല് |
| [La ള] [virāma ്] [ZWJ] | ള് |
| Character sequence | Appearance | Description |
|---|---|---|
| [Man] [ZWJ] [Woman] [ZWJ] [Boy] | 👨👩👦 | Family: Man, Woman, Boy |
| [Black flag] [ZWJ] [Skull and Crossbones] | 🏴☠️ | Pirate Flag |
| [Runner] [Emoji Modifier Fitzpatrick Type-1-2] [ZWJ] [Female Sign] | 🏃🏻♀️ | Woman Running: Light Skin Tone |
| [Runner] [Emoji Modifier Fitzpatrick Type-6] [ZWJ] [Female Sign] | 🏃🏿♀️ | Woman Running: Dark Skin Tone |
| [Man] [ZWJ] [Red hair] | 👨🦰 | Man: Red Hair |
| [Person] [ZWJ] [Sheaf of rice] | 👨🌾 | Farmer |
See also
[edit]References
[edit]- ^ "113 New Unicode Emoji (plus skin tones)". Unicode Blog. 2016-11-28. Retrieved 2021-01-14.
- ^ Constable, Peter (2004-06-30). "Proposal on Clarification and Consolidation of the Function of ZERO WIDTH JOINER in Indic Scripts" (PDF). Unicode Consortium. UTC L2/04-279, Public Review Issue #37.
- ^ "13.2. Sinhala (§ Virama (al-lakuna) and Consonant Forms)". The Unicode Standard, Core Specification. Unicode Consortium.
Unless combined with a U+200D ZERO WIDTH JOINER, an al-lakuna is always visible and does not join consonants to form orthographic consonant clusters. […] Note how the use of ZWJ in Sinhala differs from that of typical Indic scripts.
- ^ "Zero Width Joiner". Emojipedia. Retrieved 2015-09-21.
- ^ "ചിത്രം:Inscript.jpg – Malayalam Computing" (in Malayalam). Malayalam.kerala.gov.in. Archived from the original on 2011-10-11. Retrieved 2011-10-22.
- ^ "Changes related to Malayalam in Unicode 5.1.0 from 5.0" (PDF). Unicode.org. Retrieved 2015-06-12.
External links
[edit]Zero-width joiner
View on GrokipediaFundamentals
Definition and Purpose
The zero-width joiner (ZWJ), encoded as U+200D in the Unicode Standard, is a non-printing format control character designed to influence the rendering of adjacent characters without introducing any visible space or width.[1] As a zero-width glyph, it has no inherent visual representation in fonts and does not contribute to the layout's horizontal extent, distinguishing it fundamentally from printable characters that occupy space and form visible elements.[1] This invisibility ensures that the ZWJ integrates seamlessly into text streams, affecting only the interpretive behavior of rendering engines rather than altering the apparent content or structure.[1] The primary purpose of the ZWJ is to request a more connected visual appearance between neighboring characters or graphemes, overriding the default rendering rules that might otherwise separate them or apply disconnected forms.[1] It achieves this by signaling the formation of ligatures—where multiple characters combine into a single glyph—or by promoting cursive joining in scripts that support such features, thereby controlling the shape, positioning, or overall combination of elements to prevent unintended breaks or isolations.[5] In essence, the ZWJ serves as a precise tool for typographic control, ensuring that sequences maintain intended connectivity even when standard glyph selection would not.[6] Mechanically, when placed between two characters (for example, in a sequence such as <base character X, ZWJ, base character Y>), the ZWJ instructs the rendering engine to treat the pair as a unified unit for glyph selection and formation.[1] This can elevate the connection level from unconnected to cursive or ligated forms, with prioritization given to the highest available option based on the font's capabilities, while also acting as a grapheme extender to avoid segmenting the sequence into separate user-perceived units.[6] Unlike visible modifiers or spacers, the ZWJ's lack of width prevents any disruption to line breaking, kerning, or overall text flow, making it ideal for subtle adjustments in complex compositions.[7] Beyond typography, it finds brief application in constructing multi-part emoji sequences, though its core role remains in character joining.[1]Related Characters
The zero-width joiner (ZWJ, U+200D) is closely related to several other zero-width Unicode format characters that influence text rendering without adding visible width, but each serves a distinct purpose in controlling joining, breaking, or spacing behaviors.[1] The zero-width non-joiner (ZWNJ, U+200C) functions oppositely to the ZWJ by preventing the joining of characters that would otherwise connect in cursive scripts, such as Arabic or Persian, where it is used to break cursive forms in compound words or at morpheme boundaries to maintain readability and semantic clarity.[1][8] The word joiner (WJ, U+2060) inhibits line breaks between adjacent characters or words without impacting character joining or visual rendering, serving as the preferred replacement for the zero-width no-break space function previously handled by U+FEFF (now primarily a byte order mark).[1] The zero-width space (ZWSP, U+200B) provides an opportunity for line breaks and can interrupt ligatures between characters, but it has no effect on joining behaviors and is often used for text justification or to indicate word boundaries in languages without spaces.[1] Key functional differences among these characters lie in their effects on text layout: the ZWJ promotes connections such as ligatures or cursive joining, the ZWNJ suppresses them, the ZWSP enables breaks while potentially disrupting ligatures, and the WJ prevents breaks without altering joining or segmentation.[1]| Character | Code Point | Category | Primary Use Cases |
|---|---|---|---|
| Zero-width joiner (ZWJ) | U+200D | Format (Cf) | Promotes ligatures and cursive connections in scripts like Indic or Arabic.[1] |
| Zero-width non-joiner (ZWNJ) | U+200C | Format (Cf) | Suppresses joining in cursive scripts, e.g., breaking forms in Arabic or Persian compound words.[1][8] |
| Word joiner (WJ) | U+2060 | Format (Cf) | Prevents line breaks between words without affecting joining.[1] |
| Zero-width space (ZWSP) | U+200B | Format (Cf) | Allows line breaks and interrupts ligatures; used for justification or word boundaries.[1] |