Shift Out and Shift In characters

Shift Out and Shift In charactersMain

Community hub

8 pages, 0 posts

0 subscribers

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Contribute something

About hubMembersContent overviewUpdatesRules

Main reference articles

Shift Out and Shift In characters

View on Wikipedia

from Wikipedia

Shift Out (SO) and Shift In (SI) are ASCII control characters 14 and 15, respectively (0x0E and 0x0F).^[1] These are sometimes also called "Control-N" and "Control-O".

The original purpose of these characters was to provide a way to shift a coloured ribbon, split longitudinally usually with red and black, up and down to the other colour in an electro-mechanical typewriter or teleprinter, such as the Teletype Model 38, to automate the same function of manual typewriters. Black was the conventional ambient default colour and so was shifted "in" or "out" with the other colour on the ribbon.

Later advancements in technology instigated use of this function for switching to a different font or character set and back. This was used, for instance, in the Russian character set known as KOI7-switched, where SO starts printing Russian letters, and SI starts printing Latin letters again. Similarly, they are used for switching between Katakana and Roman letters in the 7-bit version of the Japanese JIS X 0201.^[2]^[3]

SO/SI control characters also are used to display VT100 pseudographics. Shift In is also used in the 2G variant^[4] of SoftBank Mobile's encoding for emoji.

The ISO/IEC 2022 standard (ECMA-35, JIS X 0202) standardises the generalized usage of SO and SI for switching between pre-designated character sets invoked over the 0x20–0x7F byte range. It refers to them respectively as Locking Shift One (LS1) and Locking Shift Zero (LS0) in an 8-bit environment, or as SO and SI in a 7-bit environment.^[5] In ISO-2022-compliant code sets where the 0x0E and 0x0F characters are used for the purpose of emphasis (such as an italic or red font) rather than a change of character set, they are referred to respectively as Upper Rail (UR) and Lower Rail (LR), rather than SO and SI.^[6]

References

[edit]

^ "The Linux Programmer's Manual". Retrieved 2012-11-16.
^ Japanese Industrial Standards Committee (1975-12-01). The Japanese Katakana graphic set of characters (PDF). ITSCJ/IPSJ. ISO-IR-13.
^ Japanese Industrial Standards Committee (1975-12-01). The Japanese Roman graphic set of characters (PDF). ITSCJ/IPSJ. ISO-IR-14.
^ Kawasaki, Yusuke (2010). Emoji encodings and cross-mapping tables in pure Perl.
^ ECMA (1994). "7.3: Invocation of character-set code elements". Character Code Structure and Extension Techniques (PDF) (ECMA Standard) (6th ed.). p. 14. ECMA-35.
^ Sveriges Standardiseringskommission (1975-12-01). NATS Control set for newspaper text transmission (PDF). ITSCJ/IPSJ. ISO-IR-7.

Revisions and contributors Edit on Wikipedia Read on Wikipedia

Shift Out and Shift In characters

View on Grokipedia

from Grokipedia

Shift Out (SO) and Shift In (SI) are control characters in the 7-bit coded character set for information interchange, as defined by the ECMA-6 standard (equivalent to ISO/IEC 646), with SO assigned the bit combination 0/14 (decimal 14, hexadecimal 0E, Unicode U+000E) and SI assigned 0/15 (decimal 15, hexadecimal 0F, Unicode U+000F).^[1] These characters belong to the C0 control set and are primarily used to enable code extension techniques by switching between alternate graphic character sets in data transmission and processing environments.^[2] In the ISO/IEC 2022 framework for character code structure and extension (identical to ECMA-35), SO functions as a locking-shift mechanism to invoke the designated G1 graphic character set into the GL (graphics left) code table area (columns 02 to 07 in a 7-bit code), affecting all subsequent characters until another shift occurs.^[2] Conversely, SI invokes the G0 graphic character set (typically the basic ASCII set) back into the GL area, restoring the primary encoding state.^[2] This pairwise shifting allows for efficient representation of extended character repertoires, such as national variants or additional symbols, without requiring a full 8-bit code, and supports transformations between 7-bit and 8-bit codes while preserving shift states.^[2] Originally included in the 1963 ASCII standard to facilitate ribbon shifting on early teleprinters and typewriters for bichrome output, SO and SI evolved into key components of international encoding standards for handling multilingual text in constrained bandwidth scenarios, such as early email protocols (e.g., ISO-2022-JP and ISO-2022-KR).^[3] Their use persists in legacy systems, terminal emulations like VT100, and certain East Asian encodings, where they enable dynamic toggling between ASCII-compatible sets and denser graphic sets containing ideographs or special symbols.^[4] In modern contexts, while largely superseded by fixed-width encodings like UTF-8, they remain defined in Unicode for backward compatibility and round-trip preservation in protocols adhering to ISO 2022 variants.^[3]

Definition and Purpose

Shift Out Character

The Shift Out (SO) character is defined as ASCII control character number 14, with a hexadecimal value of 0x0E and decimal value of 14, and is commonly referred to as Control-N.^[5]^[2] Its primary function is to invoke or lock an alternate character set, such as the G1 graphic character set in ISO terminology, thereby enabling the interpretation of subsequent code combinations outside the standard character set until a Shift In character is received.^[5]^[2] Alternatively, in certain device contexts, SO can shift a physical mechanism, such as engaging an alternate color ribbon in printers.^[3] Operationally, SO functions as a locking shift mechanism, activating the alternate state persistently for following characters and allowing continuous use of the shifted configuration without repeated invocations, which is countered only by the paired Shift In character to revert to the primary state.^[5]^[2] The standard mnemonic and symbolic representations for SO include the abbreviation "SO" and the caret notation "^N" in various technical documentation and implementations.^[5]

Shift In Character

The Shift In (SI) character is a control function defined in the ASCII standard as the code point 15, represented in hexadecimal as 0x0F and in decimal as 15, and commonly abbreviated as Control-O or ^O in mnemonic notations.^[6] This non-printing character serves to invoke the primary (G0) character set, ensuring that subsequent bit combinations are interpreted according to the default encoding until another control function alters the state. Its primary function is to revert the active character set to the initially designated primary set, such as G0 in the context of code extension techniques, thereby locking any prior shift to an alternate set. In operational terms, SI immediately cancels the effects of a preceding Shift Out (SO) by switching back to the default state, with the change applying to the following characters in the data stream without affecting ongoing control sequences.^[6] This mechanism allows for temporary extensions of the character repertoire in 7-bit environments, where SI is specifically employed as a locking-shift function to the G0 set. SI is symbolically represented as SI in standards documentation and ^O in caret notation for control characters, facilitating its identification in programming and terminal contexts.^[6] In conjunction with SO, it enables selective invocation of alternate character sets (like G1) for brief sequences before returning to the primary set, supporting efficient handling of multilingual or specialized symbols within constrained code spaces.

Historical Development

Origins in Early Teleprinters

The origins of the Shift Out (SO) and Shift In (SI) characters trace back to the mechanical constraints of early 20th-century printing telegraphs, where limited bit widths necessitated mechanisms to toggle between character sets without expanding the code itself. Émile Baudot's pioneering 5-bit code, patented in 1874, laid the foundation by introducing shift functions to access letters, figures, and symbols within a 32-character repertoire, enabling efficient transmission over telegraph lines in devices like synchronous multiplex printers.^[7] This approach influenced subsequent systems by allowing dynamic mode switching, a concept directly ancestral to SO and SI for set selection.^[7] In the 1900s and 1920s, electro-mechanical teleprinters such as those developed by the Morkrum Company (later Teletype Corporation) incorporated shift mechanisms to control printing operations, including the advancement of colored ribbons—typically red and black—for distinguishing text types in page printers. For instance, Donald Murray's variants of the Baudot code, patented starting in 1901, refined these shifts to optimize keyboard input and tape perforation, facilitating reliable operation in asynchronous start-stop systems used for stock tickers and news services.^[7] These innovations addressed the need for dual-mode printing in bandwidth-limited environments, with shifts mechanically locking the typebar or ribbon position until reset.^[8] The International Telegraph Alphabet No. 2 (ITA2), formalized in the 1920s, standardized these shift operations for global use, employing FIGS (figures/symbols mode) and LTRS (letters mode) to expand the effective character set beyond 5 bits while maintaining compatibility with early teleprinters.^[7] ITA2's design, influenced by Murray's enhancements, allowed toggling between upper- and lower-case representations or numerals/punctuation, directly paralleling the later SO and SI functions for set invocation.^[8] This was crucial for handling mixed text in mechanical devices, where a single shift code could reconfigure the print head for an entire sequence. Key milestones included the Comité Consultatif International Télégraphique (CCITT)'s adoption of ITA2 in 1930 as the international standard for telegraphy, promoting interoperability across borders.^[7] Earlier, in 1922, the U.S. Navy demonstrated radioteletype (RTTY) using Baudot-derived shifts to transmit messages from an airplane to ground stations, highlighting the codes' robustness in wireless applications.^[9] By 1924, the Radio Corporation of America (RCA) employed similar RTTY systems to send text from shore to ships, further embedding shift mechanisms in maritime and military communications.^[9]

Inclusion in ASCII Standard

The development of the Shift Out (SO) and Shift In (SI) characters within the ASCII standard began with proposals from the American Standards Association (ASA) X3.4 committee in 1961, as part of efforts to create a unified seven-bit code for information interchange. This initial proposal, influenced by the need to accommodate existing communication practices, assigned SO to position 0/14 and SI to 0/15 for enabling shifts between character sets, drawing directly from teleprinter conventions. The committee's work culminated in the publication of the first ASCII standard, ASA X3.4-1963, on June 17, 1963, which formalized the inclusion of SO and SI in the control character block at decimal positions 14 (0x0E) and 15 (0x0F), respectively.^[10]^[3] The rationale for retaining SO and SI in the seven-bit ASCII design centered on backward compatibility with established teleprinter codes, such as the International Telegraph Alphabet No. 2 (ITA2), despite the standard's goal of assigning fixed positions to characters to eliminate the need for shifting mechanisms. Proponents argued that these control characters were essential for interoperability with legacy equipment used in data communications, allowing efficient switching between uppercase/lowercase or figures/letters without expanding the code beyond seven bits. This decision reflected compromises during committee debates, where non-shifting alternatives—like AT&T's earlier six-unit code proposal—were considered but ultimately rejected in favor of preserving compatibility for practical deployment.^[10]^[11] Key influences on the inclusion came from teleprinter manufacturers, notably Teletype Corporation (a subsidiary of AT&T), whose equipment relied on similar shift functions for ribbon control and character interpretation in five-unit codes. Representatives from Teletype and Western Union participated in the X3.4 deliberations, advocating for SO and SI to ensure the new standard supported existing hardware without requiring widespread replacements. The 1967 revision (USAS X3.4-1967) added definitions for lowercase letters, while the 1968 revision (USAS X3.4-1968), published as ANSI X3.4-1968, maintained these assignments unchanged, solidifying their place in the control block (0x00-0x1F) with minor clarifications to control functions.^[10]^[3]^[12]

Technical Specifications

Code Points and Representation

The Shift Out (SO) character is encoded in the ASCII standard at decimal 14, hexadecimal 0x0E, and binary 0001110 (7-bit representation).^[13] The Shift In (SI) character follows immediately at decimal 15, hexadecimal 0x0F, and binary 0001111.^[13] These positions place SO and SI within the C0 control character range of ASCII, defined in the original ANSI X3.4-1968 standard and subsequent revisions.^[13] In Unicode, SO is mapped to the code point U+000E and SI to U+000F, both within the Basic Latin block (U+0000 to U+007F).^[14] These are classified as control characters with no assigned glyph in standard rendering, though they retain their ASCII semantics for compatibility.^[14] For visualization in debugging or documentation tools, symbolic representations are sometimes used, such as ␎ (U+240E, SYMBOL FOR SHIFT OUT) for SO and ␏ (U+240F, SYMBOL FOR SHIFT IN) for SI, drawn from the Control Pictures block. The encodings remain consistent in 8-bit extensions of ASCII, where the most significant bit is set to 0 for these 7-bit controls, preserving their positions without alteration.^[15] In ISO/IEC 8859 series standards, such as ISO/IEC 8859-1 (Latin-1), SO and SI occupy the same byte values (0x0E and 0x0F) to ensure backward compatibility with 7-bit ASCII.^[15] Similarly, in EBCDIC, the encodings align at hexadecimal 0x0E for SO and 0x0F for SI, maintaining equivalence to ASCII control codes despite differences elsewhere in the code page.^[16]

Standard	SO Code Point	SI Code Point
ASCII (7-bit)	Dec: 14, Hex: 0x0E, Bin: 0001110	Dec: 15, Hex: 0x0F, Bin: 0001111
Unicode	U+000E	U+000F
ISO 8859-1	0x0E	0x0F
EBCDIC	0x0E	0x0F

Locking Shift Mechanisms

In character encoding protocols such as those defined in ECMA-35, locking shift mechanisms enable persistent invocation of designated graphic character sets into specific positions within the code table. The Shift Out (SO) control function, equivalently known as Locking Shift One (LS1), invokes the alternate graphic character set G1 into the left half (GL) of the code table, allowing subsequent characters to be interpreted from this set until another locking shift occurs.^[2] Conversely, the Shift In (SI) control function, or Locking Shift Zero (LS0), invokes the primary graphic character set G0 back into the GL area, restoring interpretation to the default set for all following characters.^[2] This persistence distinguishes locking shifts from temporary single-shift functions, such as Single Shift Two (SS2) or Single Shift Three (SS3), which only affect the interpretation of the immediately following character before reverting to the prior state.^[2] The application of these mechanisms varies between 7-bit and 8-bit environments. In 7-bit codes, SO and SI directly toggle between G0 and G1 in the GL area, providing a simple mechanism for alternating sets without additional designation sequences.^[2] In 8-bit environments, LS0 and LS1 perform analogous functions, but character sets must first be designated using escape sequences to specific positions (e.g., via ESC sequences for G0 or G1), after which the locking shifts invoke them persistently into GL or, in some cases, the right half (GR) using rightward variants like LS1R.^[2] Error conditions arise if locking shifts are employed without prior designation of the relevant character sets. In such cases, the behavior is undefined, and no graphic characters may be invoked until a valid designation occurs, potentially leading to garbled output or fallback to default interpretations.^[2] Advanced ISO modes introduce further complexity through invocation state management, where shifts interact with a conceptual stack for designating and selecting sets across multiple levels (e.g., G2 or G3), though the core locking persists on a per-area basis until explicitly changed.^[2]

Usage in Standards and Systems

Role in ASCII and Early Computing

The Shift Out (SO) and Shift In (SI) characters, assigned code points 14 (0x0E) and 15 (0x0F) in the ASCII standard, were incorporated to enable temporary shifts between the primary character set and an alternate one, allowing early systems to access additional symbols or modes without expanding the fixed 7-bit code space.^[11] In the ASCII-1963 specification, SO initiated the alternate interpretation of subsequent characters, while SI reverted to the standard set, a mechanism retained from earlier 5-bit teleprinter codes like ITA2 to facilitate compatibility with existing hardware.^[11] This design supported basic extensions for graphics or special characters in text streams, though implementations varied, such as altering typeface or encapsulating legacy codes until SI was encountered.^[3] In ASCII-based terminals of the 1970s and 1980s, such as the DEC VT100, SO and SI functioned as locking shift controls to toggle between character sets, with SO selecting the G1 set (often for line-drawing or pseudographic symbols) and SI returning to the G0 set (standard ASCII).^[17] This usage enabled efficient display of alternate fonts or modes in early digital environments, including minicomputers and serial-connected displays, where SO/SI sequences allowed pseudographics without dedicated escape codes.^[18] For instance, VT100 documentation specifies that SO (octal 016) invokes the G1 set for subsequent rendering, supporting applications like simple diagrams in text-based interfaces.^[17] To ensure backward compatibility with legacy teleprinter hardware like the Teletype ASR-33, ASCII retained SO and SI for ribbon or case control, where SO could shift to an upper-case or alternate mode, and SI restored the lower-case default, mirroring telegraphic practices.^[11] The ASR-33, a staple in 1960s computing labs, transmitted these as 7-bit ASCII controls over serial lines, allowing integration with early systems like PDP minicomputers while preserving functionality for bichrome printing or figure shifts from 5-bit origins.^[19] This compatibility was crucial during the transition from 5-bit ITA2 to 7-bit ASCII, as teletypes formed the backbone of early data communication.^[11] Despite their inclusion, SO and SI saw limited practical adoption in early computing due to ASCII's emphasis on a non-shifting, unambiguous 7-bit structure, which prioritized fixed interpretations over dynamic sets to avoid parsing complexity in serial protocols.^[11] Usage was confined to specific peripherals or international variants requiring occasional mode switches, as the standard's design favored simplicity for broad interoperability in environments like time-sharing systems.^[3] Their role diminished further as escape-sequence-based alternatives emerged, though they persisted in terminal emulations for legacy support.^[19]

Implementation in ISO/IEC 2022

In the ISO/IEC 2022 standard, as defined in ECMA-35, the Shift Out (SO) character at code position 00/14 functions as Locking Shift One (LS1) to invoke the G1 graphic character set into the GL (graphics left) code element area, while the Shift In (SI) character at 00/15 serves as Locking Shift Zero (LS0) to invoke the G0 set into GL.^[2] In both 7-bit and 8-bit environments, SO and SI function as locking shifts, invoking the G1 or G0 graphic character set into the GL area, with the invocation persisting until another locking shift occurs.^[2] These mechanisms are integral to the standard's code structure, which divides the code table into C0 (controls, bits 0-1=00), C1 (bits 0-1=01), GL (bits 0-1=10), and GR (bits 0-1=11) areas, enabling dynamic invocation of multiple character sets without redesignation.^[2] Support for SO and SI is central to 7-bit variants of ISO/IEC 2022, such as ISO-2022-JP and ISO-2022-KR, where they facilitate shifts alongside escape sequences for set designation. For instance, in ISO-2022-KR, an initial escape sequence like ESC $ ) C designates the KS X 1001 (KSC 5601) set to G1, after which SO shifts to this Korean set for double-byte characters, and SI returns to the G0 ASCII set.^[20] Similarly, designations such as ESC ( B invoke the ISO/IR 6 (ASCII) set for G0, ensuring compatibility with 7-bit transport while allowing seamless integration of national character sets.^[2] The use of these shifts extends to multilingual applications, particularly in encodings like ISO-2022-CN defined in RFC 1922, where SO enables switching from Latin/ASCII (G0) to CJK sets such as GB 2312 or CNS 11643 after designation (e.g., ESC $ ) A for GB 2312 to G1), and SI reverts to ASCII, supporting mixed Latin, Katakana from JIS X 0201, and Chinese ideographs in a single 7-bit stream.^[21] This approach allows for efficient handling of ideographic scripts by invoking designated sets on demand, with SI often required at line ends to maintain ASCII compliance.^[21] Conformance to ISO/IEC 2022 mandates SO and SI for 7-bit implementations at Level 2 and above, where support for G0 and G1 shifting is required, ensuring interoperability in text transport.^[2] In 8-bit environments, however, these locking shifts are optional, with single-shift substitutes like SS2 (ESC N) and SS3 (ESC O) providing alternatives for invoking G2 or G3 sets temporarily without full locking.^[2]

Applications and Examples

In Printing and Ribbon Control

In early printing devices equipped with two-color ribbons, typically split lengthwise into black and red sections, the Shift Out (SO) control character raised the ribbon to the red half for emphasis or color variation in printed text, while the Shift In (SI) character lowered it back to the black default position.^[22]^[23] This mechanism allowed automated control over ribbon positioning without manual intervention, originating from telegraphy standards for teleprinters.^[24] Devices such as the Teletype Model 38 implemented SO and SI sequences to produce underlined or colored text effects; for instance, SO would engage the red ribbon to highlight sections, enabling operators to print emphasized passages in messages transmitted over telegraph lines.^[22] In these systems, the characters following SO were rendered in the alternate color until an SI restored the standard black output. A practical sequence involved transmitting SO immediately before a block of characters intended for emphasis, printing them in red, followed by SI to revert to black, thus alternating ribbon colors seamlessly during continuous operation.^[23]

In Character Set Switching

The Shift Out (SO) and Shift In (SI) control characters enable dynamic switching between different graphic character sets within a single 7-bit code space, allowing systems to access extended repertoires without requiring full 8-bit encodings.^[25] In this mechanism, SO temporarily invokes the G1 set (often an alternate repertoire like symbols or non-Latin scripts) into the GL position for subsequent characters, while SI reverts to the G0 set (typically ASCII or Latin basics).^[26] This non-locking shift approach, defined in ISO/IEC 2022, supports efficient mixing of character sets in constrained environments such as early networks or terminals. A prominent example is the KOI7-switched encoding, developed for Soviet systems to handle mixed Russian and Latin text over 7-bit channels. In KOI7-switched, the initial state uses the G0 set based on ISO 646 (Latin characters akin to US-ASCII); SO (octet 16 octal) shifts to the G1 set from ISO 5427 (Cyrillic characters for Russian letters), enabling Cyrillic insertion, and SI (octet 17 octal) shifts back to G0 for Latin continuation.^[25] This allows seamless interleaving, such as in bilingual documents, though it is less efficient than 8-bit alternatives for frequent switches. In Japanese text processing, SO and SI facilitate shifts between Romanji (Latin alphabet) and half-width Katakana in the 7-bit JIS X 0201 encoding, particularly in terminal modes like JIS7. Here, the G0 set holds JIS X 0201 Roman characters (ISO-IR 14), while G1 contains Katakana (ISO-IR 13); SO invokes G1 for phonetic Katakana output, and SI returns to G0 for Romanji, supporting compact representation of mixed scripts in early computing and communication protocols.^[27] VT100 terminals leverage SO and SI for pseudographics, designating the DEC Special Graphics set to G1 via an initial escape sequence, then using SO to invoke it temporarily for box-drawing characters. This shifts from the standard UK or US ASCII in G0 to line-drawing symbols (e.g., horizontal and vertical lines replacing codes 0x5F to 0x7E), enabling UI elements like borders in text-based interfaces, with SI restoring the primary set.^[26] Such usage persists in emulators for rendering legacy applications.

Modern Relevance and Legacy

Deprecated Uses and Alternatives

The use of Shift Out (SO, U+000E) and Shift In (SI, U+000F) control characters has largely been phased out since the late 1990s, coinciding with the widespread adoption of the Unicode Standard and its UTF-8 encoding form.^[28] Introduced in early standards like ASCII and ISO/IEC 2022 for character set switching, these characters became obsolete as Unicode provided a unified encoding without the need for shifting between code sets, with UTF-8 emerging as the dominant internet standard by the early 2000s.^[29] Their avoidance in UTF-8 stems from the encoding's stateless, self-synchronizing design, which eliminates variable-width issues associated with shift-dependent interpretations.^[28] The decline of SO and SI can be attributed to the inherent complexity of their locking shift mechanisms, which require parsers to maintain and track shifting states across streams, increasing error proneness and implementation overhead compared to fixed, stateless encodings.^[29] In contrast, standards like ISO/IEC 8859 series (e.g., ISO 8859-1 for Western European languages) favored single-byte fixed character sets without shifts, simplifying processing while supporting common scripts. This preference extended to legacy systems, such as Windows code pages (e.g., CP1252), which rely on predefined, non-shifting mappings for compatibility. In modern text processing, Unicode offers alternatives that render SO and SI unnecessary. For diacritical marks and accents historically handled via shifts, Unicode employs dedicated combining characters (e.g., U+0300 COMBINING GRAVE ACCENT), allowing inline composition without state changes. Bidirectional text, once managed through set switching, now uses explicit bidirectional override controls like U+202A LEFT-TO-RIGHT EMBEDDING and the Unicode Bidirectional Algorithm for deterministic rendering. Specific implementations have further marginalized SO and SI. In XML 1.0, these C0 control characters (U+000E and U+000F) are invalid and prohibited in document character content, except for allowed whitespace controls, leading to parsing errors if present.^[30] HTML parsers, per the HTML5 specification, emit most control characters, including SO and SI, as-is in text nodes while flagging a "control-character-in-input-stream" parse error, effectively ignoring or neutralizing SO and SI in text nodes.^[31] Modern terminal emulators, such as those based on xterm or iTerm2, process SO and SI in legacy emulation modes like VT100, where they enable character set switching, but may treat them as inert in non-emulation contexts.

Remaining Implementations

Despite their deprecation in most modern systems, Shift Out (SO) and Shift In (SI) characters persist in specific legacy and specialized implementations for compatibility and niche applications.^[32] In terminal emulators, SO and SI remain supported to maintain compatibility with VT100 and similar protocols. The xterm emulator, a standard X Window System terminal, recognizes and processes these 7-bit and 8-bit control characters as part of its VT100 emulation, enabling invocation of alternate character sets via locking shifts.^[33] Similarly, PuTTY, a widely used SSH and Telnet client, emulates VT100 behavior including SO and SI for handling national replacement character sets in legacy sessions. Within email protocols, SO and SI are retained in ISO-2022 variants designed for East Asian languages. RFC 1468 specifies ISO-2022-JP for Japanese text in Internet messages, explicitly incorporating SO (0x0E) and SI (0x0F) as control functions to shift between character sets like ASCII and JIS X 0201 katakana within 7-bit environments.^[32] This encoding ensures compatibility in mail systems handling multilingual content, where shifts facilitate seamless transitions without exceeding 7-bit constraints. In Japan's mobile ecosystem, SoftBank's legacy emoji system from the late 1990s to 2010s utilized SO and SI to access proprietary pictographs in 7-bit SMS and i-mode services, embedding them alongside Shift JIS extensions.^[23] Debugging and analysis tools continue to display SO and SI for inspecting legacy data streams. Hex editors such as HxD render these bytes visibly in ASCII columns or as symbolic representations (e.g., ^N for SO), aiding reverse engineering of old files or serial protocols. Network analyzers like Wireshark expose SO and SI in packet dissections for protocols such as Telnet or raw TCP, where they appear in byte views during traffic capture from vintage systems. In experimental and niche domains, SO and SI occasionally appear in custom 7-bit serial protocols for embedded devices or historical recreations. For instance, retro computing projects in the 2020s, such as simulations of 1970s minicomputers, incorporate these characters to emulate original Baudot-to-ASCII conversions in amateur radio or museum exhibits.

History

Media collections

Shift Out and Shift In characters

Recent from talks

Recent from talks

Contribute something

Contribute something

Media Pages

Timelines

Articles

Notes collections

Notes

Notes

Days in Chronicle

Shift Out and Shift In characters

See also

References

Shift Out and Shift In characters

Definition and Purpose

Shift Out Character

Shift In Character

Historical Development

Origins in Early Teleprinters

Inclusion in ASCII Standard

Technical Specifications

Code Points and Representation

Locking Shift Mechanisms

Usage in Standards and Systems

Role in ASCII and Early Computing

Implementation in ISO/IEC 2022

Applications and Examples

In Printing and Ribbon Control

In Character Set Switching

Modern Relevance and Legacy

Deprecated Uses and Alternatives

Remaining Implementations

References

Add your contribution

Related Hubs

Contribute something