Hubbry Logo
Shift Out and Shift In charactersShift Out and Shift In charactersMain
Open search
Shift Out and Shift In characters
Community hub
Shift Out and Shift In characters
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Shift Out and Shift In characters
Shift Out and Shift In characters
from Wikipedia
Shift In and Shift Out used in a Linux terminal to access a variant DEC Special Graphics set

Shift Out (SO) and Shift In (SI) are ASCII control characters 14 and 15, respectively (0x0E and 0x0F).[1] These are sometimes also called "Control-N" and "Control-O".

The original purpose of these characters was to provide a way to shift a coloured ribbon, split longitudinally usually with red and black, up and down to the other colour in an electro-mechanical typewriter or teleprinter, such as the Teletype Model 38, to automate the same function of manual typewriters. Black was the conventional ambient default colour and so was shifted "in" or "out" with the other colour on the ribbon.

Later advancements in technology instigated use of this function for switching to a different font or character set and back. This was used, for instance, in the Russian character set known as KOI7-switched, where SO starts printing Russian letters, and SI starts printing Latin letters again. Similarly, they are used for switching between Katakana and Roman letters in the 7-bit version of the Japanese JIS X 0201.[2][3]

SO/SI control characters also are used to display VT100 pseudographics. Shift In is also used in the 2G variant[4] of SoftBank Mobile's encoding for emoji.

The ISO/IEC 2022 standard (ECMA-35, JIS X 0202) standardises the generalized usage of SO and SI for switching between pre-designated character sets invoked over the 0x20–0x7F byte range. It refers to them respectively as Locking Shift One (LS1) and Locking Shift Zero (LS0) in an 8-bit environment, or as SO and SI in a 7-bit environment.[5] In ISO-2022-compliant code sets where the 0x0E and 0x0F characters are used for the purpose of emphasis (such as an italic or red font) rather than a change of character set, they are referred to respectively as Upper Rail (UR) and Lower Rail (LR), rather than SO and SI.[6]

See also

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Shift Out (SO) and Shift In (SI) are control characters in the 7-bit coded character set for information interchange, as defined by the ECMA-6 standard (equivalent to ISO/IEC 646), with SO assigned the bit combination 0/14 (decimal 14, hexadecimal 0E, Unicode U+000E) and SI assigned 0/15 (decimal 15, hexadecimal 0F, Unicode U+000F). These characters belong to the C0 control set and are primarily used to enable code extension techniques by switching between alternate graphic character sets in data transmission and processing environments. In the ISO/IEC 2022 framework for character code structure and extension (identical to ECMA-35), SO functions as a locking-shift mechanism to invoke the designated G1 graphic character set into the GL (graphics left) code table area (columns 02 to 07 in a 7-bit code), affecting all subsequent characters until another shift occurs. Conversely, SI invokes the G0 graphic character set (typically the basic ASCII set) back into the GL area, restoring the primary encoding state. This pairwise shifting allows for efficient representation of extended character repertoires, such as national variants or additional symbols, without requiring a full 8-bit code, and supports transformations between 7-bit and 8-bit codes while preserving shift states. Originally included in the 1963 ASCII standard to facilitate ribbon shifting on early teleprinters and typewriters for bichrome output, SO and SI evolved into key components of international encoding standards for handling multilingual text in constrained bandwidth scenarios, such as early email protocols (e.g., ISO-2022-JP and ISO-2022-KR). Their use persists in legacy systems, terminal emulations like VT100, and certain East Asian encodings, where they enable dynamic toggling between ASCII-compatible sets and denser graphic sets containing ideographs or special symbols. In modern contexts, while largely superseded by fixed-width encodings like UTF-8, they remain defined in Unicode for backward compatibility and round-trip preservation in protocols adhering to ISO 2022 variants.

Definition and Purpose

Shift Out Character

The Shift Out (SO) character is defined as ASCII number 14, with a hexadecimal value of and value of 14, and is commonly referred to as Control-N. Its primary function is to invoke or lock an alternate character set, such as the G1 graphic character set in ISO terminology, thereby enabling the interpretation of subsequent code combinations outside the standard character set until a Shift In character is received. Alternatively, in certain device contexts, SO can shift a physical mechanism, such as engaging an alternate color ribbon in printers. Operationally, SO functions as a locking shift mechanism, activating the alternate state persistently for following characters and allowing continuous use of the shifted configuration without repeated invocations, which is countered only by the paired Shift In character to revert to the primary state. The standard mnemonic and symbolic representations for SO include the abbreviation "SO" and the caret notation "^" in various technical documentation and implementations.

Shift In Character

The Shift In (SI) character is a control function defined in the ASCII standard as the 15, represented in as 0x0F and in as 15, and commonly abbreviated as Control-O or ^O in mnemonic notations. This non-printing character serves to invoke the primary (G0) character set, ensuring that subsequent bit combinations are interpreted according to the default encoding until another control function alters the state. Its primary function is to revert the active character set to the initially designated primary set, such as G0 in the of code extension techniques, thereby locking any prior shift to an alternate set. In operational terms, SI immediately cancels the effects of a preceding Shift Out (SO) by switching back to the default state, with the change applying to the following characters in the without affecting ongoing control sequences. This mechanism allows for temporary extensions of the character repertoire in 7-bit environments, where SI is specifically employed as a locking-shift function to the G0 set. SI is symbolically represented as SI in standards documentation and ^O in caret notation for control characters, facilitating its identification in programming and terminal contexts. In conjunction with SO, it enables selective invocation of alternate character sets (like G1) for brief sequences before returning to the primary set, supporting efficient handling of multilingual or specialized symbols within constrained code spaces.

Historical Development

Origins in Early Teleprinters

The origins of the Shift Out (SO) and Shift In (SI) characters trace back to the mechanical constraints of early 20th-century printing telegraphs, where limited bit widths necessitated mechanisms to toggle between character sets without expanding the code itself. Émile Baudot's pioneering 5-bit code, patented in 1874, laid the foundation by introducing shift functions to access letters, figures, and symbols within a 32-character repertoire, enabling efficient transmission over telegraph lines in devices like synchronous multiplex printers. This approach influenced subsequent systems by allowing dynamic mode switching, a concept directly ancestral to SO and SI for set selection. In the 1900s and 1920s, electro-mechanical teleprinters such as those developed by the Morkrum Company (later ) incorporated shift mechanisms to control printing operations, including the advancement of colored ribbons—typically red and black—for distinguishing text types in page printers. For instance, Murray's variants of the , patented starting in 1901, refined these shifts to optimize keyboard input and tape , facilitating reliable operation in asynchronous start-stop systems used for stock tickers and news services. These innovations addressed the need for dual-mode printing in bandwidth-limited environments, with shifts mechanically locking the typebar or ribbon position until reset. The International Telegraph Alphabet No. 2 (ITA2), formalized in the 1920s, standardized these shift operations for global use, employing FIGS (figures/symbols mode) and LTRS (letters mode) to expand the effective character set beyond 5 bits while maintaining compatibility with early teleprinters. ITA2's design, influenced by Murray's enhancements, allowed toggling between upper- and lower-case representations or numerals/punctuation, directly paralleling the later SO and SI functions for set invocation. This was crucial for handling mixed text in mechanical devices, where a single shift code could reconfigure the print head for an entire sequence. Key milestones included the Comité Consultatif International Télégraphique (CCITT)'s adoption of ITA2 in 1930 as the for , promoting across borders. Earlier, in 1922, the U.S. Navy demonstrated (RTTY) using Baudot-derived shifts to transmit messages from an airplane to ground stations, highlighting the codes' robustness in applications. By , the Radio Corporation of America (RCA) employed similar RTTY systems to send text from shore to ships, further embedding shift mechanisms in maritime and .

Inclusion in ASCII Standard

The development of the Shift Out (SO) and Shift In (SI) characters within the ASCII standard began with proposals from the American Standards Association (ASA) X3.4 committee in 1961, as part of efforts to create a unified seven-bit code for information interchange. This initial proposal, influenced by the need to accommodate existing communication practices, assigned SO to position 0/14 and SI to 0/15 for enabling shifts between character sets, drawing directly from teleprinter conventions. The committee's work culminated in the publication of the first ASCII standard, ASA X3.4-1963, on June 17, 1963, which formalized the inclusion of SO and SI in the control character block at decimal positions 14 (0x0E) and 15 (0x0F), respectively. The rationale for retaining SO and SI in the seven-bit ASCII design centered on with established codes, such as the International Telegraph Alphabet No. 2 (ITA2), despite the standard's goal of assigning fixed positions to characters to eliminate the need for shifting mechanisms. Proponents argued that these control characters were essential for interoperability with legacy equipment used in data communications, allowing efficient switching between uppercase/lowercase or figures/letters without expanding the code beyond seven bits. This decision reflected compromises during committee debates, where non-shifting alternatives—like AT&T's earlier six-unit code proposal—were considered but ultimately rejected in favor of preserving compatibility for practical deployment. Key influences on the inclusion came from teleprinter manufacturers, notably (a subsidiary of ), whose equipment relied on similar shift functions for ribbon control and character interpretation in five-unit codes. Representatives from Teletype and participated in the X3.4 deliberations, advocating for SO and SI to ensure the new standard supported existing hardware without requiring widespread replacements. The 1967 revision (USAS X3.4-1967) added definitions for lowercase letters, while the 1968 revision (USAS X3.4-1968), published as ANSI X3.4-1968, maintained these assignments unchanged, solidifying their place in the control block (0x00-0x1F) with minor clarifications to control functions.

Technical Specifications

Code Points and Representation

The Shift Out (SO) character is encoded in the ASCII standard at decimal 14, 0x0E, and binary 0001110 (7-bit representation). The Shift In (SI) character follows immediately at decimal 15, 0x0F, and binary 0001111. These positions place SO and SI within the C0 range of ASCII, defined in the original ANSI X3.4-1968 standard and subsequent revisions. In Unicode, SO is mapped to the code point U+000E and SI to U+000F, both within the Basic Latin block (U+0000 to U+007F). These are classified as control characters with no assigned glyph in standard rendering, though they retain their ASCII semantics for compatibility. For visualization in debugging or documentation tools, symbolic representations are sometimes used, such as ␎ (U+240E, SYMBOL FOR SHIFT OUT) for SO and ␏ (U+240F, SYMBOL FOR SHIFT IN) for SI, drawn from the Control Pictures block. The encodings remain consistent in 8-bit extensions of ASCII, where the most significant bit is set to 0 for these 7-bit controls, preserving their positions without alteration. In ISO/IEC 8859 series standards, such as ISO/IEC 8859-1 (Latin-1), SO and SI occupy the same byte values (0x0E and 0x0F) to ensure with 7-bit ASCII. Similarly, in , the encodings align at hexadecimal 0x0E for SO and 0x0F for SI, maintaining equivalence to ASCII control codes despite differences elsewhere in the .
StandardSO Code PointSI Code Point
ASCII (7-bit)Dec: 14, Hex: 0x0E, Bin: 0001110Dec: 15, Hex: 0x0F, Bin: 0001111
UnicodeU+000EU+000F
ISO 8859-10x0E0x0F
EBCDIC0x0E0x0F

Locking Shift Mechanisms

In character encoding protocols such as those defined in ECMA-35, locking shift mechanisms enable persistent invocation of designated graphic character sets into specific positions within the code table. The Shift Out (SO) control function, equivalently known as Locking Shift One (LS1), invokes the alternate graphic character set G1 into the left half (GL) of the code table, allowing subsequent characters to be interpreted from this set until another locking shift occurs. Conversely, the Shift In (SI) control function, or Locking Shift Zero (LS0), invokes the primary graphic character set G0 back into the GL area, restoring interpretation to the default set for all following characters. This persistence distinguishes locking shifts from temporary single-shift functions, such as Single Shift Two (SS2) or Single Shift Three (SS3), which only affect the interpretation of the immediately following character before reverting to the prior state. The application of these mechanisms varies between 7-bit and 8-bit environments. In 7-bit codes, SO and SI directly toggle between G0 and G1 in the GL area, providing a simple mechanism for alternating sets without additional designation sequences. In 8-bit environments, LS0 and LS1 perform analogous functions, but character sets must first be designated using escape sequences to specific positions (e.g., via ESC sequences for G0 or G1), after which the locking shifts invoke them persistently into or, in some cases, the right half (GR) using rightward variants like LS1R. Error conditions arise if locking shifts are employed without prior designation of the relevant character sets. In such cases, the behavior is undefined, and no graphic characters may be invoked until a valid designation occurs, potentially leading to garbled output or fallback to default interpretations. Advanced ISO modes introduce further complexity through invocation state management, where shifts interact with a conceptual stack for designating and selecting sets across multiple levels (e.g., or G3), though the core locking persists on a per-area basis until explicitly changed.

Usage in Standards and Systems

Role in ASCII and Early Computing

The Shift Out (SO) and Shift In (SI) characters, assigned code points 14 (0x0E) and 15 (0x0F) in the ASCII standard, were incorporated to enable temporary shifts between the primary character set and an alternate one, allowing early systems to access additional symbols or modes without expanding the fixed 7-bit code space. In the ASCII-1963 specification, SO initiated the alternate interpretation of subsequent characters, while SI reverted to the standard set, a mechanism retained from earlier 5-bit teleprinter codes like ITA2 to facilitate compatibility with existing hardware. This design supported basic extensions for graphics or special characters in text streams, though implementations varied, such as altering typeface or encapsulating legacy codes until SI was encountered. In ASCII-based terminals of the 1970s and 1980s, such as the , SO and SI functioned as locking shift controls to toggle between character sets, with SO selecting the G1 set (often for line-drawing or pseudographic symbols) and SI returning to the G0 set (standard ASCII). This usage enabled efficient display of alternate fonts or modes in early digital environments, including minicomputers and serial-connected displays, where SO/SI sequences allowed pseudographics without dedicated escape codes. For instance, documentation specifies that SO (octal 016) invokes the G1 set for subsequent rendering, supporting applications like simple diagrams in text-based interfaces. To ensure backward compatibility with legacy teleprinter hardware like the Teletype ASR-33, ASCII retained SO and SI for ribbon or case control, where SO could shift to an upper-case or alternate mode, and SI restored the lower-case default, mirroring telegraphic practices. The ASR-33, a staple in 1960s computing labs, transmitted these as 7-bit ASCII controls over serial lines, allowing integration with early systems like PDP minicomputers while preserving functionality for bichrome printing or figure shifts from 5-bit origins. This compatibility was crucial during the transition from 5-bit ITA2 to 7-bit ASCII, as teletypes formed the backbone of early data communication. Despite their inclusion, SO and SI saw limited practical adoption in early due to ASCII's emphasis on a non-shifting, unambiguous 7-bit , which prioritized fixed interpretations over dynamic sets to avoid complexity in serial protocols. Usage was confined to specific peripherals or international variants requiring occasional mode switches, as the standard's design favored simplicity for broad in environments like systems. Their role diminished further as escape-sequence-based alternatives emerged, though they persisted in terminal emulations for legacy support.

Implementation in ISO/IEC 2022

In the ISO/IEC 2022 standard, as defined in ECMA-35, the Shift Out (SO) character at code position 00/14 functions as Locking Shift One (LS1) to invoke the G1 graphic character set into the GL (graphics left) code element area, while the Shift In (SI) character at 00/15 serves as Locking Shift Zero (LS0) to invoke the G0 set into GL. In both 7-bit and 8-bit environments, SO and SI function as locking shifts, invoking the G1 or G0 graphic character set into the GL area, with the invocation persisting until another locking shift occurs. These mechanisms are integral to the standard's code structure, which divides the code table into C0 (controls, bits 0-1=00), C1 (bits 0-1=01), GL (bits 0-1=10), and GR (bits 0-1=11) areas, enabling dynamic invocation of multiple character sets without redesignation. Support for SO and SI is central to 7-bit variants of ISO/IEC 2022, such as ISO-2022-JP and ISO-2022-KR, where they facilitate shifts alongside escape sequences for set designation. For instance, in ISO-2022-KR, an initial like ESC $ ) C designates the (KSC 5601) set to G1, after which SO shifts to this Korean set for double-byte characters, and SI returns to the G0 ASCII set. Similarly, designations such as ESC ( B invoke the ISO/IR 6 (ASCII) set for G0, ensuring compatibility with 7-bit transport while allowing seamless integration of national character sets. The use of these shifts extends to multilingual applications, particularly in encodings like ISO-2022-CN defined in RFC 1922, where SO enables switching from Latin/ASCII (G0) to CJK sets such as or CNS 11643 after designation (e.g., ESC $ ) A for to G1), and SI reverts to ASCII, supporting mixed Latin, from , and Chinese ideographs in a single 7-bit stream. This approach allows for efficient handling of ideographic scripts by invoking designated sets on demand, with SI often required at line ends to maintain ASCII compliance. Conformance to ISO/IEC 2022 mandates SO and SI for 7-bit implementations at Level 2 and above, where support for G0 and G1 shifting is required, ensuring interoperability in text transport. In 8-bit environments, however, these locking shifts are optional, with single-shift substitutes like SS2 (ESC N) and SS3 (ESC O) providing alternatives for invoking G2 or G3 sets temporarily without full locking.

Applications and Examples

In Printing and Ribbon Control

In early printing devices equipped with two-color ribbons, typically split lengthwise into black and red sections, the Shift Out (SO) control character raised the ribbon to the red half for emphasis or color variation in printed text, while the Shift In (SI) character lowered it back to the black default position. This mechanism allowed automated control over ribbon positioning without manual intervention, originating from standards for teleprinters. Devices such as the Teletype Model 38 implemented SO and SI sequences to produce underlined or colored text effects; for instance, SO would engage the red ribbon to highlight sections, enabling operators to print emphasized passages in messages transmitted over telegraph lines. In these systems, the characters following SO were rendered in the alternate color until an SI restored the standard black output. A practical sequence involved transmitting SO immediately before a block of characters intended for emphasis, printing them in red, followed by SI to revert to black, thus alternating ribbon colors seamlessly during continuous operation.

In Character Set Switching

The Shift Out (SO) and Shift In (SI) control characters enable dynamic switching between different graphic character sets within a single 7-bit code space, allowing systems to access extended repertoires without requiring full 8-bit encodings. In this mechanism, SO temporarily invokes the G1 set (often an alternate repertoire like symbols or non-Latin scripts) into the GL position for subsequent characters, while SI reverts to the G0 set (typically ASCII or Latin basics). This non-locking shift approach, defined in ISO/IEC 2022, supports efficient mixing of character sets in constrained environments such as early networks or terminals. A prominent example is the KOI7-switched encoding, developed for Soviet systems to handle mixed Russian and Latin text over 7-bit channels. In KOI7-switched, the initial state uses the G0 set based on ISO 646 (Latin characters akin to US-ASCII); SO (octet 16 octal) shifts to the G1 set from ISO 5427 (Cyrillic characters for Russian letters), enabling Cyrillic insertion, and SI (octet 17 octal) shifts back to G0 for Latin continuation. This allows seamless interleaving, such as in bilingual documents, though it is less efficient than 8-bit alternatives for frequent switches. In Japanese text processing, SO and SI facilitate shifts between Romanji (Latin alphabet) and half-width in the 7-bit encoding, particularly in terminal modes like JIS7. Here, the G0 set holds JIS X 0201 Roman characters (ISO-IR 14), while G1 contains (ISO-IR 13); SO invokes G1 for phonetic output, and SI returns to G0 for Romanji, supporting compact representation of mixed scripts in early computing and communication protocols. VT100 terminals leverage SO and SI for pseudographics, designating the DEC Special Graphics set to G1 via an initial escape sequence, then using SO to invoke it temporarily for box-drawing characters. This shifts from the standard UK or US ASCII in G0 to line-drawing symbols (e.g., horizontal and vertical lines replacing codes 0x5F to 0x7E), enabling UI elements like borders in text-based interfaces, with SI restoring the primary set. Such usage persists in emulators for rendering legacy applications.

Modern Relevance and Legacy

Deprecated Uses and Alternatives

The use of Shift Out (SO, U+000E) and Shift In (SI, U+000F) control characters has largely been phased out since the late 1990s, coinciding with the widespread adoption of the Standard and its encoding form. Introduced in early standards like ASCII and ISO/IEC 2022 for character set switching, these characters became obsolete as provided a unified encoding without the need for shifting between code sets, with emerging as the dominant by the early 2000s. Their avoidance in stems from the encoding's stateless, self-synchronizing design, which eliminates variable-width issues associated with shift-dependent interpretations. The decline of SO and SI can be attributed to the inherent complexity of their locking shift mechanisms, which require parsers to maintain and track shifting states across streams, increasing error proneness and implementation overhead compared to fixed, stateless encodings. In contrast, standards like ISO/IEC 8859 series (e.g., ISO 8859-1 for Western European languages) favored single-byte fixed character sets without shifts, simplifying processing while supporting common scripts. This preference extended to legacy systems, such as (e.g., CP1252), which rely on predefined, non-shifting mappings for compatibility. In modern text processing, offers alternatives that render SO and SI unnecessary. For diacritical marks and accents historically handled via shifts, Unicode employs dedicated combining characters (e.g., U+0300 COMBINING GRAVE ACCENT), allowing inline composition without state changes. , once managed through set switching, now uses explicit bidirectional override controls like U+202A LEFT-TO-RIGHT EMBEDDING and the Unicode Bidirectional Algorithm for deterministic rendering. Specific implementations have further marginalized SO and SI. In XML 1.0, these C0 control characters (U+000E and U+000F) are invalid and prohibited in document character content, except for allowed whitespace controls, leading to parsing errors if present. HTML parsers, per the specification, emit most control characters, including SO and SI, as-is in text nodes while flagging a "control-character-in-input-stream" parse error, effectively ignoring or neutralizing SO and SI in text nodes. Modern terminal emulators, such as those based on or , process SO and SI in legacy emulation modes like , where they enable character set switching, but may treat them as inert in non-emulation contexts.

Remaining Implementations

Despite their deprecation in most modern systems, Shift Out (SO) and Shift In (SI) characters persist in specific legacy and specialized implementations for compatibility and niche applications. In terminal emulators, SO and SI remain supported to maintain compatibility with and similar protocols. The emulator, a standard terminal, recognizes and processes these 7-bit and 8-bit control characters as part of its emulation, enabling invocation of alternate character sets via locking shifts. Similarly, , a widely used SSH and client, emulates behavior including SO and SI for handling national replacement character sets in legacy sessions. Within email protocols, SO and SI are retained in ISO-2022 variants designed for . RFC 1468 specifies ISO-2022-JP for Japanese text in messages, explicitly incorporating SO (0x0E) and SI (0x0F) as control functions to shift between character sets like ASCII and JIS X 0201 within 7-bit environments. This encoding ensures compatibility in mail systems handling multilingual content, where shifts facilitate seamless transitions without exceeding 7-bit constraints. In Japan's mobile ecosystem, SoftBank's legacy emoji system from the late 1990s to 2010s utilized SO and SI to access proprietary pictographs in 7-bit and services, embedding them alongside extensions. Debugging and tools continue to display SO and SI for inspecting legacy data streams. Hex editors such as render these bytes visibly in ASCII columns or as symbolic representations (e.g., ^N for SO), aiding of old files or serial protocols. Network analyzers like expose SO and SI in packet dissections for protocols such as or raw TCP, where they appear in byte views during traffic capture from vintage systems. In experimental and niche domains, SO and SI occasionally appear in custom 7-bit serial protocols for embedded devices or historical recreations. For instance, retro computing projects in the 2020s, such as simulations of minicomputers, incorporate these characters to emulate original Baudot-to-ASCII conversions in or museum exhibits.

References

Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.