Hubbry Logo
search
logo

ISO/IEC 2022

logo
Community Hub0 Subscribers
Write something...
Be the first to start a discussion here.
Be the first to start a discussion here.
See all
ISO/IEC 2022

ISO/IEC 2022 Information technology—Character code structure and extension techniques, is an ISO/IEC standard in the field of character encoding. It is equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202. Originating in 1971, it was most recently revised in 1994.

ISO 2022 specifies a general structure which character encodings can conform to, dedicating particular ranges of bytes (0x00–1F and 0x7F–9F) to be used for non-printing control codes for formatting and in-band instructions (such as line breaks or formatting instructions for text terminals), rather than graphical characters. It also specifies a syntax for escape sequences, multiple-byte sequences beginning with the ESC control code, which can likewise be used for in-band instructions. Specific sets of control codes and escape sequences designed to be used with ISO 2022 include ISO/IEC 6429, portions of which are implemented by ANSI.SYS and terminal emulators.

ISO 2022 itself also defines particular control codes and escape sequences which can be used for switching between different coded character sets (for example, between ASCII and the Japanese JIS X 0208) so as to use multiple in a single document, effectively combining them into a single stateful encoding (a feature less important since the advent of Unicode). It is designed to be usable in both 8-bit environments and 7-bit environments (those where only seven bits are usable in a byte, such as e-mail without 8BITMIME).

The ASCII character set supports the ISO Basic Latin alphabet (equivalent to the English alphabet), and does not provide good support for languages which use additional letters, or which use a different writing system altogether. Other writing systems with relatively few characters, such as Greek, Cyrillic, Arabic or Hebrew, as well as forms of the Latin script using diacritics or letters absent from the ISO Basic Latin alphabet, have historically been represented on personal computers with different 8-bit, single byte, extended ASCII encodings, which follow ASCII when the most significant bit is 0 (i.e. bytes 0x00–7F, when represented in hexadecimal), and include additional characters for a most significant bit of 1 (i.e. bytes 0x80–FF). Some of these, such as the ISO 8859 series, conform to ISO 2022, while others such as DOS code page 437 do not, usually due to not reserving the bytes 0x80–9F for control codes.

Certain East Asian languages, specifically Chinese, Japanese, and Korean (collectively "CJK"), are written using far more characters than the maximum of 256 which can be represented in a single byte, and were first represented on computers with language-specific double-byte encodings or variable-width encodings; some of these (such as the Simplified Chinese encoding GB 2312) conform to ISO 2022, while others (such as the Traditional Chinese encoding Big5) do not. Control codes in ISO 2022 are always represented with a single byte, regardless of the number of bytes used for graphical characters. CJK encodings used in 7-bit environments which use ISO 2022 mechanisms to switch between character sets are often given names starting with "ISO-2022-", most notably ISO-2022-JP, although some other CJK encodings such as EUC-JP also make use of ISO 2022 mechanisms.

Since the first 256 code points of Unicode were taken from ISO 8859-1, Unicode inherits the concept of C0 and C1 control codes from ISO 2022, although it adds other non-printing characters besides the ISO 2022 control codes. However, Unicode transformation formats such as UTF-8 generally deviate from the ISO 2022 structure in various ways, including:

ISO 2022 escape sequences do, however, exist for switching to and from UTF-8 as a "coding system different from that of ISO 2022", which are supported by certain terminal emulators such as xterm.

ISO/IEC 2022 specifies the following:

See all
User Avatar
No comments yet.