Recent from talks
Contribute something
Nothing was collected or created yet.
Newline
View on Wikipedia
This article needs additional citations for verification. (February 2016) |

A newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character (newline character), or a sequence of characters, is used to signify the end of a line of text and the start of a new one.[1]
History
[edit]In the mid-1800s, long before the advent of teleprinters and teletype machines, Morse code operators or telegraphists invented and used Morse code prosigns to encode white space text formatting in formal written text messages. In particular, the Morse prosign BT (mnemonic break text), represented by the concatenation of literal textual Morse codes "B" and "T" characters, sent without the normal inter-character spacing, is used in Morse code to encode and indicate a new line or new section in a formal text message.
Later, in the age of modern teleprinters, standardized character set control codes were developed to aid in white space text formatting. ASCII was developed simultaneously by the International Organization for Standardization (ISO) and the American Standards Association (ASA), the latter being the predecessor organization to American National Standards Institute (ANSI). During the period of 1963 to 1968, the ISO draft standards supported the use of either CR+LF or LF alone as a newline, while the ASA drafts supported only CR+LF.
The sequence CR+LF was commonly used on many early computer systems that had adopted Teletype machines—typically a Teletype Model 33 ASR—as a console device, because this sequence was required to position those printers at the start of a new line. The separation of newline into two functions concealed the fact that the print head could not return from the far right to the beginning of the next line in time to print the next character. Any character printed after a CR would often print as a smudge in the middle of the page while the print head was still moving the carriage back to the first position. "The solution was to make the newline two characters: CR to move the carriage to column one, and LF to move the paper up."[2] In fact, it was often necessary to send extra padding characters—extraneous CRs or NULs—which are ignored but give the print head time to move to the left margin. Many early video displays also required multiple character times to scroll the display.
On such systems, applications had to talk directly to the Teletype machine and follow its conventions since the concept of device drivers hiding such hardware details from the application was not yet well developed. Therefore, text was routinely composed to satisfy the needs of Teletype machines. Most minicomputer systems from DEC used this convention. CP/M also used it in order to print on the same terminals that minicomputers used. From there MS-DOS (1981) adopted CP/M's CR+LF in order to be compatible, and this convention was inherited by Microsoft's later Windows operating system.
The Multics operating system began development in 1964 and used LF alone as its newline. Multics used a device driver to translate this character to whatever sequence a printer needed (including extra padding characters), and the single byte was more convenient for programming. What seems like a more obvious choice – CR – was not used, as CR provided the useful function of overprinting one line with another to create boldface, underscore and strikethrough effects. Perhaps more importantly, the use of LF alone as a line terminator had already been incorporated into drafts of the eventual ISO/IEC 646 standard. Unix followed the Multics practice, and later Unix-like systems followed Unix. This created conflicts between Windows and Unix-like operating systems, whereby files composed on one operating system could not be properly formatted or interpreted by another operating system (for example a UNIX shell script written in a Windows text editor like Notepad[3][4]).
Representation
[edit]The concepts of carriage return (CR) and line feed (LF) are closely associated and can be considered either separately or together. In the physical media of typewriters and printers, two axes of motion, "down" and "across", are needed to create a new line on the page. Although the design of a machine (typewriter or printer) must consider them separately, the abstract logic of software can combine them together as one event. This is why a newline in character encoding can be defined as CR and LF combined into one (commonly called CR+LF or CRLF).
Some character sets provide a separate newline character code. EBCDIC, for example, provides an NL character code in addition to the CR and LF codes. Unicode, in addition to providing the ASCII CR and LF control codes, also provides a "next line" (NEL) control code, as well as control codes for "line separator" and "paragraph separator" markers. Unicode also contains printable characters for visually representing line feed ␊, carriage return ␍, and other C0 control codes (as well as a generic newline, ) in the Control Pictures block.
| Operating system | Character encoding | Abbreviation | hex value | dec value | Escape sequence |
|---|---|---|---|---|---|
| Multics POSIX standard oriented systems: Unix and Unix-like systems (Linux, macOS, *BSD, AIX, Xenix, etc.), QNX 4+ Others: BeOS, Amiga, RISC OS, and others[5] |
ASCII | LF | 0A | 10 | \n |
| Windows, MS-DOS compatibles, Atari TOS, DEC TOPS-10, RT-11, CP/M, MP/M, OS/2, Symbian OS, Palm OS, Amstrad CPC, and most other early non-Unix and non-IBM operating systems | CR LF | 0D 0A | 13 10 | \r\n | |
| Commodore 64, Commodore 128, Acorn BBC, ZX Spectrum, TRS-80, Apple II, Oberon, classic Mac OS, HP Series 80, MIT Lisp Machine, and OS-9 | CR | 0D | 13 | \r | |
| Acorn BBC[6] and RISC OS spooled text output[7] | LF CR | 0A 0D | 10 13 | \n\r | |
| QNX pre-POSIX implementation (version < 4) | RS | 1E | 30 | \036 | |
| Atari 8-bit computers | ATASCII | EOL | 9B | 155 | |
| IBM mainframe systems, including z/OS (OS/390) and IBM i (OS/400) | EBCDIC | NL | 15 | 21 | \025 |
| ZX80 and ZX81 (home computers from Sinclair Research Ltd) | ZX80/ZX81 proprietary encoding | 76 | 118 |
- EBCDIC systems—mainly IBM mainframe systems, including z/OS (OS/390) and IBM i (OS/400)—use NL (New Line, 0x15)[8] as the character combining the functions of line feed and carriage return. The equivalent Unicode character (
0x85) is called NEL (Next Line). EBCDIC also has control characters called CR and LF, but the numerical value of LF (0x25) differs from the one used by ASCII (0x0A). Additionally, some EBCDIC variants also use NL but assign a different numeric code to the character. However, those operating systems use a record-based file system, which stores text files as one record per line. In most file formats, no line terminators are actually stored. - Operating systems for the CDC 6000 series defined a newline as two or more zero-valued six-bit characters at the end of a 60-bit word. Some configurations also defined a zero-valued character as a colon character, with the result that multiple colons could be interpreted as a newline depending on position.
- RSX-11 and OpenVMS also use a record-based file system, which stores text files as one record per line. In most file formats, no line terminators are actually stored, but the Record Management Services facility can transparently add a terminator to each line when it is retrieved by an application. The records themselves can contain the same line terminator characters, which can either be considered a feature or a nuisance depending on the application. RMS not only stores records, but also stores metadata about the record separators in different bits for the file to complicate matters even more (since files can have fixed length records, records that are prefixed by a count or records that are terminated by a specific character). The bits are not generic, so while they can specify that CRLF or LF or even CR is the line terminator, they can not substitute some other code.
- Fixed line length was used by some early mainframe operating systems. In such a system, an implicit end-of-line was assumed every 72 or 80 characters, for example. No newline character was stored. If a file was imported from the outside world, lines shorter than the line length had to be padded with spaces, while lines longer than the line length had to be truncated. This mimicked the use of punched cards, on which each line was stored on a separate card, usually with 80 columns on each card, often with sequence numbers in columns 73–80. Many of these systems added a carriage control character to the start of the next record; this could indicate whether the next record was a continuation of the line started by the previous record, or a new line, or should overprint the previous line (similar to a CR). Often this was a normal printing character such as
#that thus could not be used as the first character in a line. Some early line printers interpreted these characters directly in the records sent to them.
Communication protocols
[edit]Many communications protocols have some sort of new line convention. In particular, protocols published by the Internet Engineering Task Force (IETF) typically use the ASCII CRLF sequence.
In some older protocols, the new line may be followed by a checksum or parity character.
Unicode
[edit]The Unicode standard defines a number of characters that conforming applications should recognize as line terminators:[9]
| LF: | Line Feed, U+000A |
| VT: | Vertical Tab, U+000B |
| FF: | Form Feed, U+000C |
| CR: | Carriage Return, U+000D |
| CR+LF: | CR (U+000D) followed by LF (U+000A) |
| NEL: | Next Line, U+0085 |
| LS: | Line Separator, U+2028 |
| PS: | Paragraph Separator, U+2029 |
While it may seem overly complicated compared to an approach such as converting all line terminators to a single character (e.g. LF), because Unicode is designed to preserve all information when converting a text file from any existing encoding to Unicode and back (round-trip integrity), Unicode needs to make the same distinctions between line breaks made by other encodings. For instance EBCDIC has NL, CR, and LF characters, so all three have to also exist in Unicode.
Most newline characters and sequences are in ASCII's C0 controls (i.e. have Unicode code points up to 0x1F). The three newline characters outside of this range—NEL, LS and PS—are often not recognized as newlines by software. For example:
- JSON recognizes CR and LF as whitespace, but not any other newline characters.[10] C0 controls cannot appear unescaped within strings, but any other line break characters can.[11]
- ECMAScript only recognizes CR, LF, LS and PS as line terminators.[12] Historically, unescaped line terminators were not permitted in string literals,[13] but this was changed in ES2019 to allow unescaped LS and PS in strings[12] for compatibility with JSON.[14]
- YAML 1.1 recognized all three as line breaks; YAML 1.2 no longer recognizes them as line breaks in order to be compatible with JSON.[15]
- Windows Notepad, the default text editor of Microsoft Windows, does not treat any of NEL, LS, or PS as line breaks.
- gedit, the default text editor of the GNOME desktop environment, treats LS and PS as line breaks, but not NEL.
Unicode includes some glyphs intended for presenting a user-visible character to the reader of the document, and are thus not recognized themselves as a newline:
- U+23CE ⏎ RETURN SYMBOL
- U+240A ␊ SYMBOL FOR LINE FEED
- U+240D ␍ SYMBOL FOR CARRIAGE RETURN
- U+2424  SYMBOL FOR NEWLINE
HTML
[edit]In HTML, line breaks are whitespace and are generally[a] treated no different than spaces.[16] Paragraphs are created using separate instances of the HTML element <p>, with the physical separation of paragraphs controlled by the rendering engine.[17]
Line breaks can be explicitly created using the HTML element <br>. To facilitate screen readers being able to interpret pages, HTML documentation recommends against using this element for paragraphing. Instead, sources including MDN Web Docs suggest using this element for poems.[18]
In programming languages
[edit]To facilitate creating portable programs, programming languages provide some abstractions to deal with the different types of newline sequences used in different environments.
The C language provides the escape sequences \n (newline) and \r (carriage return). However, these are not required to be equivalent to the ASCII LF and CR control characters. The C standard only guarantees two traits:
- Each of these escape sequences maps to a unique implementation-defined number that can be stored in one char value.
- When writing to a file, device node, or socket/fifo in text mode,
\nis transparently translated to the native newline sequence used by the system, which may be longer than one character. When reading in text mode, the native newline sequence is translated back to\n. In binary mode, no translation is performed, and the internal representation produced by\nis output directly.
On Unix operating system platforms, where C originated, the native newline sequence is ASCII LF (0x0A), so \n was simply defined to be that value. With the internal and external representation being identical, the translation performed in text mode is a no-op, and Unix has no notion of text mode or binary mode. This has caused many programmers who developed their software on Unix systems simply to ignore the distinction completely, resulting in code that is not portable to different platforms.
The C standard library function fgets() is best avoided in binary mode because any file not written with the Unix newline convention will be misread. Also, in text mode, any file not written with the system's native newline sequence (such as a file created on a Unix system, then copied to a Windows system) will be misread as well.
Another common problem is the use of \n when communicating using an Internet protocol that mandates the use of ASCII CR+LF for ending lines. Writing \n to a text mode stream works correctly on Windows systems, but produces only LF on Unix, and something completely different on more exotic systems. Using \r\n in binary mode is slightly better.
Many languages, such as C++, Perl,[19] and Haskell provide the same interpretation of \n as C. C++ has an alternative input/output (I/O) model where the manipulator std::endl can be used to output a newline (and flushes the stream buffer).
Java, PHP,[20] and Python[21] provide the \r\n sequence (for ASCII CR+LF). In contrast to C, these are guaranteed to represent the values U+000D and U+000A, respectively.
The Java Class Library input/output (I/O) methods do not transparently translate these into platform-dependent newline sequences on input or output. Instead, they provide functions for writing a full line that automatically add the native newline sequence, and functions for reading lines that accept any of CR, LF, or CR+LF as a line terminator (see BufferedReader.readLine()). The System.lineSeparator() method can be used to retrieve the underlying line separator.
Example:
String eol = System.lineSeparator();
String lineColor = "Color: Red" + eol;
Python has a "universal newline support" feature enabled by default, which translates all three commonly found line ending conventions (\n, \r, \r\n) into Python's standard \n convention when opening a file for reading, when importing modules, and when executing a file. This feature can be controlled using the newline argument in the open() function when opening the file.[22][23]
Some languages have created special variables, constants, and subroutines to facilitate newlines during program execution. In some languages such as PHP and Perl, double quotes are required to perform escape substitution for all escape sequences, including \n and \r. In PHP, to avoid portability problems, newline sequences should be issued using the PHP_EOL constant.[24]
Example in C#:
string eol = Environment.NewLine;
string lineColor = "Color: Red" + eol;
string eol2 = "\n";
string lineColor2 = "Color: Blue" + eol2;
Issues with different newline formats
[edit]
The different newline conventions cause text files that have been transferred between systems of different types to be displayed incorrectly.
Text in files created with programs which are common on Unix-like or classic Mac OS, appear as a single long line on most programs common to MS-DOS and Microsoft Windows because these do not display a single line feed or a single carriage return as a line break.
Conversely, when viewing a file originating from a Windows computer on a Unix-like system, the extra CR may be displayed as a second line break, as ^M, or as <cr> at the end of each line.
Furthermore, programs other than text editors may not accept a file, e.g. some configuration file, encoded using the foreign newline convention, as a valid file.
The problem can be hard to spot because some programs handle the foreign newlines properly while others do not. For example, a compiler may fail with obscure syntax errors even though the source file looks correct when displayed on the console or in an editor. Modern text editors generally recognize all flavours of CR+LF newlines and allow users to convert between the different standards. Web browsers are usually also capable of displaying text files and websites which use different types of newlines.
Even if a program supports different newline conventions, these features are often not sufficiently labeled, described, or documented. Typically a menu or combo-box enumerating different newline conventions will be displayed to users without an indication if the selection will re-interpret, temporarily convert, or permanently convert the newlines. Some programs will implicitly convert on open, copy, paste, or save—often inconsistently.
Most textual Internet protocols (including HTTP, SMTP, FTP, IRC, and many others) mandate the use of ASCII CR+LF (\r\n, 0x0D 0x0A) on the protocol level, but recommend that tolerant applications recognize lone LF (\n, 0x0A) as well. Despite the dictated standard, many applications erroneously use the C newline escape sequence \n (LF) instead of the correct combination of carriage return escape and newline escape sequences \r\n (CR+LF) (see section Newline in programming languages above). This accidental use of the wrong escape sequences leads to problems when trying to communicate with systems adhering to the stricter interpretation of the standards instead of the suggested tolerant interpretation. One such intolerant system is the qmail mail transfer agent that actively refuses to accept messages from systems that send bare LF instead of the required CR+LF.[25]
The standard Internet Message Format[26] for email states: "CR and LF MUST only occur together as CRLF; they MUST NOT appear independently in the body". Differences between SMTP implementations in how they treat bare LF and/or bare CR characters have led to SMTP spoofing attacks referred to as "SMTP smuggling".[27]
The File Transfer Protocol can automatically convert newlines in files being transferred between systems with different newline representations when the transfer is done in "ASCII mode". However, transferring binary files in this mode usually has disastrous results: any occurrence of the newline byte sequence—which does not have line terminator semantics in this context, but is just part of a normal sequence of bytes—will be translated to whatever newline representation the other system uses, effectively corrupting the file. FTP clients often employ some heuristics (for example, inspection of filename extensions) to automatically select either binary or ASCII mode, but in the end it is up to users to make sure their files are transferred in the correct mode. If there is any doubt as to the correct mode, binary mode should be used, as then no files will be altered by FTP, though they may display incorrectly.[28]
Conversion between newline formats
[edit]Text editors are often used for converting a text file between different newline formats; most modern editors can read and write files using at least the different ASCII CR/LF conventions.
For example, the editor Vim can make a file compatible with the Windows Notepad text editor. Within vim
:set fileformat=dos
:wq
Editors can be unsuitable for converting larger files or bulk conversion of many files. For larger files (on Windows NT) the following command is often used:
D:\>TYPE unix_file | FIND /V "" > dos_file
Special purpose programs to convert files between different newline conventions include unix2dos and dos2unix, mac2unix and unix2mac, mac2dos and dos2mac, and flip.[29] The tr command is available on virtually every Unix-like system and can be used to perform arbitrary replacement operations on single characters. A DOS/Windows text file can be converted to Unix format by simply removing all ASCII CR characters with
$ tr -d '\r' < inputfile > outputfile
or, if the text has only CR newlines, by converting all CR newlines to LF with
$ tr '\r' '\n' < inputfile > outputfile
The same tasks are sometimes performed with awk, sed, or in Perl if the platform has a Perl interpreter:
$ awk '{sub("$","\r\n"); printf("%s",$0);}' inputfile > outputfile # UNIX to DOS (adding CRs on Linux and BSD based OS that haven't GNU extensions)
$ awk '{gsub("\r",""); print;}' inputfile > outputfile # DOS to UNIX (removing CRs on Linux and BSD based OS that haven't GNU extensions)
$ sed -e 's/$/\r/' inputfile > outputfile # UNIX to DOS (adding CRs on Linux based OS that use GNU extensions)
$ sed -e 's/\r$//' inputfile > outputfile # DOS to UNIX (removing CRs on Linux based OS that use GNU extensions)
$ perl -pe 's/\r?\n|\r/\r\n/g' inputfile > outputfile # Convert to DOS
$ perl -pe 's/\r?\n|\r/\n/g' inputfile > outputfile # Convert to UNIX
$ perl -pe 's/\r?\n|\r/\r/g' inputfile > outputfile # Convert to old Mac
The file command can identify the type of line endings:
$ file myfile.txt
myfile.txt: ASCII English text, with CRLF line terminators
The Unix egrep (extended grep) command can be used to print filenames of Unix or DOS files (assuming Unix and DOS-style files only, no classic Mac OS-style files):
$ egrep -L '\r\n' myfile.txt # show UNIX style file (LF terminated)
$ egrep -l '\r\n' myfile.txt # show DOS style file (CRLF terminated)
Other tools permit the user to visualise the EOL characters:
$ od -a myfile.txt
$ cat -e myfile.txt
$ cat -v myfile.txt
$ hexdump -c myfile.txt
Interpretation
[edit]Two ways to view newlines, both of which are self-consistent, are that newlines either separate lines or that they terminate lines. If a newline is considered a separator, there will be no newline after the last line of a file. Some programs have problems processing the last line of a file if it is not terminated by a newline. On the other hand, programs that expect newline to be used as a separator will interpret a final newline as starting a new (empty) line. Conversely, if a newline is considered a terminator, all text lines including the last are expected to be terminated by a newline. If the final character sequence in a text file is not a newline, the final line of the file may be considered to be an improper or incomplete text line, or the file may be considered to be improperly truncated.
In text intended primarily to be read by humans using software which implements the word wrap feature, a newline character typically only needs to be stored if a line break is required independent of whether the next word would fit on the same line, such as between paragraphs and in vertical lists. Therefore, in the logic of word processing and most text editors, newline is used as a paragraph break and is known as a "hard return", in contrast to "soft returns" which are dynamically created to implement word wrapping and are changeable with each display instance. In many applications a separate control character called "manual line break" exists for forcing line breaks inside a single paragraph. The glyph for the control character for a hard return is usually a pilcrow (¶), and for the manual line break is usually a carriage return arrow (↵).
Reverse and partial line feeds
[edit]RI (U+008D REVERSE LINE FEED,[30] ISO/IEC 6429 8D, decimal 141) is used to move the printing position back one line (by reverse feeding the paper, or by moving a display cursor up one line) so that other characters may be printed over existing text. This may be done to make them bolder, or to add underlines, strike-throughs or other characters such as diacritics. The reverse line feed was called a line starve – a pun on line feed – in the Hacker's Dictionary.[31]
Similarly, PLD (U+008B PARTIAL LINE FORWARD, decimal 139) and PLU (U+008C PARTIAL LINE BACKWARD, decimal 140) can be used to advance or reverse the text printing position by some fraction of the vertical line spacing (typically, half). These can be used in combination for subscripts (by advancing and then reversing) and superscripts (by reversing and then advancing), and may also be useful for printing diacritics.
See also
[edit]Notes
[edit]- ^ Line breaks within, e.g.,
<PRE>...</PRE>are honored.
References
[edit]- ^ "What is a Newline?". www.computerhope.com. Retrieved 10 May 2021.
- ^ Qualline, Steve (2001). Vi Improved - Vim (PDF). Sams Publishing. p. 120. ISBN 9780735710016. Archived from the original (PDF) on 8 April 2022. Retrieved 4 January 2023.
- ^ Duckett, Chris. "Windows Notepad finally understands everyone else's end of line characters". ZDNet. Archived from the original on 13 May 2018. Retrieved 4 January 2023.
[A]fter decades of frustration, and having to download a real text editor to change a single line in a config file from a Linux box, Microsoft has updated Notepad to be able to handle end of line characters used in Unix, Linux, and macOS environments.
- ^ Lopez, Michel (8 May 2018). "Introducing extended line endings support in Notepad". Windows Command Line. Archived from the original on 6 April 2019. Retrieved 4 January 2023.
As with any change to a long-established tool, there's a chance that this new behavior may not work for your scenarios, or you may prefer to disable this new behavior and return to Notepad's original behavior. To do this, you can change [...registry keys...] to tweak how Notepad handles pasting of text, and which EOL character to use when Enter/Return is hit
- ^ Kahn-Greene, Will Guaraldi. "ASCII chart". bluesock.org.
- ^ Bray, Andrew C.; Dickens, Adrian C.; Holmes, Mark A. (1983). The Advanced User Guide for the BBC Microcomputer (PDF). Cambridge Microcomputer Centre. pp. 103, 104. ISBN 978-0946827008. Retrieved 30 January 2019.
- ^ "Character Output". RISC OS 3 Programmers' Reference Manual. 3QD Developments Ltd. 3 November 2015. Retrieved 18 July 2018.
- ^ IBM System/360 Reference Data Card, Publication GX20-1703, IBM Data Processing Division, White Plains, NY
- ^ Heninger, Andy (20 September 2013). "UAX #14: Unicode Line Breaking Algorithm". The Unicode Consortium.
- ^ Bray, Tim (March 2014). "JSON Grammar". The JavaScript Object Notation (JSON) Data Interchange Format. sec. 2. doi:10.17487/RFC7159. RFC 7159.
- ^ Bray, Tim (March 2014). "Strings". The JavaScript Object Notation (JSON) Data Interchange Format. sec. 7. doi:10.17487/RFC7159. RFC 7159.
- ^ a b "ECMAScript 2019 Language Specification". ECMA International. June 2019. 11.3 Line Terminators.
- ^ "ECMAScript 2019 Language Specification". ECMA International. June 2018. 11.3 Line Terminators.
- ^ "Subsume JSON (a.k.a. JSON ⊂ ECMAScript)". GitHub. 22 May 2018.
- ^ "5.4. Line Break Characters". YAML Ain't Markup Language revision 1.2.2. 1 October 2021.
- ^ "HyperText Mark-up Language". info.cern.ch. CERN. Retrieved 15 August 2025.
- ^ "<p>: The Paragraph element - HTML". MDN. Mozilla. 13 August 2025.
- ^ "<br>: The Line Break element - HTML". MDN. Mozilla. 9 July 2025. Retrieved 15 August 2025.
- ^ "binmode". Perl documentation. Perl 5 Porters.
- ^ "PHP: Strings - Manual". PHP Manual. The PHP Group.
- ^ "2. Lexical analysis". The Python Language Reference. The Python Foundation.
- ^ Jansen, Jack (14 January 2002). "PEP 278 – Universal Newline Support". Python Enhancement Proposals. Python Software Foundation. sec. Specification.
- ^ "What's new in Python 2.3". Python 2.3. Python Software Foundation. sec. General, Universal newlines.
Universal newlines - files opened for reading with the special mode "U" (instead of "r") translate all three commonly found line ending conventions (n, r, rn) into Python's standard n convention. Contributed by Jack Jansen. (PEP 278)
- ^ "PHP: Predefined Constants - Manual". PHP Manual. The PHP Group.
- ^ Bernstein, D. J. "Bare LFs in SMTP".
- ^ Resnick, Pete (April 2001). Internet Message Format. doi:10.17487/RFC2822. RFC 2822.
- ^ Longin, Timo (18 December 2023). "SMTP Smuggling - Spoofing E-Mails Worldwide". SEC Consult.
- ^ Zeil, Steven (19 January 2015). "File Transfer". Old Dominion University. Archived from the original on 14 May 2016.
When in doubt, transfer in binary mode.
- ^ Sapp, Craig Stuart. "ASCII text converstion between UNIX, Macintosh, MS-DOS". Center for Computer Research in Music and Acoustics. Archived from the original on 9 February 2009.
- ^ "C1 Controls and Latin-1 Supplement" (PDF). unicode.org. Retrieved 13 February 2016.
- ^ Spears, Richard A.; Steele, Guy L. (1986). Woods, Donald R.; Finkel, Raphael A.; Crispin, Mark R.; Stallman, Richard M.; Goodfellow, Geoffrey S.; Abel, Ernest L. (eds.). "Two Specialty Dictionaries". American Speech. 61 (3): 273–277. doi:10.2307/454671. ISSN 0003-1283.
A few entries are typical computer terminology. .... Line feed is listed to explain the punning entry line starve.
External links
[edit]- The Unicode reference; see paragraph 5.8 in Chapter 5 of the Unicode 4.0 standard (PDF)
- "The [NEL] Newline Character".
- The End of Line Puzzle
- Understanding Newlines at the Wayback Machine (archived 20 August 2006)
- "The End-of-Line Story"
Newline
View on Grokipedia\n for LF, facilitating portable text manipulation.
History
Origins in Typewriters and Teleprinters
The typewriter, a pivotal invention in mechanical writing devices, was patented on June 23, 1868, by Christopher Latham Sholes, along with Carlos Glidden and Samuel W. Soule, marking the first practical model known as the "Type-Writer."[4] This device featured a carriage—a movable frame holding the paper—that advanced incrementally as keys were struck, thanks to an escapement mechanism ensuring precise letter spacing. At the end of each line, the typist manually operated a carriage return lever, which retracted the carriage to the left margin, while a separate line feed lever or platen knob advanced the paper upward by one line to prepare for the next row of text.[5] These physical operations, driven by springs and gears, addressed the need for organized linear text production on paper, preventing overlap and maintaining readability without digital aids. The introduction of electric typewriters in the 1930s further refined these mechanisms, automating actions for greater efficiency. IBM's Electromatic model, released in 1935 after acquiring the Northeast Electric Company, incorporated an electric motor to power the carriage return and line feed, reducing manual effort and enabling faster operation compared to purely mechanical predecessors.[6] Earlier attempts at electrification dated back to Thomas Edison's 1872 printing wheel design, but practical office models emerged only in this decade, with Royal introducing its first electric typewriter in 1950.[6][7] These innovations preserved the core principles of carriage return—resetting the print position horizontally—and line feed—vertical paper advancement—while enhancing reliability for professional use. Teleprinters, or teletypewriters, emerged in the early 1900s as electromechanical devices for transmitting typed messages over telegraph lines, building directly on typewriter mechanics for remote printing. Émile Baudot's five-bit telegraph code, patented in 1874, enabled efficient character transmission but initially lacked dedicated line control signals.[8] This changed with Donald Murray's 1901 adaptation of the Baudot code for English-language use, which introduced specific control characters for carriage return (CR) and line feed (LF); these simulated the typewriter's physical actions by signaling the receiving device's carriage to shift left and the platen to advance the paper, respectively.[9] Teletype machines, first commercialized by the Morkrum Company from 1906 onward and later by the Teletype Corporation, standardized the CR+LF sequence to ensure complete line transitions over asynchronous telegraph connections, allowing synchronized printing at both ends.[10][11] A key feature of teleprinters was the ability to perform overstriking—reprinting on the same line for emphasis or correction—by issuing a CR without a subsequent LF, which returned the print head to the line's start without advancing the paper, thus enabling manipulation of text on a single line before feeding to the next.[12] This capability, rooted in the separate mechanical controls of typewriters, foreshadowed flexible line handling in later communication systems and highlighted the practical need for distinct CR and LF operations in noisy telegraph environments.Evolution in Early Computing
As early computers emerged in the 1940s and 1950s, the newline concept transitioned from mechanical teleprinters to digital terminals, where repurposed teleprinters served as input/output devices for timesharing systems, allowing multiple users to interact with a single machine over phone lines.[13] Devices like the IBM 026 printing card punch, introduced in 1949, adapted typewriter mechanisms for data entry, incorporating carriage return and line feed operations to print punched cards while advancing the paper feed.[14] Early line printers, such as the IBM 1403 from the 1950s, extended this by using carriage control characters in the first column of each line to manage paper advancement, spacing, and form feeds, ensuring efficient output formatting without dedicated newline sequences.[15] The standardization of newline in computing advanced significantly with the development of the American Standard Code for Information Interchange (ASCII) in 1963 by the American Standards Association (ASA) X3 committee.[16] ASCII defined line feed (LF, hexadecimal 0x0A, decimal 10) as a control character to advance the paper or cursor to the next line, and carriage return (CR, 0x0D, decimal 13) to move the cursor to the line's starting position, drawing from teleprinter conventions to support data transmission and display.[16] These definitions, published as ASA X3.4-1963, provided a common framework for text handling across systems, influencing subsequent protocols and software.[17] In the 1960s and 1970s, operating systems diverged in newline adoption: the TECO text editor, developed in 1962 for Digital Equipment Corporation systems, treated line breaks as single LF characters internally, automatically appending LF to input carriage returns for buffer storage.[18] Multics, an influential timesharing system from the late 1960s, stored text with LF alone but inserted CR before LF during output to terminals or printers for compatibility with teleprinters.[19] In contrast, UNIX, developed in the early 1970s at Bell Labs, standardized on LF-only for line endings to enhance storage efficiency by avoiding redundant CR characters, particularly beneficial on limited media like tapes and disks.[12] The ARPANET, launched in 1969, further emphasized consistent newline handling through its reliance on ASCII control characters in early protocols like the 1822 interface message processor protocol, ensuring reliable text transmission across heterogeneous hosts by standardizing LF and CR for line demarcation in network messages.[20] This approach influenced subsequent internet standards, promoting interoperability in data exchange.Technical Representation
ASCII Control Characters
The American Standard Code for Information Interchange (ASCII), formalized in 1963 as ASA X3.4-1963, established a 7-bit character encoding scheme that reserved the code positions from 0 to 31 (and 127) for control characters, which lack visual glyphs and serve to control text layout, transmission, and peripheral devices rather than represent printable symbols.[21] These controls were influenced by earlier teleprinter codes, such as the International Telegraph Alphabet No. 2 (ITA2) from the 1930s, which introduced non-printing signals for formatting in mechanical systems like Baudot code derivatives.[22] Central to newline operations are the Line Feed (LF) and Carriage Return (CR) control characters. LF, assigned code 10 (hexadecimal 0A, octal 012, binary 0001010), instructs devices to advance the active position to the next line, performing a vertical movement without horizontal reset, as originally defined for teleprinter paper feed mechanisms.[21] CR, with code 13 (hex 0D, octal 015, binary 0001101), returns the active position to the start of the current line, resetting the horizontal position to the left margin while leaving the vertical position unchanged, emulating the mechanical action of a typewriter carriage.[21] In 7-bit ASCII streams, these appear as non-printable bytes; for example, a sequence ending a line might embed LF as the byte 0x0A in a binary data flow, invisible to direct display but interpreted by parsers to format output.[23] Related control characters include Vertical Tabulation (VT, code 11 or hex 0B, octal 013, binary 0001011), which advances the position to the next vertical tab stop for multi-line spacing, and Form Feed (FF, code 12 or hex 0C, octal 014, binary 0001100), which ejects the current page and advances to the start of a new one, both supporting structured text progression in early printing and display systems.[21] Subsequent extensions to ASCII, such as the ISO 8859 family of standards (e.g., ISO 8859-1 from 1987), preserved the core 7-bit structure unchanged for these control characters in the 0-127 range, ensuring compatibility while adding 128-255 for additional printable symbols in regional variants.[21]End-of-Line Sequences Across Systems
In computing environments, end-of-line sequences represent the transition to a new line in text data, with variations arising from historical and technical considerations across systems. The most prevalent are the line feed (LF, ASCII 0x0A) used alone in Unix, Linux, and macOS (post-2002 versions); the carriage return followed by line feed (CR+LF, ASCII 0x0D followed by 0x0A) in Windows and DOS-derived systems; and carriage return (CR, ASCII 0x0D) alone in classic Mac OS (pre-OS X). These sequences build on ASCII control characters for carriage positioning and paper advancement. The LF-only approach emerged in early Unix implementations for storage efficiency and standardization, as a single character adequately advanced the cursor on line-buffered terminals without needing separate return and feed operations.[12] In contrast, CR+LF originated in CP/M and was adopted by MS-DOS to ensure compatibility with mechanical teletypes and printers, where CR reset the print head to the line start and LF advanced the paper.[24] Classic Mac OS employed CR-only for its straightforward text rendering model, simplifying file processing on resource-constrained hardware.[25] Less common variants include the Next Line (NEL) control, encoded as 0x85 in ISO-8859-1 and equivalent to EBCDIC's NL (0x15), primarily used in IBM mainframe environments for combined carriage return and line feed in vertical tabulation contexts.[26]| Sequence | Systems | Rationale |
|---|---|---|
| LF (0x0A) | Unix/Linux/macOS (post-2002) | Efficiency in file size and terminal handling |
| CR+LF (0x0D 0x0A) | Windows/DOS/CP/M | Compatibility with teletype mechanics |
| CR (0x0D) | Classic Mac OS (pre-2001) | Simplicity in text display |
| NEL (0x85) | EBCDIC mainframes | Vertical movement in legacy encodings |
Encoding in Unicode
In Unicode, newline functionality is represented through several dedicated control characters and separators, each serving specific roles in text formatting and line progression. The primary characters include Line Feed (LF, U+000A), which advances the cursor to the next line while maintaining the horizontal position; Carriage Return (CR, U+000D), which returns the cursor to the line start; and Next Line (NEL, U+0085), a control from ISO 6429 that combines both effects in some legacy systems. Additionally, Unicode defines Line Separator (LS, U+2028) for breaking lines within paragraphs without implying a new paragraph, and Paragraph Separator (PS, U+2029) for separating entire paragraphs, both aiding in structured text processing. These characters trace their origins to early Unicode versions, with LF and CR included as part of the basic C0 control set in Unicode 1.0 released in 1991, inheriting from ASCII and ISO standards to ensure compatibility with existing text processing. LS and PS were added later in Unicode 3.0 in 2000, specifically to support bidirectional text layouts in scripts like Arabic and Hebrew, as well as East Asian typography where visual line breaks differ from Western conventions due to vertical writing modes and character widths. Unicode normalization forms, such as Normalization Form C (NFC) and Normalization Form D (NFD), preserve these line break characters without alteration, as they are neither decomposable nor composed with other characters; for instance, LF, CR, LS, and PS remain unchanged during canonical decomposition or composition to maintain text integrity. This stability is crucial for applications involving text transformation, where unintended splitting or merging of lines could disrupt formatting. In multi-byte encodings like UTF-8 and UTF-16, these characters must be treated as atomic units to avoid splitting sequences; for example, in UTF-8, LF encodes as the single byte0x0A, whereas LS requires the three-byte sequence 0xE2 0x80 0xA8, ensuring no partial reads occur during stream processing. In UTF-16, characters in the BMP such as LS (U+2028) are encoded directly as two bytes (0x20 0x28), whereas characters in higher planes (U+10000 and above) use surrogate pairs, which parsers must handle as atomic units to preserve line semantics.[31]
Usage Contexts
Operating Systems and Text Files
In Unix-like operating systems, including Linux, the native end-of-line sequence for text files is the line feed (LF) character, as standardized by POSIX for portability across systems. Editors such as vi and Emacs handle this by detecting the file's line ending format upon opening and optionally converting to LF for editing; for instance, Vim uses the:set fileformat=unix command to ensure LF consistency, while Emacs employs set-buffer-file-coding-system with the "unix" argument to normalize endings without altering content encoding.
Windows traditionally employs the carriage return plus line feed (CR+LF) sequence in text files, which is the default for applications like Notepad when creating or saving plain text files.[32] In PowerShell scripting, especially since PowerShell Core (version 6 and later, now PowerShell 7+) released in 2016, uses LF line endings by default across all platforms to support cross-platform compatibility, though this can cause issues with Windows tools expecting CRLF, as discussed in ongoing compatibility reports.[33]
macOS underwent a significant shift in newline handling with the transition to OS X in 2001, moving from the classic Mac OS's single carriage return (CR) to LF for compliance with POSIX standards and Unix heritage, ensuring seamless integration with Unix-based tools and file systems.[34]
Plain text files with a .txt extension exhibit newline variations depending on the originating system or application, leading to potential interoperability challenges. Structured formats like JSON and XML demand consistent normalization of line endings to prevent parsing errors; for example, the XML specification requires processors to normalize all line breaks to LF during input parsing, while JSON parsers may fail on unescaped or mismatched endings in multi-line values unless files are pre-normalized to a single convention.
Version control systems like Git, first released in 2005, address these discrepancies by storing text files internally with LF endings regardless of the originating platform, then converting to the system's native format (such as CR+LF on Windows) during checkout via the core.autocrlf configuration setting, which can be set to true for automatic handling or input to enforce LF on commit.[35]
A practical example arises with CSV files generated by Microsoft Excel, which enforces CR+LF as the row delimiter to align with Windows conventions, often resulting in parsing issues when these files are opened in Unix tools like csvkit or awk without prior conversion, as the extra CR may be interpreted as embedded data rather than a line separator.[30]
Programming Languages
In programming languages, newlines are commonly represented using escape sequences within string literals to insert line feed (LF) or carriage return (CR) characters. For instance, the sequence\n denotes LF in languages such as C, Java, and Python, while \r represents CR, and \r\n can be specified explicitly for the combined sequence used on Windows systems.[36]
Python 3 implements universal newlines, treating \n in file input as a portable representation that automatically handles LF, CR, or CRLF sequences regardless of the platform's native convention, as defined in PEP 278 from 2001.[37] In contrast, Java provides System.lineSeparator(), a method that returns the platform-specific newline string—such as \n on Unix-like systems or \r\n on Windows—to ensure compatibility with operating system text file conventions in input and output operations.[38]
Modern languages address variability in line endings through flexible APIs; for example, Rust's std::io::BufRead trait, via its lines() method, recognizes both LF and CRLF as line terminators, stripping the delimiter (including the optional CR) without including it in the resulting string, and supports custom line-ending configurations through iterator adaptations.[39][40] Similarly, Go's bufio.Scanner with the default ScanLines function splits on an optional CR followed by a mandatory LF (matching the regex \r?\n), allowing developers to define custom split functions for other endings.[41] In JSON strings, the escape sequence \n is interpreted as a literal LF character (Unicode U+000A), preserving the newline in serialized data across language implementations.
Newline handling in SQL varies by database system; for instance, PostgreSQL preserves newlines in string literals and text fields when inserted using escape sequences like E'\n', though certain functions such as trim() may collapse leading or trailing whitespace including newlines.[42] In regular expressions, languages like Perl treat \n as matching only the LF character by default, requiring modifiers such as /s (dotall) to make . match newlines or explicit patterns like \r?\n for broader line-ending support.[43]
A practical example in C++ is the std::getline function from <string>, which reads input until it encounters the platform-default delimiter (typically \n), consumes the delimiter to advance the stream, but excludes it from the output string, helping prevent residual characters in subsequent reads.
Web Technologies and Markup
In HTML, whitespace characters, including newlines, are collapsed into a single space during rendering in normal text flow, preventing multiple spaces or line breaks from affecting layout unless explicitly preserved.[44] The<br> element provides a mechanism for inserting a single line break, equivalent to a newline in visual rendering, and is commonly used to simulate the effect of newlines in non-preformatted content. However, within the <pre> element, all whitespace—including newlines—is preserved exactly as authored, rendering fixed-width text with explicit line breaks.[45] Numeric character entities such as (representing LF, U+000A) allow authors to embed line feeds directly in markup where needed.
CSS extends control over newline handling through the white-space property, where the pre-line value collapses consecutive whitespace sequences but preserves newlines as line breaks, allowing text to wrap while respecting authored line separations.[46] This behavior applies to standard LF characters, enabling dynamic formatting in web layouts. Gaps exist in handling Unicode-specific separators like U+2028 (line separator, LS) and U+2029 (paragraph separator, PS), which are treated as non-collapsible segment breaks in pre-line mode but may not always render consistently across browsers in internationalized content.[47]
In XML-based formats like SVG, parsers normalize all line endings—whether CR, LF, or CR+LF—to a single LF (U+000A) before processing, ensuring consistent internal representation regardless of the source file's platform.[29] Similarly, JSON used in web APIs escapes newlines within strings as \n (denoting LF), adhering to the format's strict syntax for control characters to maintain parsability across systems.[48] HTTP protocol specifications mandate CR+LF (CRLF) as the line terminator for header fields, separating name-value pairs in requests and responses.[49]
Markdown, as defined in the CommonMark specification (version 0.31.2, released January 2024), treats single newlines in paragraphs as soft breaks that are ignored for rendering, requiring either two trailing spaces followed by a newline or a blank line to produce a hard line break or paragraph separation.[50] In code blocks, however, raw newlines are preserved literally as LF, maintaining the original formatting for embedded code snippets.
Interpretation and Processing
Software Parsing Behaviors
Software applications and systems interpret newline sequences differently based on their design, platform conventions, and standards, which influences how text is parsed, processed, and rendered during reading and display. In web browsers, HTML parsing collapses sequences of whitespace characters—including newlines (LF or CR+LF)—into a single space, except within elements like<pre> or when the CSS white-space property is set to pre or pre-wrap. This behavior ensures consistent layout rendering across documents but can obscure original formatting unless preserved explicitly.
Terminal emulators, such as xterm, map the LF control character to advancing the cursor to the next line while maintaining the horizontal position, and the CR character to moving the cursor to the start of the current line without vertical movement. These mappings align with legacy teletype behaviors and enable precise cursor control in command-line interfaces.
Language runtimes often implement flexible parsing to handle cross-platform compatibility. In Java, the BufferedReader.readLine() method operates in a universal newline mode, recognizing any of \r (CR), \n (LF), or \r\n (CR+LF) as a line terminator and returning the line without the terminator.[51] Similarly, in .NET Framework and .NET Core, the StreamReader.ReadLine() method detects and consumes \r\n, \n, or \r as line endings, normalizing them during text stream processing.[52]
Modern development tools address parsing inconsistencies by auto-detecting and managing newline variants. Visual Studio Code, released in 2015, automatically detects line ending types (LF, CRLF, or CR) upon opening files and displays the current format in the status bar, allowing users to configure detection and normalization to prevent display artifacts. Version control systems like Git handle mixed newline sequences in diff operations through settings such as core.autocrlf, which normalize endings during checkout and commit to ensure consistent comparisons across environments.
In POSIX-compliant environments, as defined by IEEE Std 1003.1, a text line consists of zero or more non-newline characters terminated by a dos-mode for CRLF and mac-mode for CR—to detect and internally convert foreign newline formats to Unix LF for editing, while preserving the original on save.Format Conversion Methods
Command-line tools provide straightforward methods for converting newline formats, particularly between Unix-style LF and Windows-style CRLF sequences. The dos2unix and unix2dos utilities, originating in the early 1990s, convert files by removing or adding carriage returns as needed; for instance, dos2unix strips trailing \r characters from lines ending in \r\n, while unix2dos inserts \r before existing \n terminators.[54] These tools are available in most Unix-like systems and process files in batch mode, preserving content while normalizing line endings. Stream editors like sed and awk offer scriptable alternatives for targeted conversions without dedicated binaries. A common sed command to remove carriage returns from DOS-formatted files issed 's/\r$//' , which substitutes any \r at the end of a line with nothing, effectively converting CRLF to LF. Similarly, awk can process and rewrite lines, such as awk '{sub(/\r$/,""); print}' to strip trailing \r before outputting.[55] The tr utility simplifies deletion of carriage returns across an entire file using tr -d '\r' < input > output , which removes all instances of the \r character (ASCII 13) from input and redirects to output.[56]
In programming environments, APIs facilitate programmatic newline handling for cross-platform compatibility. Python's os module provides os.linesep , a string representing the native line separator (\r\n on Windows, \n on Unix), which can be used with str.replace() to normalize text; for example, text.replace('\n', os.linesep) converts Unix newlines to the local format before writing to disk.[57] Node.js's fs module, when reading files with 'utf8' encoding, preserves original byte sequences including mixed newlines, allowing conversion via string methods like text.replace(/\r\n/g, '\n') to unify to LF for processing. Perl supports in-place editing through the $^I variable, set to an extension for backups (e.g., $^I = ".bak"), enabling scripts like perl -i -pe 's/\r\n?//g' to remove CRLF or CR variants directly in the file.[58]
Integrated development environments (IDEs) and cloud services address conversion gaps through automation. IntelliJ IDEA allows configuration of line separators per file or globally via Editor > Code Style settings, with options to change existing files' endings (e.g., from CRLF to LF) and apply normalization during saves if tied to code style schemes.[59] In cloud storage like AWS S3, objects are stored as immutable bytes, preserving native newline formats without alteration, but transformations can be applied via AWS Lambda functions or S3 Select queries for on-demand conversion during retrieval or processing.
Version control systems like Git incorporate line ending filters to manage conversions in cross-platform repositories. Git's smudge and clean filters, defined in .gitattributes files, process files during checkout (smudge: apply local CRLF) and commit (clean: normalize to LF); for example, setting *.txt filter=crlf invokes scripts to handle endings, ensuring consistent storage while adapting to developer platforms.[35] This approach mitigates compatibility issues by automating transformations at the repository level.[60]
Common Compatibility Issues
One prevalent compatibility issue arises in version control systems like Git, where files with mixed or platform-specific newline sequences—such as CRLF on Windows versus LF on Unix-like systems—can produce misleading commit diffs that appear to show unnecessary changes to entire files.[35] This occurs because Git normalizes line endings during commits based on configuration settings likecore.autocrlf, leading developers to inadvertently introduce or propagate false modifications across repositories.[35]
In email systems, mixed newline sequences can disrupt automatic line wrapping, causing text to render incorrectly in clients that expect uniform CRLF delimiters as per MIME standards, where any occurrence of CRLF must represent a line break and isolated CR or LF usage is prohibited.[61] For instance, a message composed with LF-only lines on a Unix system may result in broken formatting or unintended reflow when viewed on Windows-based email software.[61]
Cross-platform deployment exacerbates these problems; for example, shell scripts authored on Windows with CRLF endings often fail on Linux servers because the shebang line (e.g., #!/bin/bash) becomes #!/bin/bash\r, rendering the interpreter path invalid and preventing execution.[62] Similarly, JSON parsers adhering strictly to RFC 8259 may reject or misparse documents using CR-only line endings, as the specification defines whitespace (including line breaks) but many implementations expect LF or CRLF for structural separation, treating CR as an unescaped control character.[63]
Post-2020 developments in containerization, particularly with Docker, have introduced practices enforcing LF endings in Linux-based images to mitigate portability issues, as CRLF files mounted from Windows hosts can cause runtime errors in scripts or configurations within the container environment. This standardization helps avoid inconsistencies but highlights ongoing challenges in hybrid development workflows.
Security risks also stem from unnormalized newline inputs; in web forms, failure to sanitize user-supplied data containing CRLF sequences can enable injection attacks, allowing attackers to append arbitrary HTTP headers and facilitate response splitting or cache poisoning.[64]
The RFC 2046 for MIME recommends CRLF as the standard line break in text parts while acknowledging tolerance for legacy systems using other conventions, yet deviations persist and cause interoperability failures.[61] A notable example is Microsoft Excel's handling of CSV files, where LF-only endings from Unix sources are often mangled during import, resulting in data appearing in a single row or column misalignment due to improper line detection.[65]
Tools like Vim address detection challenges via options such as ++ff=dos when editing files, which forces interpretation as DOS (CRLF) format to prevent display artifacts from mismatched endings.[66] Additionally, regular expressions in programming languages may fail if the escape sequence \n (matching LF) is used on CRLF files without accounting for the preceding CR, leading to incomplete pattern matches or parsing errors across platforms. These issues can typically be resolved through format conversion methods that normalize endings to a consistent standard.
Specialized Variants
Reverse Line Feeds
Reverse line feeds, designated as the Reverse Indexing (RI) control function in standards like ECMA-48, enable the printing or cursor position to move upward by one line, countering the downward movement of a standard line feed. This capability facilitates overstriking or overprinting, where subsequent characters are printed over previous ones to simulate effects such as bolding (by reprinting the same text) or underlining (by printing underscore characters beneath the original line) in hardware without dedicated formatting features. The process typically involves a carriage return (CR, ASCII 13) to reposition to the line's start, followed by the RI control (ASCII 141 or U+008D in Unicode) to shift upward, and then outputting the overstrike characters; in some implementations, combinations of backspace (BS, ASCII 8) with line feeds approximate this upward and leftward motion.[67] In historical contexts, reverse line feeds were integral to dot-matrix printers prevalent from the 1970s through the 1990s, where escape sequences like Epson's ESC j n allowed partial reverse feeding (n/216 inch increments) to align for precise overprinting and emphasis without advanced graphics modes.[68] Early terminals also utilized them within ANSI escape sequences or direct control characters for text formatting in line-oriented interfaces, supporting applications like document preparation where visual enhancements were achieved through mechanical repetition rather than fonts. The Teletype Model 37 teleprinter, released in 1966, incorporated support for reverse line feed alongside half-forward and half-reverse feeds, enabling sophisticated output like charts and emphasized text on friction or sprocket-fed paper at 100 words per minute.[69] Modern terminal emulators, including xterm, process these operations via ECMA-48-compliant controls, preserving compatibility for legacy software that relies on RI for formatting. Today, reverse line feeds see limited application primarily in retro computing recreations of vintage systems or emulations of period printers, where they recreate authentic overstrike behaviors. In digital text, similar effects are often emulated using Unicode combining characters, such as U+0332 COMBINING LOW LINE to simulate underlining over existing glyphs without positional reversal. For instance, in early Microsoft BASIC environments supporting extended ASCII, a carriage return could be issued viaPRINT CHR$(13); to reposition without advancing, followed by backspaces CHR$(8) to enable overprinting on the current line, though true reverse line feed required the RI character CHR$(141) on 8-bit systems for upward movement.[70]
