Recent from talks
Contribute something
Nothing was collected or created yet.
| TeX | |
|---|---|
| Developer | Donald Knuth |
| Initial release | 1978 |
| Stable release | TeX Live 2025[1]
/ 8 March 2025 |
| Repository | |
| Written in | WEB/Pascal |
| Operating system | Cross-platform |
| Type | Typesetting |
| License | Permissive free software |
| Website | tug |
| TeX | |
|---|---|
| Filename extension |
.tex |
| Internet media type | |
| Initial release | 1978 |
| Type of format | Document file format |
TeX (/tɛx/), stylized within the system as TeX, is a typesetting program which was designed and written by computer scientist and Stanford University professor Donald Knuth[2] and first released in 1978. The term now refers to the system of extensions – which includes software programs called TeX engines, sets of TeX macros, and packages which provide extra typesetting functionality – built around the original TeX language. TeX is a popular means of typesetting complex mathematical formulae; it has been noted as one of the most sophisticated digital typographical systems.[3]
TeX is widely used in academia, especially in mathematics, computer science, economics, political science, engineering, linguistics, physics, statistics, and quantitative psychology. It has long since displaced Unix troff (the previously favored formatting system), in most Unix installations (although troff still remains as the default formatter of the UNIX documentation). It is also used for many other typesetting tasks, especially in the form of LaTeX, ConTeXt, and other macro packages.
TeX was designed with two main goals in mind: to allow anybody to produce high-quality books with minimal effort, and to provide a system that would give exactly the same results on all computers, at any point in time (together with the Metafont language for font description and the Computer Modern family of typefaces).[4] TeX is free software, which made it accessible to a wide range of users.
History
[edit]When the first paper volume of Knuth's The Art of Computer Programming was published in 1968,[5] it was typeset using hot metal typesetting on a Monotype machine. This method, dating back to the 19th century, produced a "classic style" appreciated by Knuth.[6] When the second edition was published, in 1976, the whole book had to be typeset again because the Monotype technology had been largely replaced by phototypesetting, and the original fonts were no longer available. When Knuth received the galley proofs of the new book on 30 March 1977, he found them inferior. Disappointed, Knuth set out to design his own typesetting system.
Knuth saw for the first time the output of a high-quality digital typesetting system, and became interested in digital typography. On 13 May 1977, he wrote a memo to himself describing the basic features of TeX.[7] He planned to finish it on his sabbatical in 1978 (though ultimately the language's syntax was not frozen until 1989[citation needed]). Guy Steele happened to be at Stanford during the summer of 1978, when Knuth was developing his first version of TeX. When Steele returned to the Massachusetts Institute of Technology that autumn, he rewrote TeX's input/output (I/O) to run under the Incompatible Timesharing System (ITS) operating system. The first version of TeX, called TeX78, was written in the SAIL programming language to run on a PDP-10 under Stanford's WAITS operating system.[8]
For later versions of TeX, Knuth invented the concept of literate programming, a way of producing compilable source code and cross-linked documentation typeset in TeX from the same original file. The language used is called WEB and produces programs in DEC PDP-10 Pascal.
TeX82, a new version of TeX rewritten from scratch, was published in 1982. Among other changes, the original hyphenation algorithm was replaced by a new algorithm written by Frank Liang. TeX82 also uses fixed-point arithmetic instead of floating-point, to ensure reproducibility of the results across different computer hardware,[9] and includes a real, Turing-complete programming language, following intense lobbying by Guy Steele.[10] In 1989, Donald Knuth released new versions of TeX and Metafont.[11] Despite his desire to keep the program stable, Knuth realized that 128 different characters for the text input were not enough to accommodate foreign languages; the main change in version 3.0 of TeX is thus the ability to work with 8-bit inputs, allowing 256 different characters in the text input. TeX3.0 was released on March 15, 1990.[12]
Since version 3, TeX has used an idiosyncratic version numbering system, where updates have been indicated by adding an extra digit at the end of the decimal, so that the version number asymptotically approaches π. This is a reflection of the fact that TeX is now very stable, and only minor updates are anticipated. The current version of TeX is 3.141592653; it was last updated in 2021.[13] The design was frozen after version 3.0, and no new feature or fundamental change will be added, so all newer versions will contain only bug fixes.[14] Even though Donald Knuth himself has suggested a few areas in which TeX could have been improved, he indicated that he firmly believes that having an unchanged system that will produce the same output now and in the future is more important than introducing new features. For this reason, he has stated that the "absolutely final change (to be made after my death)" will be to change the version number to π, at which point all remaining bugs will become features.[15] Likewise, versions of Metafont after 2.0 asymptotically approach e (currently at 2.7182818), and a similar change will be applied after Knuth's death.[14]
Pronunciation and spelling
[edit]The name TeX is intended by its developer to be pronounced /tɛx/, with the final consonant of loch.[16] The letters of the name are meant to represent the capital Greek letters tau, epsilon, and chi, as TeX is an abbreviation of τέχνη (ΤΕΧΝΗ technē), Greek for both "art" and "craft", which is also the root word of technical. English speakers often pronounce it /tɛk/, like the first syllable of technical. Knuth instructs that it be typeset with the "E" below the baseline and reduced spacing between the letters. This is done, as Knuth mentions in his TeXbook, to distinguish TeX from other system names such as TEX, the Text EXecutive processor (developed by Honeywell Information Systems).[17]
Syntax
[edit]TeX commands commonly start with a backslash and are grouped with curly braces. Almost all of TeX's syntactic properties can be changed on the fly, which makes TeX input hard to parse by anything but TeX itself. TeX is a macro- and token-based language: many commands, including most user-defined ones, are expanded on the fly until only unexpandable tokens remain, which are then executed. Expansion itself is practically free from side effects. Tail recursion of macros takes no memory, and if-then-else constructs are available. This makes TeX a Turing-complete language even at the expansion level.[18] The system can be divided into four levels: in the first, characters are read from the input file and assigned a category code (sometimes called "catcode", for short). Combinations of a backslash (actually, any character of category zero) followed by letters (characters of category 11) or a single other character are replaced by a control-sequence token. In this sense, this stage is like lexical analysis, although it does not form numbers from digits. In the next stage, expandable control sequences (such as conditionals or defined macros) are replaced by their replacement text. The input for the third stage is then a stream of characters (including the ones with special meaning) and unexpandable control sequences (typically assignments and visual commands). Here, the characters get assembled into a paragraph, and TeX's paragraph breaking algorithm works by optimizing breakpoints over the whole paragraph. The fourth stage breaks the vertical list of lines and other material into pages.
The TeX system has precise knowledge of the sizes of all characters and symbols, and using this information, it computes the optimal arrangement of letters per line and lines per page. It then produces a DVI file ("DeVice Independent") containing the final locations of all characters. This DVI file can then be printed directly given an appropriate printer driver, or it can be converted to other formats. Nowadays, pdfTeX is often used, which bypasses DVI generation altogether.[19] The base TeX system understands about 300 commands, called primitives.[20] These low-level commands are rarely used directly by users, and most functionality is provided by format files (predumped memory images of TeX after large macro collections have been loaded). Knuth's original default format, which adds about 600 commands, is Plain TeX.[21] The most widely used format is LaTeX, originally developed by Leslie Lamport, which incorporates document styles for books, letters, slides, etc., and adds support for referencing and automatic numbering of sections and equations. Another widely used format, AMS-TeX, is produced by the American Mathematical Society and provides many more user-friendly commands, which can be altered by journals to fit with their house style. Most of the features of AMS-TeX can be used in LaTeX by using the "AMS packages" (e.g., amsmath, amssymb) and the "AMS document classes" (e.g., amsart, amsbook). This is then referred to as AMS-LaTeX.[22] Other formats include ConTeXt, used primarily for desktop publishing and written mostly by Hans Hagen at Pragma.
Design
[edit]The TeX software incorporates several aspects that were not available, or were of lower quality, in other typesetting programs at the time when TeX was released. Some of the innovations are based on interesting algorithms, and have led to several theses for Knuth's students. While some of these discoveries have now been incorporated into other typesetting programs, others, such as the rules for mathematical spacing, are still unique.[as of?]
Mathematical spacing
[edit]
Since the primary goal of the TeX language is high-quality typesetting for publishers of books, Knuth gave a lot of attention to the spacing rules for mathematical formulae.[23][24] He took three bodies of work that he considered to be standards of excellence for mathematical typography: the books typeset by the Addison-Wesley Publishing house (the publisher of The Art of Computer Programming) under the supervision of Hans Wolf; editions of the mathematical journal Acta Mathematica dating from around 1910; and a copy of Indagationes Mathematicae, a Dutch mathematics journal. Knuth looked closely at these printed papers to sort out and look for a set of rules for spacing.[25][26] While TeX provides some basic rules and the tools needed to specify proper spacing, the exact parameters depend on the font used to typeset the formula. For example, the spacing for Knuth's Computer Modern fonts has been precisely fine-tuned over the years and is now set; but when other fonts, such as AMS Euler, were used by Knuth for the first time, new spacing parameters had to be defined.[27]
The typesetting of math in TeX is not without criticism, particularly with respect to technical details of the font metrics, which were designed in an era when significant attention was paid to storage requirements. This resulted in some "hacks" overloading some fields, which in turn required other "hacks". On an aesthetics level, the rendering of radicals has also been criticized.[28] The OpenType math font specification largely borrows from TeX, but has some new features/enhancements.[29][30][31]
Hyphenation and justification
[edit]In comparison with manual typesetting, the problem of justification is easy to solve with a digital system such as TeX, which, provided that good points for line breaking have been defined, can automatically spread the spaces between words to fill in the line. The problem is thus to find the set of breakpoints that will give the most visually pleasing result. Many line-breaking algorithms use a first-fit approach, where the breakpoints for each line are determined one after the other, and no breakpoint is changed after it has been chosen.[32] Such a system is not able to define a breakpoint depending on the effect that it will have on the following lines. In comparison, the total-fit line-breaking algorithm used by TeX and developed by Donald Knuth and Michael Plass [33] considers all the possible breakpoints in a paragraph, and finds the combination of line breaks that will produce the most globally pleasing arrangement.
Formally, the algorithm defines a value called badness associated with each possible line break; the badness is increased if the spaces on the line must stretch or shrink too much to make the line the correct width. Penalties are added if a breakpoint is particularly undesirable: for example, if a word must be hyphenated, if two lines in a row are hyphenated, or if a very loose line is immediately followed by a very tight line. The algorithm will then find the breakpoints that will minimize the sum of squares of the badness (including penalties) of the resulting lines. If the paragraph contains possible breakpoints, the number of situations that must be evaluated naively is . However, by using the method of dynamic programming, the complexity of the algorithm can be brought down to (see Big O notation). Further simplifications (for example, not testing extremely unlikely breakpoints such as a hyphenation in the first word of a paragraph, or very overfull lines) lead to an efficient algorithm whose running time is , where is the width of a line. A similar algorithm is used to determine the best way to break paragraphs across two pages, in order to avoid widows or orphans (lines that appear alone on a page while the rest of the paragraph is on the following or preceding page). However, in general, a thesis by Michael Plass shows how the page-breaking problem can be NP-complete because of the added complication of placing figures.[34] TeX's line-breaking algorithm has been adopted by several other programs, such as Adobe InDesign (a desktop publishing application)[35] and the GNU fmt Unix command line utility.[36]
If no suitable line break can be found for a line, the system will try to hyphenate a word. The original version of TeX used a hyphenation algorithm based on a set of rules for the removal of prefixes and suffixes of words, and for deciding if it should insert a break between the two consonants in a pattern of the form vowel–consonant–consonant–vowel (which is possible most of the time).[37] TeX82 introduced a new hyphenation algorithm, designed by Frank Liang in 1983, to assign priorities to breakpoints in letter groups. A list of hyphenation patterns is first generated automatically from a corpus of hyphenated words (a list of 50,000 words). If TeX must find the acceptable hyphenation positions in the word encyclopedia, for example, it will consider all the subwords of the extended word .encyclopedia., where . is a special marker to indicate the beginning or end of the word. The list of subwords includes all the subwords of length 1 (., e, n, c, y, etc.), of length 2 (.e, en, nc, etc.), etc., up to the subword of length 14, which is the word itself, including the markers. TeX will then look into its list of hyphenation patterns, and find subwords for which it has calculated the desirability of hyphenation at each position. In the case of our word, 11 such patterns can be matched, namely 1c4l4, 1cy, 1d4i3a, 4edi, e3dia, 2i1a, ope5d, 2p2ed, 3pedi, pedia4, y1c. For each position in the word, TeX will calculate the maximum value obtained among all matching patterns, yielding en1cy1c4l4o3p4e5d4i3a4. Finally, the acceptable positions are those indicated by an odd number, yielding the acceptable hyphenations en-cy-clo-pe-di-a. This system based on subwords allows the definition of very general patterns (such as 2i1a), with low indicative numbers (either odd or even), which can then be superseded by more specific patterns (such as 1d4i3a) if necessary. These patterns find about 90% of the hyphens in the original dictionary; more importantly, they do not insert any spurious hyphen. In addition, a list of exceptions (words for which the patterns do not predict the correct hyphenation) are included with the Plain TeX format; additional ones can be specified by the user.[38][page needed][39]
Metafont
[edit]Metafont, not strictly part of TeX, is a font description system which allows the designer to describe characters algorithmically. It uses Bézier curves in a fairly standard way to generate the actual characters to be displayed, but Knuth devotes substantial attention to the rasterizing problem on bitmapped displays. Another thesis, by John Hobby, further explores this problem of digitizing "brush trajectories". This term derives from the fact that Metafont describes characters as having been drawn by abstract brushes (and erasers). It is commonly believed that TeX is based on bitmap fonts but, in fact, these programs "know" nothing about the fonts that they are using other than their dimensions. It is the responsibility of the device driver to appropriately handle fonts of other types, including PostScript Type 1 and TrueType. Computer Modern (commonly known as "the TeX font") is freely available in Type 1 format, as are the AMS math fonts. Users of TeX systems that output directly to PDF, such as pdfTeX, XeTeX, or LuaTeX, generally never use Metafont output at all.
Macro language
[edit]TeX documents are written and programmed using an unusual macro language. Broadly speaking, the running of this macro language involves expansion and execution stages which do not interact directly. Expansion includes both literal expansion of macro definitions as well as conditional branching, and execution involves such tasks as setting variables/registers and the actual typesetting process of adding glyphs to boxes.
The definition of a macro not only includes a list of commands but also the syntax of the call. It differs with most widely used lexical preprocessors like M4, in that the body of a macro gets tokenized at definition time.
The TeX macro language has been used to write larger document production systems, most notably including LaTeX and ConTeXt.
Operation
[edit]
A sample Hello world program in plain TeX is:
Hello, World
\bye % marks the end of the file; not shown in the final output
This might be in a file myfile.tex, as .tex is a common file extension for plain TeX files. By default, everything that follows a percent sign on a line is a comment, ignored by TeX. Running TeX on this file (for example, by typing tex myfile.tex in a command-line interpreter, or by calling it from a graphical user interface) will create an output file called myfile.dvi, representing the content of the page in a device independent format (DVI). A DVI file could then be either viewed on screen or converted to a suitable format for any of the various printers for which a device driver existed (printer support was generally not an operating system feature at the time that TeX was created). Knuth has said that there is nothing inherent in TeX that requires DVI as the output format, and later versions of TeX, notably pdfTeX, XeTeX, and LuaTeX, all support output directly to PDF.
TeX provides a different text syntax specifically for mathematical formulas. For example, the quadratic formula (which is the solution of the quadratic equation) appears as:
| Source code | Renders as |
|---|---|
The quadratic formula is $-b \pm \sqrt{b^2 - 4ac} \over 2a$
\bye
|
The formula is printed in a way a person would write by hand, or typeset the equation. In a document, entering mathematics mode is done by starting with a $ symbol, then entering a formula in TeX syntax, and closing again with another of the same symbol. Knuth explained in jest that he chose the dollar sign to indicate the beginning and end of mathematical mode in plain TeX because typesetting mathematics was traditionally supposed to be expensive.[40] Display mathematics (mathematics presented centred on a new line) is similar but uses $$ instead of a single $ symbol. For example, the above with the quadratic formula in display math:
| Source code | Renders as |
|---|---|
The quadratic formula is $$-b \pm \sqrt{b^2 - 4ac} \over 2a$$
\bye
|
Uses
[edit]In several technical fields such as computer science, mathematics, engineering and physics, TeX has become a de facto standard. Many thousands of books have been published using TeX, including books published by Addison-Wesley, Cambridge University Press, Elsevier, Oxford University Press, and Springer. Numerous journals in these fields are produced using TeX or LaTeX, allowing authors to submit their raw manuscript written in TeX.[41] While many publications in other fields, including dictionaries and legal publications, have been produced using TeX, it has not been as successful as in the more technical fields, as TeX was primarily designed to typeset mathematics.
When he designed TeX, Donald Knuth did not believe that a single typesetting system would fit everyone's needs; instead, he designed many hooks inside the program so that it would be possible to write extensions, and released the source code, hoping that the publishers would design versions tailoring to their own needs. While such extensions have been created (including some by Knuth himself),[42] most people have extended TeX only using macros and it has remained a system associated with technical typesetting.[43][44][26]
Chemistry notation
[edit]Packages like mhchem, XyMTeX, chemfig, and chemmacros for typesetting chemical equations, structures, and reactions in LaTeX.[45]
XML publication
[edit]It is possible to use TeX for automatic generation of sophisticated layout for XML data. The differences in syntax between the two description languages can be overcome with the help of TeXML. In the context of XML publication, TeX can thus be considered an alternative to XSL-FO. TeX allowed scientific papers in mathematical disciplines to be reduced to relatively small files that could be rendered client-side, allowing fully typeset scientific papers to be exchanged over the early Internet and emerging World Wide Web, even when sending large files was difficult. This paved the way for the creation of repositories of scientific papers such as arXiv, through which papers could be 'published' without an intermediary publisher.[46]
Development
[edit]The original source code for the current TeX software is written in WEB, a mixture of documentation written in TeX and a Pascal subset in order to ensure readability and portability. For example, TeX does all of its dynamic allocation itself from fixed-size arrays and uses only fixed-point arithmetic for its internal calculations. As a result, TeX has been ported to almost all operating systems, usually by using the web2c program to convert the source code into C instead of directly compiling the Pascal code. Knuth has kept a very detailed log of all the bugs he has corrected and changes he has made in the program since 1982; as of 2021[update], the list contains 440 entries, not including the version modification that should be done after his death as the final change in TeX.[47][48] Knuth offers monetary awards to people who find and report a bug in TeX. The award per bug started at US$2.56 (one "hexadecimal dollar"[49]) and doubled every year until it was frozen at its current value of $327.68. Knuth has lost relatively little money as there have been very few bugs claimed. In addition, recipients have been known to frame their check as proof that they found a bug in TeX rather than cashing it.[50][51]
Due to scammers finding scanned copies of his checks on the internet and using them to try to drain his bank account, Knuth no longer sends out real checks, but those who submit bug reports can get credit at The Bank of San Serriffe instead.[52]
Distributions and extensions
[edit]TeX is usually provided in the form of an easy-to-install bundle of TeX itself along with Metafont and all the necessary fonts, documents formats, and utilities needed to use the typesetting system. On UNIX-compatible systems, including Linux and Apple macOS, TeX is distributed as part of the larger TeX Live distribution. (Prior to TeX Live, the teTeX distribution was the de facto standard on UNIX-compatible systems.) On Microsoft Windows, there is the MiKTeX distribution (enhanced by proTeXt) and the Microsoft Windows version of TeX Live.
Several document processing systems are based on TeX, notably jadeTeX, which uses TeX as a backend for printing from James Clark's DSSSL Engine, the Arbortext publishing system, and Texinfo, the GNU documentation processing system. TeX has been the official typesetting package for the GNU operating system since 1984.
Numerous extensions and companion programs for TeX exist, among them BibTeX for bibliographies (distributed with LaTeX); pdfTeX, a TeX-compatible engine which can directly produce PDF output (as well as continuing to support the original DVI output); XeTeX, a TeX-compatible engine that supports Unicode and OpenType; and LuaTeX, a Unicode-aware extension to TeX that includes a Lua runtime with extensive hooks into the underlying TeX routines and algorithms. Most TeX extensions are available for free from CTAN, the Comprehensive TeX Archive Network.
License
[edit]Donald Knuth has indicated several times[53][54][55][26] that the source code of TeX has been placed into the "public domain", and he strongly encourages modifications or experimentations with this source code. However, since Knuth highly values the reproducibility of the output of all versions of TeX, any changed version must not be called TeX, or anything confusingly similar. To enforce this rule, any implementation of the system must pass a test suite called the TRIP test[56] before being allowed to be called TeX. The question of licence is somewhat confused by the statements included at the beginning of the TeX source code,[57] which indicate that "all rights are reserved. Copying of this file is authorized only if ... you make absolutely no changes to your copy". This restriction should be interpreted as a prohibition to change the source code as long as the file is called tex.web. The copyright note at the beginning of tex.web (and mf.web) was changed in 2021 to explicitly state this. This interpretation is confirmed later in the source code when the TRIP test is mentioned ("If this program is changed, the resulting system should not be called 'TeX'").[58] The American Mathematical Society tried in the early 1980s to claim a trademark for TeX. This was rejected because at the time "TEX" (all caps) was registered by Honeywell for the "Text EXecutive" text processing system.[citation needed]
Since the source code of TeX is essentially in the public domain, other programmers are allowed (and explicitly encouraged) to improve the system, but are required to use another name to distribute the modified TeX, meaning that the source code can still evolve. For example, the Omega project was developed after 1991, primarily to enhance TeX's multilingual typesetting abilities.[59] Knuth created "unofficial" modified versions, such as TeX-XeT, which allows a user to mix texts written in left-to-right and right-to-left writing systems in the same document.[42]
Community
[edit]
Notable entities in the TeX community include the TeX Users Group (TUG), which currently publishes TUGboat and formerly published The PracTeX Journal, covering a wide range of topics in digital typography relevant to TeX. The Deutschsprachige Anwendervereinigung TeX (DANTE) is a large user group in Germany. The TeX Users Group was founded in 1980 for educational and scientific purposes, provides an organization for those who have an interest in typography and font design, and are users of the TeX typesetting system invented by Knuth. The TeX Users Group represents the interests of TeX users worldwide. The TeX Users Group publishes the journal TUGboat three times per year;[60] DANTE publishes Die TeXnische Komödie four times per year. Other user groups include DK-TUG in Denmark, GUTenberg in France, GuIT in Italy, NTG in the Netherlands and UK-TUG in the United Kingdom; the user groups jointly maintain a complete list.[61]
Editors
[edit]There are a variety of editors designed to work with TeX:
- The TeXmacs text editor is a WYSIWYG-WYSIWYM scientific text editor, inspired by both TeX and Emacs. It uses Knuth's fonts and can generate TeX output.
- Overleaf is a partial-WYSIWYG, online editor that provides a cloud-based solution to TeX along with additional features in real-time collaborative editing.
- LyX is a WYSIWYM document processor which runs on a variety of platforms including:
- Linux,
- Microsoft Windows (newer versions require Windows 2000 or later)
- Apple macOS (using a non-native Qt front-end).
- TeXShop (for macOS), TeXworks (for Linux, macOS and Windows) and WinShell (for Windows) are similar tools and provide an integrated development environment (IDE) for working with LaTeX or TeX. For KDE/Qt, Kile provides such an IDE.
- Texmaker is the Pure Qt equivalent of Kile, with a user interface that is nearly the same as Kile's.
- TeXstudio is an open-source fork (2009) of Texmaker that offers a different approach to configurability and features. Free downloadable binaries are provided for Windows, Linux, macOS, OS/2, and FreeBSD.
- GNU Emacs has various built-in and third-party packages with support for TeX, the major one being AUCTeX.
- Visual Studio Code. A notable extension is LaTeX Workshop
- For Vim, possible plugins include Vim-LaTeX Suite,[62] Automatic TeX,[63] and TeX-9.[64]
- For Apache OpenOffice and LibreOffice, iMath and TexMaths extensions can provide mathematical TeX typesetting.[65][66]
- For MediaWiki, the Math extension provides mathematical TeX typesetting, but the code needs to be surrounded by
<math>tag.
Extensions
[edit]See also
[edit]- List of TeX typesetting software and tools
- Comparison of document markup languages
- Formula editor – Computer program used to typeset mathematical works or formulae
- List of document markup languages
- MathJax – Cross-browser JavaScript library that displays mathematical equations in web browsers
- MathTime – Typeface for TeX
- New Typesetting System
- PGF/TikZ – Graphics languages
- PSTricks
- xdvi – Open Source Computer Program
Notes
[edit]- ^ "TeX Live - TeX Users Group". tug.org. Retrieved 8 August 2025.
- ^ "Per Bothner (attendee at TeX Project meetings) discusses authorship".
Knuth definitely wrote most of the code himself, at least for the Metafont re-write, for which I have pe[r]sonal knowledge. However, some of his students (such as Michael Plass and John Hobby) did work on the algorithms used in TeX and Metafont.
- ^ Yannis Haralambous. Fonts & Encodings (Translated by P. Scott Horne). Beijing; Sebastopol, Calif: O'Reilly Media, 2007, pp. 235.
- ^ Gaudeul, Alexia (2006). "Do Open Source Developers Respond to Competition?: The (La)TeX Case Study". SSRN Electronic Journal. doi:10.2139/ssrn.908946. ISSN 1556-5068.
- ^ Knuth, Donald E. "Less brief biography". Don Knuth's Home Page. Archived from the original on 5 December 2016. Retrieved 9 January 2017.
- ^ Knuth, Donald E. "Commemorative lecture of the Kyoto Prize, 1996" (PDF). Kyoto Prize. Archived from the original (PDF) on 27 January 2018. Retrieved 18 August 2018.
- ^ Knuth, Donald Ervin, TEXDR.AFT, archived from the original on 12 January 2015
- ^ This article is based on material taken from TeX at the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.
- ^ Knuth & Plass 1981, p. 144.
- ^ Knuth, Donald E. Knuth meets NTG members, NTG: MAPS. 16 (1996), 38–49. Reprinted as Questions and Answers, III, chapter 33 of Digital Typography, p. 648.
- ^ Knuth, Donald E. The New Versions of TeX and METAFONT, TUGboat 10 (1989), 325–328; 11 (1990), 12. Reprinted as chapter 29 of Digital Typography.
- ^ Hoenig, Alan (1998). TeX Unbound: LaTeX & TeX Strategies for Fonts, Graphics, & More. Oxford University Press. ISBN 978-0-19-509686-6.
- ^ "TeX 21 release". Retrieved 5 January 2022.
- ^ a b "What is the future of TeX?". The TeX FAQ. 27 May 2018. Archived from the original on 28 April 2019. Retrieved 21 July 2019.
- ^ Knuth, Donald E. The future of TeX and METAFONT, NTG journal MAPS (1990), 489. Reprinted as chapter 30 of Digital Typography, p. 571.
- ^ Knuth, Donald E. The TeXbook, Ch. 1: The Name of the Game, p. 1.
- ^ Knuth, Donald E. The TeX Logo in Various Fonts, TUGboat 7 (1986), 101. Reprinted as chapter 6 of Digital Typography.
- ^ Jeffrey, Alan (1990), "Lists in TeX's Mouth" (PDF), TUGboat, 11 (2): 237–45
- ^ "CTAN: Package pdftex". ctan.org. Retrieved 21 July 2019.
- ^ Knuth 1984, p. 9.
- ^ Plain TeX (source code), CTAN, archived from the original on 9 May 2021
- ^ "What are the AMS packages (amsmath, etc.)?". The TeX FAQ. 27 May 2018. Archived from the original on 28 April 2019. Retrieved 21 July 2019.
- ^ Slater, Robert (1989), Portraits in Silicon, MIT Press, p. 349, ISBN 9780262691314
- ^ Syropoulos, Apostolos; Tsolomitis, Antonis; Sofroniou, Nick (2003), Digital Typography Using LaTeX, Springer, p. 93, ISBN 9780387952178
- ^ Knuth, Donald E (1996). "Questions and Answers II". TUGboat. 17: 355–367.
- ^ a b c Knuth, Donald E. (1999). Digital Typography. Stanford, Calif: Center for the Study of Language and Information Publications. ISBN 978-1-57586-010-7.
- ^ Knuth, Donald E. Typesetting Concrete Mathematics, TUGboat 10 (1989), pp. 31–36, 342. Reprinted as chapter 18 of Digital Typography, pp. 367–378.
- ^ Vieth, Ulrik. "Math typesetting in TEX: The good, the bad, the ugly" (PDF). Archived from the original (PDF) on 20 January 2022.
- ^ "High-Quality Editing and Display of Mathematical Text in Office 2007".
- ^ "LineServices".
- ^ "Map" (PDF). ntg.nl.
- ^ Barnett, Michael P (1965), Computer Typesetting: Experiments and Prospects, Cambridge, MA: MIT Press
- ^ url=http://svn.tug.org/interviews/plass.html
- ^ Knuth & Plass 1981.
- ^ "Donald E. Knuth", TUGboat (interview), 21, Advogato: 103–10, 2000, archived from the original on 22 January 2009, retrieved 26 December 2005
- ^ "4.1 fmt: Reformat paragraph text", Core GNU utilities (GNU coreutils) manual, GNU Project, 2016
- ^ Liang 1983, p. 3.
- ^ Liang 1983.
- ^ "Appendix H: Hyphenation", The TeXbook, pp. 449–55
- ^ Knuth 1984, p. 127, Ch. 16: Typing Math Formulas.
- ^ Beebe 2004, p. 10.
- ^ a b Knuth, Donald E; MacKay, Pierre (1987), "Mixing Right-to-Left Texts with Left-to-Right Texts" (PDF), TUGboat, 8: 14–25.
- ^ Knuth, Donald E (1996), "Questions and Answers I", TUGboat, 17: 7–22.
- ^ Knuth, Donald E (1996), "Questions and Answers II", TUGboat, 17: 355–367.
- ^ "Helpful LaTeX Packages for Chemistry | Computational Chemistry Resources".
- ^ O'Connell, Heath (2002). "Physicists Thriving with Paperless Publishing". Hep Lib.web. 6: 3. arXiv:physics/0007040.
- ^ Knuth, Donald E. List of updates to the TeX82 listing published in September 1982, available on CTAN.
- ^ Knuth, Donald E. Appendix to the Errors of TeX paper, available on CTAN, last modified in January 2003.
- ^ Knuth, Donald E. "Knuth: Frequently Asked Questions". Stanford Computer Science. Archived from the original on 6 March 2008. Retrieved 28 November 2019.
- ^ Kara Platoni (May–June 2006). "Love at First Byte". Stanford Magazine. Archived from the original on 4 June 2006.
- ^ "History of TeX". TeX Users Group. Retrieved 28 November 2019.
- ^ Knuth, Donald E (2008). "Knuth: Recent News – Financial Fiasco". Stanford Computer Science. Archived from the original on 29 November 2019. Retrieved 29 November 2019.
- ^ Knuth, Donald E. (1986). "Remarks to Celebrate the Publication of Computers & Typesetting" (PDF). TUGboat. 7 (2): 95–98. Retrieved 6 December 2024.
- ^ Knuth, Donald E. (1990). "The Future of TeX and METAFONT" (PDF). TUGboat. 11 (4): 489. Retrieved 6 December 2024.
- ^ Knuth, Donald Ervin (1 November 1997). "Digital Typography: 1996 Kyoto Prize Lecture" (PDF). Kyoto Prize. The Inamori Foundation. Retrieved 6 December 2024.
- ^ "Trip", CTAN (source code), archived from the original (TeX) on 12 June 2021
- ^ Knuth, Donald E (1986), TeX: The Program, Computers and Typesetting, vol. B, Reading, MA: Addison-Wesley, ISBN 0-201-13437-3
- ^ Open Source: Technology and Policy by Fadi P. Deek, James A. M. McHugh "Public domain", page 227 (2008)
- ^ "TeX Engine development". The TeX FAQ. 24 May 2018. Archived from the original on 28 April 2019. Retrieved 21 July 2019.
- ^ "The Communications of the TeX Users Group". tug.org. TeX Users Group. Retrieved 15 March 2019.
- ^ "All TeX User Groups". tug.org. TeX Users Group. Retrieved 17 November 2019.
- ^ Vim-LaTex, SourceForge, archived from the original on 8 March 2024
- ^ Automatic TeX plugin, Launch pad [permanent dead link]
- ^ TeX-9, Vim.org
- ^ TexMaths Homepage, free.fr
- ^ iMath, SourceForge
References
[edit]- Beebe, Nelson HF (2004), "25 Years of TeX and METAFONT: Looking Back and Looking Forward" (PDF), TUGboat, 25: 7–30.
- Knuth, Donald Ervin (1984), The TeXbook, Computers and Typesetting, vol. A, Reading, MA: Addison-Wesley, ISBN 0-201-13448-9. The source code of the book in TeX (and a needed set of macros [1]) is available online on CTAN. It is provided only as an example and its use to prepare a book like The TeXbook is not allowed.
- ——— (1986), TeX: The Program, Computers and Typesetting, vol. B, Reading, MA: Addison-Wesley, ISBN 0-201-13437-3. The full source code of TeX; also available on CTAN. Being written using literate programming, it contains plenty of human-readable documentation.
- ——— (1999), Digital Typography, Lecture notes, Center for the Study of Language and Information, ISBN 1-57586-010-4.
- ———; Plass, Michael F (1981). "Breaking Paragraphs into Lines". Software: Practice and Experience. 11 (11): 1119–84. doi:10.1002/spe.4380111102. S2CID 206508107..
- ———, TeX (source code), archived from the original (WEB) on 27 September 2011 contains extensive documentation about the algorithms used in TeX.
- Lamport, Leslie (1994), LaTeX: A Document Preparation System (2nd ed.), Reading, MA: Addison-Wesley, ISBN 0-201-52983-1.
- Liang, Franklin Mark (August 1983), Word Hy-phen-a-tion by Com-put-er (PhD thesis), Department of Computer Science, Stanford University.
- Salomon, David (1995), The Advanced TeXbook, Springer, Bibcode:1995adte.book.....S, ISBN 0-387-94556-3.
- Spivak, MD (1990), The Joy of TeX (reference) (2nd ed.), American Mathematical Society, ISBN 0-8218-2997-1 on AMS-TeX.
- Vulis, Michael (1992), Modern TeX and Its Applications, CRC Press, ISBN 0-8493-4431-X.
External links
[edit]- TeX Users' Group
- TeX (questions and answers), StackExchange.
- Eijkhout, Victor. TeX by Topic Archived 25 February 2021 at the Wayback Machine
- TeX for the Impatient
- Donald Knuth discusses developing the software for TEX at Xerox PARC 2/21/1980 https://archive.org/details/xerox-parc-tapes-v49
$a^2 + b^2 = c^2$. TeX was distributed under an open-source license from its inception, fostering widespread adoption and extensions by the community; notably, in 1985, Leslie Lamport built LaTeX as a set of macros atop TeX to simplify document preparation for non-experts, making it the dominant implementation in fields like mathematics, physics, and computer science.[5][6][7]
TeX's enduring influence stems from its portability across computing platforms, rigorous documentation in Knuth's The TeXbook (1986), and role as the foundation for modern tools like pdfTeX and XeTeX, which extend it to produce PDF output and support Unicode. It remains the preferred choice for authoritative publications by organizations such as the American Mathematical Society, powering millions of academic papers annually while inspiring advancements in digital typography.[2][8][1]
History
Origins and Creation
Donald Knuth's motivation for creating TeX stemmed from his dissatisfaction with the typesetting quality of the galley proofs he received on March 30, 1977, for the second edition of The Art of Computer Programming, Volume 2, which suffered from numerous errors and inconsistencies inherent to manual hot-metal typesetting processes.[4] This experience, combined with his first encounter earlier that year on February 1, 1977, with the output of a high-resolution digital typesetting machine, convinced Knuth that a computer-based system could achieve superior precision and automation in document production, especially for complex mathematical content.[9] In response, Knuth initiated development of TeX in 1978, initially implementing it in the SAIL (Stanford Artificial Intelligence Language) programming language to run on a PDP-10 computer under Stanford's WAITS operating system.[10] The first version, known as TeX78, was completed that year and marked the system's debut capability for handling sophisticated layout tasks.[11] Knuth's primary aim was to design a tool that automated high-quality typesetting while granting users fine-grained control over formatting through programmable macros, thereby addressing the limitations of existing systems in rendering mathematical expressions accurately and aesthetically.[12] Recognizing the need for broader accessibility, Knuth shifted the implementation to a portable subset of Pascal in the early 1980s, culminating in the release of TeX82 in 1982.[11] This version emphasized stability and cross-platform compatibility, solidifying TeX's foundation as a reliable system for producing professional-grade documents with rigorous mathematical typography.[10]Key Milestones and Stabilization
In 1982, Donald Knuth released TeX82, a major revision of the TeX typesetting system that incorporated the WEB literate programming system, allowing the source code to be intertwined with documentation for improved readability and maintenance.[13] This version marked a significant refinement, passing extensive testing before public distribution and establishing TeX as a robust tool for high-quality typesetting.[13] Parallel to TeX's evolution, Knuth introduced Metafont between 1977 and 1979 as a programmable font description language designed for creating custom typefaces parametrically, which was bundled with TeX to enable precise control over character shapes and ensure typographic consistency.[4] In the 1980s, Knuth utilized Metafont to develop the Computer Modern font family, a comprehensive set of scalable fonts optimized for mathematical and scientific documents, serving as the default typeface for TeX outputs.[7] By 1989, Knuth decided to freeze TeX's core development at version 3.0, declaring it a finished product where only bug fixes would be applied to maintain stability, with subsequent updates reflected solely in the version number approximating π (e.g., 3.14159...).[14] This stabilization ensured TeX's long-term reliability as a foundational system for digital typography.[15] TeX's influence extended to derivative projects in the 1990s, such as TeX Live, a cross-platform distribution originating in 1996 under the collaboration of TeX user groups and led initially by Sebastian Rahtz, which simplified installation and management of TeX components without Knuth's direct involvement.[16] Similarly, pdfTeX emerged in 1997 as an extension enabling direct PDF output from TeX sources, developed by Hàn Thế Thành independently of Knuth to address evolving document format needs.[17] A key outcome of TeX's refinement was its role in inspiring LaTeX, developed by Leslie Lamport in the 1980s as a macro package atop TeX to facilitate structured document authoring, particularly for complex technical writing, thereby broadening TeX's accessibility.[18]Fundamentals
Pronunciation and Spelling
TeX derives its name from the Greek word τέχνη (technē), meaning "art" or "craft," where the letters represent the Greek capitals tau (Τ), epsilon (Ε), and chi (Χ), and the "X" stylizes the chi to evoke the sound of technical craftsmanship.[19] The name was selected by Donald E. Knuth in 1978 during the initial development of the system.[1] Knuth specified the standard pronunciation as /tɛx/, akin to "tech" in English or the "ch" in the Scottish "loch" for the chi sound, reflecting the Greek roots.[19] Common mispronunciations include /tiːɛks/, as if spelling out the letters, or /tɛks/, resembling the start of "Texas," though Knuth explicitly favored /tɛx/ to align with the etymology.[1] The spelling is conventionally rendered as TeX, with the "X" in a distinctive sans-serif style to mimic the Greek chi, and it is always capitalized this way in official documentation; variants like pdfTeX or XeTeX retain the TeX core but adapt prefixes.[19]Basic Syntax
TeX employs a markup syntax based on plain text input, where ordinary characters form the content and special commands, known as control sequences, direct the typesetting process. Control sequences begin with a backslash () followed by letters or a single non-letter character, such as \def for defining macros or \font for selecting fonts. These primitives form the core of TeX's command set, enabling users to customize output without altering the underlying program. For instance, \font\tenrm=cmr10 selects the Computer Modern Roman font at 10-point size for subsequent text. TeX processes input in distinct modes that govern how material is arranged: vertical mode, which handles paragraph breaks and page composition; horizontal mode, for assembling lines of text within paragraphs; and math mode, dedicated to rendering mathematical expressions. Transitions between modes occur automatically or via delimiters; entering inline math mode uses a single $ character, while display math mode employs $$. Exiting math mode reverses these delimiters. This modal structure ensures precise control over layout, with TeX switching to horizontal mode upon encountering printable characters in vertical mode and to math mode only when explicitly invoked. Characters in TeX input are categorized by category codes (catcodes), integer values from 0 to 15 that dictate interpretation during tokenization. Common defaults include 11 for letters (forming part of control sequence names), 12 for digits and other symbols (treated as literal), and 10 for spaces (mostly ignored in horizontal mode). The primitive \catcode allows reassigning these, such as \catcode`#=6 to designate # as a parameter token in macro definitions, enabling flexible parsing of input streams. Other basic categories encompass 1 for { (begin-group), 2 for } (end-group), 3 for $ (math shift), 4 for % (comment), 5 for ~ (active, for custom behaviors), 6 for & (alignment tab), 7 for _ (subscript in math), 8 for ^ (superscript), 13 for active characters (executable like macros), 14 for # (parameter), and 15 for \ (escape for control sequences). Spaces, digits, and letters maintain their roles in forming tokens, with catcode changes typically confined to local scopes via grouping. A minimal TeX document follows a straightforward structure: an optional preamble sets global parameters like \hsize for line width (e.g., \hsize 5in), followed by the main body of text and commands in horizontal or vertical mode, concluding with the primitive \bye to signal end of input and trigger final output generation. This skeleton avoids complex formatting packages, relying on plain TeX's defaults for simplicity:\hsize=6in
\font\rm=cmr10 \rm
Hello, world!
\bye
\hsize=6in
\font\rm=cmr10 \rm
Hello, world!
\bye
Design Principles
Macro Language
TeX's macro language serves as a powerful programmable layer atop its core typesetting engine, enabling users to define custom commands for document customization and automation. Macros are primarily defined using the primitive\def command, which specifies a control sequence name, optional parameter template, and replacement text. For parameterized macros, the syntax \def\cmd#1{replacement} allows the macro \cmd to accept one argument substituted as #1 in the replacement text; multiple parameters use #1, #2, etc., up to #9. Aliases can be created with \let, such as \let\oldcmd\newcmd, which binds \oldcmd directly to the current meaning of \newcmd without expansion. These mechanisms, introduced by Donald Knuth, form the foundation for extending TeX's functionality beyond plain text processing.[20][21]
The macro system operates through distinct phases of expansion and execution, where control sequences are first replaced by their definitions before further processing or typesetting occurs. The \edef primitive defines a macro by fully expanding its replacement text at definition time and storing the result locally, useful for creating pre-expanded content like strings. In contrast, \xdef performs the same expansion but stores the definition globally, affecting all scopes. This expansion model allows macros to generate dynamic content, such as inserting parameters or nested macro calls, while preventing premature execution of typesetting commands during definition. Conditionals enhance programmability: \if compares character codes, \ifnum tests integer equality, and \ifdim compares dimensions, enabling branching logic within macros. Loops are supported via \loop followed by body and \repeat, repeating until a \exitloop or false conditional intervenes.[20][21][22]
TeX's macro language is Turing complete, meaning it can simulate any computable function through combinations of conditionals, loops, and register manipulations, such as using counters or dimensions as variables for arithmetic and control flow. For example, integer registers can store values manipulated via \advance and tested with \ifnum, allowing emulation of a universal Turing machine. A simple illustrative macro is \def\hello{Hello, world!\par}, which upon invocation \hello outputs the greeting followed by a new paragraph. However, the language imposes limitations for stability: direct recursion is disallowed to avoid infinite expansion loops, though it can be simulated via iterative techniques or auxiliary registers; assignments default to local scope but can be made global with the \global prefix, as in \global\def or \global\let, to persist changes across groups. These features make TeX's macros suitable for complex document logic while maintaining predictable runtime behavior.[20][23][21]
Mathematical Typography
TeX distinguishes between inline and display modes for typesetting mathematical expressions to accommodate different contextual needs. Inline math mode, entered with single dollar signs as in , integrates formulas seamlessly into paragraphs, using compact sizing and no additional vertical separation to maintain text flow. Display math mode, invoked by double dollar signs such as , centers the expression on a dedicated line with enlarged elements like limits and added vertical space above and below for emphasis and clarity. These modes ensure mathematical content aligns with surrounding text while preserving legibility, as implemented in TeX's core engine.[24] For nested expressions, TeX employs \left and \right commands to automatically scale delimiters like parentheses or brackets around subformulas, adapting their size to the enclosed content in either mode; for instance, \left( \frac{a}{b} \right) produces properly proportioned enclosure without manual adjustment. This feature supports complex hierarchical structures common in mathematics.[24] Spacing rules in TeX's mathematical typography are designed to mimic traditional printing conventions, automatically inserting spaces based on symbol categories—such as zero space after relations like or —while providing manual controls for precision. A thin space follows the command , (equivalent to 3/18 of a quad), a medium space uses : (4/18 quad), and a thick space employs ; (5/18 quad), allowing users to fine-tune intervals around operators or punctuation for optimal readability. These conventions, rooted in established typographic practices, prevent overcrowding and enhance expression clarity without requiring explicit user intervention in most cases.[25][24] Font selection in math mode defaults to math italic for variables and symbols, promoting distinction from text, with switches like \rm for roman upright fonts in operators (e.g., versus ) and \it for italic variants; bold styles can be achieved via \bf within math. The Computer Modern font family exemplifies TeX's integrated support, offering dedicated math fonts with slanted italics, upright symbols, and varying weights to suit these selections across modes.[24] Superscripts and subscripts position via ^ and _, respectively, attaching to the base symbol with offsets scaled by style (smaller in inline mode), while \limits explicitly places under- and over-limits below and above operators like sums or integrals in display mode for prominent visibility, as in This stacking prioritizes hierarchical clarity, adjusting heights and depths to avoid overlap in stacked elements.[24] Kerning adjustments in mathematical typesetting address subtle overlaps, such as tightening space between integral symbols and differentials (e.g., ) or slanting fractions like , following rules derived from manual composition to achieve balanced visual density. These pairwise corrections, applied during box assembly, ensure operators integrate fluidly with adjacent atoms.[25][24] TeX's math typesetting algorithm processes input as a sequence of atoms (classified as ordinary, operator, relation, etc.) and nuclei, building a math list that incorporates spacing, kerning, and positioning rules before converting it to horizontal boxes. This approach, emphasizing aesthetic judgment over rigid metrics—such as variable limit placement based on operator width—stems from Knuth's design goals to replicate high-quality printed mathematics, handling ambiguities through context-sensitive heuristics for superior readability.[24]Hyphenation and Justification
TeX supports flexible paragraph justification, allowing users to choose between ragged-right alignment and full justification. In ragged-right mode, the right margin is uneven, with interword spaces fixed at their natural width and no stretching applied, typically set by assigning a positive rigid \rightskip glue. This avoids excessive spacing variations but produces a less formal appearance. Full justification, the default for most documents, evenly distributes extra space across the line by stretching or shrinking interword glue components, using commands like \hskip for explicit horizontal skips that include stretch and shrink orders.[20] To evaluate line quality during justification, TeX computes a badness measure for each potential line, quantifying the deviation from ideal spacing based on the ratio of actual glue adjustment to the allowed tolerance. High badness indicates overfull or underfull lines with uneven word spacing. Break points between lines incur penalties, controlled by the \penalty primitive, which assigns a numerical cost to discourage or forbid breaks at specific locations; infinite penalties (\penalty-\infty or \penalty\maxdimen) prevent breaking entirely, while finite values allow it with a cost added to the overall optimization.[26][20] Hyphenation in TeX employs the Knuth-Liang algorithm, a rule-based method that identifies permissible break points within words using a compact trie of language-specific patterns rather than a full dictionary lookup. These patterns, short strings encoding syllable boundaries and compatibility codes, are precomputed from hyphenated word lists to achieve high accuracy with minimal storage; for English in the original TeX82 implementation, approximately 4500 patterns suffice to correctly hyphenate about 89% of cases from a standard dictionary. Patterns for a given language are loaded via the \patterns primitive during initialization, enabling efficient matching against word n-grams during processing. Users can override or supplement these with manual exceptions using the \hyphenation command, which adds specific words with marked break points, such as \hyphenation{man-u-script}.[27][20] Paragraph breaking integrates justification and hyphenation through a dynamic programming approach, exploring all feasible ways to divide the text into lines while considering glue adjustments, hyphenation opportunities, and penalties. TeX maintains a table of partial solutions, computing the minimal total demerit for prefixes of the paragraph by selecting break points that balance local line quality against global aesthetics. The demerit for each line combines badness with any break penalty, formulated as: This cubic scaling penalizes extreme spacing imbalances more severely, ensuring the final paragraph minimizes cumulative demerits while respecting constraints like maximum infilling.[26]Metafont Font System
Metafont is a programming language and software system developed by Donald Knuth for creating scalable bitmap fonts, serving as a companion to TeX since its first version in 1979.[4] It allows users to define fonts parametrically through source files with the .mf extension, which are compiled into font metric files (.tfm) and generic font files (.gf); the .gf files are then typically converted to packed bitmap files (.pk) using the gftopk utility for efficient storage and use by TeX.[28] This process enables the generation of high-quality raster fonts tailored to specific device resolutions, emphasizing mathematical precision in character shapes. The core of Metafont's parametric design lies in its use of paths, pens, and rules to simulate the drawing process, where characters are constructed by moving an abstract pen along defined trajectories rather than directly specifying outlines. Paths can include straight lines, splines, and ellipses, with commands likecircle currentpoint ... used to create smooth curves by connecting points with tension controls for natural curvature. Pens are defined as shapes—often elliptical for serifs and stems—that are transformed and stroked along paths, while rules provide constraints for consistency across glyphs, such as alignment or proportionality. This approach permits fine-tuned variations by adjusting numeric parameters, allowing a single .mf program to produce families of related fonts.[29]
A prominent example is the Computer Modern font family, which comprises over 75 fonts generated from Metafont sources using parameters like slant (for italic angles), x-height (for lowercase proportions), and extension factors for spacing and boldness. These parameters enable the creation of roman, italic, bold, and sans-serif variants across multiple sizes and weights, all derived from a unified set of glyph definitions in files like roman.mf.
Metafont operates in various modes to adapt output to different rendering contexts, such as "proof" mode for low-resolution screen previews with faster computation and "smooth" mode for high-resolution production printing with finer pixel control. To mitigate visual artifacts like aliasing or moiré patterns in bitmaps, Metafont incorporates randomization in pixel positioning during rasterization, introducing controlled variability for more natural appearances without altering the underlying design.[28]
Integration with TeX is seamless, as generated .pk and .tfm files are loaded via commands like \font\cmr=cmr10 scaled 1000, allowing TeX to reference Computer Modern Roman at 10-point size directly from Metafont output. Despite its strengths, Metafont is inherently bitmap-oriented, producing raster fonts optimized for fixed resolutions rather than scalable vectors, which limits portability to modern outline formats; this has led to successors like MetaPost, a vector-graphics extension based on Metafont's language for generating PostScript paths.
Operation
Input Processing
TeX processes its input through a structured runtime execution divided into distinct phases: tokenization (also known as scanning), expansion (macro replacement), and execution (building of boxes). In the tokenization phase, the input processor reads characters from the input stream and categorizes them into tokens based on current category codes, forming the basic units of TeX's language. This scanning occurs line by line, with the input buffer—referred to metaphorically as TeX's "mouth"—holding up to a fixed size (typically 500 characters in original implementations) of the current line or file contents. Undigestible or unprocessed tokens are held in a secondary buffer, analogous to the "stomach," awaiting further processing after expansion.[11] During the expansion phase, TeX replaces expandable tokens, such as macros and certain primitives, with their definitions or results, effectively filtering the token list until only primitive or unexpandable tokens remain. This process handles nested expansions and uses mechanisms like \expandafter to control the order of replacement, ensuring that the input is transformed into a sequence ready for execution. The execution phase then interprets these tokens to construct the document's structure using TeX's box model, where horizontal lists (for text lines) and vertical lists (for paragraphs and pages) are built into hboxes (horizontal boxes) or vboxes (vertical boxes). Math lists are similarly assembled in math mode. Completed boxes are eventually "shipped out" to form pages, though the actual output generation occurs separately.[11] TeX maintains state through various registers for variables, including \count registers for integers, \dimen registers for dimensions, and \skip registers for glue (stretchable spaces), with 256 of each available in the original TeX82 implementation. These registers store values used during processing, such as counters for page numbers or dimensions for spacing. Memory management in TeX is constrained by fixed limits to ensure portability and predictability; for instance, the total memory is capped at 30,000 words in the original TeX82 implementation, though modern variants support much larger limits to accommodate complex documents, leading to overflow errors if exceeded during box building or token storage. Such limits prevent runaway memory usage but require users to optimize complex documents.[11] For error recovery, TeX operates in an interactive mode by default, pausing at errors to allow user intervention via commands likeOutput Generation
TeX generates its primary output in the Device Independent (DVI) format, a binary file designed to be portable across different devices and resolutions without relying on specific hardware details. The DVI file structure consists of a sequence of commands that describe pages, boxes, and fonts, allowing for the precise positioning of glyphs and graphical elements. Each page in the DVI file is represented as a vertical box (\vbox) that encapsulates the content, with the \shipout command instructing TeX to output the completed page to the file. Vertical justification within pages is achieved through commands like \vfil, which distribute extra space evenly to balance the layout. The DVI format uses absolute coordinates for baseline positioning, measured in scaled points (sp), where 1 TeX point equals 65,536 sp, enabling sub-point precision for alignment and spacing.[30] This coordinate system supports the stacking of horizontal and vertical boxes to form complex layouts, with content shipped out page by page as the document is processed. Box construction, as handled earlier in the typesetting pipeline, feeds into this output phase where final assembly occurs. To convert DVI to printable or viewable formats, external drivers are employed: dvips translates DVI to PostScript for high-quality printing, while dvipdfmx or similar tools generate PDF output. Extensions to TeX have enhanced output capabilities beyond the original DVI limitations, which lack native support for color, hyperlinks, or advanced multimedia. pdfTeX, introduced in 1996, bypasses DVI entirely by producing PDF directly during processing, incorporating features like embedded fonts and compression for digital distribution. For Unicode and non-Latin scripts, XeTeX (released in 2004) integrates system fonts and outputs directly to PDF, supporting bidirectional text and complex glyph shaping. Similarly, Omega, developed in the 1990s, extends TeX's output to handle multilingual typesetting through its virtual font system, producing DVI or PDF with improved character encoding. These extensions address TeX's original constraints while preserving the core engine's precision.Applications
Mathematical and Scientific Typesetting
TeX excels in mathematical and scientific typesetting due to its sophisticated handling of formulas, symbols, and layouts, enabling the production of documents that meet the exacting standards of academic publishing. Its macro-based system allows users to define complex expressions with precision, ensuring consistent rendering across diverse output formats like PDF and DVI. This capability has made TeX a cornerstone for fields requiring dense mathematical content, from pure mathematics to theoretical physics. The American Mathematical Society (AMS) pioneered TeX's institutional adoption, training staff at Stanford University in 1979 to implement it for journal and book production by the early 1980s. AMS-TeX, an extension of plain TeX tailored for advanced mathematics, introduced features like \DeclareMathOperator, which defines custom operators—such as \DeclareMathOperator{\argmax}{argmax} for rendering \argmax in upright style with appropriate spacing—to streamline the notation of specialized functions. These enhancements addressed limitations in plain TeX, providing predefined symbols and environments for theorems, proofs, and multiline equations, thereby facilitating the preparation of rigorous scholarly works. TeX's dominance in academic dissemination is evident in its status as the preferred format for arXiv preprints, where submissions must use TeX/LaTeX source for processing, ensuring high-fidelity rendering of mathematical content. Over 90% of mathematics and statistics papers on platforms like arXiv are authored using TeX derivatives, reflecting its near-universal acceptance in the field due to superior handling of equations and references. This statistic highlights TeX's impact, as it underpins the majority of peer-reviewed mathematical literature. In scientific visualization, plain TeX supports rudimentary diagrams through macro expansions, though practical use often involves extensions for line drawings and simple shapes. For more advanced needs, LaTeX integrates TikZ, a package that generates vector-based diagrams—such as coordinate plots or schematic illustrations—directly from code, allowing seamless embedding within mathematical text without external image files. This approach maintains document consistency and scalability, ideal for illustrating concepts like geometric proofs or data flows in scientific papers. TeX's typographic finesse shines in subtle spacing adjustments that bolster readability, particularly in integrals where a thin space separates the integrand from the differential, as in ∫ f(x) , dx, preventing perceptual fusion while adhering to established mathematical conventions. Such rules, encoded in TeX's math mode, automatically adjust kerning and glue around operators, yielding outputs that rival traditional typesetting in clarity and elegance. Knuth's Concrete Mathematics (1989), co-authored with Ronald Graham and Oren Patashnik, exemplifies TeX's application, having been composed at Stanford using plain TeX with custom Concrete Roman text fonts and AMS Euler for mathematics, resulting in a benchmark for integrated algorithmic and analytic exposition. TeX's reach extends to modern data science workflows, where Jupyter notebooks leverage LaTeX rendering via MathJax to display mathematical outputs from Python computations, enabling interactive documents that blend code, visualizations, and formatted equations for exploratory analysis.Specialized Notations
TeX supports specialized notations through dedicated packages that extend its macro language to handle domain-specific symbols and structures, often requiring custom fonts generated via Metafont or its derivatives.[31][32][33] In chemistry, the mhchem package facilitates the typesetting of molecular formulae and reaction equations with intuitive commands, such as\ce{H2O} for water.[34] It also supports basic structural representations, though more complex diagrams typically integrate with tools like chemfig. For balanced equations, users can input \ce{2H2 + O2 -> 2H2O} to produce properly formatted output with subscripts and arrows.[34][35]
For music notation, MusiXTeX offers a comprehensive macro set for engraving scores, building on earlier systems like MusicTeX.[31] It employs a three-pass process for optimal spacing and includes commands like \Notes for placing notes and \Space for adjusting bar spacing, enabling the creation of multi-staff compositions with beams, slurs, and dynamics.[31]
In linguistics, the qtree package draws syntactic trees using bracket notation, ideal for hierarchical structures in syntax analysis.[32] For phonetics, the tipa package provides fonts and macros for the International Phonetic Alphabet (IPA), supporting symbols like voiced and unvoiced consonants via commands such as \textipa{abc}.[33] These rely on Metafont-generated encodings for precise diacritics.[36]
Biology notations benefit from packages like TeXshade, which processes multiple sequence alignments in formats such as FASTA or MSF, applying shading for identities and similarities.[37] It generates labeled alignments, for example, highlighting conserved residues in protein sequences.[38]
Physics applications include Feynman diagrams via the feynmf package, which uses MetaPost for vector graphics integrated with TeX, allowing lines, vertices, and labels for particle interactions.[39]
A common challenge in these adaptations is the need for custom fonts to render specialized symbols, frequently addressed by Metafont programs that parameterize shapes for scalability across notations like IPA diacritics or musical accidentals.[36][39]