Recent from talks
Contribute something
Nothing was collected or created yet.
Web (programming system)
View on WikipediaWeb, traditionally styled WEB, is a computer programming system created by Donald Knuth as the first implementation of what he called "literate programming"[1]: his idea that one could create software as works of literature, by embedding source code in descriptive text, rather than the reverse. Unlike standard programming practice which relegates documentation to comments, the WEB approach is to write an article to document the making of the source code, and to include all the source code in that article, so as to be compilable therefrom.
Philosophy
[edit]The common practice in most programming languages is that the primary text is source code, optionally supplemented by descriptive text in the form of comments. Knuth proposed that making the descriptive text primary was putting things in an order more convenient for human readers, rather than the order demanded by compilers.[2]
Much like TeX articles, the Web source text is divided into sections according to documentation flow. For example, in CWEB, code sections are seamlessly intermixed in the line of argumentation.[3]
Implementations
[edit]The original WEB system depends on Pascal and comprises two programs:
- TANGLE, which produces compilable Pascal code from the source texts, and
- WEAVE, which through the use of TeX produces nicely-formatted, printable documentation from the same source texts.
Others:
- CWEB (below) is a version of Web for the C programming language, while
- noweb is a separate literate programming tool, which is inspired by Web (as reflected in the name) and which is language agnostic.
The most significant programs written in Web are TeX and Metafont. Modern TeX distributions however use another program called Web2C to convert Web source to C.
CWEB
[edit]| CWEB | |
|---|---|
| Paradigm | Literate, imperative (procedural), structured |
| Designed by | Donald Knuth |
| Developer | Donald Knuth & Silvio Levy |
| First appeared | 1987 |
| Stable release | 3.67
/ October 24, 2006 |
| Typing discipline | Static, weak, manifest, nominal |
| OS | Cross-platform (multi-platform) |
| License | custom free-software license |
| Filename extensions | .w |
| Website | www-cs-faculty |
| Influenced by | |
| WEB, TeX | |
| Influenced | |
| noweb | |
CWEB is a computer programming system created by Donald Knuth and Silvio Levy as a follow-up to Knuth's WEB literate programming system, using the C programming language (and to a lesser extent the C++ and Java programming languages) instead of Pascal.
Like WEB, it consists of two primary programs:
- CTANGLE, which produces compilable C code from the source texts, and
- CWEAVE, which produces nicely-formatted printable documentation using TeX.
Features
[edit]- Can enter manual TeX code as well as automatic.
- Makes formatting of C code suitable for pretty-printing.
- Can define sections, and can contain documentation and codes, which can then be included into other sections.
- Writes the header code and main C code in one file, and can reuse the same sections, and then it can be tangled into multiple files for compiling.
- Uses
#linedirective so that any warnings or errors refer to the .w source. - Include files.
- Change files, which can be automatically merged into the code when compiling/printing.
- Produces index of identifiers and section names in the printout.
See also
[edit]- Documentation generators – While comparable with Web's WEAVE, these however generally follow the standard practice of source code first, the opposite of the Web approach.
References
[edit]- ^ "Knuth and Levy: CWEB". cs.stanford.edu. Retrieved 2025-10-27.
- ^ Knuth, Donald E. (1992). Literate Programming. CSLI Lecture Notes. Vol. 27. Stanford, California: Center for the Study of Language and Information.
- ^ Silvio Levy (12 June 2004). "An example of CWEB" (PDF). Archived from the original (PDF) on 20 October 2021.
External links
[edit]- The TeX Catalogue entry for Web
- CWEB homepage
- Examples of programs written in Web, By Donald Knuth (1981 and onward)
Web (programming system)
View on GrokipediaIntroduction
Definition and Purpose
The Web system is a computer programming tool developed by Donald Knuth as the pioneering implementation of literate programming, wherein a single source file integrates both executable code and its explanatory documentation in a cohesive manner.[4] This approach treats the program as a form of literature, allowing the source material to function dually as input for compilation into runnable software and as formatted text for human readers.[4] The primary purpose of Web is to empower programmers to author software in a narrative, essay-like structure that prioritizes readability and logical exposition over the conventional disjointed separation of code from comments or external manuals.[4] By intertwining documentation with code, Web reverses the typical workflow of programming languages, where code is written primarily for machines and documentation is an afterthought; instead, it emphasizes communication to humans, fostering programs that are as comprehensible as well-written prose.[4] Web was initially released in a preliminary version in November 1981, emerging directly from Knuth's efforts to implement the TeX typesetting system, for which he sought a method to document complex algorithms transparently. In its basic workflow, a source file with the .web extension serves as input, which is then processed to generate both compilable code files and documentation outputs, such as TeX-formatted documents.[4]Historical Context
During the late 1970s, Donald E. Knuth, while revising the second volume of The Art of Computer Programming and encountering poor-quality galleys from the publisher, initiated the development of the TeX typesetting system to achieve precise control over mathematical notation and typography.[5] This project highlighted Knuth's frustrations with conventional programming documentation practices, which typically isolated code from explanatory prose, resulting in documents that were difficult to comprehend and maintain for both readers and future developers.[2] Knuth sought a method to intertwine narrative descriptions with code in a natural, book-like format, viewing programs primarily as communications to humans rather than mere instructions for machines.[3] These challenges during TeX's creation, which began in 1978, directly inspired the invention of WEB as a pioneering tool for what Knuth termed literate programming. A prototype system called DOC emerged in spring 1979, but WEB was formalized shortly thereafter, with version 1.0 released in 1982 specifically tailored for the Pascal language.[2] The system's evolution remained intertwined with TeX and the concurrent Metafont font design project, both of which Knuth implemented using WEB to ensure comprehensive, self-documenting source materials that facilitated debugging, portability, and scholarly analysis.[3] WEB drew influences from the structured programming movement, notably Edsger W. Dijkstra's emphasis on clarity and modularity in code organization, as well as Knuth's broader advocacy for elevating software documentation to an artistic level comparable to literature.[2] Knuth's experiments with WEB over several years culminated in its first public description in the 1984 paper "Literate Programming," published in The Computer Journal, where he outlined the system's philosophy and demonstrated its application through examples from TeX. This milestone marked WEB's introduction to the broader computing community, underscoring its role in addressing longstanding deficiencies in program readability and verifiability.[2]Philosophy and Principles
Literate Programming Concept
Literate programming, as pioneered by Donald E. Knuth in the development of the Web system, reimagines software creation by treating programs as works of literature intended primarily for human readers rather than mere instructions for machines.[1] In this paradigm, the programmer interweaves natural language prose to explain the underlying algorithms and logic, positioning code segments as illustrative examples that support the narrative explanation.[4] This approach inverts the traditional focus of programming, where source code is often dense, executable text that prioritizes computational efficiency over clarity, resulting in opaque artifacts difficult for others to comprehend or maintain.[1] By contrast, Web's literate programs form coherent, annotated narratives that enhance accessibility and understanding.[4] Knuth drew an explicit analogy between literate programs and mathematical expositions, where authors present concepts in a logical flow that may employ forward references to later-defined elements, allowing for modular and intuitive explanations unbound by strict sequential constraints.[1] Just as a mathematician might outline a proof by discussing high-level ideas before delving into details or lemmas, a literate programmer can structure the document to reflect the "stream of consciousness" in which the solution was conceived, fostering a more natural progression of ideas.[4] The benefits of this concept, as realized in Web, include significantly improved program maintainability through its emphasis on comprehensive documentation embedded directly within the source, making modifications more straightforward for teams or future developers.[1] Readability is elevated by integrating explanatory prose, which clarifies intent and reduces errors in interpretation, while the collaborative potential grows as the literate format encourages shared understanding akin to co-authoring a technical paper.[4] Overall, these advantages lead to more robust and enjoyable programming experiences, as evidenced by Knuth's own implementations.[1]Documentation-Code Integration
In the Web programming system, documentation and code are integrated within a single source file, typically with a .web extension, which alternates between explanatory text written in TeX markup and programmatic elements in a Pascal-like syntax. This structure organizes content into discrete sections, each beginning with a control sequence such as@< for named modules or @* for major divisions, allowing the narrative to unfold logically while embedding executable code. The TeX portions provide detailed commentary, mathematical expressions, and diagrams, seamlessly interspersed with code snippets that are delimited by specific markers to distinguish them from prose.[4][6]
Section names play a crucial role in this integration, serving as labeled identifiers enclosed in angle brackets (e.g., @<Clear the arrays@>), which enable non-sequential referencing of code chunks throughout the document. These names, often descriptive and imperative in style, allow programmers to define modules out of execution order, prioritizing the explanatory flow over strict linear compilation requirements; for instance, a section defining a subroutine might appear early in the narrative but be invoked later via its name. This mechanism supports modular reuse, as references to other sections can be inserted directly into code or text, fostering a web-like interconnection without disrupting readability.[4][6]
The duality of output from a single .web file exemplifies the system's efficiency: processing yields both a compilable Pascal program and formatted documentation suitable for typesetting. The code is extracted and assembled into a standalone source file for compilation, while the documentation, including all TeX markup, is rendered into a printable document—often via TeX processing—to produce indexed, cross-linked prose with embedded code listings. This ensures that changes to the source automatically update both artifacts, maintaining synchronization between implementation and explanation.[4][6]
Cross-references are handled automatically to enhance navigability, with the system generating an index that maps code modules to their defining sections and all referencing locations in the documentation. Section numbers (e.g., §145) are assigned sequentially and underlined in the output to denote definitions, while uses are listed below, providing a comprehensive overview of dependencies without manual intervention. This feature, integral to the literate programming approach, allows readers to trace code evolution through the narrative, linking abstract descriptions to concrete implementations.[4][6]
System Architecture
Tangle Process
The Tangle utility serves as the code extraction component of the WEB literate programming system, processing a .web input file to generate a compilable source file in a standard programming language, such as Pascal. It parses the bilingual .web file, which intermingles documentation and code sections delimited by specific markers like@<name@> for modules and @d for macro definitions, identifying and separating the code portions while ignoring TeX-formatted documentation. This parsing divides the file into modules, where each module's Pascal (or equivalent) code is collected independently of its textual order in the source.[7]
Tangle handles dependencies by resolving forward references through a substitution process: it begins with unnamed modules to form an initial code segment, then iteratively replaces all named module references (e.g., @<section name@>) with their corresponding code chunks until the entire program is assembled in execution order. This mechanism allows programmers to define code sections non-sequentially in the .web file, ensuring that the output reflects the logical program flow without requiring manual reordering. The resulting file, typically with a .pas extension for Pascal, consists solely of concatenated plain text code—stripped of all documentation, converted to uppercase, and formatted with lines no longer than 72 characters—making it directly compilable by standard language compilers. It also inserts comments indicating the section numbers (e.g., {§1:} and {:1}) to facilitate linking the code back to the documentation sections.[7][4]
During processing, Tangle performs error checking to maintain program integrity, reporting issues such as undefined section names (e.g., a reference to a non-existent @<module@>), unmatched parentheses in code chunks, or invalid macro expansions. If such errors are detected, Tangle halts execution and outputs diagnostic messages to aid debugging, preventing the generation of malformed code. This is complementary to the Weave process, which focuses on documentation output; Tangle ensures that the extracted code is syntactically valid and ready for compilation.[7]
Weave Process
The Weave process in Donald Knuth's WEB system is a utility that transforms a WEB source file into a formatted TeX document, interweaving the program's documentation and code excerpts to produce readable output suitable for human consumption.[2] This process emphasizes the narrative structure of the literate program, presenting sections in a logical, explanatory order rather than the linear code sequence required for execution.[2] Weave takes a file with the extension.web as input and generates a corresponding .tex file, which is then processed by the TeX typesetting system to yield a device-independent file such as .dvi, ultimately convertible to formats like PDF.[2] The output preserves the WEB file's sectional organization, numbering modules sequentially (e.g., §1, §2) and embedding explanatory text alongside relevant code modules.[8] This flow ensures that the resulting document reads like a technical article, with code serving as illustrative examples within the prose.[2]
Key formatting features of Weave include rendering code in a monospaced font to distinguish it from surrounding text, while applying enhancements such as boldface for reserved words, italics for identifiers, and mathematical symbols for operators (e.g., ∧ for logical "and" or ≥ for greater-than-or-equal).[8] Automatic indentation aligns structured elements like begin-end blocks or if-then-else statements, improving visual clarity without manual intervention.[8] These typographic conventions make the interwoven code both aesthetically pleasing and easy to follow in the context of the documentation.[8]
Weave also generates navigational aids, including a table of contents listing section titles and an index of identifiers (e.g., variables, functions, and keywords) with references to their defining and using sections, where definitions are underlined for quick identification.[8] Cross-references are highlighted through module numbers, allowing readers to trace connections such as "used in section 27" or "defined in §14," which facilitates understanding of the program's modular relationships.[2]
Customization in Weave is achieved via embedded TeX macros within the WEB file, enabling users to define commands for styling specific documentation elements, such as font variations or section layouts (e.g., \def\WEB{{\tt WEB}} for consistent typesetting of the system name).[2] Programmers can override default code formatting or introduce additional TeX directives to tailor the output to particular publishing needs, ensuring flexibility while maintaining the system's core literate philosophy.[8]
Implementations
Original WEB
The original WEB system was developed by Donald Knuth at Stanford University, with initial design work beginning in September 1981 as part of the effort to implement and document the TeX typesetting system in Pascal.[4] This system emerged from an earlier prototype called DOC created in spring 1979, evolving into WEB Version 0 by 1981 and Version 1 by September 1982, specifically tailored to address the challenges of maintaining complex programs like TeX through integrated documentation.[4] Knuth's motivation was to create a tool that would allow programmers to express ideas in a natural, narrative order while generating both readable documentation and compilable code, thereby improving software reliability and comprehension for the TeX project.[4] WEB was designed for Pascal, incorporating built-in support for its syntax, including modules and procedures, to facilitate structured programming within the literate framework.[7] The system used macro expansion to extend Pascal's capabilities, such as handling strings and other limitations inherent to the language at the time, ensuring that TeX's intricate algorithms could be expressed clearly.[4] Users were required to be proficient in both Pascal and TeX, as WEB intertwined code sections with documentation formatted for TeX output.[7] The original WEB was distributed freely from Stanford University, with key files like WEAVE.WEB and TANGLE.WEB made available for academic and research use.[4] It was primarily employed in the development of TeX and the companion Metafont font design system, where Knuth documented over 14,000 lines of Pascal code across multiple volumes.[7] This distribution model supported the open dissemination of the TeX project materials, fostering early adoption in computational typography.[4] Despite its innovations, the original WEB had significant limitations, being tightly coupled to Pascal and the early versions of TeX, which restricted its applicability beyond those contexts.[7] Portability was a major issue, as it depended on specific Pascal compilers and TeX implementations, making adaptation to other programming languages challenging without substantial modifications.[4] These constraints later prompted extensions like CWEB for broader language support.[4]CWEB
CWEB is an extension of the original WEB system, developed by Donald E. Knuth and Silvio Levy in 1987 specifically to support literate programming in the C language, with later extensions for C++ and Java.[3][9] This adaptation allowed programmers to document and structure C code using WEB's principles while addressing the syntactic needs of C, such as its use of the preprocessor for macros and conventions for representing numbers in octal and hexadecimal formats.[9] Key adaptations in CWEB include robust handling of C-specific elements like pointers (e.g.,int *pa) and structures, ensuring they are properly formatted in both code output and documentation, all while leveraging TeX for high-quality typesetting of explanatory text.[9] The system simplifies some aspects of WEB by relying on C's built-in preprocessor capabilities, reducing the need for custom string and macro handling.[9]
CWEB employs a dual-mode structure in its input files (typically with a .w extension), where TeX serves as the default for documentation and C code is inserted via explicit mode-switching directives, such as @c to begin a code section, @t to embed TeX material within code, and @d for defining macros that expand to C preprocessor directives.[9] This integration enables seamless mixing of narrative explanations and executable code in a single file, with tools like CTANGLE extracting compilable C and CWEAVE producing formatted TeX documents.[9]
CWEB has been notably applied in TeX-related tools, including the self-documenting source code for CTANGLE and CWEAVE, as well as significant portions of the LuaTeX engine, which rewrote parts of the TeX core in C using CWEB for maintainability.[9][10] It also appears in various open-source software projects, such as example implementations on CTAN, demonstrating its utility for producing well-documented, modular C programs.[11]
Other Variants
Noweb is a language-agnostic literate programming tool developed by Norman Ramsey in the 1990s, designed for simplicity and extensibility while supporting multiple output formats such as TeX, LaTeX, HTML, and troff.[12] Unlike Knuth's original WEB system, noweb uses only five control sequences and emphasizes a pipelined architecture that allows users to customize behavior or add features with minimal code, such as 250 lines for hypertext support.[12] It works out of the box with any programming language by treating code chunks verbatim, without requiring language-specific prettyprinting, which sacrifices some formatting fidelity for broader applicability.[12] FWEB extends the literate programming paradigm to Fortran and other languages, building directly on an early version (0.5) of Silvio Levy's CWEB adaptation of WEB.[13] Developed by John A. Krommes starting in 1993, it supports C, C++, Fortran (from F77 to 2023 standards), Ratfor, and TeX, maintaining documentation and source code in a unified web file while enabling TeX-based typesetting and cross-referencing.[14] Key additions include a C-like macro preprocessor for conditional compilation, customizable style files for output control, and Fortran-specific features like symbolic statement labels and built-in Ratfor translation.[14][13] Modern ports of WEB concepts include adaptations for Java, such as ongoing efforts to create JWEB as a dedicated variant, often integrated within broader CWEB extensions that already handle Java programs alongside C and C++.[15][3] Literate programming tools like Emacs Org-mode provide integrations inspired by WEB, using its Babel subsystem to embed and tangle code from multiple languages within structured documents, facilitating portability across environments without direct dependence on Knuth's original tools.[16][17] Community-driven open-source forks have addressed post-1990s portability issues in WEB and CWEB implementations, such as compiling on modern systems and supporting additional languages or standards.[18] Examples include repositories maintaining revised CWEB versions for C/C++ with enhanced cross-platform compatibility, and independent tools like Literate, which preserve core WEB features like macro expansion while extending to arbitrary languages via open-source development.[19][20] These contributions, often hosted on platforms like GitHub, focus on reducing dependencies on legacy TeX setups and improving extensibility for contemporary workflows.[18]Syntax and Features
Macro Expansion
In the Web programming system, macros are defined using the@d command to create reusable code blocks that facilitate modular programming. A basic macro is specified as @d identifier = constant for numeric values or @d identifier == Pascal text for textual substitutions, where the identifier must consist of more than one character.[7] Parametric macros extend this with @d identifier(#) == Pascal text, allowing the # placeholder to be replaced by arguments during expansion.[7] These definitions occur in the definition part of a module, enabling the assembly of complex programs from simpler, parameterized components without redundant code.[7]
Macro expansion follows textual substitution rules during the Tangle process, where the macro body replaces all instances of the identifier in the module's Pascal text.[7] For parametric macros, the # is substituted with the provided argument, requiring balanced parentheses to ensure proper nesting and avoid parsing errors.[7] To prevent name clashes, TANGLE verifies uniqueness by considering the first seven characters of identifiers after removing underlines, thus maintaining distinct macro scopes even in large programs.[7]
Macros defined in the definition part of a module are available globally throughout the Pascal text, with uniqueness rules and modular structure preventing interference and supporting hierarchical code organization.[7]
For example, a simple textual macro might be defined as:
@d upper_case_Y == "Y"
@d upper_case_Y == "Y"
upper_case_Y appears.[7] A parametric example could be:
@d two_cases(#) == case j of
1: #(1);
2: #(2);
end
@d two_cases(#) == case j of
1: #(1);
2: #(2);
end
two_cases(upper), it expands to handle cases for the upper argument, demonstrating how macros build intricate logic from basic templates.[7]
Change Files
Change files in the WEB system provide a mechanism for modifying WEB programs without directly altering the original source files, typically named with a .web extension. These auxiliary files, often denoted with a .ch extension, allow users to specify targeted overrides or insertions to adapt the program for specific environments, such as porting to different operating systems or applying bug fixes. This approach preserves the integrity of the master WEB file while enabling localized customizations, which is particularly valuable for maintaining large, complex programs across multiple versions or platforms.[7] The syntax of change files employs special control sequences beginning with the "@" symbol to delineate modifications. A typical change is structured as@x followed by one or more lines to match in the original WEB file, then @y introducing the replacement or inserted lines, and @z to conclude the change; lines not prefixed with these codes are ignored during processing. This format ensures precise, line-by-line substitutions or additions, with the change file being scanned sequentially alongside the WEB file. TANGLE and WEAVE processors integrate the change file by replacing matched sections in the input stream before generating the respective Pascal code or TeX documentation outputs.[7]
Donald Knuth employed change files extensively in the development and maintenance of TeX, using them to incorporate bug fixes and version updates without overhauling the core tex.web source. For instance, system-dependent adjustments, such as memory allocation or file handling, were handled via files like tex.ch, allowing TeX to be adapted across diverse computing environments while keeping the master file unchanged; this supported a version control-like workflow for iterative improvements documented in Knuth's error logs. Such applications extended to extensions like e-TeX, implemented as change files applied to the original TeX source.[7][21]
Despite their utility, change files have inherent limitations, including the requirement for exact line matches in the original file, which can complicate maintenance if the base source evolves significantly. They support only sequential application during a single processing run, precluding automatic merging of multiple concurrent change files without additional tools, potentially leading to manual reconciliation efforts for complex updates.[7]
Usage and Applications
Practical Examples
One practical illustration of the WEB system involves tangling a simple Pascal program from a .web file that embeds explanatory documentation. Consider a basic program that reads an integer and prints "Hello world" times, structured as follows inhello.web:
@* Introduction.
This program takes an [integer](/page/Integer) $n$ as input, and prints ``Hello world'' $n$ times.
@p program HELLO([input, output](/page/Input/output));
var
n: [integer](/page/Integer);
i: [integer](/page/Integer);
begin
read(n);
for i := 1 to n do
begin
writeln('Hello world');
end;
end.
@* Index. Not much to it. Everything occurs in section 1.
@* Introduction.
This program takes an [integer](/page/Integer) $n$ as input, and prints ``Hello world'' $n$ times.
@p program HELLO([input, output](/page/Input/output));
var
n: [integer](/page/Integer);
i: [integer](/page/Integer);
begin
read(n);
for i := 1 to n do
begin
writeln('Hello world');
end;
end.
@* Index. Not much to it. Everything occurs in section 1.
hello.p, stripping documentation and reordering sections for compilation, while preserving comments with section numbers (e.g., {1:}). The resulting program can then be compiled with a Pascal compiler to produce an executable that functions as described.[7]
For a more complex case, the source of TeX itself demonstrates WEB's ability to section algorithms with integrated descriptions. An excerpt from tex.web shows the initialization procedure, where explanatory text precedes modular code definitions:
procedure initialize; {this procedure gets things started properly}
var @<Local variables for initialization@>@/
begin @<Initialize whatever \TeX\ might access@>@;
end;
@<Initialize whatever \TeX\ might access@>=
@<Set init...@>
@!init @<Initialize table entries (done by \.{INITEX} only)@>@;@+tini
procedure initialize; {this procedure gets things started properly}
var @<Local variables for initialization@>@/
begin @<Initialize whatever \TeX\ might access@>@;
end;
@<Initialize whatever \TeX\ might access@>=
@<Set init...@>
@!init @<Initialize table entries (done by \.{INITEX} only)@>@;@+tini
@<...@> macros referencing code modules that TANGLE expands into sequential Pascal, embedding explanations directly in the logic for clarity during development and review. The full tex.web file spans 24,863 lines, with only 523 containing WEB commands, highlighting efficient integration of ~25,000 lines of Pascal code across modular sections.[22][23]
In practice, WEB's approach reduced errors in large projects like Metafont by promoting structured documentation and code reuse; for instance, approximately 15% of Metafont's lines were reused from TeX's WEB source with adaptations, minimizing redundant implementation and facilitating error detection through detailed, typeset explanations. Knuth's meticulous indexing further ensured maintainability, with change files allowing portability across systems without introducing bugs.[24][23]
The step-by-step process for using WEB begins with writing a .web file that interleaves prose, macros, and code sections. TANGLE is then invoked (e.g., tangle hello.web) to generate compilable Pascal code, which is fed to a compiler for an executable. Simultaneously, WEAVE (e.g., weave hello.web) produces a .tex file for typesetting documentation with cross-references and indices via TeX, yielding a formatted report that mirrors the program's structure. This dual output supports iterative refinement, as seen in porting TeX82's ~60,000 lines across 15 hours using change files for system-specific adjustments.[7][23]
