M4 (computer language)
View on Wikipedia| m4 | |
|---|---|
| Paradigm | macro |
| Designed by | Brian Kernighan, Dennis Ritchie |
| First appeared | 1977 |
| Major implementations | |
| GNU m4 | |
m4 is a general-purpose macro processor included in most Unix-like operating systems, and is a component of the POSIX standard.
The language was designed by Brian Kernighan and Dennis Ritchie for the original versions of UNIX. It is an extension of an earlier macro processor, m3, written by Ritchie for an unknown AP-3 minicomputer.[1]
The macro preprocessor operates as a text-replacement tool. It is employed to re-use text templates, typically in computer programming applications, but also in text editing and text-processing applications. Most users require m4 as a dependency of GNU autoconf and GNU Bison.
History
[edit]Macro processors became popular when programmers commonly used assembly language. In those early days of programming, programmers noted that much of their programs consisted of repeated text, and they invented simple means for reusing this text. Programmers soon discovered the advantages not only of reusing entire blocks of text, but also of substituting different values for similar parameters. This defined the usage range of macro processors at the time.[2]
In the 1960s, an early general-purpose macro processor, M6, was in use at AT&T Bell Laboratories, which was developed by Douglas McIlroy, Robert Morris and Andrew Hall.[3]
Kernighan and Ritchie developed m4 in 1977, basing it on the ideas of Christopher Strachey. The distinguishing features of this style of macro preprocessing included:
- free-form syntax (not line-based like a typical macro preprocessor designed for assembly-language processing)
- the high degree of re-expansion (a macro's arguments get expanded twice: once during scanning and once at interpretation time)
The implementation of Rational Fortran used m4 as its macro engine from the beginning, and most Unix variants ship with it.
As of 2024[update] many applications continue to use m4 as part of the GNU Project's autoconf. It also appears in the configuration process of sendmail (a widespread[citation needed] mail transfer agent) and for generating footprints in the gEDA toolsuite. The SELinux Reference Policy relies heavily on the m4 macro processor.
m4 has many uses in code generation, but (as with any macro processor) problems can be hard to debug.[4]
Features
[edit]m4 offers these facilities:
- a free-form syntax, rather than line-based syntax
- a high degree of macro expansion (arguments get expanded during scan and again during interpretation)
- text replacement
- parameter substitution
- file inclusion
- string manipulation
- conditional evaluation
- arithmetic expressions
- system interface
- programmer diagnostics
- programming language independent
- human language independent
- provides programming language capabilities
Unlike most earlier macro processors, m4 does not target any particular computer or human language; historically, however, its development originated for supporting the Ratfor dialect of Fortran. Unlike some other macro processors, m4 is Turing-complete as well as a practical programming language.
Unquoted identifiers which match defined macros are replaced with their definitions. Placing identifiers in quotes suppresses expansion until possibly later, such as when a quoted string is expanded as part of macro replacement. Unlike most languages, strings in m4 are quoted using the backtick (`) as the starting delimiter, and apostrophe (') as the ending delimiter. Separate starting and ending delimiters allows the arbitrary nesting of quotation marks in strings to be used, allowing a fine degree of control of how and when macro expansion takes place in different parts of a string.
Example
[edit]The following fragment gives a simple example that could form part of a library for generating HTML code. It defines a commented macro to number sections automatically:
divert(-1)
m4 has multiple output queues that can be manipulated with the
`divert' macro. Valid queues range from 0 to 10, inclusive, with
the default queue being 0. As an extension, GNU m4 supports more
diversions, limited only by integer type size.
Calling the `divert' macro with an invalid queue causes text to be
discarded until another call. Note that even while output is being
discarded, quotes around `divert' and other macros are needed to
prevent expansion.
# Macros aren't expanded within comments, meaning that keywords such
# as divert and other built-ins may be used without consequence.
# HTML utility macro:
define(`H2_COUNT', 0)
# The H2_COUNT macro is redefined every time the H2 macro is used:
define(`H2',
`define(`H2_COUNT', incr(H2_COUNT))<h2>H2_COUNT. $1</h2>')
divert(1)dnl
dnl
dnl The dnl macro causes m4 to discard the rest of the line, thus
dnl preventing unwanted blank lines from appearing in the output.
dnl
H2(First Section)
H2(Second Section)
H2(Conclusion)
dnl
divert(0)dnl
dnl
<HTML>
undivert(1)dnl One of the queues is being pushed to output.
</HTML>
Processing this code with m4 generates the following text:
<HTML>
<h2>1. First Section</h2>
<h2>2. Second Section</h2>
<h2>3. Conclusion</h2>
</HTML>
Implementations
[edit]FreeBSD, NetBSD, and OpenBSD provide independent implementations of the m4 language. Furthermore, the Heirloom Project Development Tools includes a free version of the m4 language, derived from OpenSolaris.
M4 has been included in the Inferno operating system. This implementation is more closely related to the original m4 developed by Kernighan and Ritchie in Version 7 Unix than its more sophisticated relatives in UNIX System V and POSIX.[5]
GNU m4 is an implementation of m4 for the GNU Project.[6][7] It is designed to avoid many kinds of arbitrary limits found in traditional m4 implementations, such as maximum line lengths, maximum size of a macro and number of macros. Removing such arbitrary limits is one of the stated goals of the GNU Project.[8]
The GNU Autoconf package makes extensive use of the features of GNU m4.
GNU m4 is currently maintained by Gary V. Vaughan and Eric Blake.[6] GNU m4 is free software, released under the terms of the GNU General Public License.
See also
[edit]- AWK – text processing programming language
- C preprocessor
- Macro (computer science)
- Make
- Template processor
- Web template system
References
[edit]- ^ Brian W. Kernighan and Dennis M. Ritchie. The m4 macro processor. Technical report, Bell Laboratories, Murray Hill, New Jersey, USA, 1977. pdf Archived August 5, 2004, at the Wayback Machine
- ^ History of GNU m4
- ^ Hall, Andrew D. (1972). The M6 Macro Processor. Computing Science Technical Report #2 (PDF) (Report). Bell Labs.
- ^ Kenneth J. Turner. Exploiting the m4 macro language. Technical Report CSM-126, Department of Computing Science and Mathematics, University of Stirling, Scotland, September 1994. pdf
- ^ – Inferno General commands Manual
- ^ a b GNU m4 web site "GNU M4" Archived July 25, 2016, at the Wayback Machine, accessed January 25, 2020.
- ^ GNU m4 manual, online and for download in HTML, PDF, and other forms. "GNU M4 — GNU macro processor" Archived August 17, 2023, at the Wayback Machine, accessed January 25, 2020.
- ^ "GNU Coding Standards: Writing Robust Programs" Archived April 16, 2016, at the Wayback Machine. quote: "Avoid arbitrary limits on the length or number of any data structure".
External links
[edit]
M4 (computer language)
View on GrokipediaOverview
Definition and Purpose
m4 is a general-purpose macro processor and Turing-complete macro language designed for text preprocessing and template expansion. It operates by reading input text, identifying macro invocations, and replacing them with predefined expansions to generate output, thereby facilitating the creation of reusable text patterns.[2][1][5] The core purpose of m4 is to serve as a front-end tool for compilers and other software utilities, enabling the generation of code or configurations through macro definitions that abstract repetitive or parameterized text. It was originally developed to support code generation in programming languages such as C, Fortran, and Ratfor, allowing developers to enhance readability and adaptability without altering the target languages themselves.[1][6] In practice, m4 processes input files sequentially, substituting macro calls with their corresponding expansions while preserving non-macro text, ultimately producing plain text output suitable for further processing by downstream tools. This mechanism underscores its role in streamlining development workflows within the Unix ecosystem.[2][1]Role in Computing
m4 serves as a versatile macro processor primarily employed for text preprocessing in software development, enabling the generation of code in languages such as C by expanding macros to produce compilable source files.[4] It facilitates the creation of configuration files, where macros allow for dynamic substitution of parameters, as seen in systems like Sendmail, which uses m4 to generate its configuration files from higher-level macro definitions.[7] Additionally, m4 supports documentation generation by processing templates to produce formatted text or HTML, streamlining the maintenance of repetitive content structures.[8] As an integral component of Unix-like systems, m4 integrates seamlessly with tools for portable scripting and build automation, forming part of the POSIX standard to ensure consistent behavior across compliant environments.[2][9] This standardization enables its use in generating portable shell scripts and Makefiles, where m4's macro expansion helps abstract platform-specific details into reusable templates.[10] In modern software ecosystems, m4 plays a central role in the GNU Autotools suite, particularly through Autoconf, which leverages m4 macros to produce configure scripts that detect system characteristics and adapt build processes accordingly.[11] These scripts ensure portability across diverse Unix variants by automating the detection of compilers, libraries, and headers, thereby generating customized Makefiles without manual intervention.[12] m4's language-agnostic design, stemming from its origins as a general-purpose text processor, permits its application beyond traditional programming contexts, including as a front-end preprocessor for languages like Ratfor, C, and COBOL.[2] This neutrality extends to non-programming tasks, such as HTML templating, where m4 expands macros to assemble web pages from modular components.[8]History
Origins and Development
The m4 macro processor was developed in 1975 by Brian Kernighan and Dennis Ritchie at Bell Laboratories as part of the early suite of Unix tools.[1] It emerged from the need for a versatile text manipulation utility to support language preprocessing, particularly inspired by the requirements of Ratfor, a structured dialect of Fortran created by Ritchie that demanded a robust macro expansion mechanism.[6] m4 built upon earlier macro processors, extending M3—a system Ritchie wrote for the AP-3 minicomputer—which itself evolved from GPM, the General Purpose Macrogenerator developed by Kernighan.[1] The initial release of m4 occurred with Version 6 Unix in May 1975, marking its integration into the Unix ecosystem as a general-purpose tool for macro expansion and text processing. Key design goals included simplicity in syntax to facilitate user-defined macros, portability across systems like Unix and GCOS, and extensibility through built-in primitives that enabled complex text substitutions without tying it to a specific programming language.[1] These principles allowed m4 to serve as a frontend for diverse applications, emphasizing efficient handling of input streams and output generation. In 1977, Kernighan and Ritchie published "The M4 Macro Processor," a seminal document that detailed the tool's architecture and demonstrated its utility for enhancing functional programming languages such as PL/I and C.[1] The paper highlighted m4's ability to improve code readability and adaptability, positioning it as an essential utility for software development in resource-constrained environments like early Unix systems.[6] This publication solidified m4's foundational role in text-based preprocessing, influencing subsequent Unix tools and standards.Standardization and Evolution
m4 was included in the initial POSIX.1-1988 standard, marking its formal recognition as a portable utility and defining a minimal set of built-in macros to ensure consistent behavior across Unix-like systems. This specification outlined essential primitives such asdefine, undefine, eval, ifelse, include, divert, and undivert, along with requirements for at least nine macro arguments, nine diversion buffers, and quote strings of at least five characters, prioritizing interoperability for software portability without mandating extensions.[13]
The language evolved in the late 1980s through its refinement in AT&T's System V Release 4 (SVR4), which introduced enhancements for enhanced compatibility in system-level programming and configuration. These SVR4 updates built on earlier System V versions, providing a more robust foundation that influenced subsequent Unix implementations and emphasized m4's utility in text processing for compilers and build tools.[2]
In 1990, René Seindal released GNU m4 to overcome limitations in traditional implementations, including caps on macro names and directives at 1024 characters, maximum line lengths, macro sizes, and the overall number of macros, as well as restricted file handling that hindered processing of larger inputs. GNU m4 maintained SVR4 compatibility while eliminating these constraints, supporting unlimited nesting (beyond the traditional 1024 levels) and larger-scale operations, thereby enabling more ambitious applications in open-source development.[6]
POSIX has continued to maintain and refine m4 through periodic updates, such as the 2008 revision (POSIX.1-2008), which added the & (bitwise AND) operator to the supported operators in the eval macro, clarified diversion buffer management to exactly nine buffers, and introduced the mkstemp built-in for secure temporary files while deprecating the obsolescent maketemp. As of 2025, m4 remains a stable component of POSIX-compliant systems, exerting ongoing influence in modern toolchains as the core engine for GNU Autoconf, which generates portable configure scripts for software builds across diverse platforms.[13][14]
Core Concepts
Macros and Expansion
In m4, user-defined macros form the core mechanism for text substitution, allowing users to define reusable patterns that replace tokens during processing. The primary primitive for this isdefine, which associates a name with a value that becomes the macro's defining text. The syntax is define(name, value), where the first argument specifies the macro name and the second provides the text to substitute upon invocation.[13]
Macro expansion occurs recursively as m4 processes input from left to right, scanning for macro names and replacing them with their defining text, which is then rescanned for further expansions until no more substitutions are possible. This recursive nature enables nested macros but can lead to infinite loops if not managed carefully. To control expansion and prevent premature substitution of arguments or embedded macros, m4 employs quoting mechanisms; for instance, the dnl primitive discards all input up to and including the next newline, effectively acting like a comment to suppress unwanted expansions.[13]
When a macro is invoked with arguments, such as name(arg1, arg2), the defining text substitutes positional parameters: $1 for the first argument, $2 for the second, and so on up to $9. Special variables include $0, which expands to the macro's own name, and $#, which represents the number of arguments provided as a string. These features support parameterized substitutions without requiring complex parsing.[13]
Although m4 includes built-in primitives like ifelse that enable conditional logic, it is fundamentally a macro processor rather than a full programming language, emphasizing linear text substitution over general computation.[13]
For example, defining a macro to insert a greeting might use:
define(`greet', `Hello, $1!')
Invoking greet(World) expands to Hello, World!, with recursive scanning applied if the value contains other macros.[13]
Built-in Primitives
m4 includes a collection of built-in primitives, which are predefined macros that provide essential functionality for macro processing, output manipulation, string handling, arithmetic evaluation, and debugging. These primitives form the core toolkit for users and are required to be supported in all POSIX-conforming implementations, ensuring portability across Unix-like systems.[13] The standard set encompasses macros for defining and undefining names, controlling output flow, and performing basic computations, with extensions like stacked definitions and tracing available in compliant systems.[13] The foundational primitives for macro management includedefine, which assigns a text replacement to a given name, overwriting any prior definition; undefine, which removes all definitions associated with a name; pushdef, which defines a name while stacking the previous definition for later retrieval; and popdef, which removes the current top-level definition, restoring the prior one if it exists.[13] These primitives interact with user-defined macros by providing the mechanisms for their creation, modification, and invocation during text expansion.[13]
Output control is managed through several built-ins: dnl discards input characters up to and including the next newline, commonly used to suppress comment-like lines; include inserts the contents of a specified file into the input stream; divert redirects output to one of nine numbered buffers (or discards it with a negative number), with buffer 0 resuming normal output; and undivert empties and outputs the contents of a specified buffer, or all buffers if no argument is given.[13]
For string and arithmetic operations, len returns the decimal length of its argument string, while eval performs integer arithmetic evaluation on an expression using C-style operators such as addition (+), subtraction (-), multiplication (*), division (/), modulus (%), bitwise shifts (<<, >>), comparisons (<, <=, >, >=, ==, !=), logical operations (&&, ||, !), and bitwise operations (&, ^, |, ~), with standard precedence and associativity.[13] The eval primitive supports at least 32-bit signed integer precision and recognizes octal (leading 0) and hexadecimal (leading 0x) notations, but it does not handle floating-point arithmetic; optional arguments specify the radix (default 10, range 2-36) and minimum output digits.[13]
Conditional logic is provided by ifelse, which compares two strings and outputs a third if they match, or a fourth if they differ (or nothing if only three arguments are supplied); additional argument pairs can be processed sequentially if the initial comparison fails.[13]
Debugging capabilities include traceon, which enables tracing of macro expansions to standard error (for a specific name or all if unspecified), and traceoff, which disables such tracing; errprint outputs its arguments to standard error for diagnostic purposes.[13]
The following table summarizes the POSIX-required and core built-in primitives, their syntax, and primary functions:
| Primitive | Syntax | Function |
|---|---|---|
define | define(name, text) | Assigns text as the definition of name, replacing prior ones.[13] |
undefine | undefine(name) | Removes all definitions of name.[13] |
pushdef | pushdef(name, text) | Stacks a new definition for name, preserving the old one.[13] |
popdef | popdef(name) | Removes the top definition of name, restoring the previous if any.[13] |
dnl | dnl | Discards input until the next newline.[13] |
include | include(file) | Includes the contents of file in the output.[13] |
divert | divert(num) | Diverts output to buffer num (0-9; negative discards).[13] |
undivert | undivert([num]) | Outputs and clears buffer num (or all if omitted).[13] |
len | len(string) | Returns the length of string as a decimal number.[13] |
eval | eval(expr[, radix[, digits]]) | Evaluates integer arithmetic expression expr.[13] |
ifelse | ifelse(str1, str2, if-equal[, if-not]) | Outputs if-equal if str1 equals str2, else if-not or null.[13] |
traceon | traceon([name]) | Enables tracing of expansions for name or all macros.[13] |
traceoff | traceoff([name]) | Disables tracing for name or all macros.[13] |
errprint | errprint(args) | Prints args to standard error.[13] |
Syntax and Usage
Macro Definitions and Arguments
In m4, macros are defined using the built-indefine macro, which takes the syntax define(name, [expansion]), where name is the identifier to associate with the provided expansion text, and the expansion is optional (defaulting to empty if omitted).[15] The name argument must typically be quoted to prevent unintended macro expansion during definition; the default quoting delimiters are the left single quote () and right single quote (').[](https://www.gnu.org/software/m4/manual/html_node/Define.html) For example, define(greet', Hello, $1!')defines a macro namedgreetthat expands to incorporate its first [argument](/page/Argument).[](https://www.gnu.org/software/m4/manual/html_node/Define.html) The expansion ofdefine` itself is void, meaning it produces no output, and redefining an existing macro replaces its prior definition.[15]
When invoking a macro, arguments are passed within unquoted parentheses, separated by commas, as in macro_name(arg1, arg2, ...).[16] Within the macro's expansion text, the nth argument is accessed via $n (where n starts from 1), and these placeholders are replaced literally during expansion without further processing unless explicitly quoted.[16] If fewer arguments are provided than expected, the missing ones default to empty strings; for instance, in the greet example above, greet alone expands to "Hello, !".[16] Special argument notations include $0 for the macro's own name, $# for the total number of arguments supplied (expanding to 0 if none), $* for all arguments concatenated into a single quoted string, and $@ for all arguments as separate, individually quoted strings.[16]
define(`info', `Macro $0 called with $# args: $1, $2 (all: $* )')
info(`first', `second') # Expands to: Macro info called with 2 args: first, second (all: first,second )
info(`only') # Expands to: Macro info called with 1 args: only, (all: only )
info # Expands to: Macro info called with 0 args: , (all: )
This example demonstrates argument access and defaults, with $# providing the count and $* yielding the empty string when no arguments are given.[16]
Quoting is essential in m4 to control expansion timing and prevent premature substitution, particularly in macro definitions and arguments, where unquoted text is scanned for macro names before passing to the outer macro.[17] By default, single quotes (` and ') enclose quoted strings, stripping leading whitespace from arguments and requiring balanced unquoted parentheses within each; commas outside quotes separate arguments.[17] Nested quoting handles expansion levels: unquoted arguments expand before the macro, single-quoted ones after, and double-quoted ones treat content literally to avoid loops.[17]
The changequote built-in alters these delimiters with the syntax changequote([start], [end]), where omitting both restores defaults, an empty start disables quoting (non-portable), and delimiters can be multi-character or non-ASCII.[18] This is useful for adapting to input containing default quotes, as in changequote([', ]') followed by define([alert], [Warning: $1]), allowing bracket-based quoting without conflict.[18] For string substitution related to quoting, the translit macro performs character mapping with syntax translit([string](/page/String), chars, [replacement]), translating or deleting characters in string based on mappings in chars to replacement (or deletion if omitted), supporting ranges like a-z.[19] It aids in preprocessing arguments, such as translit(hello', l', L')` yielding "heLLo", and requires quoting for literal ranges in definitions.[19]
To manage defined macros, undefine removes a definition using undefine(name), where name should be quoted, restoring the macro to undefined state without output.[20] This supports dynamic adjustment in argument-heavy contexts, ensuring clean handling of nested quotes by preventing recursive expansions through proper layering.[17]
Diversions and File Handling
Diversions in m4 provide a mechanism for temporarily redirecting output to numbered buffers, allowing for deferred or reordered insertion into the main output stream. This feature enables complex output restructuring, such as conditional generation of content or producing multiple output files from a single input by diverting sections to separate buffers. In traditional Unix m4 implementations, diversions are limited to ten buffers, numbered from 0 to 9, where 0 represents the normal output stream.[13] GNU m4 extends this by supporting an unlimited number of diversions, constrained only by available memory (with a default aggregate limit of 512 KB across all buffers) and disk space when spilling to temporary files.[21] The primary macro for managing diversions isdivert(number), which redirects subsequent output to the specified buffer number; if the number is -1, output is discarded. To restore diverted content, undivert([number...]) appends the contents of the specified buffer(s) (or all non-empty buffers if no arguments are given) to the current output position, emptying the buffer afterward. For example, the following m4 input diverts a greeting to buffer 1, outputs a prefix, then undiverts the greeting:
divert(1)
Hello, world!
divert
This is the prefix.
undivert(1)
This produces: "This is the prefix.\nHello, world!\n".[21]
Additional control macros include divnum, which returns the current diversion number (0 for normal output), and cleardivert(number...), which discards the contents of the specified buffers without reinserting them. These primitives facilitate advanced uses, such as in GNU Autoconf for topological sorting of dependencies by collecting and reordering output sections. In traditional m4, attempting to divert to a number outside 0-9 results in an error, enforcing the buffer limit.[13][21]
File handling in m4 supports modular input by incorporating external files into the processing stream via the include(file) and sinclude(file) builtins. The include macro reads the contents of the specified file, expands any macros within it, and inserts the result into the input stream at the call site; it signals an error if the file cannot be found, is a directory, or is unreadable. In contrast, sinclude performs the same insertion but silently fails (returning an empty expansion) if the file is inaccessible, making it suitable for optional inclusions. Both macros treat an empty string argument as a nonexistent file and search for files along paths specified by the -I option or the M4PATH environment variable.[22]
For instance, assuming a file greet.m4 contains define(foo, Hello), invoking include(greet.m4')followed byfoo` expands to "Hello". This seamless concatenation allows files to be included mid-macro or comment without disruption. File inclusion integrates with diversions, enabling buffered output to incorporate external content before reinsertion. Traditional m4 shares these behaviors but may impose stricter path resolution limits compared to GNU extensions.[13][22]
Examples
Basic Macro Usage
Basic macro usage in m4 involves defining simple macros that expand to predefined text upon invocation, allowing for text substitution and basic parameterization. The primary mechanism for defining a macro is the built-indefine directive, which associates a name with an expansion string. For instance, the input define(HELLO', Hello, World!') followed by HELLO will expand to output "Hello, World!" when processed by the m4 processor.[2]
Macros can accept arguments to enable dynamic content generation, where positional parameters are referenced using $n notation (with $1 for the first argument, $2 for the second, and so on). An example definition is define(GREET', $1 says: $2'), which, when invoked as GREET(User', Hi'), expands to "User says: Hi". This allows macros to function as parameterized templates, substituting actual values at expansion time.[2]
To control output formatting, such as suppressing extraneous newlines introduced by macro definitions, the dnl (discard to next line) primitive is commonly used immediately after a definition. For example, define(PI', 3.14')dnl defines the macro without adding a newline to the output stream, enabling seamless inline expansions in larger texts.[2]
m4 is typically invoked from the command line as m4 input.m4 > output.txt, where the processor reads from the input file (or standard input if none specified) and writes expanded output to standard output or a redirected file. Errors during processing, such as undefined macros or syntax issues, result in diagnostic messages to standard error and a non-zero exit code (usually 1 for general failure), facilitating integration into scripts and build processes.[2]
Practical Applications in Build Systems
m4 plays a pivotal role in build systems by enabling the generation of portable configuration and Makefile scripts through macro expansion. In GNU Autotools, particularly Autoconf, m4 processes input files likeconfigure.ac to produce configure scripts that probe the host system for features, ensuring software builds adapt to diverse environments.[23]
A key example is the AC_CHECK_HEADERS macro, which uses m4 to generate shell code that compiles and links test programs to detect the availability of specific header files, such as unistd.h. If the header is found, the macro defines a corresponding preprocessor symbol (e.g., HAVE_UNISTD_H) in a configuration header file, allowing conditional compilation based on system capabilities. This mechanism automates feature detection without manual intervention, supporting cross-platform development for Unix-like systems.[24]
Beyond configuration scripts, m4 facilitates direct Makefile generation by incorporating conditional logic for platform-specific settings. Using built-in primitives like esyscmd to capture shell command output (e.g., operating system details) and sysval to check exit status, macros can define variables dynamically. For instance, a probe can capture system information and apply ifelse for branching:
define(`OS', esyscmd(`uname -s'))dnl
ifelse(OS, `Linux', `define(`PATH_PREFIX', `/usr/local')',
OS, `Darwin', `define(`PATH_PREFIX', `/usr/local')',
`define(`PATH_PREFIX', `C:\Program Files')')dnl
This snippet captures the operating system name and sets a platform-specific path prefix accordingly (assuming non-Unix systems lack uname), enabling tailored compilation flags or include directories in the resulting Makefile. Such techniques allow m4 to automate OS detection and conditional assembly without hardcoding assumptions.[25][26]
m4's integration in GNU Autotools continues to support cross-platform projects in 2025, powering tools like Automake for generating standards-compliant Makefiles, even as alternatives like CMake gain popularity for their simplicity.
Implementations
Traditional Unix Implementations
The original implementation of m4 was developed by Brian W. Kernighan and Dennis M. Ritchie at Bell Laboratories as a general-purpose macro processor for the Unix operating system. It first appeared in the Seventh Edition of Unix (Version 7), released in January 1979, where it served primarily as a front-end preprocessor for languages like Ratfor, C, and others lacking built-in macro capabilities.[1] This version included core features such as macro definitions with up to nine arguments (accessible via $1 to $9), arithmetic operations via theeval built-in with 32-bit integer precision, string manipulation functions like len, substr, index, and translit, and output control through diversions and file inclusion with include and sinclude.[1] The design emphasized simplicity and portability, with macro names restricted to alphanumeric characters starting with a letter or underscore, and it was documented in the 1977 Bell Labs memorandum "The M4 Macro Processor."[1] The implementation was also ported to GCOS systems, with system-specific predefined macros like unix and gcos for conditional processing.[1]
In the 1980s, m4 evolved through AT&T's System V releases, culminating in the System V Release 4 (SVR4) implementation around 1989, which standardized a set of built-in macros to promote consistency across Unix variants. SVR4 m4 retained the core functionality from Version 7 while formalizing built-ins such as define, undefine, ifelse for conditionals, divert and undivert for output buffering, and syscmd for executing shell commands, ensuring compatibility with emerging portability standards.[2] Traditional implementations like SVR4 imposed practical limits, including support for only ten diversions (numbered 0 through 9, with 0 representing normal output) and argument handling capped at nine per macro, reflecting hardware constraints of the era such as limited memory for buffering.[2] These versions prioritized efficiency, resulting in a lightweight tool suitable for text processing tasks in software development environments.[27]
Berkeley Software Distribution (BSD) variants incorporated m4 from early releases, with the 4.4BSD version in the 1990s providing full compatibility with the POSIX.1 standard while adhering closely to the traditional Unix baseline without significant extensions. Included as a standard utility in 4.4BSD's Programmer's Supplementary Documents, this implementation supported the required POSIX built-ins—such as changequote for quote customization, incr and decr for arithmetic, errprint for diagnostics, and m4wrap for cleanup actions—and enforced at least 32-bit precision for eval along with a minimum of nine diversion buffers.[13][28] The BSD m4 emphasized POSIX conformance for portability, enabling seamless use in academic and research environments, and maintained the lightweight footprint of earlier Unix versions by avoiding non-essential features.[13] Overall, traditional Unix m4 implementations focused on POSIX compliance, delivering a compact, standards-oriented tool for macro expansion in build processes and configuration files.[13]
GNU m4 and Extensions
GNU m4 is the primary implementation of the m4 macro processor maintained by the GNU Project, with its initial version 1.0 released in 1990 and the current stable version 1.4.20 issued on May 10, 2025.[14][29] This version incorporates several years of portability enhancements, minor optimizations, and bug fixes while remaining backward-compatible with prior releases in the 1.4.x series.[29] Development is led by maintainers Gary V. Vaughan and Eric Blake, ensuring ongoing support under the GNU General Public License.[14] Beyond traditional m4 constraints, GNU m4 extends functionality by removing artificial limits on macro size, line length, and the number of macros, which facilitates processing of larger, more complex inputs in contemporary environments.[2] Key additions include built-in regular expression primitives such asregexp for pattern matching and patsubst for substitution, enabling advanced string manipulation without external tools.[2] Loadable modules to add custom built-ins dynamically are planned for version 2.0.[14]
GNU m4 provides compatibility modes to align with POSIX and SVR4 standards, invoked via the --traditional flag, which disables GNU-specific extensions like extended positional arguments (beyond nine parameters) for stricter adherence to original Unix behavior.[30] Debugging is improved with the m4trace built-in, which outputs detailed traces of macro invocations, arguments, and expansions to aid in troubleshooting complex scripts.[2] These features, combined with its robustness, position GNU m4 as a core tool in the Autotools ecosystem, where it underpins GNU Autoconf for automated configuration script generation in software builds.[14]