Recent from talks
Nothing was collected or created yet.
Macro (computer science)
View on Wikipedia
In computer programming, a macro (short for "macro instruction"; from Greek μακρο- 'long, large'[1]) is a rule or pattern that specifies how a certain input should be mapped to a replacement output. Applying a macro to an input is known as macro expansion.
The input and output may be a sequence of lexical tokens or characters, or a syntax tree. Character macros are supported in software applications to make it easy to invoke common command sequences. Token and tree macros are supported in some programming languages to enable code reuse or to extend the language, sometimes for domain-specific languages.
Macros are used to make a sequence of computing instructions available to the programmer as a single program statement, making the programming task less tedious and less error-prone.[2][3] Thus, they are called "macros" because a "big" block of code can be expanded from a "small" sequence of characters. Macros often allow positional or keyword parameters that dictate what the conditional assembler program generates and have been used to create entire programs or program suites according to such variables as operating system, platform or other factors. The term derives from "macro instruction", and such expansions were originally used in generating assembly language code.
Keyboard and mouse macros
[edit]Keyboard macros and mouse macros allow short sequences of keystrokes and mouse actions to transform into other, usually more time-consuming, sequences of keystrokes and mouse actions. In this way, frequently used or repetitive sequences of keystrokes and mouse movements can be automated. Separate programs for creating these macros are called macro recorders.
During the 1980s, macro programs – originally SmartKey, then SuperKey, KeyWorks, Prokey – were very popular, first as a means to automatically format screenplays, then for a variety of user-input tasks. These programs were based on the terminate-and-stay-resident mode of operation and applied to all keyboard input, no matter in which context it occurred. They have to some extent fallen into obsolescence following the advent of mouse-driven user interfaces and the availability of keyboard and mouse macros in applications, such as word processors and spreadsheets, making it possible to create application-sensitive keyboard macros.
Keyboard macros can be used in massively multiplayer online role-playing games (MMORPGs) to perform repetitive, but lucrative tasks, thus accumulating resources. As this is done without human effort, it can skew the economy of the game. For this reason, use of macros is a violation of the TOS or EULA of most MMORPGs, and their administrators spend considerable effort to suppress them.[4]
Application macros and scripting
[edit]Keyboard and mouse macros that are created using an application's built-in macro features are sometimes called application macros. They are created by carrying out the sequence once and letting the application record the actions. An underlying macro programming language, most commonly a scripting language, with direct access to the features of the application may also exist.
The programmers' text editor Emacs (short for "editing macros") follows this idea to a conclusion. In effect, most of the editor is made of macros. Emacs was originally devised as a set of macros in the editing language TECO; it was later ported to dialects of Lisp.
Another programmers' text editor, Vim (a descendant of vi), also has an implementation of keyboard macros. It can record into a register (macro) what a person types on the keyboard and it can be replayed or edited just like VBA macros for Microsoft Office. Vim also has a scripting language called Vimscript[5] to create macros.
Visual Basic for Applications (VBA) is a programming language included in Microsoft Office from Office 97 through Office 2019 (although it was available in some components of Office prior to Office 97). However, its function has evolved from and replaced the macro languages that were originally included in some of these applications.
XEDIT, running on the Conversational Monitor System (CMS) component of VM, supports macros written in EXEC, EXEC2 and REXX, and some CMS commands were actually wrappers around XEDIT macros. The Hessling Editor (THE), a partial clone of XEDIT, supports Rexx macros using Regina and Open Object REXX (oorexx). Many common applications, and some on PCs, use Rexx as a scripting language.
Macro virus
[edit]VBA has access to most Microsoft Windows system calls and executes when documents are opened. This makes it relatively easy to write computer viruses in VBA, commonly known as macro viruses. In the mid-to-late 1990s, this became one of the most common types of computer virus. However, during the late 1990s and to date, Microsoft has been patching and updating its programs.[citation needed] In addition, current anti-virus programs immediately counteract such attacks.
Parameterized and parameterless macro
[edit]A parameterized macro is a macro that is able to insert given objects into its expansion. This gives the macro some of the power of a function.
As a simple example, in the C programming language, this is a typical macro that is not a parameterized macro, i.e., a parameterless macro:
#define PI 3.14159
This causes PI to always be replaced with 3.14159 wherever it occurs. An example of a parameterized macro, on the other hand, is this:
#define pred(x) ((x)-1)
What this macro expands to depends on what argument x is passed to it. Here are some possible expansions:
pred(2) → ((2) -1) pred(y+2) → ((y+2) -1) pred(f(5)) → ((f(5))-1)
Parameterized macros are a useful source-level mechanism for performing in-line expansion, but in languages such as C where they use simple textual substitution, they have a number of severe disadvantages over other mechanisms for performing in-line expansion, such as inline functions.
The parameterized macros used in languages such as Lisp, PL/I and Scheme, on the other hand, are much more powerful, able to make decisions about what code to produce based on their arguments; thus, they can effectively be used to perform run-time code generation.
Text-substitution macros
[edit]Languages such as C and some assembly languages have rudimentary macro systems, implemented as preprocessors to the compiler or assembler. C preprocessor macros work by simple textual substitution at the token, rather than the character level. However, the macro facilities of more sophisticated assemblers, e.g., IBM High Level Assembler (HLASM) can't be implemented with a preprocessor; the code for assembling instructions and data is interspersed with the code for assembling macro invocations.
A classic use of macros is in the computer typesetting system TeX and its derivatives, where most functionality is based on macros.[6]
MacroML is an experimental system that seeks to reconcile static typing and macro systems. Nemerle has typed syntax macros, and one productive way to think of these syntax macros is as a multi-stage computation.
Other examples:
- m4 is a sophisticated stand-alone macro processor.
- TRAC
- Macro Extension TAL, accompanying Template Attribute Language
- SMX: for web pages
- ML/1 (Macro Language One)
- troff and nroff: for typesetting and formatting Unix manpages.
- CMS EXEC: for command-line macros and application macros
- EXEC 2 in Conversational Monitor System (CMS): for command-line macros and application macros
- CLIST in IBM's Time Sharing Option (TSO): for command-line macros and application macros
- REXX: for command-line macros and application macros in, e.g., AmigaOS, CMS, OS/2, TSO
- SCRIPT: for formatting documents
- Various shells for, e.g., Linux
Some major applications have been written as text macro invoked by other applications, e.g., by XEDIT in CMS.
Embeddable languages
[edit]Some languages, such as PHP, can be embedded in free-format text, or the source code of other languages. The mechanism by which the code fragments are recognised (for instance, being bracketed by <?php and ?>) is similar to a textual macro language, but they are much more powerful, fully featured languages.
Procedural macros
[edit]Macros in the PL/I language are written in a subset of PL/I itself: the compiler executes "preprocessor statements" at compilation time, and the output of this execution forms part of the code that is compiled. The ability to use a familiar procedural language as the macro language gives power much greater than that of text substitution macros, at the expense of a larger and slower compiler. Macros in PL/I, as well as in many assemblers, may have side effects, e.g., setting variables that other macros can access.
Frame technology's frame macros have their own command syntax but can also contain text in any language. Each frame is both a generic component in a hierarchy of nested subassemblies, and a procedure for integrating itself with its subassembly frames (a recursive process that resolves integration conflicts in favor of higher level subassemblies). The outputs are custom documents, typically compilable source modules. Frame technology can avoid the proliferation of similar but subtly different components, an issue that has plagued software development since the invention of macros and subroutines.
Most assembly languages have less powerful procedural macro facilities, for example allowing a block of code to be repeated N times for loop unrolling; but these have a completely different syntax from the actual assembly language.
Syntactic macros
[edit]Macro systems—such as the C preprocessor described earlier—that work at the level of lexical tokens cannot preserve the lexical structure reliably. Syntactic macro systems work instead at the level of abstract syntax trees, and preserve the lexical structure of the original program. The most widely used implementations of syntactic macro systems are found in Lisp-like languages. These languages are especially suited for this style of macro due to their uniform, parenthesized syntax (known as S-expressions). In particular, uniform syntax makes it easier to determine the invocations of macros. Lisp macros transform the program structure itself, with the full language available to express such transformations. While syntactic macros are often found in Lisp-like languages, they are also available in other languages such as Prolog,[7] Erlang,[8] Dylan,[9] Scala,[10] Nemerle,[11] Rust,[12] Elixir,[13] Nim,[14] Haxe,[15] and Julia.[16] They are also available as third-party extensions to JavaScript[17] and C#.[18]
Early Lisp macros
[edit]Before Lisp had macros, it had so-called FEXPRs, function-like operators whose inputs were not the values computed by the arguments but rather the syntactic forms of the arguments, and whose output were values to be used in the computation. In other words, FEXPRs were implemented at the same level as EVAL, and provided a window into the meta-evaluation layer. This was generally found to be a difficult model to reason about effectively.[19]
In 1963, Timothy Hart proposed adding macros to Lisp 1.5 in AI Memo 57: MACRO Definitions for LISP.[20]
Anaphoric macros
[edit]An anaphoric macro is a type of programming macro that deliberately captures some form supplied to the macro which may be referred to by an anaphor (an expression referring to another). Anaphoric macros first appeared in Paul Graham's On Lisp and their name is a reference to linguistic anaphora—the use of words as a substitute for preceding words.
Hygienic macros
[edit]In the mid-eighties, a number of papers[21][22] introduced the notion of hygienic macro expansion (syntax-rules), a pattern-based system where the syntactic environments of the macro definition and the macro use are distinct, allowing macro definers and users not to worry about inadvertent variable capture (cf. referential transparency). Hygienic macros have been standardized for Scheme in the R5RS, R6RS, and R7RS standards. A number of competing implementations of hygienic macros exist such as syntax-rules, syntax-case, explicit renaming, and syntactic closures. Both syntax-rules and syntax-case have been standardized in the Scheme standards.
Recently, Racket has combined the notions of hygienic macros with a "tower of evaluators", so that the syntactic expansion time of one macro system is the ordinary runtime of another block of code,[23] and showed how to apply interleaved expansion and parsing in a non-parenthesized language.[24]
A number of languages other than Scheme either implement hygienic macros or implement partially hygienic systems. Examples include Scala, Rust, Elixir, Julia, Dylan, Nim, and Nemerle.
Applications
[edit]- Evaluation order
- Macro systems have a range of uses. Being able to choose the order of evaluation (see lazy evaluation and non-strict functions) enables the creation of new syntactic constructs (e.g. control structures) indistinguishable from those built into the language. For instance, in a Lisp dialect that has
condbut lacksif, it is possible to define the latter in terms of the former using macros. For example, Scheme has both continuations and hygienic macros, which enables a programmer to design their own control abstractions, such as looping and early exit constructs, without the need to build them into the language. - Data sub-languages and domain-specific languages
- Next, macros make it possible to define data languages that are immediately compiled into code, which means that constructs such as state machines can be implemented in a way that is both natural and efficient.[25]
- Binding constructs
- Macros can also be used to introduce new binding constructs. The most well-known example is the transformation of
letinto the application of a function to a set of arguments.
Felleisen conjectures[26] that these three categories make up the primary legitimate uses of macros in such a system. Others have proposed alternative uses of macros, such as anaphoric macros in macro systems that are unhygienic or allow selective unhygienic transformation.
The interaction of macros and other language features has been a productive area of research. For example, components and modules are useful for large-scale programming, but the interaction of macros and these other constructs must be defined for their use together. Module and component-systems that can interact with macros have been proposed for Scheme and other languages with macros. For example, the Racket language extends the notion of a macro system to a syntactic tower, where macros can be written in languages including macros, using hygiene to ensure that syntactic layers are distinct and allowing modules to export macros to other modules.
Macros for machine-independent software
[edit]Macros are normally used to map a short string (macro invocation) to a longer sequence of instructions. Another, less common, use of macros is to do the reverse: to map a sequence of instructions to a macro string. This was the approach taken by the STAGE2 Mobile Programming System, which used a rudimentary macro compiler (called SIMCMP) to map the specific instruction set of a given computer into machine-independent macros. Applications (notably compilers) written in these machine-independent macros can then be run without change on any computer equipped with the rudimentary macro compiler. The first application run in such a context is a more sophisticated and powerful macro compiler, written in the machine-independent macro language. This macro compiler is applied to itself, in a bootstrap fashion, to produce a compiled and much more efficient version of itself. The advantage of this approach is that complex applications can be ported from one computer to a very different computer with very little effort (for each target machine architecture, just the writing of the rudimentary macro compiler).[27][28] The advent of modern programming languages, notably C, for which compilers are available on virtually all computers, has rendered such an approach superfluous. This was, however, one of the first instances (if not the first) of compiler bootstrapping.
Assembly language
[edit]While macro instructions can be defined by a programmer for any set of native assembler program instructions, typically macros are associated with macro libraries delivered with the operating system allowing access to operating system functions such as
- peripheral access by access methods (including macros such as OPEN, CLOSE, READ and WRITE)
- operating system functions such as ATTACH, WAIT and POST for subtask creation and synchronization.[29] Typically such macros expand into executable code, e.g., for the EXIT macroinstruction,
- a list of define constant instructions, e.g., for the DCB macro—DTF (Define The File) for DOS[30]—or a combination of code and constants, with the details of the expansion depending on the parameters of the macro instruction (such as a reference to a file and a data area for a READ instruction);
- the executable code often terminated in either a branch and link register instruction to call a routine, or a supervisor call instruction to call an operating system function directly.
- Generating a Stage 2 job stream for system generation in, e.g., OS/360. Unlike typical macros, sysgen stage 1 macros do not generate data or code to be loaded into storage, but rather use the PUNCH statement to output JCL and associated data.
In older operating systems such as those used on IBM mainframes, full operating system functionality was only available to assembler language programs, not to high level language programs (unless assembly language subroutines were used, of course), as the standard macro instructions did not always have counterparts in routines available to high-level languages.
History
[edit]In the mid-1950s, when assembly language programming was the main way to program a computer, macro instruction features were developed to reduce source code (by generating multiple assembly statements from each macro instruction) and to enforce coding conventions (e.g. specifying input/output commands in standard ways).[31] A macro instruction embedded in the otherwise assembly source code would be processed by a macro compiler, a preprocessor to the assembler, to replace the macro with one or more assembly instructions. The resulting code, pure assembly, would be translated to machine code by the assembler.[32]
Two of the earliest programming installations to develop macro languages for the IBM 705 computer were at Dow Chemical Corp. in Delaware and the Air Material Command, Ballistics Missile Logistics Office in California.
Some consider macro instructions as an intermediate step between assembly language programming and the high-level programming languages that followed, such as FORTRAN and COBOL.
By the late 1950s the macro language was followed by the Macro Assemblers. This was a combination of both where one program served both functions, that of a macro pre-processor and an assembler in the same package.[32][failed verification] Early examples are FORTRAN Assembly Program (FAP)[33] and Macro Assembly Program (IBMAP)[34] on the IBM 709, 7094, 7040 and 7044, and Autocoder[35] on the 7070/7072/7074.
In 1959, Douglas E. Eastwood and Douglas McIlroy of Bell Labs introduced conditional and recursive macros into the popular SAP assembler,[36] creating what is known as Macro SAP.[37] McIlroy's 1960 paper was seminal in the area of extending any (including high-level) programming languages through macro processors.[38][36]
Macro Assemblers allowed assembly language programmers to implement their own macro-language and allowed limited portability of code between two machines running the same CPU but different operating systems, for example, early versions of MS-DOS and CP/M-86. The macro library would need to be written for each target machine but not the overall assembly language program. Note that more powerful macro assemblers allowed use of conditional assembly constructs in macro instructions that could generate different code on different machines or different operating systems, reducing the need for multiple libraries.[citation needed]
In the 1980s and early 1990s, desktop PCs were only running at a few MHz and assembly language routines were commonly used to speed up programs written in C, Fortran, Pascal and others. These languages, at the time, used different calling conventions. Macros could be used to interface routines written in assembly language to the front end of applications written in almost any language. Again, the basic assembly language code remained the same, only the macro libraries needed to be written for each target language.[citation needed]
In modern operating systems such as Unix and its derivatives, operating system access is provided through subroutines, usually provided by dynamic libraries. High-level languages such as C offer comprehensive access to operating system functions, obviating the need for assembler language programs for such functionality.[citation needed]
Moreover, standard libraries of several newer programming languages, such as Go, actively discourage the use of syscalls in favor of platform-agnostic libraries as well if not necessary, to improve portability and security.[39]
See also
[edit]- Anaphoric macro
- Assembly language § Macros – Backstory of macros
- Compound operator – Basic programming language construct
- Extensible programming – Style of computer programming
- Fused operation – Basic programming language construct
- Hygienic macro – Macros whose expansion is guaranteed not to cause the capture of identifiers
- Macro and security – Rule for substituting a set input with a set output
- Programming by demonstration – Technique for teaching a computer or a robot new behaviors
- String interpolation – Replacing placeholders in a string with values
References
[edit]- ^ Oxford English Dictionary, s.v. macro, macro-instruction, and macro-
- ^ Greenwald, Irwin D.; Kane, Maureen (April 1959). "The Share 709 System: Programming and Modification". Journal of the ACM. 6 (2). New York, NY, USA: ACM: 128–133. doi:10.1145/320964.320967. S2CID 27424222.
One of the important uses of programmer macros is to save time and clerical-type errors in writing sequence of instructions which are often repeated in the course of a program.
- ^ Strachey, Christopher (October 1965). "A General Purpose Macrogenerator". Computer Journal. 8 (3): 225–241. doi:10.1093/comjnl/8.3.225.
- ^ "Runescape: The Massive Online Adventure Game by Jagex Ltd". Retrieved 2008-04-03.
- ^ "scripts: vim online". www.vim.org.
- ^ Fine, Johnathan. "T E X forever!" (PDF). Tex Users Group. p. 141. Retrieved 6 December 2024.
TEX has a macro programming language, which allows features to be added.
- ^ "Prolog Macros". www.metalevel.at. Retrieved 2021-04-05.
- ^ "Erlang -- Preprocessor". erlang.org. Retrieved 2021-05-24.
- ^ "The Dylan Macro System — Open Dylan". opendylan.org. Retrieved 2021-04-05.
- ^ "Def Macros". Scala Documentation. Retrieved 2021-04-05.
- ^ "About - Nemerle programming language official site". nemerle.org. Retrieved 2021-04-05.
- ^ "Macros - The Rust Programming Language". doc.rust-lang.org. Retrieved 2021-04-05.
- ^ "Macros". elixir-lang.github.com. Retrieved 2021-04-05.
- ^ "macros". nim-lang.org. Retrieved 2021-04-05.
- ^ "Macros". Haxe - The Cross-platform Toolkit.
- ^ "Metaprogramming · The Julia Language". docs.julialang.org. Retrieved 2021-04-05.
- ^ "Sweet.js - Hygienic Macros for JavaScript". www.sweetjs.org.
- ^ "LeMP Home Page · Enhanced C#". ecsharp.net.
- ^ Marshall, Joe. "untitled email". Retrieved May 3, 2012.
- ^ Hart, Timothy P. (October 1963). "MACRO Definitions for LISP". AI Memos. hdl:1721.1/6111. AIM-057.
- ^ Kohlbecker, Eugene; Friedman, Daniel; Felleisen, Matthias; Duba, Bruce (1986). "Hygienic Macro Expansion". LFP '86: Proceedings of the 1986 ACM conference on LISP and functional programming. pp. 151–161. doi:10.1145/319838.319859. ISBN 0897912004.
- ^ [1] Clinger, Rees. "Macros that Work"
- ^ Flatt, Matthew. "Composable and compilable macros: you want it when?" (PDF).
- ^ Rafkind, Jon; Flatt, Matthew. "Honu: Syntactic Extension for Algebraic Notation through Enforestation" (PDF).
- ^ "Automata via Macros". cs.brown.edu.
- ^ [2], Matthias Felleisen, LL1 mailing list posting
- ^ Orgass, Richard J.; Waite, William M. (September 1969). "A base for a mobile programming system". Communications of the ACM. 12 (9). New York, NY, USA: ACM: 507–510. doi:10.1145/363219.363226. S2CID 8164996.
- ^ Waite, William M. (July 1970). "The mobile programming system: STAGE2". Communications of the ACM. 13 (7). New York, NY, USA: ACM: 415–421. doi:10.1145/362686.362691. S2CID 11733598.
- ^ "University of North Florida" (PDF). Archived from the original (PDF) on 2017-08-29. Retrieved 2018-08-15.
- ^ "DTF (DOS/VSE)". IBM.
- ^ "IBM Knowledge Center". IBM Knowledge Center. 16 August 2013.
- ^ a b "Assembler Language Macro Instructions". Cisco.
- ^ FORTRAN ASSEMBLY PROGRAM (FAP) for the IBM 709/7090 (PDF). 709/7090 Data Processing System Bulletin. IBM. 1961. J28-6098-1.
- ^ IBM 7090/7094 Programming Systems: - Macro Assembly Program (MAP) Language (PDF). Systems Reference Library. 1964. C28-6311-4. Retrieved January 12, 2025.
- ^ Reference Manual - IBM 7070 Series Programming Systems - Autocoder (PDF). IBM Systems Reference Library (First ed.). IBM Corporation . 1961. C28-6121-0.
- ^ a b Holbrook, Bernard D.; Brown, W. Stanley. "Computing Science Technical Report No. 99 – A History of Computing Research at Bell Laboratories (1937–1975)". Bell Labs. Archived from the original on September 2, 2014. Retrieved February 2, 2020.
- ^ "Macro SAP – Macro compiler modification of SAP". HOPL: Online Historical Encyclopaedia of Programming Languages. Archived from the original on August 13, 2008.
- ^ Layzell, P. (1985). "The History of Macro Processors in Programming Language Extensibility". The Computer Journal. 28 (1): 29–33. doi:10.1093/comjnl/28.1.29.
- ^ "syscall package - syscall - Go Packages". pkg.go.dev. Retrieved 2024-06-06.
External links
[edit]Macro (computer science)
View on Grokipedia#define PI 3.14159) and function-like macros that mimic functions (e.g., #define MAX(a,b) ((a) > (b) ? (a) : (b))), where the preprocessor replaces the macro name with its expansion to reduce code duplication and improve readability.[1][6] In contrast, syntactic macros, prominent in languages like Lisp and Scheme, treat macros as functions that manipulate syntax during compilation, allowing programmers to create domain-specific languages or new control structures, such as defining a while loop in a functional paradigm.[5][2]
Modern languages continue to leverage macros for metaprogramming—writing code that generates other code—with Rust's declarative and procedural macros enabling traits like automatic derivation of methods (e.g., #[derive(Debug)]), and similar features in languages like Elixir and Julia for enhancing expressiveness and performance.[7] While macros offer benefits like conciseness and portability across compilers, they can introduce challenges such as debugging difficulties due to expansion opacity and potential for unintended side effects in function-like forms.[1][5] Overall, macros remain a foundational tool for language extensibility, influencing compiler design and software development practices.[4]
Fundamentals
Definition and Purpose
In computer science, a macro is a rule or pattern that specifies how a certain input sequence, such as a command or code snippet, should be mapped to an output sequence, such as expanded instructions, with the expansion automated during compilation, interpretation, or execution.[8] This mechanism allows for the abstraction of repetitive or complex operations into simpler invocations, treating the macro as a shorthand for the underlying sequence.[9] The primary purposes of macros include reducing tedium in programming or user tasks by automating repetitive actions, enabling abstraction for complex operations that would otherwise require verbose code, and facilitating code reuse and portability across different contexts.[10] For instance, macros can shorten verbose code in languages like C by defining constants or inline expressions, or automate user interface actions such as sequences of keystrokes in applications.[11] Keyboard macros serve as a simple example of this in user applications, while syntactic macros enable advanced code transformations in languages like Lisp.[2] Macros offer benefits such as improved programmer productivity through reduced manual repetition, enhanced code readability by hiding implementation details, and better maintainability via modular abstractions that promote reuse.[12] However, they introduce trade-offs, including potential errors during expansion that can lead to subtle bugs, challenges in debugging due to the pre-compilation substitution process, and increased code size from inlining, which may complicate optimization.[11] The general workflow of a macro involves three key stages: definition, where the mapping rule is specified (e.g., via a directive like #define in C); invocation, where the macro name is used in the source code as a placeholder; and expansion, where the processor or interpreter replaces the invocation with the corresponding output sequence, often performing substitutions for parameters if applicable.[9] This process occurs before full compilation or execution, ensuring the expanded code integrates seamlessly into the program.[13]Classification of Macros
Macros in computer science are broadly classified into several primary categories based on their mechanisms and application domains. User interface macros, such as keyboard and mouse macros, automate sequences of input actions like keystrokes or cursor movements, operating at a low level to simulate device interactions across applications without requiring deep integration with the host software.[14] In contrast, textual macros perform simple text-substitution, replacing predefined patterns with fixed or parameterized strings during preprocessing, as seen in assembly language macros where mnemonic abbreviations expand to machine instructions.[15] Syntactic macros extend this by transforming the abstract syntax tree (AST) of code, enabling structure-aware modifications that preserve program semantics, while procedural macros go further by executing arbitrary code to generate or manipulate program fragments dynamically.[16] Within these categories, macros can be further distinguished by scope and behavior. Parameterless macros involve fixed expansions without inputs, suitable for constant substitutions, whereas parameterized macros accept arguments to enable reusable, context-dependent transformations.[15] Hygiene addresses variable binding: hygienic macros isolate generated identifiers to prevent unintended name capture in the surrounding code, promoting safer expansions, in opposition to anaphoric macros that deliberately share lexical context for more expressive but riskier integrations.[16] Hybrid forms arise in embeddable languages, where macros integrate seamlessly with the host language's runtime or compilation environment, allowing code generation that blends macro expansion with interpreted or compiled execution.[16] Key distinctions among macro types include expansion timing—runtime for dynamic adaptation in user interface macros, versus compile-time for textual and syntactic macros to optimize performance—and power level, ranging from basic replacement in textual macros to domain-specific language (DSL) extension in procedural ones.[15]User Interface and Application Macros
Keyboard and Mouse Macros
Keyboard and mouse macros consist of recorded sequences of keystrokes, mouse clicks, movements, or drags that can be replayed on demand to automate repetitive interactions with graphical user interfaces (GUIs). These macros simulate human input at the device level, capturing and reproducing actions such as typing text, navigating menus, or performing precise cursor operations without requiring direct code-level intervention. Unlike higher-level scripting, they focus on low-level input events, making them accessible for non-programmers to handle routine UI tasks efficiently.[17] Implementation typically involves either direct recording of user actions or scripted definitions using specialized tools. For instance, AutoHotkey, a free open-source scripting language for Windows (current version v2.0.19 as of January 2025), enables users to create macros by recording inputs via add-ons like Pulover's Macro Creator or by writing simple scripts that send simulated keystrokes and mouse events to target windows. On Windows, the Microsoft Mouse and Keyboard Center supports macro recording directly within its interface, allowing users to assign sequences to programmable hardware buttons on compatible mice and keyboards, with playback triggered by hardware presses. In macOS, Automator's "Watch Me Do" action provides a built-in recording mode that captures mouse and keyboard events during a demonstration, compiling them into a replayable workflow that can be triggered via shortcuts or schedules. On Linux, tools like AutoKey allow similar recording and playback of keyboard and mouse events using Python scripting. These tools often support two modes: direct capture, which records raw events for exact replay, and scripted modes, which allow editing for conditional logic or loops.[17][18][19] Common use cases span gaming, productivity, and accessibility applications. In gaming, macros bind complex sequences—such as rapid-fire key combinations or coordinated mouse movements—to a single trigger, enabling players to execute intricate maneuvers without manual repetition. For productivity, they automate form filling in applications like web browsers or office software, where a macro might simulate tabbing through fields and entering predefined data to streamline data entry tasks. In accessibility scenarios, macros simplify interactions for users with motor impairments by mapping multi-step gestures to single inputs, such as replacing a series of clicks and drags with one keystroke to enhance usability in standard GUIs.[17][20] Despite their utility, keyboard and mouse macros face limitations related to hardware dependencies and adaptability. They are inherently tied to specific operating systems and input devices, often failing across platforms or with varying screen resolutions due to absolute coordinate reliance in recordings. Moreover, they struggle with dynamic interfaces where UI elements shift, such as in responsive web pages, requiring manual scripting extensions for robustness against changes like window resizing or layout updates. These constraints highlight the need for hybrid approaches combining recording with visual or structural UI recognition for more reliable automation.[20][21]Application Macros and Scripting
Application macros and scripting refer to programmable sequences defined within specific software applications using their built-in APIs or dedicated scripting languages to automate complex workflows and tasks. These macros go beyond simple recordings by allowing users to write custom code that interacts directly with the application's object model, enabling manipulation of data, user interfaces, and internal states. For instance, in spreadsheet software like Microsoft Excel, Visual Basic for Applications (VBA) serves as the primary scripting language, embedded within the Office suite to create macros that perform calculations, format data, or generate reports programmatically.[22] Similarly, Adobe applications such as Photoshop and InDesign utilize ExtendScript, an extension of JavaScript, to script operations like batch image processing or document layout automation.[23] Implementation of application macros typically involves event-driven scripting, where code responds to user actions, application events, or triggers such as file openings or button clicks. Developers access the application's GUI elements through object-oriented APIs, allowing scripts to read, modify, or create components like worksheets in Excel or layers in Adobe Illustrator. In integrated development environments (IDEs), scripting consoles provide similar capabilities; for example, IntelliJ IDEA's IDE Scripting Console uses languages like Groovy or Kotlin to automate refactoring, code generation, or project inspections without developing full plugins.[24] Macros are often stored in dedicated files—such as .bas modules in VBA or .jsx files in ExtendScript—and can be executed via menus, keyboard shortcuts, or automated schedules, integrating seamlessly with the host application's runtime environment.[22][23] These macros offer significant advantages through their context-aware nature, adapting to the current state of the application, such as dynamic data ranges in a spreadsheet or selected objects in a design tool. Extensibility is a key benefit, as scripting supports control structures like loops, conditional statements, and error handling, enabling sophisticated automation that scales with user needs—for example, VBA scripts can loop through thousands of rows to apply conditional formatting, while ExtendScript can iterate over image batches for resizing and exporting.[22] This integration also facilitates inter-application communication, such as transferring data between Excel and Outlook or coordinating actions across Adobe Creative Cloud apps via BridgeTalk messaging.[23] Overall, they enhance productivity by reducing manual effort in repetitive tasks and allowing customization tailored to domain-specific workflows.[24] Despite these benefits, application macros and scripting present challenges, particularly vendor lock-in, where reliance on proprietary APIs ties users to a specific software ecosystem, complicating migrations to alternative tools due to incompatible scripting syntax or object models.[25] Maintenance issues arise as application updates frequently alter APIs or behaviors, requiring script revisions to ensure compatibility; for instance, VBA macros developed for older Excel versions may fail in newer releases without adjustments.[22] Security restrictions further complicate deployment, as macro-enabled files demand explicit user trust settings to mitigate risks of malicious code execution, often leading to organizational policies that disable scripting by default.[22] These factors can increase long-term costs and limit portability across platforms.Textual Macros
Parameterless Text-Substitution Macros
Parameterless text-substitution macros, also referred to as object-like macros, consist of a single identifier that the preprocessor replaces with a fixed sequence of tokens, without accepting any input parameters.[26] These macros are defined using a directive such as#define in languages like C, where the identifier is followed by the expansion text, enabling straightforward symbolic representation of constants or repeated code fragments.[6] The initial implementation of such macros in the C preprocessor, developed in the early 1970s, provided basic string replacement capabilities as a foundational feature for code abstraction.[27]
In practice, parameterless macros are commonly employed to define numeric constants or short phrases that appear frequently in source code. For instance, in C programming, the directive #define PI 3.14159 instructs the preprocessor to substitute every occurrence of PI with 3.14159, as shown in the following example:
#define PI 3.14159
double circumference = 2 * PI * radius;
#define PI 3.14159
double circumference = 2 * PI * radius;
double circumference = 2 * 3.14159 * radius;, eliminating the need to hardcode the value and reducing errors from manual repetition.[26] Similarly, in build systems like GNU Make, parameterless macros function as simple variables for text substitution; for example, defining CC = gcc allows $(CC) to expand to gcc in rules, such as
program: main.o
$(CC) -o program main.o
program: main.o
$(CC) -o program main.o
program: main.o
gcc -o program main.o
program: main.o
gcc -o program main.o
#define NUMBERS 1, \
2, \
3
int array[] = { NUMBERS };
#define NUMBERS 1, \
2, \
3
int array[] = { NUMBERS };
int array[] = { 1, 2, 3 };.[26] By centralizing repeated elements, these macros reduce boilerplate code, enhance readability, and facilitate maintenance, as changes to the expansion propagate across the entire codebase.[30] Empirical studies of C preprocessor usage confirm that object-like macros are prevalent for defining constants, comprising a significant portion of macro definitions in large projects to avoid scattering literal values.[31]
Despite their simplicity, parameterless macros introduce rigidity due to their fixed expansions, preventing adaptation to varying contexts without redefinition.[26] A key drawback is the potential for unintended substitutions arising from name clashes; if a macro identifier coincides with a variable or function name in the code, the preprocessor may expand it erroneously, leading to compilation errors or altered semantics.[32] For example, defining #define MAX 100 could unexpectedly replace a function parameter named MAX in void func(int MAX), resulting in invalid code after expansion. Such issues underscore the need for careful naming conventions to avoid global interference, as macros lack scoping mechanisms.[6] This foundational approach later evolved into parameterized variants to address these limitations by incorporating input handling for greater flexibility.
Parameterized Text-Substitution Macros
Parameterized text-substitution macros, also known as function-like macros, extend parameterless macros by accepting formal parameters that enable dynamic substitution of arguments into a predefined template during preprocessing.[33] These macros are defined using the#define directive followed by the macro name, a parenthesized list of parameters, and the replacement text, allowing for reusable code snippets that mimic function calls without runtime overhead.[6] For instance, in C, the macro #define MAX(a, b) ((a) > (b) ? (a) : (b)) substitutes the provided arguments a and b into the conditional expression, expanding to the appropriate comparison when invoked as MAX(x, y).[34]
The expansion process involves several mechanics to handle argument substitution accurately. Upon invocation, actual arguments replace the formal parameters in the macro body, after which the resulting text is rescanned for further macro expansions, adhering to strict rules to prevent unintended interactions. Token pasting, using the ## operator, concatenates adjacent tokens—often involving macro parameters—to form new identifiers; for example, #define PASTE(a, b) a##b expands PASTE(foo, bar) to foobar, useful for generating variable names dynamically.[35] Similarly, stringification with the # operator converts a macro parameter into a string literal without further expansion; #define STR(x) #x turns STR(hello) into "hello", aiding in debugging or logging by enclosing arguments in quotes.[36] However, arguments with side effects, such as function calls, pose risks if evaluated multiple times in the expansion; for example, #define SQUARE(x) ((x) * (x)) applied to SQUARE(i++) increments i twice, unlike a true function that evaluates once.[37]
In practice, parameterized macros are widely used in C and C++ for defining inline computations, constants, and simple utilities that avoid function call overhead while promoting code reuse.[38] Common applications include min/max functions, like the MAX example, or generic patterns such as swapping variables with #define SWAP(a, b) do { typeof(a) _t = (a); (a) = (b); (b) = _t; } while(0), which ensures type safety and single evaluation through temporary variables.[37] In older systems without templates or generics, such macros facilitated portable code generation, such as conditional includes or platform-specific adaptations, enhancing maintainability in pre-standardized environments.[33]
Despite their utility, several pitfalls complicate their use. Expansion order follows a depth-first approach, where nested macros expand fully before outer ones, potentially leading to misnesting if parentheses are mismatched; for instance, improper grouping in arguments can alter operator precedence unexpectedly.[32] Debugging expanded code is challenging due to the preprocessor's textual nature, though tools like the -E compiler flag reveal the fully expanded source for inspection. Additionally, recursion is limited to prevent infinite loops—GCC enforces a 200-level depth, exceeding the C standard's minimum of 15—to safeguard compilation stability.[39] These issues underscore the need for careful design, often favoring inline functions in modern C++ over macros for better type checking and safety.[38]
Macros in Embeddable Languages
Macros in embeddable languages refer to lightweight domain-specific languages (DSLs) integrated into host languages or tools, allowing users to define custom abstractions through textual substitution mechanisms that expand during preprocessing or parsing. These macros enable the embedding of extensible syntax within scripting or configuration environments without requiring modifications to the core parser of the host language. For instance, the m4 macro processor serves as a general-purpose embeddable macro system for Unix tools, where it preprocesses input files by expanding macro definitions before passing them to the target application. Similarly, Jinja, a templating engine for Python, incorporates macros as reusable blocks that facilitate dynamic content generation in web development and configuration files. Implementation of these macros typically involves tight integration with the host language's parser, where expansion occurs at parse time to substitute defined patterns with their corresponding text. This process ensures that macros act as preprocessors, transforming source code or templates into valid host language constructs prior to compilation or execution. A notable example is noweb, a literate programming tool that embeds macros for tangling and weaving documentation and code, allowing authors to define custom extraction rules that expand during document processing to separate prose from executable code. Such integration maintains the host language's syntax while permitting user-defined extensions, often leveraging simple pattern matching for substitution. Text-substitution serves as the underlying mechanism for these expansions, providing a straightforward way to parameterize and reuse code snippets. The primary benefits of macros in embeddable languages include clear separation of concerns, where domain-specific logic can be encapsulated without altering the host language's core, and ease of customization for end-users in specialized applications. This approach promotes modularity, as macros allow for the injection of boilerplate reduction or conditional logic directly into templates or scripts, enhancing maintainability in environments like configuration management. For example, the ECPG preprocessor in PostgreSQL enables macro-like directives such asEXEC SQL DEFINE for reusable query patterns that expand at preprocessing time, simplifying complex database interactions while preserving SQL's native syntax.[40] In configuration languages, YAML parsers with macro support, like those in Ansible's Jinja2 integration, permit variable substitutions and loops that expand macros to generate dynamic infrastructure definitions, improving scalability in deployment automation.
Syntactic and Procedural Macros
Syntactic Macros
Syntactic macros represent a advanced form of metaprogramming in computer science, where macros manipulate the abstract syntax tree (AST) of source code to enable structural transformations beyond simple text replacement. Unlike earlier textual macros that substitute strings directly, syntactic macros treat code as data, allowing programmers to define new syntactic constructs that integrate seamlessly with the host language. This capability is most notably realized in Lisp-family languages, such as Common Lisp, where thedefmacro special form defines a macro by specifying a name, parameters, and a body that computes the expansion.[41]
The process of syntactic macro expansion involves capturing the macro invocation as an s-expression, which serves as a direct representation of the AST in homoiconic languages like Lisp. The macro's body then evaluates this input to generate a new s-expression, which is inserted in place of the original form during compilation or interpretation. To mitigate issues like variable capture—where identifiers from the macro unintentionally bind to those in the surrounding context—hygiene techniques are employed, such as generating unique symbols (e.g., via gensym in Common Lisp) or using built-in hygienic expansion in languages like Scheme and Racket. This ensures that the expanded code preserves the intended scoping without accidental identifier collisions.[41][42]
One key advantage of syntactic macros lies in their support for metaprogramming, enabling the creation of domain-specific languages (DSLs) tailored to particular problem domains. For instance, Common Lisp's loop macro provides a declarative syntax for complex iterations, expanding to efficient combinations of do, if, and other primitives, thereby abstracting repetitive control structures. Similarly, macros like with-open-file manage resource acquisition and release, generating try-finally blocks to ensure files are closed even if errors occur, thus promoting safer and more concise code for I/O operations. These examples illustrate how syntactic macros extend the language's expressiveness without altering its core semantics.[43][10]
However, syntactic macros introduce challenges, particularly in debugging, as the expanded code can obscure the original intent, making it difficult to trace errors back to the macro invocation. Tools like macro steppers or expanders help by visualizing the transformation sequence, but the process remains non-trivial for complex macros. Additionally, since macro bodies operate within the full power of the host language, expansions can be Turing-complete, potentially leading to non-terminating computations during macro processing if not carefully designed.[44]
Procedural Macros
Procedural macros represent an advanced form of metaprogramming in which macros are treated as functions executed at compile time to generate code dynamically based on input tokens or attributes, producing output code snippets that are inserted into the program.[7] This approach enables arbitrary computation during expansion, distinguishing it from simpler substitution mechanisms.[7] In implementation, procedural macros receive aTokenStream—a sequence of syntactic tokens from the source code—as input, process it using the host language's logic, and return a new TokenStream as output, which the compiler integrates seamlessly.[7] They operate during the compilation phase, after parsing but before type checking of the generated code, ensuring the output undergoes full validation.[7]
Procedural macros are categorized into three primary stages or forms: derive macros, invoked via the #[derive] attribute to automate trait implementations for data structures; attribute macros, which expand custom attributes applied to items like functions or modules; and function-like macros, called similarly to regular functions to generate code inline.[7] For instance, in Rust, a derive macro might be defined as follows:
use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput};
#[proc_macro_derive(HelloMacro)]
pub fn hello_macro_derive(input: TokenStream) -> TokenStream {
let ast = parse_macro_input!(input as DeriveInput);
let name = &ast.ident;
let gen = quote! {
impl HelloMacro for #name {
fn hello_macro() {
println!("Hello, Macro! My name is {}!", stringify!(#name));
}
}
};
gen.into()
}
use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput};
#[proc_macro_derive(HelloMacro)]
pub fn hello_macro_derive(input: TokenStream) -> TokenStream {
let ast = parse_macro_input!(input as DeriveInput);
let name = &ast.ident;
let gen = quote! {
impl HelloMacro for #name {
fn hello_macro() {
println!("Hello, Macro! My name is {}!", stringify!(#name));
}
}
};
gen.into()
}
Debug or Clone in Rust, where derive macros produce efficient, tailored code without manual repetition.[7] In Scala, similar facilities via macros enable compile-time derivation of type class instances, for example, generating serialization code for JSON encoding in libraries that inspect case class structures to produce encoders and decoders.[45]
The primary benefits of procedural macros lie in their ability to ensure type safety in generated code, as the compiler verifies the output just like handwritten code, and their support for modular library design, allowing third-party extensions to language features without core modifications.[7] They promote code reuse and reduce verbosity for complex patterns, such as custom serialization or API routing definitions.[45]
Limitations include increased compile-time overhead from executing the macro logic, which can extend build durations for large projects, and the added complexity in authoring and debugging, as macro code runs in a separate procedural context with limited access to the broader program state compared to syntactic approaches.[7] Additionally, they necessitate isolation in dedicated crates or modules to prevent runtime interference.[7]
Hygienic and Anaphoric Macros
Hygienic macros are a class of syntactic macros designed to prevent unintended variable capture during expansion, ensuring that identifiers introduced by the macro do not accidentally bind to variables in the surrounding code. This property, known as hygiene, is achieved through automatic renaming of bound identifiers to unique symbols, typically using techniques like scope sets or mark propagation. The concept was formalized in the 1986 paper "Hygienic Macro Expansion" by Kohlbecker, Friedman, Felleisen, and Duba, which introduced an algorithm to enforce hygiene by tracking binding relationships during macro transformation.[46] In languages like Scheme, hygienic macros are implemented viasyntax-rules, a declarative macro system that treats binders and references as scoped units, automatically generating fresh names to avoid name clashes. This approach promotes modularity by isolating macro-generated code from the lexical context, reducing bugs from accidental interactions and enabling safer code reuse.[46]
For example, in Racket—a dialect of Scheme—a hygienic macro defining a local binding like let ensures that internal variables do not interfere with external ones. Consider a macro dbl that doubles its argument using a temporary variable y:
(define-syntax dbl
(syntax-rules ()
[(dbl x) (let ([y 1]) (* 2 x y))]))
(define-syntax dbl
(syntax-rules ()
[(dbl x) (let ([y 1]) (* 2 x y))]))
(let ([y 7]) (dbl 3)), the expansion becomes (let ([y 7]) (let ([y1 1]) (* 2 3 y1))), where y1 is a renamed hygienic variant, preserving the outer y binding. This automatic hygiene simplifies macro authoring, as programmers need not manually manage identifier uniqueness, leading to more reliable and maintainable extensions in block-structured languages.[46]
In contrast, anaphoric macros deliberately share bindings between the macro's expansion and the caller's scope, using anaphors—implicit references like the symbol it—to create concise, context-dependent constructs. This intentional capture allows for expressive shortcuts but requires careful use to avoid unintended side effects. Anaphoric macros originated in Lisp traditions, where they leverage the language's homoiconicity to inject shared variables, as detailed in Paul Graham's 1993 book "On Lisp." A classic example is aif, an anaphoric if that binds the test result to it for use in consequent and alternate clauses:
(defmacro aif (test then &optional else)
`(let ((it ,test))
(if it ,then ,else)))
(defmacro aif (test then &optional else)
`(let ((it ,test))
(if it ,then ,else)))
(aif (find-if #'evenp nums) (print it)) avoids recomputing the test and referencing it explicitly, streamlining code for common patterns.[47]
Hygienic and anaphoric macros represent opposing philosophies within syntactic macro systems: hygiene prioritizes safety through isolation via namespaces or renaming algorithms, making it the default in modern Scheme implementations for robust modularity. Anaphora, conversely, embraces explicit scoping for brevity and expressiveness, often implemented by leaking bindings in unhygienic Lisp macros, though it demands programmer vigilance to prevent capture errors. While hygiene mitigates risks in large-scale codebases, anaphora shines in domain-specific idioms, with implementations balancing the two through optional unhygienic escapes in hygienic systems.
Domain-Specific Macros
Assembly Language Macros
Assembly language macros provide a mechanism in macro assemblers for defining reusable blocks of code that expand directly into machine instructions, enabling abstraction and automation at the low level. These macros typically support parameters to customize the generated code, such as register names or immediate values, and are processed during the assembly phase to produce object code without runtime overhead. In tools like the Microsoft Macro Assembler (MASM), macros incorporate advanced features including looping, arithmetic operations, and string manipulation to generate complex instruction sequences. Similarly, the GNU Assembler (GAS) uses directives like.macro and .endm to define macros that output assembly instructions, supporting optional parameters with defaults or requirements.[48][49]
Implementation of assembly macros relies on inline expansion, where the assembler substitutes the macro definition at each invocation site, replacing parameters with actual values to form valid instructions. Local labels are often employed to avoid naming conflicts across multiple expansions, particularly in GAS's alternative macro mode (.altmacro), which generates unique identifiers for each instance. For example, a GAS macro to reserve a block of memory might be defined as follows:
.macro reserve size
.space \size
.endm
.macro reserve size
.space \size
.endm
reserve 16 expands to .space 16, allocating 16 bytes of uninitialized data. Another common use defines reusable subroutines or data initialization routines; for instance, a parameterized macro in MASM could generate a loop to clear a memory block:
CLEAR_MEM MACRO dest, count
LOCAL loop_start
mov cx, count
loop_start:
mov byte ptr [dest], 0
inc dest
loop loop_start
ENDM
CLEAR_MEM MACRO dest, count
LOCAL loop_start
mov cx, count
loop_start:
mov byte ptr [dest], 0
inc dest
loop loop_start
ENDM
CLEAR_MEM buffer, 100, producing the corresponding x86 instructions without creating a separate procedure call. Such expansions facilitate the creation of data blocks or instruction patterns tailored to hardware specifics, like interrupt handlers or peripheral configurations.[49]
Historically, assembly macros emerged in the early 1960s as part of macro assemblers to enhance assembly language productivity, originating from efforts at Bell Labs and implementations like MIDAS for the PDP-1 in 1963. They played a key role in improving portability by abstracting assembler-specific syntax differences, allowing code to be adapted across variants from different vendors without full rewrites. Macros also reduced repetitive opcodes by encapsulating common sequences, such as arithmetic operations or I/O routines, which streamlined development in resource-constrained environments like the ARPANET IMP system in 1969.[50]
In modern contexts, assembly macros remain vital in embedded systems for hardware-specific optimizations, where they enable precise control over interrupts, timing-critical code, and resource allocation without the abstraction layers of higher languages. For example, in microcontroller programming with assemblers like those for AVR or ARM, macros define atomic operations like enabling interrupts (INTR_ON MACRO asm("sei") END) to ensure thread safety in real-time applications. Optimizing compilers, such as those for embedded targets, often output assembly code incorporating predefined macros from system headers to fine-tune performance, balancing code size and execution speed in constrained devices like automotive controllers or IoT sensors.[51][52]
Macros for Machine-Independent Software
Macros for machine-independent software employ conditional or parameterized mechanisms to abstract differences in operating systems, hardware architectures, and compilers, thereby enabling a single codebase to compile and run across diverse platforms. These macros typically operate at the preprocessor level or within build systems, using feature detection to select appropriate code paths or configurations. A prominent example is the GNU Autoconf tool, which leverages the M4 macro processor to generate configure scripts that probe the build environment for platform-specific attributes, such as available headers, functions, or libraries, and adjust the build accordingly.[53] Implementation often involves feature tests that expand macros into compatible code variants. In C programming, conditional compilation directives like#ifdef and #if defined() are used to detect platform characteristics via predefined macros, such as _WIN32 for Windows or __linux__ for Linux, allowing the preprocessor to include OS-specific implementations. For instance, to handle endianness variations—where byte order differs between big-endian (e.g., PowerPC) and little-endian (e.g., x86) systems—a macro can perform a compile-time or runtime check: #define IS_BIG_ENDIAN (*(char*)&(int){1} == 0), which tests the byte representation of the integer 1 and expands to byte-swapping code only if necessary, ensuring portable data handling in files or networks.[54]
API shims further enhance portability by mapping platform-specific functions to a unified interface through macro expansion. In cross-platform C code, a macro like #ifdef _WIN32 #define READDIR(dirent, dir) FindNextFileA((HANDLE)(dir)->handle, &(dirent)) #else #define READDIR(dirent, dir) readdir((dir)) #endif abstracts directory reading between Windows' FindNextFileA and POSIX's readdir, allowing the same source to compile on both without modification. Build systems integrate these techniques; for example, CMake employs its scripting language with functions akin to macros (e.g., target_compile_definitions) to detect features via check_include_file tests and generate platform-independent build files, supporting targets from Unix-like systems to Windows and embedded devices.[55][56]
The primary benefits include maintaining a unified codebase that reduces development effort for multi-platform support, as seen in large projects like the Linux kernel or GNU software, where portability is achieved without duplicating source files. This approach minimizes errors from manual porting and leverages automated detection to adapt to evolving hardware, such as new CPU architectures.[53]
However, challenges arise in maintaining these conditionals, as proliferating #ifdef blocks can obscure code logic and increase complexity. Additionally, unused expansions in conditional paths may introduce minor binary bloat if not stripped during linking, though this is typically negligible compared to the gains in maintainability.
Modern Language-Specific Macros
In contemporary programming languages, macros have advanced to support sophisticated metaprogramming, enabling developers to extend syntax and automate boilerplate while maintaining safety and expressiveness. These features address post-2010 needs for safer, more ergonomic code generation in systems and scientific computing contexts. Rust distinguishes between declarative macros, which use pattern-matching viamacro_rules! to transform syntax declaratively, and procedural macros, which execute arbitrary Rust code at compile time to generate or analyze syntax trees.[57] Procedural macros encompass derive macros, a subtype that automatically implements traits for structs and enums; for instance, the #[derive(Serialize)] attribute from the Serde library generates serialization code for JSON and other formats, reducing manual implementation errors. Derive macros were first stabilized in Rust edition 2015 with version 1.15 on February 2, 2017, and procedural macros beyond derives stabilized in edition 2018 with version 1.30 on October 25, 2018.[58]
The Rust 2024 edition, released with version 1.85.0 on February 20, 2025, updated macro fragment specifiers so that expr now matches const and _ expressions, introducing expr_2021 for previous behavior compatibility.[59]
Julia employs expression-based macros that manipulate abstract syntax trees (ASTs) to create custom syntax for metaprogramming, particularly useful in numerical and scientific domains.[60] The @time macro, for example, wraps code to measure execution time and memory usage, providing benchmarking without altering the original expression's semantics. This approach allows seamless integration of domain-specific languages, such as for linear algebra operations, directly into Julia code.
In JavaScript, Sweet.js provides hygienic macros through a source-to-source transpiler, enabling syntax extensions like pattern matching or custom control structures while preserving variable scoping.[61] Elixir builds on its Lisp heritage with hygienic macros by default, using quote to capture unevaluated expressions as ASTs and unquote to inject dynamic values, facilitating safe code generation for concurrent and distributed systems.[62][63]
Emerging trends in language-specific macros emphasize tight integration with type systems for safer expansions, as in Rust's derive macros that leverage trait bounds to ensure generated code respects type constraints.[7] Macros also enhance cross-platform ecosystems, such as Rust's use in WebAssembly via the wasm-bindgen macro, which automates bindings between Rust and JavaScript for browser-based applications. Additionally, they bolster library ecosystems by automating repetitive patterns, like trait derivations in Rust crates, promoting reusable and maintainable codebases without sacrificing performance.[7]
Security Considerations
Macro Viruses
Macro viruses are a type of computer virus that embeds malicious code within the macros of office application files, such as Microsoft Word documents or Excel spreadsheets, exploiting macro languages like Visual Basic for Applications (VBA) to execute harmful actions.[64] These viruses leverage the automation capabilities of application macros to infect documents and spread across systems.[65] The rise of macro viruses occurred in the 1990s alongside the proliferation of macro-rich office applications, marking a significant shift in malware targeting productivity software rather than operating systems.[66] The first known macro virus, WM/Concept, emerged in July 1995 and targeted Microsoft Word by infecting the Normal.dot template, allowing it to propagate to new documents created on infected systems.[64] This virus demonstrated the potential for self-replication within document files and soon became one of the most widespread viruses at the time.[66] A prominent example is the Melissa virus, which appeared in March 1999 and rapidly spread via email attachments containing infected Word documents, overwhelming corporate email servers and causing widespread disruptions.[67] In terms of mechanics, macro viruses typically auto-execute upon opening an infected file, as the macro code is triggered by built-in events like document load in applications such as Word or Excel.[65] Propagation often occurs through email attachments, where the virus emails copies of itself to contacts listed in the victim's address book, facilitating rapid dissemination without user intervention.[68] Payloads can include data theft, such as harvesting email addresses for further spread, file corruption, or downloading additional malware, all executed via the macro's access to system resources.[68] To mitigate macro viruses, modern Microsoft Office versions disable macros by default, requiring explicit user enablement for execution.[69] Additional protections include digital signatures, which verify macro authenticity from trusted publishers before allowing execution, and Protected View, a sandboxing feature that opens potentially unsafe files in an isolated read-only mode to prevent automatic code running.[70][71] Antivirus software further detects and blocks known macro virus signatures during file scans.[69]Code Injection and Expansion Risks
Untrusted macro expansion in programming languages like C and Rust can introduce significant security risks, particularly through code injection and unintended side effects during compilation. In C, the preprocessor's textual substitution mechanism allows macros to generate code that may evaluate arguments multiple times, leading to subtle defects exploitable in security-critical contexts, such as buffer overflows if a macro inadvertently alters array bounds or pointer arithmetic. For instance, a macro defined as#define ABS(x) (((x) < 0) ? -(x) : (x)) used with ABS(++n) increments n twice, potentially causing off-by-one errors in loops that manage memory, which attackers could leverage if combined with untrusted inputs. Similarly, malicious #include directives from tainted sources can inject arbitrary code snippets, amplifying supply-chain vulnerabilities in header libraries.[72]
In Rust, procedural macros pose even greater risks due to their ability to execute arbitrary Rust code at build time, enabling direct code injection into the compiled binary. If a dependency crate contains a malicious procedural macro, it can generate unsafe code or exfiltrate secrets during expansion, as the macro runs with the privileges of the build environment, including file and network access. An example involves a proc macro that parses untrusted input to derive traits; tainted inputs could coerce the generation of unsafe blocks, bypassing Rust's memory safety guarantees and introducing vulnerabilities like use-after-free. This is exacerbated in supply-chain attacks, where compromised crates on crates.io propagate malicious macros to downstream projects.[73][74] As of early 2025, such risks persist, with ongoing discussions in the Rust community about enhanced sandboxing to address supply-chain threats.[75]
To mitigate these risks, developers should prioritize input validation in macros that process external data, ensuring strict parsing to prevent injection of malicious constructs. In C, best practices include avoiding function-like macros altogether in favor of inline or static functions, which provide single-evaluation semantics and type safety, reducing the chance of exploitable side effects. For Rust, sandboxing procedural macros—such as through WebAssembly-based execution—limits their access to system resources, while tools like Clippy can lint macro expansions for hidden unsafety or suspicious patterns during development. Additionally, IDE integrations with macro hygiene checks help detect potential injection points early.[72][75][76]
As of October 2025, broader concerns involve AI-generated code, which can include macro definitions produced by tools like code assistants that inadvertently embed vulnerabilities; reports indicate AI-assisted coding contributes to one-in-five reported breaches overall.[77] These risks are heightened in open-source derives, where AI-suggested procedural macros may overlook hygiene, leading to unvalidated expansions in shared libraries. While related to macro viruses in document contexts, these compile-time threats focus on static exploits rather than runtime propagation.
Historical Development
Early Origins
The concept of macros in computer science emerged in the late 1940s and 1950s as a means to simplify programming through text substitution and code reuse, particularly in the era of limited memory and manual input methods like punched cards. Early forms appeared in the form of open subroutines, which expanded inline rather than calling closed routines, effectively acting as proto-macros to avoid overhead. For instance, the EDSAC computer, operational from 1949 at the University of Cambridge, supported open subroutines that programmers could define and expand on the fly to conserve storage, reducing the tedium of entering repetitive instruction sequences via paper tape.[78] This approach addressed the practical challenges of early stored-program machines, where every instruction counted toward resource limits. In the early 1950s, assembly language development further formalized macro-like features for instruction shorthand and code generation. Grace Hopper, working at Remington Rand, played a pivotal role by proposing in 1951 a library of subroutines stored on punched cards that could be automatically incorporated into programs as needed, an idea that evolved into the notion of open subroutines or macro expansions.[50] The IBM 701's assembler, developed by Nathaniel Rochester and introduced in 1954, incorporated symbolic notation and basic abbreviation mechanisms that served as precursors to full macros, enabling programmers to define shorthand for common instruction patterns and thereby streamline scientific computations on vacuum-tube machines. These innovations marked the transition from pure machine code to more expressive assembly tools, significantly easing the labor of programming large-scale calculations. By the late 1950s and into the 1960s, macros began appearing in higher-level contexts, building on subroutine libraries. Lisp 1.5, released in 1959 by John McCarthy and colleagues at MIT, introduced user-defined functions via LAMBDA expressions, with FEXPRs allowing unevaluated argument passing—a mechanism that functioned as proto-macros by enabling runtime code transformation without automatic evaluation. In planning systems, the STRIPS framework, developed in the early 1970s at SRI International, extended this through ABSTRIPS, which automatically generated macro-operators from prior solution plans to abstract sequences of actions, improving efficiency in robot problem-solving tasks.[79] Assembly language macros continued to evolve in the 1970s, exemplified by Digital Equipment Corporation's MACRO-11 assembler for the PDP-11 minicomputers, which supported conditional and parametric macro definitions to generate code snippets, drastically cutting down on the punch-card volume required for complex programs.[80] This tool, integral to real-time systems and embedded applications, highlighted macros' role in mitigating the physical and error-prone nature of data entry in an era before widespread interactive computing. Hopper's early advocacy for reusable code modules influenced these developments, laying groundwork for macros as a foundational abstraction in programming languages.[50]Evolution and Modern Advances
The standardization of the C preprocessor in the ANSI X3.159-1989 standard formalized macro definitions and expansion rules, enabling portable textual substitution across C implementations.[81] Concurrently, the m4 macro processor, originally developed in 1975 by Brian Kernighan and Dennis Ritchie as a general-purpose tool for UNIX, was extended by the GNU m4 implementation released in 1990 by René Seindal, influencing build systems and configuration scripts through the 1990s and beyond.[82] In the early 1990s, Scheme's Revised^4 Report (R4RS), published in 1991, introduced optional hygienic macros via thedefine-syntax form, addressing variable capture issues in macro expansion and promoting safer syntactic extensions.[83] This innovation, building briefly on Lisp's foundational metaprogramming, emphasized hygiene to prevent unintended bindings during expansion. During the 2000s, Common Lisp saw incremental macro extensions in libraries and implementations, enhancing domain-specific languages while maintaining its defmacro-based system from the 1994 ANSI standard.[41] The 2010s brought further shifts toward hygienic and procedural approaches: Julia incorporated expression-based macros from its 2012 release, allowing runtime-like code generation with hygiene checks.[60] Rust introduced procedural macros in 2015 via RFC 1566, enabling compile-time syntax manipulation through token streams for safer, attribute-driven derivations.[84] Projects like Sweet.js (2013) extended hygienic macros to JavaScript, using pattern matching to mitigate lexical pitfalls in non-Lisp environments.[61]
By the 2020s, trends emphasized safe metaprogramming, with a pronounced shift toward hygiene for security and modularity, as detailed in historical analyses of macro systems.[85]