Recent from talks
Nothing was collected or created yet.
Factor (programming language)
View on Wikipedia| Factor | |
|---|---|
| Paradigm | multi-paradigm: concatenative (stack-based), functional, object-oriented |
| Developer | Slava Pestov |
| First appeared | 2003 |
| Stable release | 0.100
/ September 11, 2024 |
| Typing discipline | strong, dynamic |
| OS | Windows, macOS, Linux |
| License | BSD license |
| Website | factorcode.org |
| Influenced by | |
| Joy, Forth, Lisp, Self | |
Factor is a stack-oriented programming language created by Slava Pestov. Factor is dynamically typed and has automatic memory management, as well as powerful metaprogramming features. The language has a single implementation featuring a self-hosted optimizing compiler and an interactive development environment. The Factor distribution includes a large standard library.
History
[edit]Slava Pestov created Factor in 2003 as a scripting language for a video game.[1] The initial implementation, now referred to as JFactor, was implemented in Java and ran on the Java Virtual Machine. Though the early language resembled modern Factor superficially in terms of syntax, the modern language is very different in practical terms and the current implementation is much faster.
The language has changed significantly over time. Originally, Factor programs centered on manipulating Java objects with Java's reflection capabilities. From the beginning, the design philosophy has been to modify the language to suit programs written in it. As the Factor implementation and standard libraries grew more detailed, the need for certain language features became clear, and they were added. JFactor did not have an object system where the programmer could define their own classes, and early versions of native Factor were the same; the language was similar to Scheme in this way. Today, the object system is a central part of Factor. Other important language features such as tuple classes, combinator inlining, macros, user-defined parsing words and the modern vocabulary system were only added in a piecemeal fashion as their utility became clear.
The foreign function interface was present from very early versions to Factor, and an analogous system existed in JFactor. This was chosen over creating a plugin to the C part of the implementation for each external library that Factor should communicate with, and has the benefit of being more declarative, faster to compile and easier to write.
The Java implementation initially consisted of just an interpreter, but a compiler to Java bytecode was later added. This compiler only worked on certain procedures. The Java version of Factor was replaced by a version written in C and Factor. Initially, this consisted of just an interpreter, but the interpreter was replaced by two compilers, used in different situations. Over time, the Factor implementation has grown significantly faster.[2]
Description
[edit]Factor is a dynamically typed, functional and object-oriented programming language. Code is structured around small procedures, called words. In typical code, these are 1–3 lines long, and a procedure more than 7 lines long is very rare. Something that would idiomatically be expressed with one procedure in another programming language would be written as several words in Factor.[3]
Each word takes a fixed number of arguments and has a fixed number of return values. Arguments to words are passed on a data stack, using reverse Polish notation. The stack is used just to organize calls to words, and not as a data structure. The stack in Factor is used in a similar way to the stack in Forth; for this, they are both considered stack languages. For example, below is a snippet of code that prints out "hello world" to the current output stream:
"hello world" print
print is a word in the io vocabulary that takes a string from the stack and returns nothing. It prints the string to the current output stream (by default, the terminal or the graphical listener).[3]
The factorial function can be implemented in Factor in the following way:
: factorial ( n -- n! ) dup 1 > [ [1..b] product ] [ drop 1 ] if ;
Not all data has to be passed around only with the stack. Lexically scoped local variables let one store and access temporaries used within a procedure. Dynamically scoped variables are used to pass things between procedure calls without using the stack. For example, the current input and output streams are stored in dynamically scoped variables.[3]
Factor emphasizes flexibility and the ability to extend the language.[3] There is a system for macros, as well as for arbitrary extension of Factor syntax. Factor's syntax is often extended to allow for new types of word definitions and new types of literals for data structures. It is also used in the XML library to provide literal syntax for generating XML. For example, the following word takes a string and produces an XML document object which is an HTML document emphasizing the string:
: make-html ( string -- xml )
dup
<XML
<html>
<head><title><-></title></head>
<body><h1><-></h1></body>
</html>
XML> ;
The word dup duplicates the top item on the stack. The <-> stands for filling in that part of the XML document with an item from the stack.
Implementation and libraries
[edit]Factor includes a large standard library, written entirely in the language. These include
- A cross-platform GUI toolkit, built on top of OpenGL and various windowing systems, used for the development environment.[4]
- Bindings to several database libraries, including PostgreSQL and SQLite.[5]
- An HTTP server and client, with the Furnace web framework.[6]
- Efficient homogeneous arrays of integers, floats and C structs.[7]
- A library implementing regular expressions, generating machine code to do the matching.[8]
A foreign function interface is built into Factor, allowing for communication with C, Objective-C and Fortran programs. There is also support for executing and communicating with shaders written in GLSL.[3][9]
Factor is implemented in Factor and C++. It was originally bootstrapped from an earlier Java implementation. Today, the parser and the optimizing compiler are written in the language. Certain basic parts of the language are implemented in C++ such as the garbage collector and certain primitives.
Factor uses an image-based model, analogous to many Smalltalk implementations, where compiled code and data are stored in an image.[10] To compile a program, the program is loaded into an image and the image is saved. A special tool assists in the process of creating a minimal image to run a particular program, packaging the result into something that can be deployed as a standalone application.[3][11]
The Factor compiler implements many advanced optimizations and has been used as a target for research in new optimization techniques.[3][12]
References
[edit]- ^ Pestov, Slava. "Slava Pestov's corner of the web".
- ^ "Concatenative.org wiki: Factor/Implementation History".
- ^ a b c d e f g Pestov, Sviatoslav; Ehrenberg, Daniel (2010). "Factor: a dynamic stack-based programming language". ACM SIGPLAN Notices. 45 (12). ACM: 43–58. doi:10.1145/1899661.1869637.
- ^ Pestov, Slava. "Factor documentation: UI framework".
- ^ Coleman, Doug. "Factor documentation: Database library".
- ^ Pestov, Slava. "Factor documentation: HTTP server".
- ^ Pestov, Slava. "Factor documentation: Specialized arrays".
- ^ Coleman, Doug; Ehrenberg, Daniel. "Factor documentation: Regular expressions".
- ^ Pestov, Slava (28 July 2010). "Overhauling Factor's C library interface".
- ^ Pestov, Slava (10 January 2010). "Factor's bootstrap process explained".
- ^ Pestov, Slava (5 July 2008). "On shaking trees".
- ^ Ehrenberg, Daniel (2010). "Closure elimination as constant propagation" (PDF). Archived from the original (PDF) on 2011-07-26.
External links
[edit]- Official website
- Slava Pestov (October 27, 2008). Factor: An Extensible Interactive Language (flv) (Tech talk). Google. Archived from the original on 2021-12-22.
- Zed Shaw (2008). The ACL is Dead (flv) (CUSEC 2008). CUSEC. – a presentation written in Factor which mentions and praises Factor
Factor (programming language)
View on Grokipedia( path encoding -- seq ) to ensure type safety and clarity during interactive use.[1] It supports purely object-oriented programming with single and multiple dispatch on generic functions, alongside low-level capabilities like foreign function interfaces (FFI), manual memory management, and binary data handling for systems programming.[1] The language emphasizes expressiveness through vocabularies—modular libraries that extend its core—covering areas from mathematics and strings to user interfaces and networking, with over 37,000 words in the standard library.[1]
Factor's implementation includes a virtual machine written in C++, generational garbage collection, and two compilers: a baseline for quick iteration and an optimizing one for production code, achieving competitive performance on benchmarks such as the Computer Language Benchmarks Game.[1] Its ecosystem facilitates standalone application deployment and interactive exploration via a listener and tools like the paved listener for debugging.[3] The community engages through a Discord server and contributes to extensions, maintaining Factor's focus on practicality for domains like scientific computing, web development, and tooling.[3]
Overview
Core Characteristics
Factor is a dynamically typed programming language that employs garbage collection for automatic memory management, allowing developers to focus on code logic without manual allocation and deallocation concerns.[1] This approach uses a generational garbage collector, optimizing for workloads with many short-lived objects through mark-sweep-compact in the oldest generation and copying collection in younger ones.[1] The language's type system supports dynamic dispatch while providing optional static checks via stack effect declarations, enhancing reliability in interactive development.[1] At its core, Factor operates on a stack-oriented execution model that utilizes reverse Polish notation (RPN), where operations are postfix and arguments are passed via a data stack rather than named parameters.[2] This concatenative paradigm enables terse, composable code but requires familiarity with stack manipulation to avoid errors.[1] Factor supports multiple programming paradigms, blending concatenative programming with functional elements like higher-order functions and quotations (anonymous code blocks) and object-oriented features such as classes, inheritance, and polymorphism.[1] These capabilities make it versatile for general-purpose application development, drawing influences from languages like Forth, Lisp, and Smalltalk.[1] As an open-source project licensed under the BSD license, Factor is freely available and encourages community contributions.[4] It provides native implementations for major platforms, including Windows, macOS, and Linux on x86 and x86-64 architectures, ensuring portability without runtime dependencies.[3] Deployment follows an image-based model, where the entire runtime state—including compiled code, data, and libraries—is saved to a single executable image file, facilitating quick startups and distribution of self-contained applications.[3] This method, akin to systems in Smalltalk, supports incremental development and easy versioning of program states.[2]Design Philosophy
Factor's design philosophy draws heavily from Forth, adopting its stack-based, concatenative model for simplicity and expressiveness, while incorporating high-level abstractions such as dynamic typing and automatic garbage collection to mitigate the manual memory management and low-level complexities often associated with Forth implementations.[1] This evolution aims to balance the elegance of postfix notation with modern conveniences, enabling developers to focus on problem-solving rather than boilerplate concerns.[2] At its core, Factor seeks to foster an interactive and extensible environment optimized for rapid prototyping and full-scale application development, where code can be incrementally built, tested, and refined in a live REPL session without the need for frequent recompilation.[1] The language emphasizes programmer productivity through a rich ecosystem of libraries and tools that support seamless integration across platforms, allowing for quick iteration from idea to deployment.[2] Central to this philosophy is the concept of "words"—short, composable functions that serve as the fundamental building blocks of code, promoting readability and modularity within the concatenative paradigm by clearly declaring their stack effects (e.g.,( x y -- z ) for a word consuming two inputs and producing one output).[1] This approach encourages the creation of domain-specific vocabularies, where words can be combined like Lego bricks to form expressive pipelines, reducing cognitive overhead in complex computations.[2]
Factor explicitly rejects rigid, context-free syntaxes typical of imperative languages, instead favoring a macro system that enables runtime and compile-time extensions for defining custom, declarative notations tailored to specific problem domains, such as embedded DSLs for parsing or UI layout.[1] By treating syntax as programmable, Factor empowers users to evolve the language itself, aligning with its goal of ultimate flexibility without sacrificing the underlying stack discipline.[2]
History
Origins and Early Development
Factor was created in 2003 by Slava Pestov as an embedded scripting language named JFactor, initially implemented in Java for a video game project. The language was designed to provide a simple mechanism for scripting game logic within a larger Java application, leveraging a stack-based evaluation model to enable concise and efficient code for tasks like handling game entities and behaviors. This early incarnation emphasized ease of embedding and rapid prototyping, drawing inspiration from Forth's concatenative paradigm while incorporating dynamic typing and garbage collection from the outset.[5] From its inception, Factor adopted a Forth-like syntax centered on stack manipulation, but Pestov integrated object-oriented elements, making every value an object to support modern programming practices. The stack-based approach allowed for postfix notation without explicit variables in many cases, facilitating a focus on composable operations suitable for scripting environments. This blend addressed limitations in existing languages for game development, where traditional syntaxes often complicated dynamic behaviors.[1] The Java-hosted implementation proved limiting for broader applications, lacking features like tail-call optimization and efficient dynamic dispatch, which prompted a transition to a standalone native compiler in 2005. This shift was driven by performance requirements for more demanding programs and the goal of achieving self-hosting, enabling Factor to compile its own codebase independently of the JVM. The native version bootstrapped from the Java prototype, marking Factor's evolution from a niche scripting tool to a general-purpose language.[1]Major Milestones and Releases
In 2005, Factor's development shifted from its initial Java-based interpreter to a native runtime implemented in a combination of C and Factor code, enabling the first self-hosted compiler and replacing the pure Java implementation for improved performance and independence. This transition laid the groundwork for Factor's bootstrapping process, where core components like the parser and compiler are written in the language itself.[6][1] Between 2008 and 2010, significant enhancements included the introduction of an optimizing compiler fully implemented in Factor, which generates machine code for architectures like x86 and PowerPC through extensive data flow analysis. A cross-platform UI toolkit based on OpenGL was added, supporting interactive development environments, while the standard library underwent major expansions for practical applications. Slava Pestov detailed the bootstrap process in contemporary blog posts, highlighting how an existing Factor image compiles updates to produce new images.[1][7] From 2010 onward, following Pestov's primary leadership until 2010, the project has been driven by an active community, with key releases including version 0.97 in November 2014, which incorporated over 1,400 commits addressing bug fixes, library refinements, and Emacs integration improvements. Version 0.98, released in July 2018, further emphasized stability enhancements and concurrency support via coroutine-based multithreading in the single-threaded runtime. In August 2023, version 0.99 was released, adding features such as a Guided Tour for new users, Unicode 15 support, OpenSSL 3.1.2 and SQLite 3.42.0 bindings on Windows, and re-added FreeBSD support. These efforts focused on robustness for real-world use without altering core language semantics.[8][9][10][11][12] In September 2024, version 0.100 was released, featuring modern platform compatibility like preliminary ARM64 support in the non-optimizing compiler, Unicode 15.1 integration, improved floating-point output, and automatic light/dark theme detection on Windows, alongside fixes for XML namespace handling and library optimizations. This update reduced image sizes through compressed format support and resolved long-standing issues across Windows, macOS, and Linux platforms.[13][2] Ongoing maintenance of Factor is sustained by a community of contributors via the GitHub repository, ensuring compatibility updates and incremental improvements as of November 2025.[3]Language Fundamentals
Syntax and Stack Model
Factor employs a postfix notation, also known as Reverse Polish Notation (RPN), in which operands precede operators on a single data stack, eliminating the need for parentheses to denote precedence.[1] For instance, the expression2 3 + . pushes the literals 2 and 3 onto the stack, applies the addition word to pop them and push their sum (5), and then prints the result using the output word ..[1] This stack-based execution model treats the data stack as the primary mechanism for passing arguments to and returning results from words (Factor's term for functions or procedures), with literals such as numbers, strings, and sequences directly pushed onto the stack when encountered.[1]
To document and verify the behavior of words with respect to the stack, Factor uses stack effect declarations in the form ( inputs -- outputs ), where inputs and outputs are listed from bottom to top of the stack (rightmost being the top).[14] These declarations specify the number and types of values consumed from and produced onto the stack; for example, the addition word + is declared as ( x y -- z ), indicating it pops two numbers (y on top, then x) and pushes their sum.[14] Stack effects support row variables like ..a for sequences of unknown length and nested declarations for quotations, enabling the compiler's stack checker to enforce correctness at compile time for inline words.[14]
Literals in Factor include primitive types like integers, floats, booleans (true as t, false as f), and strings, all of which are pushed directly onto the stack without additional syntax.[1] Quotations, denoted by square brackets [ ... ], create anonymous functions or delayed code blocks that are also pushed as first-class values onto the stack; they can be executed later using words like call or passed to combinators such as map.[1] For example, 3 [ 2 + ] call pushes 3, then the quotation that adds 2, and invokes it to yield 5. Basic control structures operate directly on stack values, such as conditionals where a boolean on top selects between two quotations via if: flag [ true-branch ] [ false-branch ] if, with the chosen quotation called and the other dropped.[1]
Vocabularies serve as namespaces for organizing words, allowing code to be modularized into hierarchical modules stored in directories.[15] To access words from a vocabulary, one qualifies them explicitly as vocab-name.word (e.g., math.+), or imports the entire vocabulary using USE: vocab-name for single imports or USING: vocab1 vocab2 ; for multiple, making words directly accessible without qualification.[15]
Factor eschews traditional variables in favor of stack-based computation for local state, with persistent or global state managed through object slots (fields in objects) or dynamically scoped variables that thread values across word calls without stack manipulation.[1] Dynamically scoped variables, such as those for input/output streams, are accessed via words like get and set, providing a way to share context implicitly during execution.[1]
Words and Vocabularies
In Factor, the basic unit of code organization is the word, which serves as a named procedure equivalent to functions in other languages. Words are defined using the colon definition syntax: word-name ( stack-effect ) word-body ;, where the stack effect declares the inputs and outputs on the data stack, and the word-body consists of Factor expressions that manipulate the stack.[16] For example, the word to compute the square of a number is defined as : square ( n -- n ) dup * ;, which duplicates the top stack item and multiplies it by itself.[16] This syntax encapsulates reusable stack manipulations, promoting modular code construction.[17]
Factor supports words with lexical variables using the :: definition syntax, which allows naming input and output parameters in the stack effect for use within the word body. These provide a more imperative style within the concatenative framework. For example: :: square ( x -- y ) x x * ; defines a word where x is the input and y the output, both accessible by name inside the body. Local computations within a word can be encapsulated using anonymous quotations, such as : process ( data -- result ) [ 2 * ] call ;, which doubles the input data and leaves the result on the stack.[16][1]
Words are organized into vocabularies, which act as modular namespaces grouping related functionality to avoid naming conflicts and enhance code maintainability. Each vocabulary corresponds to a directory in the Factor source tree, containing files with definitions that load together.[15] For instance, the math vocabulary collects arithmetic operations like + and sqrt, while sequences handles list manipulations such as map.[15]
In interactive development, the listener—a terminal-based REPL—evaluates expressions line-by-line, displaying the data stack after each input and facilitating rapid testing and loading of vocabularies with commands like USE: math or reload.[18] This environment allows developers to define and invoke words incrementally.[18]
Programming Paradigms
Concatenative and Functional Aspects
Factor's concatenative programming paradigm allows programs to be composed by simply juxtaposing words, which are the language's fundamental units of code, each performing a transformation on the data stack. This composition is implicit and point-free, meaning functions are applied without explicitly naming arguments or intermediate variables; instead, the stack serves as the medium for data flow, enabling concise expressions likedup square + to compute the square of a number plus the number itself, where dup duplicates the top stack item, square squares it, and + adds the results.[19]
Higher-order functions are supported through quotations, which are anonymous code blocks enclosed in square brackets that can be passed as arguments to combinators. For instance, the map combinator applies a quotation to each element of a sequence, producing a new sequence, as in { 1 2 3 } [ sq ] map, which squares each number to yield { 1 4 9 }. Similarly, fold reduces a sequence using a binary quotation, such as { 1 2 3 } [ + ] fold summing the elements to 6. These features facilitate functional-style programming by treating code as data.[20][21]
Data in Factor is immutable by default in its functional core, with sequences and other structures typically returned as new values rather than modified in place, promoting referential transparency. Functional constructs like fold and higher-order combinators encourage pure transformations, while recursion is optimized through tail-call elimination, allowing efficient implementation of loops without stack overflow; for example, a recursive factorial can be defined using a tail-called word that accumulates the product.[19][22]
Pattern matching and destructuring are achieved through stack manipulation words, such as swap to exchange the top two items or dup to duplicate, enabling selective access and binding without explicit variable declarations; this stack-oriented approach integrates seamlessly with the concatenative model for concise data decomposition.[19]
To maintain purity, Factor supports subsets of code that avoid side effects by relying on stack transformations and combinators, excluding I/O or mutable state operations, which allows for verifiable functional behavior in critical sections.[19]
Object-Oriented Features
Factor integrates object-oriented programming concepts into its concatenative, stack-based paradigm through a lightweight object system centered on tuples and generic functions, enabling encapsulation, inheritance, and polymorphism without traditional message-passing semantics.[23] This design draws inspiration from systems like CLOS, where behavior is dispatched based on object types rather than explicit method calls, allowing seamless extension within the stack model.[1] Classes in Factor are primarily defined using tuples, which serve as user-defined data structures with named slots for storing state. TheTUPLE: word declares a tuple class, specifying its name and slots in the form TUPLE: <class-name> { slot1 slot2 ... } ;, where each slot holds an instance variable of a declared type.[24] For example, a simple point class might be defined as TUPLE: point { x number } { y number } ;, creating a class word point for type checks and instantiation, along with automatic accessor methods for the slots. Instances are constructed explicitly, often via the boa word, which binds values to slots from the stack: 2 3 point boa yields a point instance with those coordinates.[24][25] Slots encapsulate object state, with access controlled through generated reader (slot>>) and writer (>>slot) methods that push or set values on the stack while preserving type safety.[26] Read-only slots can be declared to prevent modification, enhancing data integrity in object designs.[26]
Methods in Factor are implemented as generic words, which dispatch to specialized implementations based on the classes of their inputs, providing polymorphism integrated with the stack effects.[23] A generic word is declared with [GENERIC:] and methods added using M:, specializing on tuple classes; for instance, the + operation on numbers is a generic that selects methods matching the operand types.[23] This approach enables protocol-based polymorphism, where objects adhere to behavioral protocols through shared generic methods rather than explicit interfaces, allowing extensible and composable object interactions without subclass proliferation.[23]
Inheritance supports hierarchical class relationships, with single inheritance achieved via direct subclassing or mixins. A subclass is defined as TUPLE: subclass < superclass { new-slots } ;, inheriting all slots and methods from the superclass while adding or overriding as needed.[27] Mixin classes, declared with MIXIN: <mixin-name> < superclass ;, provide reusable behavior for single-inheritance scenarios, such as adding common functionality to unrelated hierarchies. Multiple inheritance is handled through union classes via UNION: union-class { class1 class2 ... } ;, combining methods from multiple parents with linearization to resolve ambiguities based on class precedence.[27] Type checking is facilitated by predicate words, such as point? for the point class, which verify instance membership and support dispatch in generics; these predicates extend to subclasses and unions for polymorphic type queries.[24][27]
Extensibility and Tools
Macros and Syntax Extension
Factor provides powerful mechanisms for extending its syntax through macros and parsing words, allowing users to define custom syntactic constructs that integrate seamlessly with the language's concatenative model. Macros, defined using theMACRO: word, perform compile-time transformations by replacing invocations with expanded quotations, enabling the movement of computations from runtime to compile-time for improved efficiency and abstraction. This facility supports the creation of higher-level idioms while preserving the stack-based evaluation semantics.[28][1]
Parsing words, declared with SYNTAX:, further extend syntax by executing during the parsing phase to read input tokens and manipulate the parse tree, such as by adding deferred actions via suffix!. These words, conventionally named in uppercase, operate on an accumulator to build domain-specific syntax without altering the core parser. The POSTPONE: word plays a crucial role in this process by appending the subsequent word to the parse tree literally, even if it is a parsing word itself, thus delaying its execution until runtime and preventing premature evaluation during macro or parser definition. For instance, POSTPONE: { can be used within a macro to include array literal parsing at expansion time rather than parse time. This combination allows for arbitrary syntax extensions, aligning with Factor's design goal of extensibility.[29][30][1]
Symbol and scanner words facilitate the construction of domain-specific languages (DSLs) by tokenizing input and generating custom literals or structures. Scanner words process sequences of tokens, while symbols serve as identifiers that can trigger specialized parsing; together, they enable embedded DSLs for tasks like XML literals, where <XML parses markup until a closing tag, producing a structured quotation. Integration with quotations is central to meta-programming: macros and parsing words generate or compose quotations that are spliced into the code, allowing dynamic code generation while the stack checker infers effects for optimization.[29][1]
To ensure hygiene and avoid name capture in macro expansions, Factor employs unique symbols generated by gensym, which creates uninterned words guaranteed not to conflict with existing bindings. This practice prevents unintended variable shadowing, as the generated symbols are distinct from any user-defined names in the surrounding scope. For example, in defining a macro that introduces temporary variables, gensym ensures isolation:
MACRO: temp-macro ( x -- y )
gensym temp {
[ temp set x get + temp get ]
} ;
MACRO: temp-macro ( x -- y )
gensym temp {
[ temp set x get + temp get ]
} ;
temp is uniquely named, safeguarding against capture.[31][28]
Practical examples illustrate these features. For pattern matching syntax, Factor extends the language via macros that invert quotations using the undo combinator, enabling declarative matching on stack values. The case construct chains such inverses:
{ { [<nil>] [0] } { [<cons>] [sum +] } } case
{ { [<nil>] [0] } { [<cons>] [sum +] } } case
<nil> for empty or <cons> for non-empty), recursively summing elements by applying the corresponding quotation if the pattern succeeds, with inverses defined in word properties for compile-time or memoized runtime evaluation. Such extensions create Lisp-like pattern matching without native variables, leveraging the stack for binding.[32]
Similarly, syntax for loops can be extended using parsing words to define iterative constructs beyond basic combinators like each. For instance, a custom FOR: parsing word might scan a range and body quotation, suffixing an expansion that uses each internally:
SYNTAX: FOR: ... parse-until "do" [ ... ] parse-literal suffix! ;
SYNTAX: FOR: ... parse-until "do" [ ... ] parse-literal suffix! ;
FOR: i 1..10 [ i sq . ], transforming it at parse time into a quotation applying each over the range with the squared index, demonstrating how parsing words build familiar control structures on Factor's foundational model. Infix-like notation can also be introduced, such as for ranges with ... parsing until the next token, expanding 12 ... 18 to [12,18) for concise sequence generation. These mechanisms, grounded in quotations, empower users to tailor syntax for specific domains while maintaining type safety through the stack checker.[29][1]
Foreign Function Interface
Factor's Foreign Function Interface (FFI) enables seamless integration with code written in other languages, primarily by allowing Factor programs to call functions from dynamically linked libraries and vice versa.[33] The FFI is implemented in thealien vocabulary and supports direct interaction with C libraries, with extensions for handling data types, pointers, and structs that align Factor's stack-based model with C's conventions.[33] This interface is inspired by systems like Common Lisp's CFFI, emphasizing runtime loading and type-safe wrappers to minimize boilerplate.[1]
To load shared objects, Factor uses the add-library word or syntax extensions like LIBRARY: to dynamically link libraries such as .dll, .so, or .dylib files at runtime, supporting platform-specific paths and calling conventions like cdecl.[33] Function wrappers are defined using FUNCTION: or the lower-level define-declared, which specify the return type, parameters, and ABI, automatically generating Factor words that handle stack-to-argument conversion.[34] For example, to interface with a C function int add(int a, int b), one defines:
FUNCTION: int add ( int a, int b ) ;
FUNCTION: int add ( int a, int b ) ;
add that pops two integers from the stack, calls the C function, and pushes the result.[34] Calling conventions map Factor values to C types: integers and floats convert directly, while aggregates like structs use STRUCT: declarations for layout and passing by value or pointer.[35] Pointers are handled via c-ptr types, with the >c-ptr word converting Factor objects like byte-arrays or structs to C pointers, enabling in-place modification.[36]
Support for Fortran is provided through the alien.fortran vocabulary, which adapts the FFI for Fortran's calling conventions and data passing, such as handling arrays and common blocks in shared libraries.[37] Similarly, Objective-C integration uses the cocoa or objc vocabularies, allowing Factor to invoke methods on Objective-C objects via runtime type introspection and message passing, facilitating development for macOS and iOS applications.[1]
Memory management across language boundaries requires caution due to Factor's moving garbage collector, which can relocate objects like byte-arrays passed as pointers.[38] To avoid interference, developers use pinned or manually allocated memory via malloc and free words in the alien vocabulary, ensuring pointers remain valid during C execution or callbacks.[39] Pinned objects, such as certain alien instances, are fixed in memory to prevent GC movement, providing safe buffers for long-lived C references.[1] Callbacks from C to Factor are supported by registering quotations as function pointers, with the runtime handling invocation and stack setup.[34]
Embedding Factor in C applications involves linking the Factor VM as a library (e.g., libfactor.so) and using the embedding API defined in vm/master.h.[40] Initialization occurs via init_factor_from_args, which loads an image file and processes arguments, after which Factor code can be evaluated from C using factor_eval_string to run expressions and retrieve string results.[41] Conversely, C applications can be embedded in Factor by loading their libraries and invoking functions, as in the OpenSSL example where platform-specific libraries are loaded and functions like SSL_new are wrapped for secure socket operations.[1] This bidirectional capability supports hybrid applications, such as games using SDL via FFI for graphics while leveraging Factor for logic.[42]
Implementation Details
Compiler and Runtime Environment
Factor's implementation features a self-hosted optimizing compiler written primarily in the language itself, augmented by a C++ virtual machine (VM) that handles low-level operations such as primitive dispatch and memory management.[43] The compiler supports ahead-of-time compilation to native machine code, enabling high performance while preserving dynamic features like runtime code reloading.[44] This architecture separates the high-level optimizing compiler, implemented in Factor, from the base non-optimizing compiler embedded in the C++ VM, which is used for rapid compilation during interactive sessions.[43] The compilation process involves multiple stages beginning with parsing and macro expansion, followed by optimizations such as stack effect checking with row polymorphism, quotation inlining, sparse conditional constant propagation (SCCP), escape analysis, and dead code elimination.[43] These optimizations operate on both high-level and low-level intermediate representations (IR), including static single assignment (SSA) form, to enable advanced transformations like value numbering, common subexpression elimination, and register allocation.[43] The final code generation stage produces native binaries, with support for platform-specific features like SIMD instructions, ensuring efficient execution without relying on just-in-time (JIT) compilation.[43] For interactive development, Factor employs a listener—a read-eval-print loop—that leverages the non-optimizing compiler for quick evaluation of expressions, compiling them into executable quotations on the fly.[44] This allows seamless experimentation while deferring full optimization to word-level compilation, where entire definitions are analyzed for better performance.[44] The runtime environment, powered by the C++ VM, includes a generational garbage collector that uses a copying collector for the young generation and a mark-sweep-compactor for the old generation, facilitating automatic memory management and supporting runtime code modifications through object compaction.[43] Factor's fast startup is achieved via image saving, where the entire runtime state—including compiled code and loaded vocabularies—is serialized to a file and reloaded instantly on launch.[43] The bootstrap process begins with a minimal C++-generated base image containing the VM and essential primitives, which is then used to compile the full Factor system, including the optimizing compiler and parser, culminating in a self-hosted development image.[43] This partial self-hosting mirrors approaches in languages like Common Lisp, ensuring the core language can rebuild itself from source.[43]Standard Library Overview
Factor's standard library, known as the "basis," provides a rich collection of vocabularies that support a wide range of programming tasks, from basic data manipulation to advanced system interactions.[45] These vocabularies are modular and extensible, allowing developers to load only what is needed for a given application. Key components include core utilities for data structures and mathematics, graphical interfaces, networking protocols, system-level operations, and specialized tools for databases, cryptography, and scientific computation. Core vocabularies handle fundamental data structures and utilities. Thesequences vocabulary offers operations on ordered collections like lists and arrays, including mapping, filtering, and slicing functions. Associative data structures are managed by the assocs protocol, implemented in vocabularies such as hashtables for efficient key-value storage with constant-time lookups and vectors for dynamic, growable arrays.[45] Utility vocabularies like strings provide string manipulation, formatting, and encoding/decoding routines, while math delivers arithmetic operations, trigonometric functions, and basic statistical tools.
For user interfaces and graphics, Factor includes a comprehensive GUI toolkit in the ui vocabulary, supporting windows, gadgets (such as buttons, sliders, and text fields), event handling, and layout management.[46] Graphics capabilities extend to opengl for 3D rendering and hardware-accelerated visuals, and cairo for 2D vector graphics, enabling cross-platform drawing of shapes, paths, and text.
Networking features encompass protocols and data exchange formats. The http vocabulary implements both client and server functionality, including request/response handling and status code management. Support for data serialization includes json for parsing and generating JSON structures, and xml for XML document manipulation. Lower-level connectivity is provided by sockets for TCP/UDP communication and the io streams for binary and text I/O over networks.
System-level operations cover file handling, concurrency, and platform interactions. The io.files sub-vocabulary manages file reading, writing, and directory traversal, with support for paths and permissions. Concurrency primitives include the threads vocabulary for cooperative multitasking and channels for message passing between threads. The system vocabulary queries OS details like CPU architecture and memory usage, while io.launcher enables process spawning and environment variable access.
Specialized vocabularies address domain-specific needs. Database support features the db abstraction for relational queries and the sqlite binding for lightweight, embedded SQL databases. Cryptography is handled by the crypto suite, offering hashing (e.g., SHA-2 via checksums.sha), symmetric/asymmetric encryption, and digital signatures, with bindings to openssl for advanced features. Scientific computing basics are available in math extensions like blas for linear algebra operations, including matrix multiplication and vector computations.
