Hubbry Logo
Parrot virtual machineParrot virtual machineMain
Open search
Parrot virtual machine
Community hub
Parrot virtual machine
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Parrot virtual machine
Parrot virtual machine
from Wikipedia
Parrot virtual machine
Final release
8.1.0 / February 16, 2016; 9 years ago (2016-02-16)[1]
Repository
Written inC
Operating systemCross-platform
SuccessorMoarVM (for Raku)
TypeVirtual machine
LicenseArtistic License 2.0
Websitewww.parrot.org Edit this at Wikidata

Parrot is a discontinued register-based process virtual machine designed to run dynamic languages efficiently. It is possible to compile Parrot assembly language and Parrot intermediate representation (PIR, an intermediate language) to Parrot bytecode and execute it. Parrot is free and open-source software.[2]

Parrot was started by the Perl community and developed with help from the open-source and free software communities. As a result, it was focused on license compatibility with Perl (Artistic License 2.0), platform compatibility across a broad array of systems, processor architecture compatibility across most modern processors, speed of execution, small size (around 700k depending on platform), and the flexibility to handle the varying demands made by Raku and other modern dynamic languages.

Version 1.0, with a stable application programming interface (API) for development, was released on March 17, 2009.[3] The last version is release 8.1.0 "Andean Parakeet".[1] Parrot was officially discontinued in August 2021, after being supplanted by MoarVM in its main use (Raku) and never becoming a mainstream VM for any of its other supported languages.[4]

History

[edit]

The name Parrot came from an April Fool's joke which announced a hypothetical language, named Parrot, that would unify Python and Perl.[5][6] The name was later adopted by the Parrot project (initially a part of the Raku development effort) which aimed to support Raku, Python, and other programming languages.

The Parrot Foundation was dissolved in 2014.[7] The Foundation was created in 2008 to hold the copyright and trademarks of the Parrot project, to help drive development of language implementations and the core codebase, to provide a base for growing the Parrot community, and to reach out to other language communities.[8]

Historical design decisions are documented in the form of Parrot Design Documents, or PDDs, in the Parrot repository.[9]

Until late 2005, Dan Sugalski was the lead designer and chief architect of Parrot. Chip Salzenberg, a longtime Perl, Linux kernel, and C++ hacker, took over until mid-2006, when he became the lead developer. Allison Randal, the lead developer of Punie and chief architect of Parrot's compiler tools, was the chief architect until mid-October 2010 when she stepped down and chose Christoph Otto as the new chief architect.[10]

Languages

[edit]

The goal of the Parrot virtual machine was to host client languages and allow inter-operation between them. Several hurdles exist in accomplishing this goal, in particular the difficulty of mapping high-level concepts, data, and data structures between languages.

Static and dynamic languages

[edit]

The differing properties of statically and dynamically typed languages motivated the design of Parrot. Current popular virtual machines such as the Java virtual machine and the Common Language Runtime (for the .NET platform) have been designed for statically typed languages, while the languages targeted by Parrot are dynamically typed.

Virtual machines such as the Java virtual machine and the current Perl 5 virtual machine are also stack-based. Parrot developers chose a register-based design, reasoning that it more closely resembles a hardware design, allowing the vast literature on compiler optimization to be used in generating bytecode for the Parrot virtual machine that could run at speeds closer to machine code.[citation needed] Other register-based virtual machines inspired parts of Parrot's design, including LLVM, the Lua VM and Inferno's Dis.

Functional concepts

[edit]

Parrot has rich support for several features of functional programming including closures and continuations, both of which can be particularly difficult to implement correctly and portably, especially in conjunction with exception handling and threading. The biggest advantage is the dynamic extendability of objects with methods, which allows for polymorphic containers (PMCs) and associated opcodes. Implementing solutions to these problems at the virtual machine level obviates the need to solve them in the individual client languages.

Compiler tools

[edit]

Parrot provides a suite of compiler-writing tools[11] which includes the Parser Grammar Engine (PGE), a hybrid parser-generator that can express a recursive descent parser as well as an operator-precedence parser, allowing free transition between the two in a single grammar. The PGE feeds into the Tree Grammar Engine (TGE) which further transforms the parse-tree generated by PGE for optimization and ultimately for code generation.

Implementations

[edit]

The most complete language implementations targeting the Parrot VM were Raku (known at the time as Rakudo Perl 6), Lua and Winxed.[12] Projects to implement many other languages were started, including PHP, Python, and Ruby; along with esoteric and demonstration languages such as Befunge and the "squaak" tutorial language.[13] None of these projects were successful in becoming the primary implementation of their respective languages.[4]

Internals

[edit]

There are three forms of program code for Parrot:

  • Bytecode[14] is binary and is natively interpreted by Parrot. Bytecode is usually stored in files with the filename extension ".pbc".
  • Parrot assembly language (PASM) is the low level language that compiles down to bytecode. PASM code is usually stored in files with the filename extension ".pasm".
  • Parrot intermediate representation (PIR[15]) is a slightly higher level language than PASM and also compiles down to bytecode. It is the primary target of language implementations. PIR transparently manages Parrot's inter-routine calling conventions, provides improved syntax, register allocation, and more. PIR code is usually stored in files with the filename extension ".pir".

Examples

[edit]

Registers

[edit]

Parrot is register-based like most hardware CPUs, and unlike most virtual machines, which are stack-based. Parrot provides four types of registers:

Parrot provides an arbitrary number of registers; this number is fixed at compile time per subroutine.

Arithmetic operations

[edit]

In PASM

    set I1, 4
    inc I1        # I1 is now 5
    add I1, 2     # I1 is now 7
    set N1, 42.0
    dec N1        # N1 is now 41.0
    sub N1, 2.0   # N1 is now 39.0
    print I1
    print ', '
    print N1
    print "\n"
    end

In PIR

 .sub 'main' :main
    $I1 = 4
    inc $I1     # $I1 is now 5
    $I1 += 2    # $I1 is now 7
    $N1 = 42.0
    dec $N1     # $N1 is now 41.0
    $N1 -= 2.0  # $N1 now 39.0
    print $I1
    print ', '
    print $N1
    print "\n"
 .end

mod_parrot

[edit]

mod_parrot is an optional module for the Apache web server. It embeds a Parrot virtual machine interpreter into the Apache server and provides access to the Apache API to allow handlers to be written in Parrot assembly language, or any high-level language targeted to Parrot.

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The Parrot virtual machine is a register-based aimed at efficiently compiling and executing bytecode for dynamic programming languages, such as Perl 6 (now Raku), , Python, , , and . Designed to serve as a common runtime environment for multiple high-level languages (HLLs), it supports features like continuations for advanced and a next-generation engine, distinguishing it from stack-based virtual machines like the (JVM). Development of began in 2001 as part of the Perl 6 project, with its first release occurring in September of that year and version 1.0 achieved in March 2009. The project, initially led by contributors like Dan Sugalski, produced monthly releases on the third Tuesday of each month, with supported versions in January and July; notable releases include 7.9.0 through 8.1.0 between 2015 and 2016. 's architecture includes compiler tools such as the Intermediate Code Compiler (IMCC), Parrot Grammar Engine (PGE), and Tree Grammar Engine (TGE) to facilitate HLL development and generation. Although supported a variety of languages and optimizations, including garbage collection improvements and performance enhancements like StringBuilder, active development ceased around 2017. For Raku, it was superseded by MoarVM, and it did not become the dominant VM for other dynamic languages. The codebase remains available on under the Parrot Foundation, requiring a C compiler and optional libraries like ICU for building.

History and Development

Origins and Initial Goals

The Parrot virtual machine traces its humorous origins to an April Fool's Day joke published on April 1, 2001, by Perl developer Simon Cozens, which fictitiously announced the merger of the and Python programming languages into a unified language named Parrot. This satirical piece, appearing on the official website, imagined Parrot as a hybrid scripting language combining features from both, complete with mock syntax examples and endorsements from creator and Python creator . Shortly after the joke, in mid-2001, the concept was adopted as a serious project under the leadership of Dan Sugalski, who served as Parrot's chief designer and architect. Sugalski, drawing from discussions in the 6 development community, aimed to create a versatile, high-performance capable of serving as a common runtime for multiple dynamic languages, including , Python, , and Tcl. The initial emphasis was on supporting 6 (renamed Raku in 2019), but the design prioritized broad applicability to avoid language-specific limitations and foster interoperability among dynamic language ecosystems. A key aspect of Parrot's foundational goals was its adoption of a register-based execution model over the more common stack-based approach, chosen for greater efficiency in dispatch and alignment with modern CPU architectures like RISC systems. This decision, inspired by successful register-oriented emulators such as Apple's 68K implementation for PowerPC Macs, sought to minimize overhead from stack manipulations and enable optimizations that mirrored hardware-level . By focusing on these principles, Parrot aimed to provide a flexible foundation for compiling and executing dynamic languages without the performance bottlenecks seen in earlier virtual machines.

Key Milestones and Releases

The Parrot virtual machine project began in 2001 under the leadership of Dan Sugalski, who served as the initial lead designer and chief architect from 2001 to late 2005, guiding its early conceptualization as a target for dynamic languages like 6. In 2006, Chip Salzenberg, a veteran and contributor, assumed the role of lead developer, focusing on stabilizing the core architecture during a period of intensive prototyping and feature implementation. Allison Randal then took over as chief architect around mid-2006, continuing until mid-October 2010, during which she oversaw significant advancements in tools and . Christoph Otto succeeded her as architect starting in 2010, leading efforts to refine the virtual machine's performance and extensibility through subsequent years. A major milestone came with the release of Parrot version 1.0 on March 17, 2009, designated "Haru Tatsu," which established a stable after over seven years of alpha and beta development phases characterized by iterative improvements in handling and register-based execution. This version enabled reliable compilation and execution for multiple dynamic languages, marking Parrot's transition from experimental to production-ready status. To support ongoing development and community activities, the Parrot Foundation was formed as a non-profit , providing grants, legal structure, and organizational backing; however, it dissolved in 2014 amid persistent funding challenges that limited sustained contributions and maintenance efforts. Development continued with regular releases emphasizing optimization and platform support, culminating in the final version, 8.1.0 "Andean Parakeet," issued on February 16, 2016, which incorporated enhancements to just-in-time (JIT) compilation and threading capabilities for better concurrency in dynamic language runtimes. Throughout its lifecycle, Parrot was implemented in C for cross-platform portability, facilitating deployment across diverse operating environments without excessive footprint.

Discontinuation and Legacy

The official discontinuation of the Parrot virtual machine was announced on , 2021, following its replacement by MoarVM as the primary runtime for Raku (formerly 6) and its inability to achieve widespread adoption for other dynamic languages. Several factors contributed to this outcome, including a decline in active contributors after 2016, with the last code commit occurring on , 2017, as well as competition from established language-specific virtual machines such as the JVM and V8, and a reorientation of priorities within the and Raku communities toward more specialized runtimes like MoarVM. The project's repository remains publicly available but confirms that no further development is planned, preserving the codebase for reference. Despite its discontinuation, 's legacy endures through its influence on open-source compilation and execution tools, notably the Parrot Compiler Toolkit (PCT), which facilitated the development of compilers for multiple dynamic languages. It also continues to serve as an educational resource for understanding register-based architectures, with its documentation and examples illustrating core concepts in handling and dynamic language runtimes. As of 2025, Parrot remains inactive, with its accessible for historical and academic study but without ongoing maintenance or updates.

Design and Architecture

Core Principles and Features

The virtual machine is a register-based virtual machine specifically designed to execute dynamic languages, optimizing for features such as runtime type flexibility, concurrency, and just-in-time () compilation to enhance in interpreted environments. This architecture targets languages that require extensive runtime modifications, including code extension and dynamic type systems, by providing a unified platform that reduces the overhead typically associated with traditional interpreters. Parrot incorporates support for multiple object models, including a multithreading model to facilitate concurrency through primitives, alongside high-level abstractions such as continuations and coroutines that enable advanced in dynamic programs. These elements allow for efficient handling of parallel tasks and non-linear execution paths, aligning with the needs of languages that emphasize flexibility over static compilation. Implemented in C for broad cross-platform compatibility across systems like , Windows, and macOS, Parrot emphasizes speed and low overhead, making it suitable for embedding within larger applications via a straightforward . Its core features include automatic garbage collection through a dead object detection mechanism, structured with throw-and-catch operations, and extensibility via plugins that allow integration of custom instruction sets without altering the core runtime. Unlike stack-based virtual machines such as the JVM, Parrot's register-based model minimizes the number of instructions required for operations, leading to reduced dispatch overhead and improved execution efficiency on modern hardware architectures. This design choice results in fewer dynamic instructions—potentially up to 46% less than equivalent stack-based code—while maintaining comparable density, thereby prioritizing performance for dynamic workloads.

Register-Based Execution Model

The Parrot virtual machine employs a register-based execution model, diverging from traditional stack-based virtual machines by utilizing a set of typed registers to hold operands and intermediate results during instruction execution. This approach emulates the register architecture of physical CPUs, where operations directly reference and modify registers rather than pushing and popping values onto a stack. Parrot supports four primary register types: registers (denoted as I), numeric registers (N for floating-point values), registers (S), and polymorphic container registers (P for Parrot Magic Cookies, or PMCs, which handle complex objects). Each subroutine can allocate an arbitrary number of these registers, determined at and limited to a maximum of 256 per type to balance flexibility with efficiency. Instructions in Parrot's execution engine operate directly on these registers, avoiding the overhead of stack manipulation. For instance, an operation like add I0, I1, I2 computes the sum of the values in registers I1 and I2 and stores the result in I0, all within the current register frame without intermediate transfers. The virtual machine's interpreter, typically implemented as a threaded interpreter for dispatching opcodes, or optionally via just-in-time () compilation for further optimization, processes these operations sequentially or in optimized bursts. This direct register access minimizes usage and instruction count compared to stack-based alternatives, as operands remain in fast-access locations throughout computation. Subroutine invocation in Parrot preserves the calling context through a call frame stack, where each subroutine or lexical block maintains its own dedicated register frame. Upon entry to a subroutine, the previous frame's registers are saved implicitly via the stack, and a new frame is allocated with the subroutine's specified registers; parameters are passed by copying or referencing values between frames. This mechanism supports efficient handling of , particularly tail recursion, where a .tailcall directive allows the current frame to be reused for the subsequent call, preventing stack growth and enabling unbounded recursion depth without overflow. The call frame stack thus ensures register state isolation while facilitating low-overhead transitions between execution contexts. Overall, Parrot's register-based model enhances performance by aligning closely with hardware execution patterns, reducing the number of instructions needed for common operations and simplifying potential optimizations in scenarios. Benchmarks from early implementations demonstrated fewer instructions executed for equivalent stack-based code, underscoring the model's efficiency for dynamic language workloads.

Bytecode and Intermediate Representations

The compilation process for programs targeting the Parrot virtual machine begins with from dynamic languages, which is parsed and transformed into Parrot Intermediate Representation (PIR) using compiler tools such as NQP (Not Quite Perl), a lightweight language designed for generating PIR routines. This intermediate step allows compilers to abstract high-level language constructs into a form suitable for further optimization and execution on . The resulting PIR is then compiled to Parrot Assembly Language (PASM), a lower-level representation, before final assembly into (PBC), the executable format interpreted by the virtual machine. PIR serves as a higher-level, human-readable intermediate representation that includes macros and directives for structured control flow, such as .sub and .end to define subroutines, enabling easier expression of complex behaviors without manual management of low-level details like register allocation. In contrast, PASM is a low-level, assembly-like language that directly maps to Parrot's opcodes, requiring explicit specification of operations like add for arithmetic or print for output, and it uses literal register references such as I0 for integers. Opcodes in both PIR and PASM are numbered internally for efficient dispatch, with Parrot supporting over 200 core instructions that handle operations including branching (e.g., if and branch) and object invocation (e.g., callmethod and invoke). The primary executable format, PBC, is a serialized, platform-independent binary representation of the compiled program, consisting of a fixed 18-byte header followed by aligned segments for , constants, fixups, and debug information. The header encodes metadata such as magic bytes (PBC\x0d\x0a\x1a\x0a), wordsize (4 or 8 bytes), byteorder (little or big endian), floating-point type, version numbers, and a UUID for packfile identification, ensuring compatibility across architectures via loader-time conversion. This design allows PBC files, typically with a .pbc extension, to be efficiently loaded and executed by the interpreter, supporting the virtual machine's register-based model during runtime. Parrot implements predereferencing as a key performance improvement technique, involving a bytecode transformation that amortizes the cost of dereferencing function and data pointers to enable faster execution in the interpreter's inner loop. This process pre-converts opcode numbers into pointers to their corresponding opfuncs and transforms register and constant numbers in opcode arguments into direct pointers, thereby reducing pointer dereferencing overhead during runtime. Developed as part of the early Parrot core, this feature provides an instructive intermediate optimization for other virtual machine designers, bridging the gap between a naive runloop implementation and full just-in-time (JIT) compilation.

Language Support and Implementations

Targeted Dynamic Languages

The virtual machine was primarily designed as the runtime environment for Perl 6 (now known as Raku), with the Rakudo compiler serving as its flagship implementation targeting 's bytecode. Rakudo, initiated in 2009, leveraged to execute Perl 6 programs, providing support for advanced dynamic features such as and metaclasses through 's object model and calling conventions. This integration allowed Rakudo to compile Perl 6 into Parrot Intermediate Representation (PIR) and execute it efficiently on the VM. However, Rakudo's reliance on ended with the January 2014 release when the project switched to MoarVM due to performance and stability limitations in , marking a shift away from as the primary backend for Raku development. Among other dynamic languages, saw partial implementations for , with a that translated Lua 5.1 source to using around 4,000 lines of code, demonstrating the VM's suitability for lightweight scripting but remaining incomplete for full coverage. Experimental efforts included Cardinal, a Ruby 1.8-compatible that achieved a fairly complete parser but left the underdeveloped, highlighting 's potential for object-oriented dynamic languages while underscoring implementation challenges. For Python, the provided a prototype targeting Python 3 syntax, focusing on core features like dictionaries but stalling early without full maturity or production viability. Tcl influences shaped 's design, particularly in extensibility and embedding, with Partcl offering a from-scratch Tcl 8.5 implementation that compiled to , though it saw no updates after 2012. Niche implementations further illustrated Parrot's versatility for dynamic paradigms. Winxed, a Parrot-native with JavaScript-like syntax, mapped directly to Parrot's register types (integers, floats, strings, and PMCs) and served as a tool for VM development and testing. Esoteric languages like received a semistable PIR-based interpreter, achieving 100% feature coverage for the Befunge specification on Parrot 3.3.0. Squaak, a functional demonstration , was included in Parrot's examples to showcase construction using Parrot Compiler Tools, emphasizing higher-order functions and closures on the VM. Parrot's runtime provided foundational support for dynamic features across these languages, including metaclasses for introspective object manipulation and for polymorphic method selection based on argument types. Despite these efforts, many projects on stalled at proof-of-concept stages due to limited developer resources and the VM's evolving , which prioritized breadth over depth in language support. This incomplete coverage contributed to Parrot's development ceasing in , with an official announcement of inactivity in , as alternative VMs like MoarVM proved more sustainable for ongoing dynamic ecosystems.

Compiler Tools and Ecosystems

The Compiler Toolkit (PCT) serves as a foundational framework for developing compilers targeting the Parrot virtual machine, enabling the creation of high-level language (HLL) compilers that parse , perform optimizations, and generate Parrot Intermediate Representation (PIR) bytecode. PCT integrates the Parser Grammar Engine (PGE) for defining grammars in a Perl 6-inspired rules format and Not Quite Perl (NQP) for implementing action methods that transform parse trees into abstract syntax trees (). The toolkit's HLLCompiler class orchestrates the compilation pipeline, supporting modes for , interactive evaluation, and runtime compilation, which streamlines the of new language implementations on Parrot. Not Quite Perl (NQP) functions as a language within the ecosystem, providing a minimal subset of 6 syntax for writing parser actions without requiring a full . In compiler development, NQP enables the construction of nodes during , which PCT subsequently converts to PIR for execution, making it essential for languages like Rakudo 6 that compile atop . Its syntax includes sigils for variables, the binding operator (:=) for references, and match objects ($/) to access parsed data, facilitating efficient syntax-directed translation. The Parser Engine (PGE) is a key component for grammar specification in compilers, compiling declarative rules and tokens into PIR-based parser modules that support recursive descent and operator precedence parsing. PGE rules, such as rule TOP { <record> }, define language syntax patterns, with embedded actions triggered by {*} to invoke NQP methods for building parse trees or ASTs during compilation. This engine powers the parsing phase in PCT-based compilers, allowing developers to generate executable parsers from grammar files (e.g., .pg) via commands like parrot Perl6Grammar.pbc --output=example.pir example.pg. High-level language tools in the Parrot ecosystem include utilities like Rosella, a collection of portable libraries that abstract low-level Parrot operations for enhanced developer productivity. Rosella provides testing frameworks such as its Test library (inspired by xUnit and Test::More) and MockObject for simulating dependencies, alongside utilities for file system operations, string manipulation, and text templating to support interoperation and validation in compiler projects. These tools are designed to be language-agnostic and free of C dependencies, promoting reusable patterns across Parrot-based implementations. The Intermediate Code Compiler (imcc) forms a core part of the development workflow, serving as Parrot's primary front-end to assemble and optimize PIR or PASM code into bytecode for execution. Developers typically use imcc to compile high-level sources through PCT-generated grammars and actions, producing .pbc files in a single step that includes , optimization, and runtime embedding. Tools like mk_language_shell.pir automate the initial setup by generating skeleton files for grammars, actions, and main entry points. Parrot's compiler ecosystem integrates with Perl's Comprehensive Perl Archive Network (CPAN) for distribution and reuse of components, such as PGE libraries and related modules that facilitate parsing and compilation tasks. This connection allows Perl developers to leverage Parrot tools within broader workflows, exemplified by PGE's availability for building extensible grammars in dynamic language projects.

Static Language Experiments

Parrot's design as a optimized for dynamic languages posed significant challenges for supporting static languages, yet several experimental efforts explored ports of statically typed or C-like languages to demonstrate versatility. One notable project was the implementation, aimed at compiling a subset of the standard to Parrot bytecode, primarily to facilitate automated generation of Native Call Interface (NCI) signatures for library bindings and extensions. This port highlighted Parrot's potential beyond dynamic paradigms but remained in early development stages, with volunteers contributing sporadically. These experiments revealed inherent limitations of Parrot's dynamic foundation when applied to static typing. The VM's reliance on Polymorphic Magic Containers (PMCs) for flexible data representation introduced overhead unsuitable for the performance demands of static compilation, where type information is resolved at rather than runtime. To address this, developers had to create custom PMC types to enforce type checking and signatures manually, loading modules at and extracting type data for storage—efforts that lacked built-in VM support and required substantial . Attempts to emulate subsets faced similar hurdles, with no complete implementations emerging due to the mismatch between Parrot's register-based, and Java's stack-based, statically verified model. A key approach in these experiments involved leveraging Parrot's meta-programming capabilities, particularly through the KnowHOW objects in 6 implementations like Rakudo, to simulate static behaviors. KnowHOW served as the foundational meta-object for classes and roles, enabling developers to customize and add runtime type constraints or parametric generics that mimicked static . For instance, declaring methods with initial carets (^) customized the containing class's KnowHOW to include type-aware methods, allowing experimental static-like enforcement in otherwise dynamic code. However, such simulations remained inefficient compared to native static VMs. Overall, these static language experiments yielded few successful, production-ready implementations, underscoring Parrot's niche suitability for dynamic paradigms like those in Ruby's Cardinal port or the constrained but primarily dynamic compiler. The overhead and custom workarounds contributed to Parrot's limited adoption outside dynamic ecosystems, reinforcing its legacy as a specialized VM rather than a general-purpose platform for static languages.

Internals and Operations

Data Types and Registers

Parrot supports a set of primitive data types designed for efficient low-level operations. are signed values sized to the machine word, typically 32 bits on 32-bit systems or 64 bits on 64-bit systems, providing native performance for arithmetic tasks. Floating-point numbers use double-precision representation to handle decimal computations accurately. Strings are advanced data structures supporting , encoded primarily in or ASCII for broad character handling. Keys function as identifiers for hash operations, accepting either integer or string forms to index aggregate structures. For more sophisticated data handling, employs Polymorphic Containers (PMCs), which encapsulate complex types such as objects, arrays, and hashes. PMCs extend beyond primitives by representing aggregate and behavioral structures, including resizable arrays for dynamic collections and associative hashes for key-value storage. These containers are self-extensible, allowing developers to define custom types that integrate seamlessly with Parrot's runtime. The register system maps directly to these data types, utilizing four distinct sets for optimized access in its register-based execution model. I-registers store integers, N-registers hold floating-point values, S-registers manage strings, and P-registers contain PMCs. Each set has a variable number of registers, determined at per subroutine, with typed access enforcing correct usage to avoid mismatches during operations. Parrot's type system is fundamentally dynamic, permitting flexible runtime typing while supporting optional hints through PMC metadata for performance optimization. Memory management relies on garbage collection via a mark-and-sweep mechanism, where the Dead Object Detection phase marks live objects starting from registers and stacks, followed by a sweep to reclaim unmarked memory for PMCs and strings. A key feature of PMCs is their use of vtables for extensibility and method dispatch, central to 's object model. Vtables act as abstract interfaces defining operations like value retrieval or modification, dispatched polymorphically based on argument types to enable language-specific behaviors without altering the core interpreter. This design allows inheritance from default vtables and dynamic loading of custom implementations, fostering reusable and adaptable data handling unique to Parrot.

Instruction Set and Arithmetic

The Parrot virtual machine employs a comprehensive instruction set known as opcodes, which form the native operations executed by its register-based runtime. These opcodes number over 1,200 in a standard installation, encompassing variants for different data types such as integers (I), numbers (N), strings (S), and polymorphic containers (P or PMC). They are implemented and compiled into the core, enabling efficient execution of from dynamic languages. Opcodes are broadly categorized into groups including core operations for basic computation, for program branching, and for interaction with external resources. Core opcodes handle fundamental tasks like data manipulation and arithmetic, while control opcodes manage execution paths, and I/O opcodes facilitate operations such as reading from files or printing output. This categorization allows for modular extension via dynamic opcodes (dynops), though the core set provides the foundational functionality. Arithmetic operations in Parrot support a range of numerical computations across scalar types and PMCs, with syntax typically following the form result = operand1 operator operand2. For integers and floats, binary operations include addition (add I0, I1, I2 sets I0 to I1 + I2), subtraction (sub), multiplication (mul), division (div), and modulus (mod). Unary operations cover negation (neg), absolute value (abs), and trigonometric functions such as sine (sin N0, N1 computes the sine of N1 in radians and stores it in N0), cosine (cos), and tangent (tan). Exponentiation is available via pow. These operations are type-specific, with variants like add_i_i for integer-integer addition, ensuring precise handling without implicit conversions. For arbitrary-precision arithmetic, including big integers, Parrot relies on PMC-based types like the Integer PMC, which overloads these opcodes to support unlimited digit lengths through underlying libraries such as GMP. Control flow instructions enable conditional and unconditional branching, subroutine invocation, and lexical scoping within Parrot Intermediate Representation (PIR). Basic branching uses opcodes like goto LABEL for unconditional jumps and if I0, LABEL to branch to a label if the integer register I0 is true (non-zero). More flexible variants include branch OFFSET for relative jumps by instruction offset and unless I0, LABEL for the inverse condition. Subroutines are defined in PIR using .sub name to begin a block and .end to close it, supporting lexical scoping through opcodes such as store_lex 'var', P0 to bind a PMC to a lexical name and find_lex P0, 'var' to retrieve it. Invocation occurs via invoke P0 for calling a subroutine PMC or call LABEL for direct jumps to labels, with returns handled by return. These mechanisms facilitate structured programming constructs like loops and conditionals in higher-level languages targeting Parrot. Exception handling in Parrot integrates with its opcode set through a handler-based system using PMC objects for error representation. The throw P0 opcode raises an exception stored in PMC register P0, propagating it up the call stack until caught. Handlers are established with push_eh LABEL to register an exception handler at a label and pop_eh to remove it, allowing structured error recovery. PMCs enable extensible error types, such as exceptions with attributes for messages or types, supporting language-specific error semantics without altering the core VM. While 's runtime focuses on direct interpretation, optimizations in PIR compilation include basic analyses like to remove unreachable instructions, improving efficiency prior to packing into Parrot Bytecode (PBC) format.

Memory Management and Threading

employs a pluggable garbage collection subsystem designed to support multiple models, including mark-and-sweep and compacting collectors, with options for incremental, concurrent, and generational variants to balance performance and pause times. The system uses a tri-color marking , classifying objects as white (unvisited, potentially dead), gray (visited but with unmarked children), or black (fully marked and live) to facilitate and collection without halting the interpreter excessively. While most Polymorphic Containers (PMCs) rely on for immediate deallocation, the tracing GC intervenes for cyclic references, forming a hybrid approach that minimizes overhead for acyclic structures while ensuring completeness. Memory allocation in leverages pools to optimize for fragmentation and allocation speed, distinguishing between fixed-size and variable-size needs. Fixed-size pools, such as those for PMCs and strings, use arenas—pre-allocated blocks sized for a specific number of objects—to enable rapid, contiguous allocation without individual malloc calls, reducing overhead in high-frequency scenarios like object creation in dynamic languages. Variable-size pools handle buffers and strings with more flexible backing stores, while the overall system allows configuration of pool thresholds and GC triggers to suit workload demands, such as infrequent collections for latency-sensitive applications. Parrot's threading model centers on a per-interpreter concurrency scheduler implemented as a Scheduler PMC, which manages tasks—lightweight units of execution abstracted as Task PMCs—for flexible support of models like threads, event-based programming, and . Green threads, realized through continuations that capture and restore execution state (including stacks and instruction pointers), enable within a single OS thread, preempting after a quantum (e.g., via branch checks) to simulate concurrency without OS involvement, ideal for I/O-bound workloads but limited to one core. Experimental native threading extends this with an N:M hybrid model, mapping green tasks to OS threads (up to one per core) for true parallelism, using proxies for inter-thread data sharing and Parrot_thread_create for spawning interpreters with independent GC contexts. Interpreter locking employs mutexes in the lock-based model, requiring PMCs to acquire locks before mutation, while (STM) offers lock-free alternatives through atomic transactions with conflict validation. Actor-like behavior emerges via message-passing PMCs in STM or hybrid setups, where tasks communicate through shared, proxied objects without direct state mutation. By default, Parrot operates single-threaded, initializing a scaled to CPU cores only when concurrency is explicitly invoked, a design choice to simplify embedding and reduce complexity in non-parallel code. Despite these capabilities, concurrency features remained experimental and underutilized at the project's conclusion in 2017, as focus shifted and full multi-threaded stability was not achieved before deprecation.

Examples and Usage

Basic Code Snippets

The Parrot virtual machine supports low-level programming through Parrot Assembly (PASM), a register-based assembly language, and Parrot Intermediate Representation (PIR), a higher-level syntax that compiles to PASM. These languages enable direct interaction with the VM's registers and instructions for basic operations. The following snippets demonstrate fundamental syntax for arithmetic, output, and string manipulation, focusing on integer registers (prefixed as I in PASM or $I in PIR) and string registers (S or $S). A simple PASM example adds two integer constants and prints the result. This uses immediate values directly in the add instruction and the print opcode for output, followed by end to terminate execution:

add I0, 5, 3 print I0 end

add I0, 5, 3 print I0 end

This code loads 5 and 3 as immediates, adds them to register I0 (resulting in 8), prints the value, and ends the program. In PIR, code is organized into subroutines delimited by .sub and .end, with assignment operators for clarity. The following defines a main subroutine that adds two integers stored in registers and prints the sum using say, which includes a newline:

.sub main $I0 = 10 $I1 = 20 add $I2, $I0, $I1 say $I2 .end

.sub main $I0 = 10 $I1 = 20 add $I2, $I0, $I1 say $I2 .end

This assigns 10 to $I0 and 20 to $I1, adds them to $I2 (yielding 30), and outputs the result. Alternatively, PIR supports infix notation like $I2 = $I0 + $I1 for the addition. For string handling, PASM uses the set instruction to load a constant into a string register, followed by print and newline for output:

set S0, "Hello" print S0 newline end

set S0, "Hello" print S0 newline end

This stores the string "Hello" in register S0, prints it, adds a newline, and terminates. In PIR, the equivalent is $S0 = "Hello"; say $S0;, leveraging the same register types but with subroutine structure. To execute these snippets, save the code in a file with a .pir or .pasm extension (PIR is more common for beginners) and run it using the Parrot interpreter from the command line: parrot example.pir. As Parrot is no longer actively maintained (last update 2017), obtain the interpreter by cloning https://github.com/parrot/parrot and building from source following the README instructions, which require a C compiler and optional dependencies like ICU. This interprets the code directly without prior compilation. For bytecode compilation, use parrot -o example.pbc example.pir before running parrot example.pbc.

Advanced Programming Patterns

Advanced programming in the Parrot virtual machine often involves leveraging its polymorphic containers (PMCs) for object-oriented patterns, structured for iteration, for robust error management, coroutines for , and the Native Call Interface (NCI) for with native code. These patterns build upon the register-based architecture to enable efficient, high-level abstractions in dynamic languages targeting . Object creation in Parrot PASM utilizes the new opcode to instantiate PMCs, which serve as versatile objects capable of holding various data types. For instance, to create an PMC and assign a value, the code new P0, ['Integer']; set P0, 42 allocates a new PMC in register P0 of type Integer and sets its value to 42. To retrieve the value into an integer register, use set I0, P0, which assigns the PMC's value to I0. This pattern is essential for implementing classes and instances in higher-level languages compiled to Parrot . Conditional loops provide fine-grained control over repetition, using integer registers and branching opcodes for efficiency. A basic counted loop can be expressed as .local int i; i = 0; loop: if i >= 10 goto endloop; inc i; goto loop; endloop:, where a local integer variable i is initialized, checked against a limit with if and comparison, incremented via inc, and looped back with goto until the condition triggers an exit branch. This structure avoids deep recursion while maintaining performance in register-based execution, suitable for algorithms requiring iterative processing. Exception handling in Parrot employs a handler stack to manage runtime gracefully, integrating seamlessly with subroutine definitions. The pattern begins with push_eh handler to register an , followed by potentially erroneous code like divide_by_zero(), and pop_eh to remove the handler post-execution; the handler subroutine is defined as .sub handler: say "Error caught", which prints a message upon invocation. This mechanism uses exception objects to propagate details, allowing resumption or cleanup without halting the program, and is particularly useful in libraries where error-prone operations must be contained. Coroutines enable non-preemptive multitasking through continuation-based , where a subroutine yields control back to the caller while preserving state. A subroutine can be defined with .sub coro yield 1 .end, and called in another sub as $I0 = coro() to execute until the yield, returning a value (here, 1) and allowing re- to resume from that point. This pattern supports generators and cooperative scheduling in languages like Perl 6, leveraging Parrot's continuation PMCs for lightweight concurrency without full threads. For integrating with external libraries, the Native Call Interface (NCI) allows direct invocation of functions by specifying signatures and loading shared objects, bypassing full compilation. A typical usage involves loading a with loadlib P_lib, 'library.so', getting the function with $P_func = dlfunc P_lib, 'function_name', 'iii', and invoking $P_func with arguments from registers, enabling seamless interop for performance-critical extensions such as mathematical routines or system calls. This underscores NCI's role in embedding Parrot within C-based applications while maintaining through prototype strings.

Integrations and Applications

Web Server Modules

mod_parrot is an 2 module designed to integrate the Parrot virtual machine with web servers, enabling the execution of Parrot Intermediate Representation (PIR) or precompiled (.pbc) files as custom handlers. This module exposes the Apache API and data structures directly to the Parrot interpreter, allowing developers to create dynamic web content in Parrot-based languages without requiring extensive C code. It functions similarly to mod_perl by providing a persistent runtime environment, which improves performance over traditional CGI scripts by avoiding repeated interpreter initialization. Key features include access to Apache request objects through Parrot Managed Containers (PMCs), such as the equivalent, which supports methods for reading headers, arguments, and content while handling authentication and content generation. Output buffering is managed via 's string handling and methods like puts, ensuring responses are sent efficiently to the client without direct C intervention. The module also serves as a common layer for higher-level languages (HLLs) on , such as early 6 implementations via Rakudo or via Pipp, minimizing the need for language-specific wrappers. Configuration involves loading the module with LoadModule parrot_module modules/mod_parrot.so in the httpd.conf file, followed by directives to initialize and map handlers. For example, ParrotInit /path/to/lib/ModParrot/init.pbc sets up the initial environment, while ParrotLoad /path/to/script.pbc loads specific ; URI mapping uses <Location /example> SetHandler parrot-code ParrotHandler ExampleScript </Location> to associate paths with handlers. These directives allow seamless integration of Parrot scripts into Apache's request processing pipeline. In practice, mod_parrot supports use cases like generating dynamic web pages with personalized content or implementing custom logic in PIR scripts, akin to Perl scripts in mod_perl environments. For instance, a handler could process form data from a request and output responses based on Parrot's manipulation capabilities. Development of mod_parrot culminated in version 0.5, released on January 4, 2009, after which it received no further updates. Although functional at the time, the module became unmaintained following the broader discontinuation of the project around 2017.

Embedding in Other Systems

The Parrot virtual machine provides an embedding that enables integration of its runtime into C-based applications, allowing execution within larger programs. This supports creating interpreters, loading and running , and managing resources without requiring a full installation. The original embedding interface, documented in embed.pod, includes functions such as Parrot_new(Parrot_Interp parent) for initializing a new interpreter instance—passing NULL for the root interpreter—and Parrot_runcode(PARROT_INTERP, int argc, char *argv[]) for executing loaded with command-line arguments. loading is handled via Parrot_pbc_read for files or Parrot_pbc_load for in-memory packfiles, facilitating seamless incorporation of dynamic language scripts into host applications. A newer, more stable API outlined in PDD 10 replaces the original, emphasizing opaque data types like Parrot_PMC and Parrot_String for portability. Key functions include Parrot_api_make_interpreter(NULL, 0, args, &interp) to create an interpreter, Parrot_api_load_bytecode_file(interp, filename, &pbc) to load , and Parrot_api_run_bytecode(interp, main, argc, argv) to execute it. This interface prioritizes over direct assembly calls for stability, with all entry points exposed through a single header, parrot/api.h, and error handling that propagates exceptions back to the host application without abrupt termination. Resource management, such as destroying interpreters via Parrot_api_destroy_interpreter, remains the responsibility of the embedding application. In media and graphics applications, Parrot integrates with libraries like through its Native Call Interface (NCI), enabling scripting of and rendering. Parrot's experimental bindings, added in releases such as 1.4.0 in 2009 and expanded in later versions like 3.4.0 in 2011, leverage NCI to call functions directly from Parrot-hosted , supporting features like programming with GLSL and operations for image processing. This allows dynamic scripting in tools for video effects or scientific visualization, where Parrot scripts can manipulate GPU resources alongside C++ or code, achieving performance comparable to native implementations. Similar NCI-based extensions could theoretically extend to frameworks, though practical examples remain limited to experimental prototypes due to Parrot's niche adoption. For standalone applications, embedding facilitates bundling with a minimal interpreter stub into self-contained executables, avoiding runtime dependencies. This approach, as described in early documents, involves compiling host code that instantiates the interpreter and runs embedded , suitable for desktop tools requiring dynamic behavior like configuration scripting. However, creating such binaries requires linking against libparrot, and no dedicated tools like automated bundlers were widely standardized. Embedding Parrot presents challenges, particularly in multi-threaded environments, where creating child interpreters without parenting them to the root can lead to unpredictable errors and resource conflicts. The project's development ceased after 2017, with the last commit in October 2017, limiting ongoing support and updates for embedded use cases. As a result, actual deployments in desktop or media applications are rare, overshadowed by more actively maintained virtual machines like the JVM or MoarVM.

References

  1. https://en.wikibooks.org/wiki/Parrot_Virtual_Machine/Languages_on_Parrot
  2. https://en.wikibooks.org/wiki/Parrot_Virtual_Machine/Parrot_Embedding
Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.