Purely functional programming
View on WikipediaIn computer science, purely functional programming usually designates a programming paradigm—a style of building the structure and elements of computer programs—that treats all computation as the evaluation of mathematical functions.
Program state and mutable objects are usually modeled with temporal logic, as explicit variables that represent the program state at each step of a program execution: a variable state is passed as an input parameter of a state-transforming function, which returns the updated state as part of its return value. This style handles state changes without losing the referential transparency of the program expressions.
Purely functional programming consists of ensuring that functions, inside the functional paradigm, will only depend on their arguments, regardless of any global or local state. A pure functional subroutine only has visibility of changes of state represented by state variables included in its scope.
Difference between pure and impure functional programming
[edit]The exact difference between pure and impure functional programming is a matter of controversy. Sabry's proposed definition of purity is that all common evaluation strategies (call-by-name, call-by-value, and call-by-need) produce the same result, ignoring strategies that error or diverge.[1]
A program is usually said to be functional when it uses some concepts of functional programming, such as first-class functions and higher-order functions.[2] However, a first-class function need not be purely functional, as it may use techniques from the imperative paradigm, such as arrays or input/output methods that use mutable cells, which update their state as side effects. In fact, the earliest programming languages cited as being functional, IPL and Lisp,[3][4] are both "impure" functional languages by Sabry's definition.
Properties of purely functional programming
[edit]Strict versus non-strict evaluation
[edit]Each evaluation strategy which ends on a purely functional program returns the same result. In particular, it ensures that the programmer does not have to consider in which order programs are evaluated, since eager evaluation will return the same result as lazy evaluation. However, it is still possible that an eager evaluation may not terminate while the lazy evaluation of the same program halts. An advantage of this is that lazy evaluation can be implemented much more easily; as all expressions will return the same result at any moment (regardless of program state), their evaluation can be delayed as much as necessary.
Parallel computing
[edit]In a purely functional language, the only dependencies between computations are data dependencies, and computations are deterministic. Therefore, to program in parallel, the programmer need only specify the pieces that should be computed in parallel, and the runtime can handle all other details such as distributing tasks to processors, managing synchronization and communication, and collecting garbage in parallel. This style of programming avoids common issues such as race conditions and deadlocks, but has less control than an imperative language.[5]
To ensure a speedup, the granularity of tasks must be carefully chosen to be neither too big nor too small. In theory, it is possible to use runtime profiling and compile-time analysis to judge whether introducing parallelism will speed up the program, and thus automatically parallelize purely functional programs. In practice, this has not been terribly successful, and fully automatic parallelization is not practical.[5]
Data structures
[edit]Purely functional data structures are persistent. Persistency is required for functional programming; without it, the same computation could return different results. Functional programming may use persistent non-purely functional data structures, while those data structures may not be used in purely functional programs.
Purely functional data structures are often represented in a different way than their imperative counterparts.[6] For example, array with constant-time access and update is a basic component of most imperative languages and many imperative data-structures, such as hash table and binary heap, are based on arrays. Arrays can be replaced by map or random access list, which admits purely functional implementation, but the access and update time is logarithmic. Therefore, purely functional data structures can be used in languages which are non-functional, but they may not be the most efficient tool available, especially if persistency is not required.
In general, conversion of an imperative program to a purely functional one also requires ensuring that the formerly-mutable structures are now explicitly returned from functions that update them, a program structure called store-passing style.
Purely functional language
[edit]A purely functional language is a language which only admits purely functional programming. Purely functional programs can however be written in languages which are not purely functional.
References
[edit]- ^ Sabry, Amr (January 1993). "What is a purely functional language?". Journal of Functional Programming. 8 (1): 1–22. CiteSeerX 10.1.1.27.7800. doi:10.1017/S0956796897002943. S2CID 30595712.
- ^ Atencio, Luis (18 June 2016). Functional Programming in Javascript. Manning Publications. ISBN 978-1617292828.
- ^ The memoir of Herbert A. Simon (1991), Models of My Life pp.189-190 ISBN 0-465-04640-1 claims that he, Al Newell, and Cliff Shaw are "commonly adjudged to be the parents of [the] artificial intelligence [field]", for writing Logic Theorist, a program which proved theorems from Principia Mathematica automatically. In order to accomplish this, they had to invent a language and a paradigm which, which viewed retrospectively, embeds functional programming.
- ^ McCarthy, John (June 1978). "History of LISP". The first ACM SIGPLAN conference on History of programming languages - HOPL-1. pp. 217–223. doi:10.1145/800025.808387.
- ^ a b Marlow, Simon (18 June 2013). Parallel and Concurrent Programming in Haskell: Techniques for Multicore and Multithreaded Programming. O'Reilly Media. pp. 5–6. ISBN 978-1449335946.
- ^ Purely functional data structures by Chris Okasaki, Cambridge University Press, 1998, ISBN 0-521-66350-4
Purely functional programming
View on GrokipediaFundamentals
Definition of Purity
In purely functional programming, a pure function is defined as one that, for any given input, always produces the same output and has no observable side effects, such as input/output operations, mutation of state, or interactions with external environments.[7][8] This ensures that the function's behavior is deterministic and solely dependent on its arguments, mirroring the properties of mathematical functions.[9] At its core, purely functional programming treats computations as the evaluation of mathematical functions, where programs are constructed as compositions of expressions rather than sequences of imperative commands that alter state.[4] This approach emphasizes immutability and the absence of hidden dependencies, allowing programs to be reasoned about equationally, much like algebraic manipulations.[7] The foundational principles of purity trace their origins to the lambda calculus, developed by Alonzo Church in the 1930s as a formal system for expressing computation through function abstraction and application without any mutable state.[10][11] Church's work provided a theoretical basis for modeling all computable functions in a purely applicative manner, influencing the design of modern purely functional languages.[10] For instance, a simple arithmetic operation like addition exemplifies a pure function: given inputs 2 and 3, it invariably returns 5, with no alteration to external variables or generation of side effects.[7] In contrast, a random number generator is impure, as it may yield different results for the same input due to reliance on internal or external state, violating purity.[12] Purity in functions also enables referential transparency, where an expression can be substituted with its value without altering the program's meaning.[7]Referential Transparency
Referential transparency is a fundamental property in purely functional programming, where an expression can be replaced by its evaluated value at any point in the program without altering the overall behavior or meaning. This substitutability ensures that the value of an expression depends solely on its subexpressions and not on external factors such as mutable state or execution context. As defined in early programming language theory, referential transparency means that to evaluate an expression containing a subexpression, only the value of the subexpression is needed, preserving mathematical-like predictability.[13] Purity in functional programming, characterized by the absence of side effects and deterministic evaluation given the same inputs, directly implies referential transparency, since expressions remain consistent under substitution. Formally, in the lambda calculus, this is illustrated by the equivalence , where the application can be substituted by its result without affecting the program's semantics, demonstrating how pure functions enable such equational substitutions.[14][13] The benefits of referential transparency are profound for program analysis and optimization. It facilitates techniques like common subexpression elimination, where repeated computations of the same expression are computed only once, improving efficiency without changing results. Moreover, it supports equational reasoning, treating programs as algebraic expressions that can be manipulated and proven correct using mathematical laws, as seen in the algebra of functional programs where combining forms allow transformations like . This equational approach contrasts with imperative styles by enabling formal verification and optimization at a high level.[14][15] In practical terms, referential transparency simplifies debugging and testing by ensuring functions produce predictable outputs solely from their inputs, independent of order or context, thereby reducing errors from hidden dependencies and allowing modular composition with confidence.[14]Differences from Impure Functional Programming
Side Effects and Impurity
In purely functional programming, functions produce no observable changes beyond their return values, but side effects introduce modifications to the external environment that violate this principle. Common types of side effects include input/output (I/O) operations, such as reading from or writing to files or consoles; modifications to global variables, which alter shared state across function invocations; throwing exceptions, which abruptly terminate normal execution flow; and non-local control flow mechanisms, like continuations or gotos, that jump execution to arbitrary points in the program. These effects make function behavior dependent on context, execution order, or external state, contrasting sharply with the deterministic nature of pure functions.[16][17] Impure functional programming languages, such as Lisp and Scala, permit side effects alongside functional constructs like higher-order functions and recursion, fostering hybrid imperative-functional styles that blend purity with practicality. In Lisp, for instance, primitives likerplaca and rplacd enable mutation of list structures, allowing destructive updates within otherwise applicative code. Similarly, Scala integrates functional features with object-oriented elements, supporting mutable variables and I/O directly in functional expressions, which enables efficient real-world applications but compromises strict purity. This impurity allows developers to leverage functional abstractions for expressiveness while incorporating imperative idioms for performance-critical tasks.[18] (Chapter 23 on effects)
The introduction of side effects leads to several consequences that undermine the benefits of functional programming, including loss of predictability due to non-deterministic outcomes from timing or order dependencies, increased difficulty in debugging because effects can propagate unpredictably across modules, and reduced composability as functions become harder to reuse without considering their environmental impact. For example, consider an impure function that computes a sum and prints it to the console:
def impureSum(a: Int, b: Int): Int = {
val result = a + b
println(s"Sum is: $result") // Side effect: I/O
result
}
This function's behavior changes based on when it is called, potentially interleaving output with other prints, whereas a pure equivalent simply returns the value without external interaction:
def pureSum(a: Int, b: Int): Int = a + b
The pure version maintains referential transparency—replacing calls with their results does not alter program behavior—while the impure one breaks it, complicating reasoning and testing. Side effects thus introduce bugs related to execution order and state interference, which pure functions avoid entirely.[19]
To mitigate impurity in such languages, monads serve as a conceptual technique to encapsulate side effects, structuring computations so that effects like I/O or state changes are isolated within a composable wrapper, preserving much of the pure style's modularity without eliminating impurity outright. This approach, originally developed for pure languages but adaptable to impure ones, sequences effects predictably while allowing controlled interaction with the environment.[20]
Mutable State Management
In impure functional programming, mutable state is managed through mechanisms like reference cells and arrays that allow in-place modifications, enabling imperative-style updates within a largely functional framework. For instance, in languages such as Standard ML (SML), reference cells of type'a ref are created using the ref constructor, dereferenced with the ! operator, and updated via the := assignment operator, permitting variables to change over time unlike immutable values in pure paradigms. Arrays, provided via the Array structure, similarly support mutable operations like Array.update, which alters elements at specific indices without creating new structures. These features extend functional languages with imperative capabilities, as seen in SML's design where such primitives integrate seamlessly with higher-order functions but introduce observable changes to program state.
A key challenge of mutable state in impure functional programming is the introduction of aliasing, where multiple identifiers can refer to the same underlying storage location, leading to unexpected interactions during updates. For example, consider a counter implemented with a shared reference: val counter = ref 0; fun increment () = (counter := !counter + 1; !counter);, where calling increment() twice yields 1 and then 2, but if another alias val alias = counter is used to set alias := 0 midway, subsequent increments start from the reset value, complicating reasoning about the program's behavior. This aliasing breaks referential transparency, as the meaning of an expression can depend on external mutations rather than solely on its subexpressions, making equational reasoning difficult and increasing the risk of subtle bugs. In concurrent settings, mutable state exacerbates issues like race conditions, where unsynchronized access to shared references can produce nondeterministic outcomes, though SML itself lacks built-in concurrency primitives to mitigate this.
To address these challenges while retaining some functional benefits, impure functional languages employ workarounds such as state-passing patterns, where state is explicitly threaded through function arguments to avoid hidden mutations, or closures that encapsulate mutable references for localized control. In state-passing, a function might take the current state as input and return an updated state alongside the result, simulating purity by making dependencies explicit, though this incurs overhead from repeated copying or allocation compared to direct mutation. Closures can capture references in their environment, allowing incremental updates within a scoped context, but they still risk aliasing if the captured state is shared externally. These techniques aim to localize impurity but often require careful discipline to prevent proliferation of side effects, which encompass broader external interactions beyond internal state changes.
The trade-offs of mutable state in impure functional programming favor efficiency for algorithms benefiting from in-place updates, such as certain graph traversals or accumulators, where avoiding data duplication reduces time and space complexity. However, this comes at the cost of reliability, as aliasing and loss of transparency hinder debugging, testing, and parallelization, potentially leading to programs that are harder to verify than their immutable counterparts.
Core Properties
Evaluation Strategies
In strictly evaluated purely functional languages, such as PureScript, function arguments are evaluated to values prior to function application, a strategy known as call-by-value.[21] This approach simplifies the operational semantics by ensuring predictable evaluation order and avoiding redundant computations of arguments used multiple times, unlike call-by-name where arguments may be re-evaluated. Consequently, it can prevent divergence in cases where an argument expression would loop indefinitely if re-evaluated, promoting more straightforward debugging and implementation. In contrast, non-strict or lazy evaluation defers the computation of arguments until their values are actually demanded by the function body, typically implemented via call-by-need to share results and avoid recomputation.[22] This strategy, central to Haskell, enables the construction and manipulation of infinite data structures, such as lazy lists, without immediate resource exhaustion, as only the required portions are computed on demand.[19] Additionally, it supports natural short-circuiting in control structures like conditionals, where unevaluated branches are discarded if not selected.[22] To illustrate the difference, consider a conditional expressionif cond then expr1 else expr2. In a strict evaluator, both expr1 and expr2 are computed regardless of cond's value, potentially wasting effort if, for example, expr2 is an expensive or non-terminating computation. In a lazy evaluator, only the selected branch is evaluated, mirroring logical short-circuiting and aligning with referential transparency to facilitate optimizations without altering program meaning.
Comparatively, strict evaluation may perform unnecessary work on discarded arguments, reducing efficiency for higher-order functions where arguments might go unused, while lazy evaluation excels in compositional settings but can incur higher memory overhead from thunk storage and sharing mechanisms.[19] Referential transparency in purely functional programs further aids lazy strategies by allowing safe reordering and common subexpression elimination during optimization.[23]
Lazy evaluation was popularized in the 1990s through Haskell, where it became a defining feature to support modular, declarative programming with infinite structures and higher-order combinators.[23] Its theoretical foundations trace to denotational semantics, as explored in natural semantics models that separate evaluation from sharing to capture lazy behavior precisely.[24]