Recent from talks
Nothing was collected or created yet.
Proof assistant
View on WikipediaThis article is missing information about automated proof checking. (February 2024) |
This article includes a list of general references, but it lacks sufficient corresponding inline citations. (November 2018) |

In computer science and mathematical logic, a proof assistant or interactive theorem prover is a software tool to assist with the development of formal proofs by human–machine collaboration. This involves some sort of interactive proof editor, or other interface, with which a human can guide the search for proofs, the details of which are stored in, and some steps provided by, a computer.
A recent effort within this field is making these tools use artificial intelligence to automate the formalization of ordinary mathematics.[1]
System comparison
[edit]| Name | Latest version | Developer(s) | Implementation language | Features | |||||
|---|---|---|---|---|---|---|---|---|---|
| Higher-order logic | Dependent types | Small kernel |
Proof automation | Proof by reflection |
Code generation | ||||
| ACL2 | 8.3 | Matt Kaufmann, J Strother Moore | Common Lisp | No | Untyped | No | Yes | Yes[2] | Already executable |
| Agda | 2.6.4.3[3] | Ulf Norell, Nils Anders Danielsson, and Andreas Abel (Chalmers and Gothenburg)[3] | Haskell[3] | Yes [citation needed] |
Yes [4] |
Yes [citation needed] |
No [citation needed] |
Partial [citation needed] |
Already executable [citation needed] |
| Albatross | 0.4 | Helmut Brandl | OCaml | Yes | No | Yes | Yes | Unknown | Not yet implemented |
| F* | repository | Microsoft Research and INRIA | F* | Yes | Yes | No | Yes | Yes[5] | Yes |
| HOL Light | repository | John Harrison | OCaml | Yes | No | Yes | Yes | No | No |
| HOL4 | Kananaskis-13 (or repo) | Michael Norrish, Konrad Slind, and others | Standard ML | Yes | No | Yes | Yes | No | Yes |
| Idris | 2 0.6.0 | Edwin Brady | Idris | Yes | Yes | Yes | Unknown | Partial | Yes |
| Isabelle | Isabelle2025 (March 2025) | Larry Paulson (Cambridge), Tobias Nipkow (München) and Makarius Wenzel | Standard ML, Scala | Yes | No | Yes | Yes | Yes | Yes |
| Lean | v4.23.0[6] | Leonardo de Moura (Microsoft Research) | C++, Lean | Yes | Yes | Yes | Yes | Yes | Yes |
| LEGO | 1.3.1 | Randy Pollack (Edinburgh) | Standard ML | Yes | Yes | Yes | No | No | No |
| Metamath | v0.198[7] | Norman Megill | ANSI C | ||||||
| Mizar | 8.1.11 | Białystok University | Free Pascal | Partial | Yes | No | No | No | No |
| Nqthm | |||||||||
| NuPRL | 5 | Cornell University | Common Lisp | Yes | Yes | Yes | Yes | Unknown | Yes |
| PVS | 6.0 | SRI International | Common Lisp | Yes | Yes | No | Yes | No | Unknown |
| Rocq | 9.0 | INRIA | OCaml | Yes | Yes | Yes | Yes | Yes | Yes |
| Twelf | 1.7.1 | Frank Pfenning, Carsten Schürmann | Standard ML | Yes | Yes | Unknown | No | No | Unknown |
- ACL2 – a programming language, a first-order logical theory, and a theorem prover (with both interactive and automatic modes) in the Boyer–Moore tradition.
- Rocq (former name: Coq) – Allows the expression of mathematical assertions, mechanically checks proofs of these assertions, helps to find formal proofs, and extracts a certified program from the constructive proof of its formal specification.
- HOL theorem provers – A family of tools ultimately derived from the LCF theorem prover. In these systems the logical core is a library of their programming language. Theorems represent new elements of the language and can only be introduced via "strategies" which guarantee logical correctness. Strategy composition gives users the ability to produce significant proofs with relatively few interactions with the system. Members of the family include:
- HOL4 – The "primary descendant", still under active development. Support for both Moscow ML and Poly/ML. Has a BSD-style license.
- HOL Light – A thriving "minimalist fork". OCaml based.
- ProofPower – Went proprietary, then returned to open source. Based on Standard ML.
- IMPS, An Interactive Mathematical Proof System.[8]
- Isabelle is an interactive theorem prover, successor of HOL. The main code-base is BSD-licensed, but the Isabelle distribution bundles many add-on tools with different licenses.
- Jape – Java based.
- Lean
- LEGO
- Matita – A light system based on the Calculus of Inductive Constructions.
- MINLOG – A proof assistant based on first-order minimal logic.
- Mizar – A proof assistant based on first-order logic, in a natural deduction style, and Tarski–Grothendieck set theory.
- PhoX – A proof assistant based on higher-order logic which is eXtensible.
- Prototype Verification System (PVS) – a proof language and system based on higher-order logic.
- Theorem Proving System (TPS) and ETPS – Interactive theorem provers also based on simply typed lambda calculus, but based on an independent formulation of the logical theory and independent implementation.
User interfaces
[edit]A popular front-end for proof assistants is the Emacs-based Proof General, developed at the University of Edinburgh.
Rocq includes RocqIDE, which is based on OCaml/Gtk. Isabelle includes Isabelle/jEdit, which is based on jEdit and the Isabelle/Scala infrastructure for document-oriented proof processing. More recently, Visual Studio Code extensions have been developed for Rocq,[9] Isabelle by Makarius Wenzel,[10] and for Lean 4 by the leanprover developers.[11]
Formalization extent
[edit]Freek Wiedijk has been keeping a ranking of proof assistants by the amount of formalized theorems out of a list of 100 well-known theorems. As of September 2025, only six systems have formalized proofs of more than 70% of the theorems, namely Isabelle, HOL Light, Lean, Rocq, Metamath and Mizar.[12][13]
Notable formalized proofs
[edit]The following is a list of notable proofs that have been formalized within proof assistants.
| Theorem | Proof assistant | Year |
|---|---|---|
| Four color theorem[14] | Rocq | 2005 |
| Feit–Thompson theorem[15] | Rocq | 2012 |
| Fundamental group of the circle[16] | Rocq | 2013 |
| Erdős–Graham problem[17][18] | Lean | 2022 |
| Polynomial Freiman-Ruzsa conjecture over [19] | Lean | 2023 |
| BB(5) = 47,176,870[20] | Rocq | 2024 |
See also
[edit]- Automated theorem proving – Subfield of automated reasoning and mathematical logic
- Computer-assisted proof – Mathematical proof at least partially generated by computer
- Formal verification – Proving or disproving the correctness of certain intended algorithms
- QED manifesto – Proposal for a computer-based database of all mathematical knowledge
- Satisfiability modulo theories – Logical problem studied in computer science
- Prover9 – is an automated theorem prover for first-order and equational logic
Notes
[edit]- ^ Ornes, Stephen (August 27, 2020). "Quanta Magazine – How Close Are Computers to Automating Mathematical Reasoning?".
- ^ Hunt, Warren; Kaufmann, Matt; Krug, Robert Bellarmine; Moore, J.; Smith, Eric W. (2005). "Meta Reasoning in ACL2" (PDF). Theorem Proving in Higher Order Logics. Lecture Notes in Computer Science. Vol. 3603. pp. 163–178. doi:10.1007/11541868_11. ISBN 978-3-540-28372-0.
- ^ a b c "agda/agda: Agda is a dependently typed programming language / interactive theorem prover". GitHub. Retrieved 31 July 2024.
- ^ "The Agda Wiki". Retrieved 31 July 2024.
- ^ Search for "proofs by reflection": arXiv:1803.06547
- ^ "Lean 4 Releases Page". GitHub. Retrieved 22 September 2025.
- ^ "Release v0.198 metamath/Metamath-exe". GitHub.
- ^ Farmer, William M.; Guttman, Joshua D.; Thayer, F. Javier (1993). "IMPS: An interactive mathematical proof system". Journal of Automated Reasoning. 11 (2): 213–248. doi:10.1007/BF00881906. S2CID 3084322. Retrieved 22 January 2020.
- ^ "coq-community/vscoq". July 29, 2024 – via GitHub.
- ^ Wenzel, Makarius. "Isabelle". Retrieved 2 November 2019.
- ^ "VS Code Lean 4". GitHub. Retrieved 15 October 2023.
- ^ Wiedijk, Freek (22 September 2025). "Formalizing 100 Theorems".
- ^ Geuvers, Herman (February 2009). "Proof assistants: History, ideas and future". Sādhanā. 34 (1): 3–25. doi:10.1007/s12046-009-0001-5. hdl:2066/75958. S2CID 14827467.
- ^ Gonthier, Georges (2008), "Formal Proof—The Four-Color Theorem" (PDF), Notices of the American Mathematical Society, 55 (11): 1382–1393, MR 2463991, archived (PDF) from the original on 2011-08-05
- ^ "Feit thomson proved in coq - Microsoft Research Inria Joint Centre". 2016-11-19. Archived from the original on 2016-11-19. Retrieved 2023-12-07.
- ^ Licata, Daniel R.; Shulman, Michael (2013). "Calculating the Fundamental Group of the Circle in Homotopy Type Theory". 2013 28th Annual ACM/IEEE Symposium on Logic in Computer Science. pp. 223–232. arXiv:1301.3443. doi:10.1109/lics.2013.28. ISBN 978-1-4799-0413-6. S2CID 5661377.
- ^ "Math Problem 3,500 Years In The Making Finally Gets A Solution". IFLScience. 2022-03-11. Retrieved 2024-02-09.
- ^ Avigad, Jeremy (2023). "Mathematics and the formal turn". arXiv:2311.00007 [math.HO].
- ^ Sloman, Leila (2023-12-06). "'A-Team' of Math Proves a Critical Link Between Addition and Sets". Quanta Magazine. Retrieved 2023-12-07.
- ^ "We have proved "BB(5) = 47,176,870"". The Busy Beaver Challenge. 2024-07-02. Retrieved 2024-07-09.
References
[edit]- Barendregt, Henk; Geuvers, Herman (2001). "18. Proof-assistants using Dependent Type Systems" (PDF). In Robinson, Alan J. A.; Voronkov, Andrei (eds.). Handbook of Automated Reasoning. Vol. 2. Elsevier. pp. 1149–. ISBN 978-0-444-50812-6. Archived from the original (PDF) on 2007-07-27.
- Pfenning, Frank. "17. Logical frameworks" (PDF). Handbook vol 2 2001. pp. 1065–1148.
- Pfenning, Frank (1996). "The practice of logical frameworks". In Kirchner, H. (ed.). Trees in Algebra and Programming – CAAP '96. Lecture Notes in Computer Science. Vol. 1059. Springer. pp. 119–134. doi:10.1007/3-540-61064-2_33. ISBN 3-540-61064-2.
- Constable, Robert L. (1998). "X. Types in computer science, philosophy and logic". In Buss, S. R. (ed.). Handbook of Proof Theory. Studies in Logic. Vol. 137. Elsevier. pp. 683–786. ISBN 978-0-08-053318-6.
- Wiedijk, Freek (2005). "The Seventeen Provers of the World" (PDF). Radboud University Nijmegen.
External links
[edit]This article's use of external links may not follow Wikipedia's policies or guidelines. (December 2022) |
- Theorem Prover Museum
- "Introduction" in Certified Programming with Dependent Types.
- Introduction to the Coq Proof Assistant (with a general introduction to interactive theorem proving)
- Interactive Theorem Proving for Agda Users
- A list of theorem proving tools
- Catalogues
- Digital Math by Category: Tactic Provers
- Automated Deduction Systems and Groups
- Theorem Proving and Automated Reasoning Systems
- Database of Existing Mechanized Reasoning Systems
- NuPRL: Other Systems
- "Specific Logical Frameworks and Implementations". Archived from the original on 10 April 2022. Retrieved 15 February 2024. (By Frank Pfenning).
- DMOZ: Science: Math: Logic and Foundations: Computational Logic: Logical Frameworks
Proof assistant
View on GrokipediaFundamentals
Definition and Purpose
A proof assistant is a software system designed to aid in the construction, verification, and checking of formal mathematical proofs through mechanized reasoning.[7] These tools allow users to encode mathematical statements and proofs in a formal language, where the system rigorously validates each step against a specified logical foundation, ensuring that the proof is correct relative to the axioms and inference rules employed.[8] The core purpose of proof assistants is to mitigate human error inherent in manual proof-writing, facilitate interactive development of proofs where users guide the process with high-level commands, and support the machine-checked formalization of intricate theorems that might otherwise be prone to subtle mistakes.[9] By providing a trusted environment for proof validation, these systems enable reliable certification of mathematical results, which is particularly valuable in domains requiring absolute certainty, such as foundational mathematics, software verification, and cryptography.[7] Proof assistants differ from automated theorem provers, which primarily automate the search for proofs with minimal user intervention, by prioritizing user-directed, verifiable proof steps that allow for detailed control and inspection.[10] In contrast to computer algebra systems, which focus on symbolic manipulation and computation of expressions (e.g., solving equations or simplifying polynomials), proof assistants emphasize logical deduction and proof certification rather than mere algebraic evaluation.[11] These systems trace their origins to early efforts in automated deduction in the late 1960s and 1970s, such as investigations into mechanical proof generation, with more structured interactive frameworks emerging in the 1980s.[12]Core Components
Proof assistants rely on a modular architecture centered around several key components that facilitate the construction and verification of formal proofs. Many such systems, particularly those in the LCF tradition or based on type theory, feature a kernel as the trustworthy core responsible for proof checking, implementing a minimal set of primitive rules and type-checking algorithms to validate all user-provided content.[13] This small kernel ensures reliability by reducing complex proofs to basic, manually verifiable operations, adhering to the de Bruijn criterion, which requires generating independently checkable proof objects that can be verified by a simple program regardless of the system's complexity.[14] Other systems, such as declarative ones like Mizar, use alternative verification mechanisms without a minimal kernel.[1] In interactive proof assistants, tactics act as procedural commands that guide the proof process by transforming proof goals into subgoals, such as applying lemmas or performing case analysis, allowing users to build proofs interactively without directly manipulating low-level proof terms.[1] Declarative systems instead emphasize writing proofs in a style close to natural mathematical language, with the system checking justifications against a library. Libraries provide pre-built collections of theorems, definitions, and axioms, enabling reuse of established mathematical results to accelerate proof development; for instance, Coq's standard libraries include modules like Arith for arithmetic operations.[13] Proof scripts consist of user-written sequences of commands, including tactic applications and declarations, that orchestrate the overall proof construction in a structured, reproducible manner.[1] In systems with a kernel, the proof verification process hinges on its role in validating large-scale proofs through efficient type checking, ensuring consistency even as proofs grow to megabytes in size.[14] By the de Bruijn criterion, the kernel independently confirms that each proof step adheres to the system's logical rules, minimizing trust assumptions to just this core component and preventing inconsistencies from propagating through higher-level abstractions.[1] This separation allows elaborate user interactions and tactics to be untrusted, as long as they produce verifiable proof objects for kernel inspection.[13] A typical workflow in interactive proof assistants begins with the user stating a theorem in the system's formal language, followed by applying a series of tactics to decompose the goal into simpler subgoals, ultimately reducing it to primitive axioms or admitted facts that the kernel verifies step by step.[1] For example, in Coq, a user might declare a lemma, use theapply tactic to invoke a relevant theorem from a library, and then employ intros or destruct to handle variables and cases, with the kernel checking each transformation for type correctness.[13] Upon completion, the proof script is executed to generate a proof term, which the kernel type-checks to confirm validity.[1] Declarative systems follow a different process, where users provide detailed justifications that the system verifies against predefined rules and the library.
Automation plays a crucial role in handling routine subtasks through built-in procedures for decidable fragments of the underlying logic, such as linear arithmetic or propositional tautologies, which can resolve subgoals without user intervention.[1] Tactics like Coq's auto or omega leverage these decidable solvers to automatically discharge simple obligations, integrating seamlessly with interactive proof construction while still submitting results to kernel verification.[13] Many proof assistants, including those based on dependent type theory, incorporate such mechanisms to balance human guidance with computational efficiency.[1]
Historical Development
Origins in Automated Reasoning
The origins of proof assistants can be traced to foundational developments in mathematical logic and computability theory during the 1930s and 1950s, which established the theoretical limits of formal systems and computation. Kurt Gödel's incompleteness theorems, published in 1931, demonstrated that in any sufficiently powerful consistent formal system capable of expressing basic arithmetic, there exist true statements that cannot be proved within the system itself, highlighting the inherent incompleteness of axiomatic mathematics. This result underscored the challenges of fully automating mathematical reasoning and influenced later efforts in formal verification by emphasizing the need for systems that could handle undecidable propositions. Complementing Gödel's work, Alan Turing's 1936 paper on computable numbers introduced the concept of a universal computing machine and proved the undecidability of the halting problem, showing that no general algorithm exists to determine whether an arbitrary program will terminate.[15] Alonzo Church's lambda calculus, developed in the early 1930s and applied in his 1936 demonstration of the unsolvability of the Entscheidungsproblem, provided a formal framework for expressing computable functions and higher-order abstractions, serving as a precursor to typed logics used in verification.[16] These contributions collectively laid the groundwork for automated reasoning by revealing the boundaries of mechanical proof and inspiring computational models for logic. In the 1960s, the field advanced toward practical automated theorem proving, driven by efforts to mechanize first-order logic for machine implementation. A pivotal development was J.A. Robinson's introduction of the resolution principle in 1965, a complete and sound inference rule for first-order logic that enabled efficient refutation-based theorem proving by reducing clauses through unification and resolution steps.[17] This method, designed explicitly for computer use, became a cornerstone of automated deduction systems and facilitated the integration of logic into early artificial intelligence programs. Concurrently, early AI logic programs, such as extensions of the 1956 Logic Theorist by Newell and Simon, evolved in the 1960s to explore heuristic search in proof generation, though they often struggled with scalability for non-trivial theorems.[18] These systems marked the shift from theoretical foundations to programmatic attempts at automation, yet they primarily operated in propositional or simple first-order settings, with emerging interest in higher-order logic for more expressive reasoning. The 1970s saw a transition from fully automated provers to semi-automated, interactive approaches, exemplified by the Boyer-Moore theorem prover, which originated in projects at the University of Texas and Xerox PARC starting around 1971. This system emphasized inductive proof techniques and user-guided simplification strategies within a first-order logic framework, allowing human intervention to resolve ambiguities in complex deductions.[19] By focusing on semi-automated deduction, it addressed the practical limitations of pure automation, where exhaustive search often failed due to combinatorial explosion in proof spaces for mathematical theorems.[18] Researchers recognized that fully automatic systems were insufficient for intricate mathematics, as undecidability results and the complexity of real-world proofs necessitated human insight to select lemmas, guide tactics, and manage assumptions, paving the way for interactive proof assistants.[19]Key Milestones and Systems
The development of proof assistants gained momentum in the 1980s with the emergence of early interactive theorem provers that emphasized human-guided proof construction. Earlier, in the late 1960s, the Automath project at Eindhoven University developed the first interactive proof assistant for formalizing mathematical texts. In the 1970s, Robin Milner's LCF at the University of Edinburgh introduced a foundational interactive theorem prover based on higher-order logic, emphasizing a trusted kernel for proof validation.[1] One pivotal system was Nqthm, developed by Robert S. Boyer and J Strother Moore starting in the late 1970s and continuing through the 1980s at the University of Texas at Austin; it introduced automated reasoning techniques for program verification and became influential in industrial applications, such as hardware and software correctness proofs. Concurrently, the HOL system, initiated in 1986 by Mike Gordon at the University of Cambridge, built on the LCF framework to support higher-order logic, enabling more expressive formalizations of mathematics and computer science concepts.[19] Entering the 1990s, proof assistants evolved toward more robust logical foundations and user-friendly interfaces. Coq, developed from 1989 at Inria by Christine Paulin-Mohring and others, was based on the Calculus of Inductive Constructions—a dependent type theory that allowed for concise definitions of inductive structures and proofs—and quickly became a standard for formalizing complex mathematics. Isabelle, originating in 1986 under Larry Paulson at the University of Cambridge and maturing through the 1990s, adopted an LCF-style architecture with a generic framework for classical and intuitionistic logics, facilitating theorem proving across diverse domains. The 2000s and 2010s saw broader adoption in academia and industry, with systems tailored for specific verification needs. PVS, released in 1992 by SRI International under NASA sponsorship, integrated higher-order logic with automated decision procedures, proving effective for aerospace software analysis. ACL2, a successor to Nqthm launched in the mid-1990s by Boyer and Moore, focused on industrial-scale hardware and software verification, supporting executable specifications in a first-order logic with induction. Agda, introduced in 2007 at Chalmers University, advanced dependently typed programming and proof, drawing from Martin-Löf type theory to enable interactive formalization of functional programs as proofs. In the 2010s and 2020s, proof assistants emphasized mathematical formalization and automation enhancements. Lean, developed starting in 2013 by Leonardo de Moura at Microsoft Research, targeted large-scale mathematics libraries with its dependent type theory kernel and integration of tactics for efficient proof scripting. During this period, many systems incorporated SAT solvers and other automated tools; for instance, Isabelle's Sledgehammer (from 2008 onward) translated goals to external solvers like Vampire, significantly boosting proof automation. Key milestones underscored these advancements: the Four Color Theorem was fully formalized in Coq in 2005 by Georges Gonthier, resulting in approximately 60,000 lines of Coq code.[20] Similarly, the Kepler conjecture—affirming the densest sphere packing—was verified in 2014 using HOL Light by the Flyspeck project led by Thomas Hales, confirming Hales' approximately 250-page proof through extensive formal checks.Logical Foundations
Higher-Order Logic Systems
Higher-order logic (HOL) extends first-order logic by permitting quantification over functions and predicates, treating them as first-class entities alongside individuals. This allows for the direct expression of complex mathematical concepts, such as properties of properties or functions that take other functions as arguments, which are inexpressible in first-order systems.[21] The formal semantics of HOL is grounded in the simply typed lambda calculus, where terms are assigned types from a hierarchy starting with base types like bool for propositions and ind for individuals, and built up via function types σ → τ. Logical connectives and quantifiers are encoded as higher-typed constants, with key axioms including extensionality (stating that functions are equal if they agree on all inputs), infinity (ensuring the existence of infinite domains, such as the natural numbers), and a choice axiom via the Hilbert ε-operator, which selects an element satisfying a predicate. This framework ensures a conservative extension of classical first-order logic, preserving its theorems while enabling richer expressiveness without introducing inconsistencies.[22][23] A core feature of HOL is its support for shallow embeddings of mathematics, where domains like numbers, sets, and structures are defined directly using the type system and lambda abstractions, avoiding deep encodings into a separate meta-language. For instance, sets are represented as predicates of type α → bool, allowing primitive treatment of set-theoretic operations within the logic itself. This approach facilitates concise formalizations of classical mathematics.[24] Representative HOL-based proof assistants include HOL Light, which employs a minimal kernel with just 10 primitive inference rules to enforce soundness, making it ideal for foundational work in verification. In contrast, Isabelle/HOL provides an extensible framework for building theories, integrating automated tactics and a rich library for higher-order unification and classical proof procedures.[22][24] The strengths of HOL systems lie in their alignment with classical reasoning, enabling efficient proofs in domains requiring non-constructive methods, such as hardware and software verification. For example, HOL's primitive support for set theory and equality axioms has been leveraged to verify floating-point arithmetic algorithms at scale, demonstrating its practicality for industrial applications.[25][26] These systems typically incorporate interactive theorem proving to guide users through complex derivations.Dependent Type Theory Systems
Dependent type theory provides a foundation for proof assistants where types can depend on values, enabling the expression of propositions as types through the Curry-Howard isomorphism. This correspondence, first articulated in the context of typed lambda calculi, identifies proofs of propositions with programs of corresponding types, allowing logical statements to be computationally inhabited.[27] In this framework, introduced by Per Martin-Löf in his intuitionistic type theory, propositions are treated as types, and proofs are terms that inhabit those types, fostering a constructive approach to mathematics where existence implies constructibility.[28] Key features of dependent type theory include the identification of proofs with programs, which supports the extraction of executable code from formal proofs, and inductive types that model recursive data structures such as natural numbers or lists.[28] Some systems incorporate the univalence axiom from homotopy type theory (HoTT), which equates paths (identities) between types with equivalences, enabling higher-dimensional structures that capture homotopy-theoretic interpretations of equality.[29] To prevent paradoxes like Girard's paradox arising from self-referential universes, these theories employ cumulative or non-cumulative universe levels, stratifying types into a hierarchy where each level classifies types of lower levels. Equality is typically handled via identity types, which form a propositional equality that can be refined in HoTT to include higher paths, or via more intensional variants in core systems.[30] Representative proof assistants based on dependent type theory include Coq, which implements the Calculus of Inductive Constructions (CIC), extending the Calculus of Constructions with inductive definitions to support both proofs and program extraction.[31] Agda realizes Martin-Löf's intuitionistic type theory, emphasizing dependent pattern matching and termination checking for practical programming alongside theorem proving.[32] Lean employs a dependent type theory with a focus on automation through tactics and a kernel based on CIC-like rules, facilitating large-scale formalizations in mathematics and verification.[33] These systems excel in supporting functional programming paradigms and constructive mathematics, as seen in Coq's ability to extract certified algorithms, such as verified implementations of sorting or cryptographic primitives, directly from proofs.[31] This integration bridges formal verification with software development, ensuring that mathematical correctness translates to runtime guarantees without introducing non-constructive axioms by default.System Design and Features
Interactive Theorem Proving
Interactive theorem proving represents the core paradigm in most proof assistants, where users collaboratively construct formal proofs by guiding the system through a series of decomposition steps, rather than relying solely on full automation. In this process, the user begins with a stated theorem, which the system treats as an initial goal, and applies tactics—high-level commands that transform the current proof state into simpler subgoals or resolve them entirely. The proof state encompasses the current goals, hypotheses, and context, which the system maintains and updates after each tactic application, allowing users to focus on high-level strategy while the assistant verifies local correctness. This backward reasoning workflow proceeds goal-oriented, starting from the theorem and working towards primitive axioms or admitted facts, enabling the handling of complex, undecidable problems that evade complete automation.[34] Tactics serve as the primary mechanism for goal decomposition, with users choosing between ad-hoc applications for specific steps and scripted sequences for repetitive or structured proofs. Common examples include induction, which generates subgoals for base cases and inductive steps on recursively defined structures like natural numbers or lists; rewriting, which replaces terms in goals or hypotheses using equalities or definitions to simplify expressions; and case analysis (via destruct in Coq or case_tac in Isabelle), which splits goals based on constructors of inductive types, such as distinguishing empty and non-empty lists. These tactics can be combined using tacticals, such as sequencing (e.g., applying one after another) or repetition, to build proof scripts that mirror mathematical reasoning while ensuring type-theoretic consistency. For instance, proving list reversal properties often involves induction followed by case analysis and rewriting to handle recursive calls.[24] To balance user interaction with computational power, proof assistants integrate external automated solvers, such as SMT solvers (e.g., Z3 or veriT) and equality-matching engines, particularly for proving lemmas in decidable fragments like linear arithmetic or propositional logic. These tools are invoked via dedicated tactics, like Sledgehammer in Isabelle, which translates goals to external provers and reconstructs proofs using verifiable certificates checked by the assistant, or bounded invocations in Coq that limit solver exploration to avoid non-termination. This hybrid approach automates routine subproofs while deferring creative decomposition to the user, as seen in verifications where SMT handles quantifier-free constraints but induction addresses recursive aspects.[34][35] The interactive paradigm offers key benefits, including the ability to manage undecidable logics like higher-order logic by leveraging human insight for non-trivial choices, such as selecting induction hypotheses or case splits, which fosters deeper understanding of proof structures. It also promotes reliability, as the system guarantees that only valid steps advance the proof, reducing errors in formalizing intricate theorems like the Four Color Theorem or Kepler Conjecture. Overall, this user-guided method scales to large formalizations by modularizing proofs into verifiable components, enhancing both mathematical discovery and software verification.[34]User Interfaces and Tools
Proof assistants offer diverse user interfaces to support interactive proof construction, catering to users ranging from novices to experts. These interfaces emphasize ergonomic interaction, enabling step-by-step verification while managing complex proof states. Text-based environments, integrated with extensible editors like Emacs and Vim, form a foundational approach for many systems. Proof General, an Emacs-based framework, provides a unified interface for assistants such as Coq and Isabelle, featuring syntax highlighting, script parsing, and real-time feedback on proof progress.[36] Agda's Emacs mode similarly facilitates interactive type-checking and goal refinement through structured editing and Unicode input methods.[37] Dedicated IDEs enhance usability with graphical elements tailored to proof development. CoqIDE serves as Coq's native standalone interface, offering windows for script editing, goal inspection, and query evaluation, with support for backtracking and error localization.[38] Isabelle/jEdit, built on the jEdit editor, delivers asynchronous processing, markup rendering, and plugin extensibility for seamless proof exploration.[39] For Lean, the VS Code extension acts as the primary IDE, incorporating language server protocol features like hover information and code navigation to streamline theorem stating and tactic application.[40] Visualization tools play a crucial role in presenting proof dynamics clearly. Goal displays, common across interfaces, show current subgoals, hypotheses, and context, as seen in Proof General's dedicated goals buffer that updates incrementally during tactic execution.[41] Error highlighting identifies syntax or logical issues with visual cues, reducing debugging time. Specialized visualizers, such as Prooftree for Coq in Proof General, render proof trees as layered diagrams, with nodes representing tactics and branches indicating subgoal evolution—fully proved paths appear in green for at-a-glance verification.[42] Auxiliary tools augment core interfaces with practical utilities. Proof state debuggers, like Coq's interactive stepping mode, allow users to inspect and retract commands mid-proof, aiding in error diagnosis.[43] Version control integration leverages editor capabilities, such as Git hooks in VS Code or Emacs packages, to track proof script changes collaboratively. Export functions enable documentation generation; Coq's coqdoc utility converts scripts to LaTeX or HTML, producing polished PDFs for sharing formalized results.[44] To improve accessibility, especially in educational settings, notebook-style interfaces promote exploratory learning. Lean's lean4-jupyter kernel integrates with Jupyter notebooks, allowing users to interleave code execution, theorem proofs, and visualizations in a literate programming format.[45] User interfaces have evolved from rudimentary command-line prompts to immersive, web-accessible platforms. Initial systems depended on terminal interactions for tactic input and output, but contemporary developments like jsCoq compile Coq to JavaScript, providing a browser-native IDE with scratchpad editing, key bindings for proof stepping, and embeddable demos—facilitating remote collaboration without installation.[46]Comparison Across Systems
Feature and Capability Overview
Proof assistants vary significantly in their core features and capabilities, reflecting their underlying logical foundations and design goals. Major systems such as Coq, Isabelle/HOL, Lean, and HOL4 support distinct logics—ranging from intuitionistic type theories to classical higher-order logics—while providing tactic-based proof construction, varying degrees of automation, and mechanisms for code extraction and proof reuse. These differences influence their suitability for formalizing mathematics, verifying software, or developing readable proofs.[47] The following table summarizes key comparison criteria across these systems, highlighting supported logics, proof languages, automation, libraries, definition mechanisms, extraction, expressiveness, and reuse.| System | Supported Logic | Tactic Language | Automation Levels | Library Size/Name | Language for Definitions | Extraction Capabilities | Theorem Expressiveness | Proof Reuse via Modules |
|---|---|---|---|---|---|---|---|---|
| Coq | Calculus of Inductive Constructions (intuitionistic) | Ltac (procedural) | High (e.g., auto, omega for arithmetic) | Large (standard library + contrib, thousands of theorems) | Gallina (functional) | Strong to OCaml, Haskell, Scheme | High via dependent types for precise specifications | Modules and functors for parameterized proofs |
| Isabelle/HOL | Classical higher-order logic | Isar (declarative), ML-based | Very high (Sledgehammer integrates external ATPs)[47] | Extensive (Archive of Formal Proofs, 934 entries as of November 2025)[48] | HOL (functional) | Code generation to Scala, Haskell, SML[47] | Supports classical axioms like Peirce's law natively | Locales and theories for modular extensions |
| Lean | Dependent type theory (CIC with quotients, choice) | Native tactics (e.g., simp, rw) | High (simp, linarith, ring; typed tactic framework) | Very large (mathlib: ~120,000 definitions, ~240,000 theorems as of November 2025)[49] | Lean (functional with dependent types) | To C, JavaScript; focuses on verified executables | Universe polymorphism for scalable hierarchies | Type classes and sections for reusable abstractions |
| HOL4 | Classical higher-order logic | ML-based tactics (e.g., REWRITE_TAC) | Moderate (HOLyHammer for premise selection, Metis) | Moderate (standard library for analysis, sets; ~7,000 theorems) | ML (functional) | To SML, OCaml via embeddings | Balanced for automation in classical settings | Theories and lemmas for incremental building |
