Hubbry Logo
Canonical formCanonical formMain
Open search
Canonical form
Community hub
Canonical form
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Canonical form
Canonical form
from Wikipedia
Algorithmic anagram test using multisets as canonical forms: The strings "madam curie" and "radium came" are given as C arrays. Each one is converted into a canonical form by sorting. Since both sorted strings literally agree, the original strings were anagrams of each other.

In mathematics and computer science, a canonical, normal, or standard form of a mathematical object is a standard way of presenting that object as a mathematical expression. Often, it is one which provides the simplest representation of an object and allows it to be identified in a unique way. The distinction between "canonical" and "normal" forms varies from subfield to subfield. In most fields, a canonical form specifies a unique representation for every object, while a normal form simply specifies its form, without the requirement of uniqueness.[1]

The canonical form of a positive integer in decimal representation is a finite sequence of digits that does not begin with zero. More generally, for a class of objects on which an equivalence relation is defined, a canonical form consists in the choice of a specific object in each class. For example:

In computer science, and more specifically in computer algebra, when representing mathematical objects in a computer, there are usually many different ways to represent the same object. In this context, a canonical form is a representation such that every object has a unique representation (with canonicalization being the process through which a representation is put into its canonical form).[2] Thus, the equality of two objects can easily be tested by testing the equality of their canonical forms.

Despite this advantage, canonical forms frequently depend on arbitrary choices (like ordering the variables), which introduce difficulties for testing the equality of two objects resulting on independent computations. Therefore, in computer algebra, normal form is a weaker notion: A normal form is a representation such that zero is uniquely represented. This allows testing for equality by putting the difference of two objects in normal form.

Canonical form can also mean a differential form that is defined in a natural (canonical) way.

Definition

[edit]

Given a set S of objects with an equivalence relation R on S, a canonical form is given by designating some objects of S to be "in canonical form", such that every object under consideration is equivalent to exactly one object in canonical form. In other words, the canonical forms in S represent the equivalence classes, once and only once. To test whether two objects are equivalent, it then suffices to test equality on their canonical forms. A canonical form thus provides a classification theorem and more, in that it not only classifies every class, but also gives a distinguished (canonical) representative for each object in the class.

Formally, a canonicalization with respect to an equivalence relation R on a set S is a mapping c:SS such that for all s, s1, s2S:

  1. c(s) = c(c(s))   (idempotence),
  2. s1 R s2 if and only if c(s1) = c(s2)   (decisiveness), and
  3. s R c(s)   (representativeness).

Property 3 is redundant; it follows by applying 2 to 1.

In practical terms, it is often advantageous to be able to recognize the canonical forms. There is also a practical, algorithmic question to consider: how to pass from a given object s in S to its canonical form s*? Canonical forms are generally used to make operating with equivalence classes more effective. For example, in modular arithmetic, the canonical form for a residue class is usually taken as the least non-negative integer in it. Operations on classes are carried out by combining these representatives, and then reducing the result to its least non-negative residue. The uniqueness requirement is sometimes relaxed, allowing the forms to be unique up to some finer equivalence relation, such as allowing for reordering of terms (if there is no natural ordering on terms).

A canonical form may simply be a convention, or a deep theorem. For example, polynomials are conventionally written with the terms in descending powers: it is more usual to write x2 + x + 30 than x + 30 + x2, although the two forms define the same polynomial. By contrast, the existence of Jordan canonical form for a matrix is a deep theorem.

History

[edit]

According to OED and LSJ, the term canonical stems from the Ancient Greek word kanonikós (κανονικός, "regular, according to rule") from kanṓn (κᾰνών, "rod, rule"). The sense of norm, standard, or archetype has been used in many disciplines. Mathematical usage is attested in a 1738 letter from Logan.[3] The German term kanonische Form is attested in a 1846 paper by Eisenstein,[4] later the same year Richelot uses the term Normalform in a paper,[5] and in 1851 Sylvester writes:[6]

"I now proceed to [...] the mode of reducing Algebraical Functions to their simplest and most symmetrical, or as my admirable friend M. Hermite well proposes to call them, their Canonical forms."

In the same period, usage is attested by Hesse ("Normalform"),[7] Hermite ("forme canonique"),[8] Borchardt ("forme canonique"),[9] and Cayley ("canonical form").[10]

In 1865, the Dictionary of Science, Literature and Art defines canonical form as:

"In Mathematics, denotes a form, usually the simplest or most symmetrical, to which, without loss of generality, all functions of the same class can be reduced."

Examples

[edit]

Note: in this section, "up to" some equivalence relation E means that the canonical form is not unique in general, but that if one object has two different canonical forms, they are E-equivalent.

Large number notation

[edit]

Standard form is used by many mathematicians and scientists to write extremely large numbers in a more concise and understandable way, the most prominent of which being the scientific notation.[11]

Number theory

[edit]

Linear algebra

[edit]
Objects A is equivalent to B if: Normal form Notes
Normal matrices over the complex numbers for some unitary matrix U Diagonal matrices (up to reordering) This is the Spectral theorem
Matrices over the complex numbers for some unitary matrices U and V Diagonal matrices with real non-negative entries (in descending order) Singular value decomposition
Matrices over an algebraically closed field for some invertible matrix P Jordan normal form (up to reordering of blocks)
Matrices over an algebraically closed field for some invertible matrix P Weyr canonical form (up to reordering of blocks)
Matrices over a field for some invertible matrix P Frobenius normal form
Matrices over a principal ideal domain for some invertible matrices P and Q Smith normal form The equivalence is the same as allowing invertible elementary row and column transformations
Matrices over the integers for some unimodular matrix U Hermite normal form
Matrices over the integers modulo n Howell normal form
Finite-dimensional vector spaces over a field K A and B are isomorphic as vector spaces , n a non-negative integer

Algebra

[edit]
Objects A is equivalent to B if: Normal form
Finitely generated R-modules with R a principal ideal domain A and B are isomorphic as R-modules Primary decomposition (up to reordering) or invariant factor decomposition

Geometry

[edit]

In analytic geometry:

  • The equation of a line: Ax + By = C, with A2 + B2 = 1 and C ≥ 0
  • The equation of a circle:

By contrast, there are alternative forms for writing equations. For example, the equation of a line may be written as a linear equation in point-slope and slope-intercept form.

Convex polyhedra can be put into canonical form such that:

  • All faces are flat,
  • All edges are tangent to the unit sphere, and
  • The centroid of the polyhedron is at the origin.[12]

Integrable systems

[edit]

Every differentiable manifold has a cotangent bundle. That bundle can always be endowed with a certain differential form, called the canonical one-form. This form gives the cotangent bundle the structure of a symplectic manifold, and allows vector fields on the manifold to be integrated by means of the Euler-Lagrange equations, or by means of Hamiltonian mechanics. Such systems of integrable differential equations are called integrable systems.

Dynamical systems

[edit]

The study of dynamical systems overlaps with that of integrable systems; there one has the idea of a normal form (dynamical systems).

Three dimensional geometry

[edit]

In the study of manifolds in three dimensions, one has the first fundamental form, the second fundamental form and the third fundamental form.

Functional analysis

[edit]
Objects A is equivalent to B if: Normal form
Hilbert spaces If A and B are both Hilbert spaces of infinite dimension, then A and B are isometrically isomorphic. sequence spaces (up to exchanging the index set I with another index set of the same cardinality)
Commutative C*-algebras with unit A and B are isomorphic as C*-algebras The algebra of continuous functions on a compact Hausdorff space, up to homeomorphism of the base space.

Classical logic

[edit]

Set theory

[edit]

Game theory

[edit]

Proof theory

[edit]

Rewriting systems

[edit]

The symbolic manipulation of a formula from one form to another is called a "rewriting" of that formula. One can study the abstract properties of rewriting generic formulas, by studying the collection of rules by which formulas can be validly manipulated. These are the "rewriting rules"—an integral part of an abstract rewriting system. A common question is whether it is possible to bring some generic expression to a single, common form, the normal form. If different sequences of rewrites still result in the same form, then that form can be termed a normal form, with the rewrite being called a confluent. It is not always possible to obtain a normal form.

Lambda calculus

[edit]
  • A lambda term is in beta normal form if no beta reduction is possible; lambda calculus is a particular case of an abstract rewriting system. In the untyped lambda calculus, for example, the term does not have a normal form. In the typed lambda calculus, every well-formed term can be rewritten to its normal form.

Graph theory

[edit]

In graph theory, a branch of mathematics, graph canonization is the problem of finding a canonical form of a given graph G. A canonical form is a labeled graph Canon(G) that is isomorphic to G, such that every graph that is isomorphic to G has the same canonical form as G. Thus, from a solution to the graph canonization problem, one could also solve the problem of graph isomorphism: to test whether two graphs G and H are isomorphic, compute their canonical forms Canon(G) and Canon(H), and test whether these two canonical forms are identical.

Computing

[edit]

In computing, the reduction of data to any kind of canonical form is commonly called data normalization.

For instance, database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency.[13]

In the field of software security, a common vulnerability is unchecked malicious input (see Code injection). The mitigation for this problem is proper input validation. Before input validation is performed, the input is usually normalized by eliminating encoding (e.g., HTML encoding) and reducing the input data to a single common character set.

Other forms of data, typically associated with signal processing (including audio and imaging) or machine learning, can be normalized in order to provide a limited range of values.

In content management, the concept of a single source of truth (SSOT) is applicable, just as it is in database normalization generally and in software development. Competent content management systems provide logical ways of obtaining it, such as transclusion.

See also

[edit]

Notes

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
In mathematics, a canonical form, also known as a normal form, is a standardized representation of a mathematical object—most commonly a matrix associated with a linear transformation on a finite-dimensional vector space—chosen from among equivalent representations to ensure uniqueness up to specified equivalences, such as permutation of blocks. This standardization facilitates the classification, comparison, and analysis of the object's intrinsic properties, such as eigenvalues, minimal polynomials, and invariant factors, by reducing complex structures to simpler, revealing formats. Canonical forms are particularly prominent in linear algebra, where they arise from similarity transformations or other equivalences that preserve essential characteristics like the characteristic polynomial. The most notable canonical forms include the Jordan canonical form and the rational canonical form, each suited to different field characteristics and providing distinct insights into linear operators. The Jordan canonical form decomposes a matrix into a block diagonal structure consisting of Jordan blocks—upper triangular matrices with a single eigenvalue on the diagonal and ones on the superdiagonal—unique up to the order of blocks, and it is defined over algebraically closed fields like the complex numbers. This form highlights the geometric multiplicity of eigenvalues and the sizes of generalized eigenspaces, aiding in the study of solvability for systems of differential equations and stability analysis. In contrast, the rational canonical form expresses the matrix as a of companion matrices corresponding to the elementary divisors of the minimal polynomial, applicable over any field, and it emphasizes the cyclic decomposition of the into invariant subspaces. Both forms are unique in their block structures (up to ordering), ensuring that isomorphic linear transformations yield identical canonical representations. Beyond linear algebra, canonical forms appear in other mathematical domains, such as integer matrices via the , which reduces a matrix to a diagonal form using elementary row and column operations over the integers, revealing invariant factors for modules over principal ideal domains. In and logic, canonical forms like the (sum of products) or (product of sums) provide complete, minimal expressions for Boolean functions based on their truth tables. These representations underscore a unifying : selecting a "natural" or simplest form to canonicalize objects, promoting consistency in proofs, computations, and theoretical developments across algebra, geometry, and beyond.

Fundamentals

Definition

In , a canonical form is a standard representation of a mathematical or logical object that serves as a unique representative for its under a specified . It functions as a mapping from the set of objects to a where each class is represented exactly once, enabling efficient identification, comparison, and manipulation of equivalent structures. This uniqueness is typically preserved up to or other defined equivalences, making canonical forms essential for algorithmic and theoretical purposes across various mathematical domains. Canonical forms differ from normal forms in that the latter provide a standardized structure without necessarily enforcing uniqueness—multiple non-equivalent normal forms may exist for the same object—while canonical forms mandate a single, distinguished representative per class to resolve ambiguities. The general process for computing a canonical form involves a algorithm that systematically selects a representative from each , often through reduction steps like normalization followed by disambiguation rules (e.g., lexicographic ordering). For instance, consider polynomials over a field where equivalence is defined modulo : equivalent polynomials are mapped to their monic form, with the leading normalized to 1, providing a unique representative for each class.

Historical Development

The term "" derives from the Greek word kanōn, meaning "rule" or "standard," evolving through canonicus to denote something conforming to an established rule or norm; it entered English in the late primarily through contexts referring to texts accepted as authoritative by the church. In , the "canonical form" first appeared in in J. J. Sylvester's "Sketch of a on Elimination, Transformation, and Canonical Forms," marking the transition of the term from general standardization to specific mathematical representation, emphasizing a unique or preferred expression for objects under equivalence relations. By the mid-19th century, the concept gained formal traction in and , with applications to forms and invariants. In the 1870s, advanced canonical forms within , critiquing and refining approaches to normal forms for linear substitutions while emphasizing arithmetic generality over algebraic abstraction in his exchanges with Camille . Concurrently, introduced the canonical form for matrices in his 1870 treatise Traité des substitutions et des équations algébriques, providing a decomposition into Jordan blocks that revealed the structure of linear transformations over algebraically closed fields. Early in the 20th century, extended the notion to in his 1910 collaboration with Young, using canonical definitions for subspaces and transformations to axiomatize projective spaces rigorously. The mid-20th century saw the concept's expansion into logic and computing, as in the 1920s promoted formal systems with canonical representations of proofs and sentences to secure mathematics' foundations against paradoxes. In the 1930s, Alan Turing's seminal work on formalized "standard descriptions" of mechanical processes, akin to forms, enabling the precise modeling of algorithmic via Turing machines. From the 1960s onward, forms became integral to computer algebra systems, such as early implementations like FORMAC, where they facilitated symbolic simplification and equivalence checking in and expression manipulation. In theoretical computer science and category theory, the term evolved further in the 1970s with the development of canonical models in categorical logic, providing universal constructions for interpreting logical systems and modalities within functorial frameworks. This period highlighted a shift toward abstract, structure-preserving representations, influencing modern applications in type theory and semantics.

Canonical Forms in Discrete Mathematics

Large Number Notation

Canonical large number notation refers to standardized, recursive systems designed to represent extremely large integers in a compact and unambiguous manner, particularly by avoiding ambiguities in the evaluation of towers and higher hyperoperations. These notations ensure uniqueness through right-associative parsing rules and recursive definitions, allowing precise expression of numbers that exceed practical . Knuth's up-arrow notation, introduced by , serves as a canonical form for and higher hyperoperations, extending beyond standard to denote iterated operations systematically. A single up-arrow (↑) represents , such that ab=aba \uparrow b = a^b; double up-arrows (↑↑) denote , where aba \uparrow\uparrow b is a power tower of bb copies of aa; and additional arrows indicate further iterations of the previous operation. This notation is defined recursively: anb=an1(an(b1))a \uparrow^n b = a \uparrow^{n-1} (a \uparrow^n (b-1)) with base cases an1=aa \uparrow^n 1 = a and a0b=a×ba \uparrow^0 b = a \times b, ensuring right-to-left evaluation to eliminate ambiguity in tower expressions. For instance, 242 \uparrow\uparrow 4 evaluates as 2(2(22))=2(24)=216=65,5362^{ (2^{ (2^2) }) } = 2^{ (2^4) } = 2^{16} = 65{,}536, where the power tower is parsed from the top down to maintain uniqueness. This example illustrates how the notation compactly captures iterated exponentiation without requiring explicit stacking of exponents. The Conway-Wechsler system provides another canonical approach for naming large numbers, extending the traditional -illion suffix (e.g., million for 10610^6) to arbitrarily large powers of 10 through a systematic combination of Latin roots, ensuring unique verbal representations for numbers up to 103×10610^{3 \times 10^{6}} and beyond via recursive rules. Developed by John Horton Conway and Allan Wechsler, the system constructs names like "centillion" for 1060010^{600} by concatenating roots for the exponent's digits, with adjustments for consistency in English naming conventions, thus avoiding ad hoc inventions for immense quantities. In , up-arrow notation finds application in expressing bounds for rapidly growing functions, such as values of the , which enumerates primitive recursive functions and yields numbers like A(4,1)=243=65,5363=65,533A(4,1) = 2 \uparrow\uparrow 4 - 3 = 65{,}536 - 3 = 65{,}533, used to model hyperoperations in counting problems. Similarly, it defines , an upper bound in for the minimum dimensions avoiding certain monochromatic subgraphs in colorings, constructed as g64g_{64} where g1=33g_1 = 3 \uparrow\uparrow\uparrow\uparrow 3 and each subsequent gkg_k iterates the arrow count, highlighting the notation's role in quantifying .

Number Theory

In , the canonical form of a positive greater than 1 is its , expressed as a unique product of prime powers n=p1e1p2e2pkekn = p_1^{e_1} p_2^{e_2} \cdots p_k^{e_k}, where the primes pip_i are distinct and the exponents ei1e_i \geq 1. This uniqueness, up to the order of factors, follows from the , which guarantees that every such has exactly one such decomposition. The theorem underpins much of arithmetic, ensuring that provides a standard, invariant representation for computational and theoretical purposes. Binary quadratic forms, expressed as ax2+bxy+cy2ax^2 + bxy + cy^2 with integer coefficients a,b,ca, b, c and discriminant d=b24acd = b^2 - 4ac, admit a canonical reduced form through Gauss's reduction algorithm. For positive definite forms (where d<0d < 0 and a>0a > 0), the algorithm transforms the form via integer linear substitutions to achieve the reduced conditions: bac|b| \leq a \leq c, and b0b \geq 0 if b=a|b| = a or a=ca = c. These bounds ensure uniqueness within each equivalence class under SL2(Z)\mathrm{SL}_2(\mathbb{Z})-action, with the discriminant dd preserved and satisfying d3|d| \geq 3 for primitive forms. Gauss's method, detailed in Section V of his Disquisitiones Arithmeticae, counts the number of such reduced forms to determine class numbers. In quadratic fields K=Q(d)K = \mathbb{Q}(\sqrt{d})
Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.