Recent from talks
Nothing was collected or created yet.
Hill cipher
View on WikipediaThis article includes a list of general references, but it lacks sufficient corresponding inline citations. (February 2012) |

In classical cryptography, the Hill cipher is a polygraphic substitution cipher based on linear algebra. Invented by Lester S. Hill in 1929, it was the first polygraphic cipher in which it was practical (though barely) to operate on more than three symbols at once.
The following discussion assumes an elementary knowledge of matrices.
Encryption
[edit]Each letter is represented by a number modulo 26. Though this is not an essential feature of the cipher, this simple scheme is often used:
| Letter | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Number | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 |
To encrypt a message, each block of n letters (considered as an n-component vector) is multiplied by an invertible n × n matrix, against modulus 26. To decrypt the message, each block is multiplied by the inverse of the matrix used for encryption.
The matrix used for encryption is the cipher key, and it should be chosen randomly from the set of invertible n × n matrices (modulo 26). The cipher can, of course, be adapted to an alphabet with any number of letters; all arithmetic just needs to be done modulo the number of letters instead of modulo 26.
Consider the message 'ACT', and the key below (or GYB/NQK/URP in letters):
Since 'A' is 0, 'C' is 2 and 'T' is 19, the message is the vector:
Thus the enciphered vector is given by:
which corresponds to a ciphertext of 'POH'. Now, suppose that our message is instead 'CAT', or:
This time, the enciphered vector is given by:
which corresponds to a ciphertext of 'FIN'. Every letter has changed. The Hill cipher has achieved Shannon's diffusion, and an n-dimensional Hill cipher can diffuse fully across n symbols at once.
Decryption
[edit]In order to decrypt, we turn the ciphertext back into a vector, then simply multiply by the inverse matrix of the key matrix (IFK/VIV/VMI in letters). We find that, modulo 26, the inverse of the matrix used in the previous example is:
Taking the previous example ciphertext of 'POH', we get:
which gets us back to 'ACT', as expected.
One complication exists in picking the encrypting matrix:
- Not all matrices have an inverse. The matrix will have an inverse if and only if its determinant is inversible modulo n, where n is the modular base.
Thus, if we work modulo 26 as above, the determinant must be nonzero, and must not be divisible by 2 or 13. If the determinant is 0, or has common factors with the modular base, then the matrix cannot be used in the Hill cipher, and another matrix must be chosen (otherwise it will not be possible to decrypt). Fortunately, matrices which satisfy the conditions to be used in the Hill cipher are fairly common.
For our example key matrix:
So, modulo 26, the determinant is 25. Since and , 25 has no common factors with 26, and this matrix can be used for the Hill cipher.
The risk of the determinant having common factors with the modulus can be eliminated by making the modulus prime. Consequently, a useful variant of the Hill cipher adds 3 extra symbols (such as a space, a period and a question mark) to increase the modulus to 29.
Example
[edit]Let
be the key and suppose the plaintext message is 'HELP'. Then this plaintext is represented by two pairs
Then we compute
- and
and continue encryption as follows:
The matrix K is invertible, hence exists such that . The inverse of K can be computed by using the formula
This formula still holds after a modular reduction if a modular multiplicative inverse is used to compute . Hence in this case, we compute
Then we compute
- and
Therefore,
- .
Security
[edit]The basic Hill cipher is vulnerable to a known-plaintext attack because it is completely linear. An opponent who intercepts plaintext/ciphertext character pairs can set up a linear system which can (usually) be easily solved; if it happens that this system is indeterminate, it is only necessary to add a few more plaintext/ciphertext pairs. Calculating this solution by standard linear algebra algorithms then takes very little time.
While matrix multiplication alone does not result in a secure cipher it is still a useful step when combined with other non-linear operations, because matrix multiplication can provide diffusion. For example, an appropriately chosen matrix can guarantee that small differences before the matrix multiplication will result in large differences after the matrix multiplication. Indeed, some modern ciphers use a matrix multiplication step to provide diffusion. For example, the MixColumns step in AES is a matrix multiplication. The function g in Twofish is a combination of non-linear S-boxes with a carefully chosen matrix multiplication (MDS).
Key space size
[edit]The key space is the set of all possible keys. The key space size is the number of possible keys. The effective key size, in number of bits, is the binary logarithm of the key space size.
There are matrices of dimension n × n. Thus or about is an upper bound on the key size of the Hill cipher using n × n matrices. This is only an upper bound because not every matrix is invertible and thus usable as a key. The number of invertible matrices can be computed via the Chinese Remainder Theorem. I.e., a matrix is invertible modulo 26 if and only if it is invertible both modulo 2 and modulo 13. The number of invertible n × n matrices modulo 2 is equal to the order of the general linear group GL(n,Z2). It is
Equally, the number of invertible matrices modulo 13 (i.e. the order of GL(n,Z13)) is
The number of invertible matrices modulo 26 is the product of those two numbers. Hence it is
Additionally it seems to be prudent to avoid too many zeroes in the key matrix, since they reduce diffusion. The net effect is that the effective keyspace of a basic Hill cipher is about . For a 5 × 5 Hill cipher, that is about 114 bits. Of course, key search is not the most efficient known attack.
Mechanical implementation
[edit]When operating on 2 symbols at once, a Hill cipher offers no particular advantage over Playfair or the bifid cipher, and in fact is weaker than either, and slightly more laborious to operate by pencil-and-paper. As the dimension increases, the cipher rapidly becomes infeasible for a human to operate by hand.
A Hill cipher of dimension 6 was implemented mechanically. Hill and a partner were awarded a patent (U.S. patent 1,845,947) for this device, which performed a 6 × 6 matrix multiplication modulo 26 using a system of gears and chains.
Unfortunately the gearing arrangements (and thus the key) were fixed for any given machine, so triple encryption was recommended for security: a secret nonlinear step, followed by the wide diffusive step from the machine, followed by a third secret nonlinear step. (The much later Even–Mansour cipher also uses an unkeyed diffusive middle step). Such a combination was actually very powerful for 1929, and indicates that Hill apparently understood the concepts of a meet-in-the-middle attack as well as confusion and diffusion. Unfortunately, his machine did not sell.[citation needed]
See also
[edit]Other practical "pencil-and-paper" polygraphic ciphers include:
References
[edit]- Lester S. Hill, Cryptography in an Algebraic Alphabet, The American Mathematical Monthly Vol.36, June–July 1929, pp. 306–312. (PDF)
- Lester S. Hill, Concerning Certain Linear Transformation Apparatus of Cryptography, The American Mathematical Monthly Vol.38, 1931, pp. 135–154.
- Jeffrey Overbey, William Traves, and Jerzy Wojdylo, On the Keyspace of the Hill Cipher, Cryptologia, Vol.29, No.1, January 2005, pp59–72. (CiteSeerX) (PDF)
External links
[edit]- "Hill Cipher Web App" implements the Hill cipher and shows the matrices involved
- "Hill Cipher Explained" illustrates the linear algebra behind the Hill Cipher
- "Hill's Cipher Calculator" outlines the Hill Cipher with a Web page
Hill cipher
View on GrokipediaIntroduction
History and Development
The Hill cipher was invented by American mathematician Lester S. Hill and first described in his 1929 paper "Cryptography in an Algebraic Alphabet," published in The American Mathematical Monthly. In this work, Hill introduced a polygraphic substitution method based on linear transformations, marking an early application of matrix algebra to cryptographic systems.[5] Lester Sanders Hill (1890–1961), who earned his B.A. from Columbia University in 1911 and his Ph.D. from Yale University in 1926, pursued a career in applied mathematics with a focus on communications and cryptography during the interwar period.[6] His motivation stemmed from a desire to leverage algebraic structures for secure message encoding, building on his broader research into mathematical models for error detection and transmission systems.[7] Hill later patented a mechanical device in 1932 (U.S. Patent No. 1,845,947) to implement a 6×6 version of his cipher using gears and wheels, aiming to make the computationally intensive process practical without electronic aids.[8] Despite these innovations, the Hill cipher saw limited practical adoption in the 1930s and during World War II, overshadowed by more operational mechanical systems like the Enigma machine, which better suited wartime needs for speed and simplicity.[9] Its reliance on manual matrix operations rendered it tedious and error-prone without computational support, confining it largely to theoretical exploration.[10]Overview and Basic Principles
The Hill cipher, invented by Lester S. Hill in 1929, is a polygraphic substitution cipher that applies linear algebra to encrypt messages in blocks, treating the process as a transformation over a finite alphabet. As a symmetric-key algorithm, it uses the same key for both encryption and decryption, relying on modular arithmetic modulo 26 to map the 26 letters of the English alphabet to integers from 0 to 25. This modular framework ensures that all operations remain within the alphabet's bounds, forming a ring structure that supports the necessary algebraic manipulations.[11] At its core, the cipher divides the plaintext into fixed-size blocks of length n, where each block is represented as an n-dimensional vector over the integers modulo 26—for instance, n=2 corresponds to digraphs (pairs of letters). Encryption proceeds by multiplying each plaintext vector by an invertible n × n key matrix, also with entries modulo 26, to yield a corresponding ciphertext vector; the resulting components are then converted back to letters. This matrix multiplication embodies the cipher's polygraphic nature, substituting multiple letters simultaneously rather than individually, which enhances diffusion across the message.[11] The symmetry of the algorithm stems from the invertibility of the key matrix: decryption reverses the process by multiplying the ciphertext vectors by the modular inverse of the key matrix, recovering the original plaintext provided the matrix's determinant is coprime to 26. This reliance on matrix inverses highlights the cipher's foundation in linear transformations within the vector space , though the method assumes only basic familiarity with such structures without requiring advanced derivations. Larger block sizes n increase the key space and security potential, but the core principles remain consistent across implementations.[11]Mathematical Foundations
Linear Algebra Prerequisites
In the context of cryptographic systems like the Hill cipher, linear algebra operations are performed over the ring of integers modulo , denoted , where is typically the size of the alphabet, such as 26 for the English alphabet. A vector is a column (or row) of elements from , representing a point in . A matrix is an array of such elements, with square matrices () being particularly relevant for transformations. These structures generalize real linear algebra but restrict entries and operations to modular arithmetic to ensure finite, discrete computations.[12] Vector addition and scalar multiplication in are defined componentwise: for vectors and , the sum is ; for a scalar , the product is . These operations preserve the modular structure, ensuring results remain in . For example, with , adding and yields , and multiplying by scalar 4 gives . Such operations form the basis for combining modular elements efficiently.[13][12] Matrix multiplication extends these via row-by-column dot products with modular reduction. For an matrix and matrix , the product has entry . Each dot product is computed over integers before reducing modulo . Consider a simple example with : The (1,1) entry is ; similarly, (1,2) is , (2,1) is , and (2,2) is . Thus, This row-by-column process ensures associativity and distributivity hold modulo .[13][12] A square matrix is invertible if there exists such that , the identity matrix; this requires the determinant to be coprime to (i.e., ), ensuring has a modular inverse. Individual elements have inverses if , found via the extended Euclidean algorithm, as solutions to exist uniquely modulo under this condition. For , a prime power product, invertibility is complicated because is not a field—not every nonzero element (or determinant) is invertible, as units are only those coprime to 26 (e.g., 1, 3, 5, ..., 25 excluding multiples of 2 or 13). Thus, matrices with sharing factors with 26 (like 2 or 13) lack inverses, unlike in prime modulus cases where nonzero determinants suffice. Focus on coprime determinants ensures reliable reversibility in applications.[12][13]Key and Message Representation
In the Hill cipher, plaintext messages are encoded using the standard English alphabet, where each letter is mapped to an integer from 0 to 25, with A corresponding to 0, B to 1, and so on up to Z at 25.[3][14] Spaces and punctuation are typically omitted from the plaintext or replaced with padding characters to ensure the message consists solely of letters, facilitating numerical conversion.[15] This mapping allows the cipher to operate within the finite field of integers modulo 26, aligning with the 26-letter alphabet.[3] The encoded plaintext is then divided into fixed-length blocks to form column vectors suitable for matrix multiplication. For a key matrix of size , the message is segmented into vectors of length , such as pairs of letters for (e.g., "AB" becomes the vector ).[14][15] If the message length is not a multiple of , it is padded with additional letters, often 'X' (encoded as 23), to complete the final block; this padding is later removed during decryption if recognizable.[3][15] The resulting ciphertext vectors are converted back to letters by mapping the modular integers to A-Z (e.g., 0 to A, 1 to B).[14] The key in the Hill cipher is an matrix with integer entries between 0 and 25, selected randomly from the set of such matrices.[3] For the cipher to be reversible, must be invertible modulo 26, which requires that the greatest common divisor of its determinant and 26 is 1 (i.e., ).[3][14] This condition ensures the existence of a modular inverse matrix for decryption.[15] Matrices failing this invertibility criterion are invalid keys, as they would render decryption impossible. For instance, consider the 2×2 matrix ; its determinant is , so , making it singular and unsuitable.[3] To verify a key, compute and check the gcd with 26.[14]Encryption and Decryption
Encryption Algorithm
The encryption algorithm of the Hill cipher transforms blocks of plaintext into ciphertext through matrix multiplication over the integers modulo 26, treating the English alphabet as numerical vectors where A=0, B=1, ..., Z=25. The plaintext message is first divided into consecutive blocks of length , where is the dimension of the square key matrix , with each block represented as a column vector .[16] If the message length is not a multiple of , it is padded with pre-arranged null letters, such as "X", to form complete blocks. For each plaintext block vector , the corresponding ciphertext block is computed as . Component-wise, this is given by for , where are the entries of the key matrix . Each resulting is then mapped back to a letter in the alphabet (e.g., 0=A, 1=B, ..., 25=Z) to form the ciphertext block.[3] The full ciphertext is obtained by concatenating the encrypted blocks in order.[16] The process can be outlined in the following steps:- Convert the plaintext string to a sequence of numerical values modulo 26.
- Pad the sequence if necessary to ensure its length is a multiple of .
- Partition the numerical sequence into -dimensional column vectors .
- For each to , compute .
- Convert each back to letters and concatenate to form the ciphertext string.[3]
Decryption Algorithm
The decryption process in the Hill cipher reverses the encryption by applying the modular inverse of the key matrix to the ciphertext blocks. For successful decryption, the key matrix must be invertible modulo 26, which requires that the greatest common divisor of its determinant and 26 is 1 (i.e., ).[3] If this condition is not met, the matrix has no inverse modulo 26, and a different key must be selected to ensure invertibility.[3] To decrypt, first compute the inverse matrix such that , where is the identity matrix. One method to find is the adjugate formula: where is the adjugate matrix (the transpose of the cofactor matrix), and is the modular multiplicative inverse of modulo 26, which exists precisely when .[17] Alternatively, Gaussian elimination (or row reduction) can be applied to the augmented matrix modulo 26, using only elementary row operations with multipliers that are invertible modulo 26, to transform it into .[18] Once is obtained, the plaintext blocks are recovered as follows. Divide the ciphertext into vectors of length equal to the dimension of (e.g., column vectors for an matrix), where each entry is the numerical representation of a letter (A=0, B=1, ..., Z=25). For each block , compute yielding the plaintext vector . Convert the resulting numerical values back to letters. For the full message, process all blocks sequentially and concatenate the plaintext letters; if padding was added during encryption to make the message length a multiple of , remove the extraneous characters at the end (typically 'X').[18][17]Examples and Applications
Numerical Example
To illustrate the Hill cipher's matrix operations, consider a simple case with block size over the modulus 26, using abstract numerical vectors rather than letter mappings. The key matrix is The determinant of is . Since , the determinant is coprime to 26, ensuring that is invertible modulo 26 and thus suitable as a key for encryption and decryption. Let the plaintext vector be . Encryption computes the ciphertext vector . Perform the matrix-vector multiplication step by step:- First component: , and .
- Second component: , and (since ).
- Top-left: (since ).
- Top-right: (since ).
- Bottom-left: (since ).
- Bottom-right: (since ).
- First component: (since ).
- Second component: (since ).
Text-Based Example
To illustrate the application of the Hill cipher to textual messages, consider a block size of and an encryption key matrix , where letters are mapped to numbers with A=0, B=1, ..., Z=25, and all operations are performed modulo 26. A simple example uses the plaintext "HI". Convert to numerical vectors: H=7, I=8, forming the column vector . Encrypt by computing : The resulting ciphertext numbers 11 and 13 correspond to L and N, yielding "LN". To decrypt, first find the inverse matrix . The determinant of is , and the modular inverse of 11 modulo 26 is 19 (since ). The adjugate is , so where negative entries are adjusted modulo 26 (-2 ≡ 24, -5 ≡ 21). Now decrypt: : recovering H and I. For a longer message, consider the plaintext "PAYMOREMONEY" (length 12, a multiple of ), with no padding required. Convert to numbers: P=15, A=0, Y=24, M=12, O=14, R=17, E=4, M=12, O=14, N=13, E=4, Y=24. Form blocks as column vectors and encrypt each:- PA: → (T=19, X=23) → "TX"
- YM: → (S=18, W=22) → "SW"
- OR: → (Y=24, H=7) → "YH"
- EM: → (K=10, A=0) → "KA"
- ON: → (Q=16, F=5) → "QF"
- EY: → (I=8, G=6) → "IG"