Recent from talks
Nothing was collected or created yet.
SHA-1
View on Wikipedia| Secure Hash Algorithms | |
|---|---|
| Concepts | |
| hash functions, SHA, DSA | |
| Main standards | |
| SHA-0, SHA-1, SHA-2, SHA-3 | |
| General | |
|---|---|
| Designers | National Security Agency |
| First published | 1993 (SHA-0), 1995 (SHA-1) |
| Series | (SHA-0), SHA-1, SHA-2, SHA-3 |
| Certification | FIPS PUB 180-4, CRYPTREC (Monitored) |
| Cipher detail | |
| Digest sizes | 160 bits |
| Block sizes | 512 bits |
| Structure | Merkle–Damgård construction |
| Rounds | 80 |
| Best public cryptanalysis | |
| A 2011 attack by Marc Stevens can produce hash collisions with a complexity between 260.3 and 265.3 operations.[1] The first public collision was published on 23 February 2017.[2] SHA-1 is prone to length extension attacks. | |
In cryptography, SHA-1 (Secure Hash Algorithm 1) is a hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecimal digits. It was designed by the United States National Security Agency, and is a U.S. Federal Information Processing Standard.[3] The algorithm has been cryptographically broken[4][5][6][7][8][9][10] but is still widely used.
Since 2005, SHA-1 has not been considered secure against well-funded opponents;[11] as of 2010 many organizations have recommended its replacement.[12][10][13] NIST formally deprecated use of SHA-1 in 2011 and disallowed its use for digital signatures in 2013, and declared that it should be phased out by 2030.[14] As of 2020[update], chosen-prefix attacks against SHA-1 are practical.[6][8] As such, it is recommended to remove SHA-1 from products as soon as possible and instead use SHA-2 or SHA-3. Replacing SHA-1 is urgent where it is used for digital signatures.
All major web browser vendors ceased acceptance of SHA-1 SSL certificates in 2017.[15][9][4] In February 2017, CWI Amsterdam and Google announced they had performed a collision attack against SHA-1, publishing two dissimilar PDF files which produced the same SHA-1 hash.[16][2] However, SHA-1 is still secure for HMAC.[17]
Microsoft has discontinued SHA-1 code signing support for Windows Update on August 3, 2020,[18] which also effectively ended the update servers for versions of Windows that have not been updated to SHA-2, such as Windows 2000 up to Vista, as well as Windows Server versions from Windows 2000 Server to Server 2003.
Development
[edit]
- A, B, C, D and E are 32-bit words of the state;
- F is a nonlinear function that varies;
- denotes a left bit rotation by n places;
- n varies for each operation;
- Wt is the expanded message word of round t;
- Kt is the round constant of round t;
denotes addition modulo 232.
SHA-1 produces a message digest based on principles similar to those used by Ronald L. Rivest of MIT in the design of the MD2, MD4 and MD5 message digest algorithms, but generates a larger hash value (160 bits vs. 128 bits).
SHA-1 was developed as part of the U.S. Government's Capstone project.[19] The original specification of the algorithm was published in 1993 under the title Secure Hash Standard, FIPS PUB 180, by U.S. government standards agency NIST (National Institute of Standards and Technology).[20][21] This version is now often named SHA-0. It was withdrawn by the NSA shortly after publication and was superseded by the revised version, published in 1995 in FIPS PUB 180-1 and commonly designated SHA-1. SHA-1 differs from SHA-0 only by a single bitwise rotation in the message schedule of its compression function. According to the NSA, this was done to correct a flaw in the original algorithm which reduced its cryptographic security, but they did not provide any further explanation.[22][23] Publicly available techniques did indeed demonstrate a compromise of SHA-0, in 2004, before SHA-1 in 2017 (see §Attacks).
Applications
[edit]Cryptography
[edit]SHA-1 forms part of several widely used security applications and protocols, including TLS and SSL, PGP, SSH, S/MIME, and IPsec. Those applications can also use MD5; both MD5 and SHA-1 are descended from MD4.
SHA-1 and SHA-2 are the hash algorithms required by law for use in certain U.S. government applications, including use within other cryptographic algorithms and protocols, for the protection of sensitive unclassified information. FIPS PUB 180-1 also encouraged adoption and use of SHA-1 by private and commercial organizations. SHA-1 is being retired from most government uses; the U.S. National Institute of Standards and Technology said, "Federal agencies should stop using SHA-1 for...applications that require collision resistance as soon as practical, and must use the SHA-2 family of hash functions for these applications after 2010",[24] though that was later relaxed to allow SHA-1 to be used for verifying old digital signatures and time stamps.[24]
A prime motivation for the publication of the Secure Hash Algorithm was the Digital Signature Standard, in which it is incorporated.
The SHA hash functions have been used for the basis of the SHACAL block ciphers.
Data integrity
[edit]Revision control systems such as Git, Mercurial, and Monotone use SHA-1, not for security, but to identify revisions and to ensure that the data has not changed due to accidental corruption. Linus Torvalds said about Git in 2007:
- If you have disk corruption, if you have DRAM corruption, if you have any kind of problems at all, Git will notice them. It's not a question of if, it's a guarantee. You can have people who try to be malicious. They won't succeed. [...] Nobody has been able to break SHA-1, but the point is the SHA-1, as far as Git is concerned, isn't even a security feature. It's purely a consistency check. The security parts are elsewhere, so a lot of people assume that since Git uses SHA-1 and SHA-1 is used for cryptographically secure stuff, they think that, Okay, it's a huge security feature. It has nothing at all to do with security, it's just the best hash you can get. ...
- I guarantee you, if you put your data in Git, you can trust the fact that five years later, after it was converted from your hard disk to DVD to whatever new technology and you copied it along, five years later you can verify that the data you get back out is the exact same data you put in. [...]
- One of the reasons I care is for the kernel, we had a break in on one of the BitKeeper sites where people tried to corrupt the kernel source code repositories.[25]
However Git does not require the second preimage resistance of SHA-1 as a security feature, since it will always prefer to keep the earliest version of an object in case of collision, preventing an attacker from surreptitiously overwriting files.[26] The known attacks (as of 2020) also do not break second preimage resistance.[27]
Cryptanalysis and validation
[edit]For a hash function for which L is the number of bits in the message digest, finding a message that corresponds to a given message digest can always be done using a brute force search in approximately 2L evaluations. This is called a preimage attack and may or may not be practical depending on L and the particular computing environment. However, a collision, consisting of finding two different messages that produce the same message digest, requires on average only about 1.2 × 2L/2 evaluations using a birthday attack. Thus the strength of a hash function is usually compared to a symmetric cipher of half the message digest length. SHA-1, which has a 160-bit message digest, was originally thought to have 80-bit strength.
Some of the applications that use cryptographic hashes, like password storage, are only minimally affected by a collision attack. Constructing a password that works for a given account requires a preimage attack, as well as access to the hash of the original password, which may or may not be trivial. Reversing password encryption (e.g. to obtain a password to try against a user's account elsewhere) is not made possible by the attacks. However, even a secure password hash can't prevent brute-force attacks on weak passwords. See Password cracking.
In the case of document signing, an attacker could not simply fake a signature from an existing document: The attacker would have to produce a pair of documents, one innocuous and one damaging, and get the private key holder to sign the innocuous document. There are practical circumstances in which this is possible; until the end of 2008, it was possible to create forged SSL certificates using an MD5 collision.[28]
Due to the block and iterative structure of the algorithms and the absence of additional final steps, all SHA functions (except SHA-3)[29] are vulnerable to length-extension and partial-message collision attacks.[30] These attacks allow an attacker to forge a message signed only by a keyed hash – SHA(key || message), but not SHA(message || key) – by extending the message and recalculating the hash without knowing the key. A simple improvement to prevent these attacks is to hash twice: SHAd(message) = SHA(SHA(0b || message)) (the length of 0b, zero block, is equal to the block size of the hash function).
SHA-0
[edit]At CRYPTO 98, two French researchers, Florent Chabaud and Antoine Joux, presented an attack on SHA-0: collisions can be found with complexity 261, fewer than the 280 for an ideal hash function of the same size.[31]
In 2004, Biham and Chen found near-collisions for SHA-0 – two messages that hash to nearly the same value; in this case, 142 out of the 160 bits are equal. They also found full collisions of SHA-0 reduced to 62 out of its 80 rounds.[32]
Subsequently, on 12 August 2004, a collision for the full SHA-0 algorithm was announced by Joux, Carribault, Lemuet, and Jalby. This was done by using a generalization of the Chabaud and Joux attack. Finding the collision had complexity 251 and took about 80,000 processor-hours on a supercomputer with 256 Itanium 2 processors (equivalent to 13 days of full-time use of the computer).
On 17 August 2004, at the Rump Session of CRYPTO 2004, preliminary results were announced by Wang, Feng, Lai, and Yu, about an attack on MD5, SHA-0 and other hash functions. The complexity of their attack on SHA-0 is 240, significantly better than the attack by Joux et al.[33][34]
In February 2005, an attack by Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu was announced which could find collisions in SHA-0 in 239 operations.[5][35]
Another attack in 2008 applying the boomerang attack brought the complexity of finding collisions down to 233.6, which was estimated to take 1 hour on an average PC from the year 2008.[36]
In light of the results for SHA-0, some experts[who?] suggested that plans for the use of SHA-1 in new cryptosystems should be reconsidered. After the CRYPTO 2004 results were published, NIST announced that they planned to phase out the use of SHA-1 by 2010 in favor of the SHA-2 variants.[37]
Attacks
[edit]In early 2005, Vincent Rijmen and Elisabeth Oswald published an attack on a reduced version of SHA-1 – 53 out of 80 rounds – which finds collisions with a computational effort of fewer than 280 operations.[38]
In February 2005, an attack by Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu was announced.[5] The attacks can find collisions in the full version of SHA-1, requiring fewer than 269 operations. (A brute-force search would require 280 operations.)
The authors write: "In particular, our analysis is built upon the original differential attack on SHA-0, the near collision attack on SHA-0, the multiblock collision techniques, as well as the message modification techniques used in the collision search attack on MD5. Breaking SHA-1 would not be possible without these powerful analytical techniques."[39] The authors have presented a collision for 58-round SHA-1, found with 233 hash operations. The paper with the full attack description was published in August 2005 at the CRYPTO conference.
In an interview, Yin states that, "Roughly, we exploit the following two weaknesses: One is that the file preprocessing step is not complicated enough; another is that certain math operations in the first 20 rounds have unexpected security problems."[40]
On 17 August 2005, an improvement on the SHA-1 attack was announced on behalf of Xiaoyun Wang, Andrew Yao and Frances Yao at the CRYPTO 2005 Rump Session, lowering the complexity required for finding a collision in SHA-1 to 263.[7] On 18 December 2007 the details of this result were explained and verified by Martin Cochran.[41]
Christophe De Cannière and Christian Rechberger further improved the attack on SHA-1 in "Finding SHA-1 Characteristics: General Results and Applications,"[42] receiving the Best Paper Award at ASIACRYPT 2006. A two-block collision for 64-round SHA-1 was presented, found using unoptimized methods with 235 compression function evaluations. Since this attack requires the equivalent of about 235 evaluations, it is considered to be a significant theoretical break.[43] Their attack was extended further to 73 rounds (of 80) in 2010 by Grechnikov.[44] In order to find an actual collision in the full 80 rounds of the hash function, however, tremendous amounts of computer time are required. To that end, a collision search for SHA-1 using the volunteer computing platform BOINC began August 8, 2007, organized by the Graz University of Technology. The effort was abandoned May 12, 2009 due to lack of progress.[45]
At the Rump Session of CRYPTO 2006, Christian Rechberger and Christophe De Cannière claimed to have discovered a collision attack on SHA-1 that would allow an attacker to select at least parts of the message.[46][47]
In 2008, an attack methodology by Stéphane Manuel reported hash collisions with an estimated theoretical complexity of 251 to 257 operations.[48] However he later retracted that claim after finding that local collision paths were not actually independent, and finally quoting for the most efficient a collision vector that was already known before this work.[49]
Cameron McDonald, Philip Hawkes and Josef Pieprzyk presented a hash collision attack with claimed complexity 252 at the Rump Session of Eurocrypt 2009.[50] However, the accompanying paper, "Differential Path for SHA-1 with complexity O(252)" has been withdrawn due to the authors' discovery that their estimate was incorrect.[51]
One attack against SHA-1 was Marc Stevens[52] with an estimated cost of $2.77M (2012) to break a single hash value by renting CPU power from cloud servers.[53] Stevens developed this attack in a project called HashClash,[54] implementing a differential path attack. On 8 November 2010, he claimed he had a fully working near-collision attack against full SHA-1 working with an estimated complexity equivalent to 257.5 SHA-1 compressions. He estimated this attack could be extended to a full collision with a complexity around 261.
The SHAppening
[edit]On 8 October 2015, Marc Stevens, Pierre Karpman, and Thomas Peyrin published a freestart collision attack on SHA-1's compression function that requires only 257 SHA-1 evaluations. This does not directly translate into a collision on the full SHA-1 hash function (where an attacker is not able to freely choose the initial internal state), but undermines the security claims for SHA-1. In particular, it was the first time that an attack on full SHA-1 had been demonstrated; all earlier attacks were too expensive for their authors to carry them out. The authors named this significant breakthrough in the cryptanalysis of SHA-1 The SHAppening.[10]
The method was based on their earlier work, as well as the auxiliary paths (or boomerangs) speed-up technique from Joux and Peyrin, and using high performance/cost efficient GPU cards from Nvidia. The collision was found on a 16-node cluster with a total of 64 graphics cards. The authors estimated that a similar collision could be found by buying US$2,000 of GPU time on EC2.[10]
The authors estimated that the cost of renting enough of EC2 CPU/GPU time to generate a full collision for SHA-1 at the time of publication was between US$75K and $120K, and noted that was well within the budget of criminal organizations, not to mention national intelligence agencies. As such, the authors recommended that SHA-1 be deprecated as quickly as possible.[10]
SHAttered – first public collision
[edit]On 23 February 2017, the CWI (Centrum Wiskunde & Informatica) and Google announced the SHAttered attack, in which they generated two different PDF files with the same SHA-1 hash in roughly 263.1 SHA-1 evaluations. This attack is about 100,000 times faster than brute forcing a SHA-1 collision with a birthday attack, which was estimated to take 280 SHA-1 evaluations. The attack required "the equivalent processing power of 6,500 years of single-CPU computations and 110 years of single-GPU computations".[2]
Birthday-Near-Collision Attack – first practical chosen-prefix attack
[edit]On 24 April 2019 a paper by Gaëtan Leurent and Thomas Peyrin presented at Eurocrypt 2019 described an enhancement to the previously best chosen-prefix attack in Merkle–Damgård–like digest functions based on Davies–Meyer block ciphers. With these improvements, this method is capable of finding chosen-prefix collisions in approximately 268 SHA-1 evaluations. This is approximately 1 billion times faster (and now usable for many targeted attacks, thanks to the possibility of choosing a prefix, for example malicious code or faked identities in signed certificates) than the previous attack's 277.1 evaluations (but without chosen prefix, which was impractical for most targeted attacks because the found collisions were almost random)[1] and is fast enough to be practical for resourceful attackers, requiring approximately $100,000 of cloud processing. This method is also capable of finding chosen-prefix collisions in the MD5 function, but at a complexity of 246.3 does not surpass the prior best available method at a theoretical level (239), though potentially at a practical level (≤249).[55] This attack has a memory requirement of 500+ GB.
On 5 January 2020 the authors published an improved attack called "shambles".[8] In this paper they demonstrate a chosen-prefix collision attack with a complexity of 263.4, that at the time of publication would cost US$45K per generated collision.
Official validation
[edit]Implementations of all FIPS-approved security functions can be officially validated through the CMVP program, jointly run by the National Institute of Standards and Technology (NIST) and the Communications Security Establishment (CSE). For informal verification, a package to generate a high number of test vectors is made available for download on the NIST site; the resulting verification, however, does not replace the formal CMVP validation, which is required by law for certain applications.
As of December 2013[update], there are over 2000 validated implementations of SHA-1, with 14 of them capable of handling messages with a length in bits not a multiple of eight (see SHS Validation List Archived 2011-08-23 at the Wayback Machine).
Examples and pseudocode
[edit]Example hashes
[edit]These are examples of SHA-1 message digests in hexadecimal and in Base64 binary to ASCII text encoding.
SHA1("The quick brown fox jumps over the lazy dog")
Even a small change in the message will, with overwhelming probability, result in many bits changing due to the avalanche effect. For example, changing dog to cog produces a hash with different values for 81 of the 160 bits:
SHA1("The quick brown fox jumps over the lazy cog")
The hash of the zero-length string is:
SHA1("")
SHA-1 pseudocode
[edit]Pseudocode for the SHA-1 algorithm follows:
Note 1: All variables are unsigned 32-bit quantities and wrap modulo 232 when calculating, except for ml, the message length, which is a 64-bit quantity, and hh, the message digest, which is a 160-bit quantity. Note 2: All constants in this pseudo code are in big endian. Within each word, the most significant byte is stored in the leftmost byte position Initialize variables: h0 = 0x67452301 h1 = 0xEFCDAB89 h2 = 0x98BADCFE h3 = 0x10325476 h4 = 0xC3D2E1F0 ml = message length in bits (always a multiple of the number of bits in a character). Pre-processing: append the bit '1' to the message e.g. by adding 0x80 if message length is a multiple of 8 bits. append 0 ≤ k < 512 bits '0', such that the resulting message length in bits is congruent to −64 ≡ 448 (mod 512) append ml, the original message length in bits, as a 64-bit big-endian integer. Thus, the total length is a multiple of 512 bits. Process the message in successive 512-bit chunks: break message into 512-bit chunks for each chunk break chunk into sixteen 32-bit big-endian words w[i], 0 ≤ i ≤ 15 Message schedule: extend the sixteen 32-bit words into eighty 32-bit words: for i from 16 to 79 Note 3: SHA-0 differs by not having this leftrotate. w[i] = (w[i-3] xor w[i-8] xor w[i-14] xor w[i-16]) leftrotate 1 Initialize hash value for this chunk: a = h0 b = h1 c = h2 d = h3 e = h4 Main loop:[3][56] for i from 0 to 79 if 0 ≤ i ≤ 19 then f = (b and c) or ((not b) and d) k = 0x5A827999 else if 20 ≤ i ≤ 39 f = b xor c xor d k = 0x6ED9EBA1 else if 40 ≤ i ≤ 59 f = (b and c) or (b and d) or (c and d) k = 0x8F1BBCDC else if 60 ≤ i ≤ 79 f = b xor c xor d k = 0xCA62C1D6 temp = (a leftrotate 5) + f + e + k + w[i] e = d d = c c = b leftrotate 30 b = a a = temp Add this chunk's hash to result so far: h0 = h0 + a h1 = h1 + b h2 = h2 + c h3 = h3 + d h4 = h4 + e Produce the final hash value (big-endian) as a 160-bit number: hh = (h0 leftshift 128) or (h1 leftshift 96) or (h2 leftshift 64) or (h3 leftshift 32) or h4
The number hh is the message digest, which can be written in hexadecimal (base 16).
The chosen constant values used in the algorithm were assumed to be nothing up my sleeve numbers:
- The four round constants
kare 230 times the square roots of 2, 3, 5 and 10. However they were incorrectly rounded to the nearest integer instead of being rounded to the nearest odd integer, with equilibrated proportions of zero and one bits. As well, choosing the square root of 10 (which is not a prime) made it a common factor for the two other chosen square roots of primes 2 and 5, with possibly usable arithmetic properties across successive rounds, reducing the strength of the algorithm against finding collisions on some bits. - The first four starting values for
h0throughh3are the same with the MD5 algorithm, and the fifth (forh4) is similar. However they were not properly verified for being resistant against inversion of the few first rounds to infer possible collisions on some bits, usable by multiblock differential attacks.
Instead of the formulation from the original FIPS PUB 180-1 shown, the following equivalent expressions may be used to compute f in the main loop above:
Bitwise choice between c and d, controlled by b. (0 ≤ i ≤ 19): f = d xor (b and (c xor d)) (alternative 1) (0 ≤ i ≤ 19): f = (b and c) or ((not b) and d) (alternative 2) (0 ≤ i ≤ 19): f = (b and c) xor ((not b) and d) (alternative 3) (0 ≤ i ≤ 19): f = vec_sel(d, c, b) (alternative 4) [premo08] Bitwise majority function. (40 ≤ i ≤ 59): f = (b and c) or (d and (b or c)) (alternative 1) (40 ≤ i ≤ 59): f = (b and c) or (d and (b xor c)) (alternative 2) (40 ≤ i ≤ 59): f = (b and c) xor (d and (b xor c)) (alternative 3) (40 ≤ i ≤ 59): f = (b and c) xor (b and d) xor (c and d) (alternative 4) (40 ≤ i ≤ 59): f = vec_sel(c, b, c xor d) (alternative 5)
It was also shown[57] that for the rounds 32–79 the computation of:
w[i] = (w[i-3] xor w[i-8] xor w[i-14] xor w[i-16]) leftrotate 1
can be replaced with:
w[i] = (w[i-6] xor w[i-16] xor w[i-28] xor w[i-32]) leftrotate 2
This transformation keeps all operands 64-bit aligned and, by removing the dependency of w[i] on w[i-3], allows efficient SIMD implementation with a vector length of 4 like x86 SSE instructions.
Comparison of SHA functions
[edit]In the table below, internal state means the "internal hash sum" after each compression of a data block.
| Algorithm and variant | Output size (bits) |
Internal state size (bits) |
Block size (bits) |
Rounds | Operations | Security (bits) |
Performance on Skylake (median cpb)[58] | First published | ||
|---|---|---|---|---|---|---|---|---|---|---|
| Long messages | 8 bytes | |||||||||
| MD5 (as reference) | 128 | 128 (4 × 32) |
512 | 4 (16 operations in each round) |
And, Xor, Or, Rot, Add (mod 232) | ≤ 18 (collisions found)[59] |
4.99 | 55.00 | 1992 | |
| SHA-0 | 160 | 160 (5 × 32) |
512 | 80 | And, Xor, Or, Rot, Add (mod 232) | < 34 (collisions found) |
≈ SHA-1 | ≈ SHA-1 | 1993 | |
| SHA-1 | < 63 (collisions found)[60] |
3.47 | 52.00 | 1995 | ||||||
| SHA-2 | SHA-224 SHA-256 |
224 256 |
256 (8 × 32) |
512 | 64 | And, Xor, Or, Rot, Shr, Add (mod 232) |
112 128 |
7.62 7.63 |
84.50 85.25 |
2004 2001 |
| SHA-384 | 384 | 512 (8 × 64) |
1024 | 80 | And, Xor, Or, Rot, Shr, Add (mod 264) |
192 | 5.12 | 135.75 | 2001 | |
| SHA-512 | 512 | 256 | 5.06 | 135.50 | 2001 | |||||
| SHA-512/224 SHA-512/256 |
224 256 |
112 128 |
≈ SHA-384 | ≈ SHA-384 | 2012 | |||||
| SHA-3 | SHA3-224 SHA3-256 SHA3-384 SHA3-512 |
224 256 384 512 |
1600 (5 × 5 × 64) |
1152 1088 832 576 |
24[61] | And, Xor, Rot, Not | 112 128 192 256 |
8.12 8.59 11.06 15.88 |
154.25 155.50 164.00 164.00 |
2015 |
| SHAKE128 SHAKE256 |
d (arbitrary) d (arbitrary) |
1344 1088 |
min(d/2, 128) min(d/2, 256) |
7.08 8.59 |
155.25 155.50 | |||||
Implementations
[edit]Below is a list of cryptography libraries that support SHA-1:
Hardware acceleration is provided by the following processor extensions:
- Intel SHA extensions: Available on some Intel and AMD x86 processors.
- VIA PadLock
- IBM z/Architecture: Available since 2003 as part of the Message-Security-Assist Extension[62]
Collision countermeasure
[edit]In the wake of SHAttered, Marc Stevens and Dan Shumow published "sha1collisiondetection" (SHA-1CD), a variant of SHA-1 that detects collision attacks and changes the hash output when one is detected. The false positive rate is 2−90.[63] SHA-1CD is used by GitHub since March 2017 and git since version 2.13.0 of May 2017.[64]
See also
[edit]Notes
[edit]- ^ a b Stevens, Marc (June 19, 2012). Attacks on Hash Functions and Applications (PDF) (PhD thesis). Leiden University. hdl:1887/19093. ISBN 9789461913173. OCLC 795702954.
- ^ a b c Stevens, Marc; Bursztein, Elie; Karpman, Pierre; Albertini, Ange; Markov, Yarik (2017). Katz, Jonathan; Shacham, Hovav (eds.). The First Collision for Full SHA-1 (PDF). Advances in Cryptology – CRYPTO 2017. Lecture Notes in Computer Science. Vol. 10401. Springer. pp. 570–596. doi:10.1007/978-3-319-63688-7_19. ISBN 9783319636870. Archived from the original (PDF) on May 15, 2018. Retrieved February 23, 2017.
- Marc Stevens; Elie Bursztein; Pierre Karpman; Ange Albertini; Yarik Markov; Alex Petit Bianco; Clement Baisse (February 23, 2017). "Announcing the first SHA1 collision". Google Security Blog.
- ^ a b "Secure Hash Standard (SHS)" (PDF). National Institute of Standards and Technology. 2015. doi:10.6028/NIST.FIPS.180-4. Federal Information Processing Standards Publication 180-4. Archived from the original (PDF) on 2020-01-07. Retrieved 2019-09-23.
- ^ a b "The end of SHA-1 on the Public Web". Mozilla Security Blog. 23 February 2017. Retrieved 2019-05-29.
- ^ a b c "SHA-1 Broken – Schneier on Security". www.schneier.com.
- ^ a b "Critical flaw demonstrated in common digital security algorithm". Nanyang Technological University, Singapore. 24 January 2020.
- ^ a b "New Cryptanalytic Results Against SHA-1 – Schneier on Security". www.schneier.com.
- ^ a b c Leurent, Gaëtan; Peyrin, Thomas (2020-01-05). "SHA-1 is a Shambles First Chosen-Prefix Collision on SHA-1 and Application to the PGP Web of Trust" (PDF). Cryptology ePrint Archive, Report 2020/014.
- ^ a b "Google will drop SHA-1 encryption from Chrome by January 1, 2017". VentureBeat. 2015-12-18. Retrieved 2019-05-29.
- ^ a b c d e Stevens, Marc; Karpman, Pierre; Peyrin, Thomas. "The SHAppening: freestart collisions for SHA-1". Retrieved 2015-10-09.
- ^ Schneier, Bruce (February 18, 2005). "Schneier on Security: Cryptanalysis of SHA-1".
- ^ "NIST.gov – Computer Security Division – Computer Security Resource Center". Archived from the original on 2011-06-25. Retrieved 2019-01-05.
- ^ Schneier, Bruce (8 October 2015). "SHA-1 Freestart Collision". Schneier on Security.
- ^ "NIST Retires SHA-1 Cryptographic Algorithm" (Press release). NIST. 2022-12-15.
- ^ Goodin, Dan (2016-05-04). "Microsoft to retire support for SHA1 certificates in the next 4 months". Ars Technica. Retrieved 2019-05-29.
- ^ "CWI, Google announce first collision for Industry Security Standard SHA-1". Retrieved 2017-02-23.
- ^ Barker, Elaine (May 2020). Recommendation for Key Management: Part 1 – General, Table 3 (Technical Report). NIST. p. 56. doi:10.6028/NIST.SP.800-57pt1r5.
- ^ "SHA-1 Windows content to be retired August 3, 2020". techcommunity.microsoft.com. Retrieved 2024-02-28.
- ^ "RSA FAQ on Capstone".
- ^ Selvarani, R.; Aswatha, Kumar; T V Suresh, Kumar (2012). Proceedings of International Conference on Advances in Computing. Springer Science & Business Media. p. 551. ISBN 978-81-322-0740-5.
- ^ Secure Hash Standard, Federal Information Processing Standards Publication FIPS PUB 180, National Institute of Standards and Technology, 11 May 1993
- ^ Kramer, Samuel (11 July 1994). "Proposed Revision of Federal Information Processing Standard (FIPS) 180, Secure Hash Standard". Federal Register.
- ^ fgrieu. "Where can I find a description of the SHA-0 hash algorithm?". Cryptography Stack Exchange.
- ^ a b Computer Security Division, Information Technology Laboratory (2017-01-04). "NIST Policy on Hash Functions – Hash Functions". CSRC, NIST. Retrieved 2023-08-27.
- ^ "Tech Talk: Linus Torvalds on git". YouTube. Retrieved November 13, 2013.
- ^ Torvalds, Linus. "Re: Starting to think about sha-256?". marc.info. Retrieved 30 May 2016.
- ^ Walfield, Neal H. (2020). "openpgp: Pass the hash algo's security reqs to Policy::signature". gitlab.com/sequoia-pgp. – see section "Background" in the rendered documentation
- ^ Sotirov, Alexander; Stevens, Marc; Appelbaum, Jacob; Lenstra, Arjen; Molnar, David; Osvik, Dag Arne; de Weger, Benne (December 30, 2008). "MD5 considered harmful today: Creating a rogue CA certificate". Retrieved March 29, 2009.
- ^ "Strengths of Keccak – Design and security". The Keccak sponge function family. Keccak team. Retrieved 20 September 2015.
Unlike SHA-1 and SHA-2, Keccak does not have the length-extension weakness, hence does not need the HMAC nested construction. Instead, MAC computation can be performed by simply prepending the message with the key.
- ^ "Schneier on Security: Cryptography Engineering". www.schneier.com. Retrieved 2023-08-27.
- ^ Chabaud, Florent; Joux, Antoine (October 3, 1998). "Differential collisions in SHA-0". In Krawczyk, Hugo (ed.). Advances in Cryptology – CRYPTO '98. Lecture Notes in Computer Science. Vol. 1462. Springer. pp. 56–71. doi:10.1007/BFb0055720. ISBN 978-3-540-64892-5 – via Springer Link.
- ^ Biham, Eli; Chen, Rafi. "Near-Collisions of SHA-0" (PDF).
- ^ "Report from Crypto 2004". Archived from the original on 2004-08-21. Retrieved 2004-08-23.
- ^ Grieu, Francois (18 August 2004). "Re: Any advance news from the crypto rump session?". Newsgroup: sci.crypt. Event occurs at 05:06:02 +0200. Usenet: fgrieu-05A994.05060218082004@individual.net.
- ^ Efficient Collision Search Attacks on SHA-0 Archived 2005-09-10 at the Wayback Machine, Shandong University
- ^ Manuel, Stéphane; Peyrin, Thomas (2008-02-11). Collisions on SHA-0 in One Hour (PDF). Fast Software Encryption 2008. Lecture Notes in Computer Science. Vol. 5086. pp. 16–35. doi:10.1007/978-3-540-71039-4_2. ISBN 978-3-540-71038-7.
- ^ "NIST Brief Comments on Recent Cryptanalytic Attacks on Secure Hashing Functions and the Continued Security Provided by SHA-1". 23 August 2017. Retrieved 2022-03-16.
- ^ Rijmen, Vincent; Oswald, Elisabeth (2005). "Update on SHA-1". Cryptology ePrint Archive.
- ^ Collision Search Attacks on SHA1 Archived 2005-02-19 at the Wayback Machine, Massachusetts Institute of Technology
- ^ Lemos, Robert. "Fixing a hole in security". ZDNet.
- ^ Cochran, Martin (2007). "Notes on the Wang et al. 263 SHA-1 Differential Path". Cryptology ePrint Archive.
- ^ De Cannière, Christophe; Rechberger, Christian (2006-11-15). "Finding SHA-1 Characteristics: General Results and Applications". Advances in Cryptology – ASIACRYPT 2006. Lecture Notes in Computer Science. Vol. 4284. pp. 1–20. doi:10.1007/11935230_1. ISBN 978-3-540-49475-1.
- ^ "IAIK Krypto Group — Description of SHA-1 Collision Search Project". Archived from the original on 2013-01-15. Retrieved 2009-06-30.
- ^ "Collisions for 72-step and 73-step SHA-1: Improvements in the Method of Characteristics". Retrieved 2010-07-24.
- ^ "SHA-1 Collision Search Graz". Archived from the original on 2009-02-25. Retrieved 2009-06-30.
- ^ "heise online – IT-News, Nachrichten und Hintergründe". heise online. 27 August 2023.
- ^ "Crypto 2006 Rump Schedule". www.iacr.org.
- ^ Manuel, Stéphane. "Classification and Generation of Disturbance Vectors for Collision Attacks against SHA-1" (PDF). Cryptology ePrint Archive. Retrieved 2011-05-19.
- ^ Manuel, Stéphane (2011). "Classification and Generation of Disturbance Vectors for Collision Attacks against SHA-1". Designs, Codes and Cryptography. 59 (1–3): 247–263. doi:10.1007/s10623-010-9458-9. S2CID 47179704. the most efficient disturbance vector is Codeword2 first reported by Jutla and Patthak
- ^ "SHA-1 collisions now 2^52" (PDF).
- ^ McDonald, Cameron; Hawkes, Philip; Pieprzyk, Josef (2009). "Differential Path for SHA-1 with complexity O(252)". Cryptology ePrint Archive. (withdrawn)
- ^ "Cryptanalysis of MD5 & SHA-1" (PDF).
- ^ "When Will We See Collisions for SHA-1? – Schneier on Security". www.schneier.com.
- ^ "Google Code Archive – Long-term storage for Google Code Project Hosting". code.google.com.
- ^ Leurent, Gaëtan; Peyrin, Thomas (2019). "From Collisions to Chosen-Prefix Collisions Application to Full SHA-1" (PDF). In Yuval Ishai; Vincent Rijmen (eds.). Advances in Cryptology – EUROCRYPT 2019 (PDF). 38th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Darmstadt, Germany, May 19–23, 2019. Lecture Notes in Computer Science. Vol. 11478. Springer. pp. 527–555. doi:10.1007/978-3-030-17659-4_18. ISBN 978-3-030-17658-7. S2CID 153311244.
- ^ "RFC 3174 - US Secure Hash Algorithm 1 (SHA1) (RFC3174)". www.faqs.org.
- ^ Locktyukhin, Max (2010-03-31), "Improving the Performance of the Secure Hash Algorithm (SHA-1)", Intel Software Knowledge Base, retrieved 2010-04-02
- ^ "Measurements table". bench.cr.yp.to.
- ^ Tao, Xie; Liu, Fanbao; Feng, Dengguo (2013). Fast Collision Attack on MD5 (PDF). Cryptology ePrint Archive (Technical report). IACR.
- ^ Stevens, Marc; Bursztein, Elie; Karpman, Pierre; Albertini, Ange; Markov, Yarik. The first collision for full SHA-1 (PDF) (Technical report). Google Research.
- Marc Stevens; Elie Bursztein; Pierre Karpman; Ange Albertini; Yarik Markov; Alex Petit Bianco; Clement Baisse (February 23, 2017). "Announcing the first SHA1 collision". Google Security Blog.
- ^ "The Keccak sponge function family". Retrieved 2016-01-27.
- ^ IBM z/Architecture Principles of Operation, publication number SA22-7832. See KIMD and KLMD instructions in Chapter 7.
- ^ Stevens, Marc (2017). "cr-marcstevens/sha1collisiondetection: Library and command line tool to detect SHA-1 collision in a file".
- ^ King, Jeff (10 May 2017). "Git 2.13 has been released". The GitHub Blog.
References
[edit]- Eli Biham, Rafi Chen, Near-Collisions of SHA-0, Cryptology ePrint Archive, Report 2004/146, 2004 (appeared on CRYPTO 2004), IACR.org
- Xiaoyun Wang, Hongbo Yu and Yiqun Lisa Yin, Efficient Collision Search Attacks on SHA-0, Crypto 2005
- Xiaoyun Wang, Yiqun Lisa Yin and Hongbo Yu, Finding Collisions in the Full SHA-1, Crypto 2005
- Henri Gilbert, Helena Handschuh: Security Analysis of SHA-256 and Sisters. Selected Areas in Cryptography 2003: pp175–193
- An Illustrated Guide to Cryptographic Hashes
- "Proposed Revision of Federal Information Processing Standard (FIPS) 180, Secure Hash Standard". Federal Register. 59 (131): 35317–35318. 1994-07-11. Retrieved 2007-04-26.[permanent dead link]
- A. Cilardo, L. Esposito, A. Veniero, A. Mazzeo, V. Beltran, E. Ayugadé, A CellBE-based HPC application for the analysis of vulnerabilities in cryptographic hash functions, High Performance Computing and Communication international conference, August 2010
- SECURE HASH STANDARD. (1995). https://nvlpubs.nist.gov/nistpubs/Legacy/FIPS/fipspub180-1.pdf
External links
[edit]- CSRC Cryptographic Toolkit – Official NIST site for the Secure Hash Standard
- FIPS 180-4: Secure Hash Standard (SHS)
- RFC 3174 (with sample C implementation)
- Interview with Yiqun Lisa Yin concerning the attack on SHA-1
- Explanation of the successful attacks on SHA-1 (3 pages, 2006)
- Cryptography Research – Hash Collision Q&A
- Lecture on SHA-1 (1h 18m) on YouTube by Christof Paar Archived 2017-04-24 at the Wayback Machine
SHA-1
View on GrokipediaHistory
Origins and Development
SHA-1, or Secure Hash Algorithm 1, was developed by the National Security Agency (NSA) as part of the U.S. Government's Capstone project to establish robust cryptographic standards for federal use.[8] The algorithm's design drew principles from Ronald L. Rivest's MD4 message-digest algorithm, aiming to create a more secure hashing function modeled after MD4 and its successor MD5.[9] While specific individual designers are not publicly attributed, the effort was led by the NSA with significant input from the National Institute of Standards and Technology (NIST) to ensure compatibility with emerging digital signature requirements.[8] The primary design goal of SHA-1 was to produce a 160-bit message digest, providing enhanced collision resistance compared to the 128-bit output of MD5 by making it computationally infeasible to find two distinct messages with the same hash value.[9] This longer digest length was intended to support secure applications such as digital signatures, where even minor message alterations should be detectable with high probability. NIST formalized SHA-1 through the Secure Hash Standard, publishing it as Federal Information Processing Standard (FIPS) PUB 180-1 on April 17, 1995, with an effective date of October 2, 1995.[1] SHA-1 saw initial adoption as a core component of the Digital Signature Standard (DSS), specified in FIPS PUB 186, where it is required for use with the Digital Signature Algorithm (DSA) to generate and verify signatures.[1] This integration positioned SHA-1 as a foundational element in U.S. federal cryptography protocols from the mid-1990s onward, superseding the earlier SHA specification in FIPS PUB 180 from 1993.[1]Relation to SHA-0
SHA-0 was initially developed by the National Security Agency (NSA) and announced by the National Institute of Standards and Technology (NIST) in April 1993 as a draft version of the Secure Hash Standard (SHS) intended to succeed MD5 due to emerging weaknesses in the latter.[10] However, shortly after its publication in May 1993, the NSA identified an undisclosed weakness in SHA-0 and requested that NIST withdraw it, limiting its distribution and preventing widespread release or adoption.[10] This precursor version remained largely undocumented publicly, with only limited details emerging later through reverse-engineering and analyses. To address the flaw, the NSA modified SHA-0 to produce SHA-1, which was subsequently published in April 1995 as FIPS PUB 180-1. The key change involved altering the message schedule in the compression function by introducing a left circular shift (rotation) of one bit on each expanded word, effectively changing the rotation constant from 0 in SHA-0 to 1 in SHA-1; this adjustment was explicitly stated to correct the identified weakness without altering the overall structure or output size. All other aspects of the algorithm, including the 160-bit output and the 80-round compression process, remained consistent. Public understanding of SHA-0's specific vulnerabilities was limited until analytical work began in the late 1990s, as the NSA never disclosed details of the original flaw. The first published collision attack on full SHA-0 appeared in 1998, demonstrating that collisions could be found with approximately 2^{61} operations using differential cryptanalysis techniques.[11] This work by Chabaud and Joux highlighted structural differences that made SHA-0 more susceptible than SHA-1 to such attacks, though it did not directly reveal the NSA's undisclosed issue.[11] As a result, SHA-1 was positioned by NIST as the secure, corrected iteration of the design, suitable for federal use and standardization in cryptographic protocols, while SHA-0 was effectively abandoned and never formalized in any subsequent FIPS publication. This transition underscored early efforts to balance rapid deployment with rigorous security validation in hash function development.[10]Algorithm Description
Input Preparation
The input preparation phase of SHA-1 transforms an arbitrary-length message into a sequence of fixed-size blocks suitable for processing by the hash function. This involves padding the message to ensure its length is a multiple of 512 bits, allowing it to be divided into 512-bit (64-byte) blocks, with each block subsequently processed in 80 rounds during the compression phase.[12] The padding rule begins by appending a single '1' bit to the message, followed by a sequence of zero bits. The number of zero bits, denoted as , is the smallest non-negative integer such that the total length after appending the '1' bit and zeros satisfies , where is the original message length in bits. This ensures 64 bits remain for the length field. Following the padding bits, the 64-bit binary representation of the original message length (in big-endian byte order) is appended. SHA-1 supports messages up to bits in length, and the resulting padded message length is always a multiple of 512 bits.[12] Prior to processing the blocks, SHA-1 initializes five 32-bit registers, through , with specific hexadecimal constants:,
,
,
,
.
These values are derived from the first 32 bits of the fractional parts of the square roots of the first five prime numbers: , , , , and .[12]
Compression Function
The compression function of SHA-1 processes each 512-bit message block in conjunction with the current hash value to produce an updated 160-bit hash value, forming the core of the algorithm's iterative mixing process.[1] This function operates on five chaining variables, denoted as , , , , and , each a 32-bit word. For the initial block, these are initialized to specific constant values: , , , , and (in hexadecimal). For subsequent blocks, the chaining variables are set to the intermediate hash values from the previous compression.[1] The input 512-bit block is first divided into sixteen 32-bit words, labeled through , typically in big-endian byte order. These are then expanded into an 80-word message schedule to using bitwise operations and rotations. Specifically, for to , each is computed as the bitwise XOR of , , , and , followed by a 1-bit left circular rotation (denoted as ). All operations are performed modulo .[1] The compression proceeds through 80 iterative rounds, grouped into four 20-round phases, each employing a distinct nonlinear round function and a phase-specific constant :- For rounds : and .
- For rounds : and .
- For rounds : and .
- For rounds : and .
Output Generation
After processing all message blocks through the compression function, the final 160-bit hash digest is formed by adding the resulting working variables—A, B, C, D, and E—from the last compression step to the corresponding chaining variables H₀, H₁, H₂, H₃, and H₄, with each addition performed modulo 2³².[13] These updated chaining variables, each 32 bits wide, represent the accumulated hash state. The digest is then produced by concatenating the five final 32-bit chaining variables in order: H₀ || H₁ || H₂ || H₃ || H₄, yielding a 160-bit value.[13] This value is typically represented as a 40-character hexadecimal string for readability and transmission. The byte order used in the output representation is big-endian, where the most significant byte of each 32-bit word appears first.[13] For the special case of an empty message (length 0 bits), the SHA-1 digest isda39a3ee5e6b4b0d3255bfef95601890afd80709.[13]
Applications
Cryptographic Protocols
SHA-1 has been widely employed in cryptographic protocols for digital signatures, where it hashes messages to produce a fixed-size digest before applying signature algorithms, thereby reducing computational overhead while relying on the hash's collision resistance for security. In the Digital Signature Algorithm (DSA), SHA-1 is used to compute the hash of the message, which is then signed using modular exponentiation over finite fields, as specified in the original Digital Signature Standard. Similarly, for RSA signatures under PKCS #1 v1.5, SHA-1 serves as the hashing mechanism to digest the message prior to encryption with the recipient's public key, enabling efficient verification in protocols requiring non-repudiation.[14] In Secure Sockets Layer (SSL) 3.0 and in Transport Layer Security (TLS) protocols versions 1.0 and 1.1, SHA-1 was the default hash function for signing certificates and messages, providing integrity and authenticity during key exchanges and handshakes. These versions integrated SHA-1 into the pseudorandom function (PRF) for deriving keys and into signature schemes for server authentication, assuming its resistance to collisions to prevent forgery attacks. However, due to advancing cryptanalysis, SHA-1-signed certificates were deprecated in TLS 1.3, which mandates stronger hashes like SHA-256 to mitigate risks in certificate validation.[15][16][17] SHA-1 also played a key role in email security protocols such as Pretty Good Privacy (PGP) and Secure/Multipurpose Internet Mail Extensions (S/MIME), where it ensured message authentication and integrity in signed and encrypted communications. In earlier versions of OpenPGP, such as specified in RFC 2440, SHA-1 was used to generate digests for signatures in the message format, supporting both detached and inline signing for email and file verification, though modern implementations deprecate it for new signatures while maintaining verification compatibility for interoperability.[18][19] For S/MIME versions 2 and 3, sending and receiving agents must support SHA-1 as the digest algorithm in Cryptographic Message Syntax (CMS) structures, though later versions and current practices deprecate SHA-1 in favor of SHA-256, often paired with RSA or DSA for signing MIME parts to protect against tampering.[20][21][22][23][24] Historically, SHA-1's deployment in these protocols assumed a collision resistance security level of approximately 2^80 operations, derived from its 160-bit output size via the birthday paradox, which was considered adequate against practical attacks until demonstrated otherwise. This assumption underpinned its use in signatures and TLS for over two decades, with protocols designed under the belief that finding collisions would require infeasible computational resources. In 2017, researchers achieved the first practical collision attack on SHA-1 using specialized hardware, reducing the effective cost to around 2^63 operations and prompting widespread deprecation in security protocols to avoid forgery vulnerabilities.[25][26][27]Data Integrity Checks
SHA-1 has been widely employed in non-cryptographic data integrity verification to detect accidental alterations or transmission errors in files and data structures, where deliberate attacks are not the primary concern. In such contexts, its role is to produce a fixed 160-bit digest that serves as a unique fingerprint for the input data, allowing users to confirm that the received content matches the original without requiring secret keys or collision-resistant security guarantees.[28] One prominent application is in Git version control systems, where SHA-1 serves as the default hashing algorithm for identifying commits, trees, blobs, and tags, enabling efficient detection of changes in repository contents.[29] This usage persisted from Git's inception until efforts began to transition to SHA-256 for enhanced resilience, with experimental SHA-256 repositories introduced to support gradual migration while maintaining backward compatibility with SHA-1 identifiers.[28] In Git, SHA-1's integrity checks ensure that objects remain unaltered during storage and transfer, facilitating reliable version tracking without the overhead of stronger cryptographic primitives.[30] For general file verification, tools like sha1sum, part of the GNU Coreutils package, compute SHA-1 digests to safeguard against tampering or corruption in downloaded files, such as software distributions or archives. Users typically generate a SHA-1 checksum for the source file and compare it against the provided value after download; a mismatch indicates potential issues like bit flips during transfer.[31] This method is particularly useful for open-source projects and large binaries, where sha1sum's output format includes the digest, filename, and input mode for straightforward validation. In package management systems, SHA-1 has been utilized for integrity checks in distributions like Debian and RPM-based systems, verifying that downloaded packages or their components have not been altered.[32] For instance, Debian historically included SHA-1 hashes in package metadata to confirm the unaltered state of .deb files during installation, prioritizing error detection over adversarial resistance.[33] Similarly, RPM packages incorporate SHA-1 digests for headers and payloads, allowing tools like rpm to validate file integrity post-download without relying on signatures for basic checks.[34] These implementations highlight SHA-1's role in ensuring package reliability in trusted repository environments. SHA-1 offers practical advantages for these integrity tasks, including faster computation speeds compared to successors like SHA-256, making it suitable for processing large files where quick verification is essential.[35] Its 160-bit output further minimizes the likelihood of false positives from random collisions, with an accidental match probability on the order of 1 in 2^160, sufficient for non-security-critical error detection.Security Analysis
Initial Validation
Upon its publication, SHA-1 underwent initial validation through formal certification by the National Institute of Standards and Technology (NIST) as part of Federal Information Processing Standard (FIPS) 180-1, issued on April 17, 1995, and effective October 2, 1995, establishing it as the Secure Hash Standard for federal use in applications requiring data integrity and digital signatures.[1] This certification affirmed SHA-1's design as a revision of SHA-0, incorporating modifications to enhance resistance against known cryptanalytic techniques, with NIST deeming its 160-bit output suitable for providing high-level security comparable to contemporary symmetric ciphers.[2] The standard was reaffirmed and expanded in FIPS 180-2, published on August 1, 2002, and effective February 1, 2003, which retained SHA-1 as the core algorithm while introducing additional variants like SHA-256, thereby endorsing its continued reliability based on ongoing evaluations up to that point.[36] This reaffirmation reflected NIST's assessment that no practical weaknesses had emerged in SHA-1's structure during the intervening years. Early independent cryptanalytic reviews, including applications of differential cryptanalysis, confirmed the absence of collisions for the full SHA-1 algorithm, as demonstrated in analyses of reduced-round variants that highlighted design strengths without compromising the complete function.[37] Based on its 160-bit output length and the Merkle-Damgård construction, SHA-1 was assumed to offer a security margin of approximately operations against collision attacks via the birthday paradox and against preimage attacks, aligning with theoretical bounds for ideal hash functions of that size.[2] SHA-1's trustworthiness was further validated through international adoption, notably its inclusion as a dedicated hash function in the ISO/IEC 10118-3 standard for information technology security techniques, published in 2004, which specified it alongside other approved algorithms for producing hash-codes up to 160 bits.[38] This standardization by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) signified broad consensus on SHA-1's initial robustness for cryptographic protocols and data integrity applications.Collision Attacks
The vulnerability of SHA-1 to collision attacks was first demonstrated theoretically in 2005 by Wang, Yin, and Yu, who introduced a differential path that reduced the complexity of finding collisions from the expected 2^{80} to approximately 2^{69} hash operations, later refined to 2^{63} in unpublished work.[39] This breakthrough relied on a structured differential analysis of SHA-1's compression function, marking the initial practical threat to its collision resistance and prompting reevaluation of its security margins.[5] Subsequent advancements focused on chosen-prefix collisions, a more powerful variant where attackers control distinct input prefixes leading to the same hash value. In 2013, Stevens et al. presented the first such attack using collision fragments, achieving an estimated cost of around 2^{52} operations for key components, building on optimized local-collision analysis to connect arbitrary prefixes efficiently.[40] This approach distinguished itself by targeting specific message blocks, enabling targeted applications in protocols vulnerable to prefix manipulation. The theoretical progress culminated in a practical demonstration with the SHAttered attack in 2017, by Stevens et al., who constructed the first publicly known full collision for SHA-1 by generating two distinct PDF files sharing the same hash value.[26] The computation required an effort equivalent to 2^{63} SHA-1 operations, performed using approximately 110 GPU-years on a large-scale cloud infrastructure, highlighting the feasibility of real-world exploitation despite high resource demands.[41] Further refinement in 2019 by Leurent and Peyrin introduced a birthday-near-collision technique, enabling a chosen-prefix collision attack with complexity between 2^{66.9} and 2^{69.4}, sufficient to forge Git commit histories by altering repository contents without changing the hash chain.[42] This attack exploited near-collisions to bridge prefix differences, demonstrating direct impact on version control systems still using SHA-1. In 2020, Leurent and Roy demonstrated a practical chosen-prefix collision based on this method, with a complexity of 2^{63.4} GPU operations, by creating colliding PGP certificates that could undermine the PGP web of trust.[43] While collision attacks have progressed to practicality, no feasible preimage attacks—recovering an input from a given hash—exist for SHA-1, with the best theoretical complexity remaining at 2^{160} operations due to the lack of exploitable weaknesses in the function's one-way properties. Thus, security analyses continue to emphasize collision resistance as the primary concern for deprecation.Deprecation Timeline
In response to theoretical and practical advances in cryptanalysis demonstrating reduced collision resistance for SHA-1, the National Institute of Standards and Technology (NIST) issued guidance in 2005 advising federal agencies to plan a transition to the SHA-2 family of hash functions, particularly SHA-256, for applications requiring high levels of security.[3] During the 2010s, major web browsers began deprecating SHA-1-signed certificates to mitigate risks in TLS connections. Google Chrome removed support for SHA-1 certificates in version 56, released at the end of January 2017, preventing affected sites from loading securely.[44] Similarly, Microsoft Edge and Internet Explorer 11 blocked loading of sites protected by SHA-1 certificates starting May 9, 2017, with enterprise and self-signed exceptions encouraged to migrate to SHA-2 promptly.[45] The Internet Engineering Task Force (IETF) further advanced deprecation through standards updates prohibiting SHA-1 in TLS protocols. RFC 9155, published in December 2021, formally deprecated MD5 and SHA-1 signature hashes in TLS 1.2 and DTLS 1.2 due to their vulnerability to collision attacks, recommending stronger alternatives for digital signatures.[46] This built on earlier prohibitions in TLS 1.3 (RFC 8446, August 2018), which required servers to avoid offering SHA-1-signed certificates unless no valid alternative chain exists. In December 2022, NIST announced a comprehensive phase-out of SHA-1 across all cryptographic applications, mandating a full transition by December 31, 2030, in favor of the more secure SHA-2 and SHA-3 algorithms to address ongoing collision vulnerabilities.[6] After this date, FIPS 140-validated modules using SHA-1 as an approved algorithm will be relegated to historical status, prohibiting new federal procurements.[3] As of 2025, SHA-1 persists in some legacy systems for non-critical data integrity checks, though its use is strongly discouraged due to established classical collision weaknesses. Emerging quantum computing threats, such as those enabled by Grover's algorithm potentially reducing preimage resistance, underscore the urgency of migration, even absent new classical attacks.[3][47]Examples
Hash Computations
SHA-1 processes arbitrary-length input messages to produce a fixed 160-bit (20-byte) digest, which is conventionally represented as a 40-character lowercase hexadecimal string. This deterministic function ensures that even minor changes in the input result in significantly different outputs, demonstrating its sensitivity to input variations. Common test cases highlight SHA-1's behavior. For the empty message (zero length), the hash isda39a3ee5e6b4b0d3255bfef95601890afd80709.[48] For the simple string "hello" (five ASCII bytes), the resulting digest is aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d. A more illustrative example is the pangram "The quick brown fox jumps over the lazy dog" (43 bytes including spaces and period), which yields 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12. These examples, verified through standard implementations, underscore how SHA-1 transforms diverse inputs into unique fixed-size values.
To demonstrate processing at the byte level, consider the short message "abc" (three bytes: 0x61, 0x62, 0x63 in ASCII, total length 24 bits). The algorithm first pads the message by appending a single byte 0x80 (representing the '1' bit followed by zeros), then adds 52 zero bytes to align to 448 bits, followed by an 8-byte big-endian representation of the message length (0x0000000000000018). This forms a single 512-bit (64-byte) block. The block is divided into sixteen 32-bit big-endian words, which are expanded into an 80-word message schedule using predefined expansion rules. These words are then fed into the compression function, which iteratively updates five chained 32-bit variables over 80 steps—grouped into four rounds of 20 iterations each—involving bitwise logical operations (AND, OR, XOR, NOT), left rotations, and modular additions based on constants derived from the square roots of primes. The final values of these variables, concatenated, form the 160-bit digest a9993e364706816aba3e25717850c26c9cd0d89d.[9] This high-level flow illustrates SHA-1's block-based structure without exhaustive numerical details, as the output mechanism concatenates the working variables into the final hexadecimal string (detailed in the Output Generation section).
Pseudocode
The SHA-1 algorithm processes an input message to produce a 160-bit hash value through a series of steps including message padding, initialization of hash values, and iterative processing of 512-bit blocks via a message schedule expansion and a compression function consisting of 80 rounds.[12] Below is a high-level pseudocode representation of SHA-1, based on the Secure Hash Standard specification. All operations are performed modulo , with bitwise functions defined as follows:- (also denoted as for certain rounds)
- (left rotation by bits)
SHA-1(Message M):
λ ← length of M in bits
// Step 1: Pad the message
Append a '1' bit to M
Append k '0' bits, where k is the smallest non-negative integer such that (λ + 1 + k) ≡ 448 mod 512
Append the 64-bit big-endian representation of λ as a block
Divide the padded message into N sequential 512-bit blocks M^(1), ..., M^(N)
// Step 2: Initialize hash values (five 32-bit words, hexadecimal)
H0 ← 0x67452301
H1 ← 0xEFCDAB89
H2 ← 0x98BADCFE
H3 ← 0x10325476
H4 ← 0xC3D2E1F0
// Step 3: Process each 512-bit block
for i = 1 to N do:
// 3.1: Prepare the message schedule (80 32-bit words W_t)
for t = 0 to 15 do:
W_t ← the t-th 32-bit word of M^(i) // big-endian
for t = 16 to 79 do:
W_t ← ROTL_1( W_(t-16) ⊕ W_(t-14) ⊕ W_(t-8) ⊕ W_(t-3) )
// 3.2: Initialize working variables with previous hash value
a ← H0
b ← H1
c ← H2
d ← H3
e ← H4
// 3.3: Compression function (80 rounds)
for t = 0 to 79 do:
if 0 ≤ t ≤ 19 then:
f ← Ch(b, c, d)
K_t ← 0x5A827999
elif 20 ≤ t ≤ 39 then:
f ← Parity(b, c, d)
K_t ← 0x6ED9EBA1
elif 40 ≤ t ≤ 59 then:
f ← Maj(b, c, d)
K_t ← 0x8F1BBCDC
else: // 60 ≤ t ≤ 79
f ← Parity(b, c, d)
K_t ← 0xCA62C1D6
T ← ROTL_5(a) + f + e + W_t + K_t
e ← d
d ← c
c ← ROTL_30(b)
b ← a
a ← T
// 3.4: Add this block's compression to current hash
H0 ← H0 + a
H1 ← H1 + b
H2 ← H2 + c
H3 ← H3 + d
H4 ← H4 + e
// Step 4: Produce the final 160-bit message digest
return (H0 || H1 || H2 || H3 || H4) // concatenation in big-endian order
SHA-1(Message M):
λ ← length of M in bits
// Step 1: Pad the message
Append a '1' bit to M
Append k '0' bits, where k is the smallest non-negative integer such that (λ + 1 + k) ≡ 448 mod 512
Append the 64-bit big-endian representation of λ as a block
Divide the padded message into N sequential 512-bit blocks M^(1), ..., M^(N)
// Step 2: Initialize hash values (five 32-bit words, hexadecimal)
H0 ← 0x67452301
H1 ← 0xEFCDAB89
H2 ← 0x98BADCFE
H3 ← 0x10325476
H4 ← 0xC3D2E1F0
// Step 3: Process each 512-bit block
for i = 1 to N do:
// 3.1: Prepare the message schedule (80 32-bit words W_t)
for t = 0 to 15 do:
W_t ← the t-th 32-bit word of M^(i) // big-endian
for t = 16 to 79 do:
W_t ← ROTL_1( W_(t-16) ⊕ W_(t-14) ⊕ W_(t-8) ⊕ W_(t-3) )
// 3.2: Initialize working variables with previous hash value
a ← H0
b ← H1
c ← H2
d ← H3
e ← H4
// 3.3: Compression function (80 rounds)
for t = 0 to 79 do:
if 0 ≤ t ≤ 19 then:
f ← Ch(b, c, d)
K_t ← 0x5A827999
elif 20 ≤ t ≤ 39 then:
f ← Parity(b, c, d)
K_t ← 0x6ED9EBA1
elif 40 ≤ t ≤ 59 then:
f ← Maj(b, c, d)
K_t ← 0x8F1BBCDC
else: // 60 ≤ t ≤ 79
f ← Parity(b, c, d)
K_t ← 0xCA62C1D6
T ← ROTL_5(a) + f + e + W_t + K_t
e ← d
d ← c
c ← ROTL_30(b)
b ← a
a ← T
// 3.4: Add this block's compression to current hash
H0 ← H0 + a
H1 ← H1 + b
H2 ← H2 + c
H3 ← H3 + d
H4 ← H4 + e
// Step 4: Produce the final 160-bit message digest
return (H0 || H1 || H2 || H3 || H4) // concatenation in big-endian order
Comparisons
With SHA-2 Family
SHA-1 produces a 160-bit digest, whereas SHA-256, a prominent member of the SHA-2 family, generates a 256-bit digest, providing greater resistance to brute-force attacks such as preimage searches, where the effort scales with half the digest length (2^80 for SHA-1 versus 2^128 for SHA-256).[49] Both algorithms employ the Merkle-Damgård construction, processing input messages in 512-bit blocks through iterative compression functions to produce the final hash value.[49] However, SHA-2 variants like SHA-256 utilize 64 rounds of processing per block, compared to SHA-1's 80 rounds, and rely on distinct primitive operations—including bitwise majority (Maj), choice (Ch), and rotation-based sigma functions (Σ0, Σ1, σ0, σ1)—without the nonlinear f functions (f0 through f3) characteristic of SHA-1's design, which is loosely based on a modified Davies-Meyer structure.[49] In terms of performance, SHA-1 typically executes faster on legacy hardware due to its smaller output size and simpler per-round computations, making it more efficient for resource-constrained environments, though modern implementations show diminishing differences. Despite this, SHA-2 algorithms are recommended for all new cryptographic designs owing to their enhanced security margins against known attacks. The transition from SHA-1 to the SHA-2 family is formalized in NIST standards, with FIPS 180-4 (published in 2015) specifying both but emphasizing SHA-2 for ongoing use, while SP 800-131A Revision 2 (2019) prohibits SHA-1 for generating digital signatures after 2013 and allows limited use in other applications until December 31, 2015; NIST later mandated a full phase-out by December 31, 2030.[49][50][6]With MD5
SHA-1 was developed as an improvement over MD5, which was designed by Ronald Rivest in 1991 and specified in RFC 1321 the following year, producing a 128-bit hash value intended for enhanced collision resistance compared to its predecessor MD4.[51] In contrast, SHA-1, published by the National Institute of Standards and Technology (NIST) in 1995 as FIPS 180-1, outputs a longer 160-bit hash to provide greater security against brute-force attacks and potential collisions.[1] A key design difference lies in their processing structures: MD5 operates through 64 rounds divided into four distinct phases, each employing a unique nonlinear function—F for bitwise operations in the first phase, G in the second, H using XOR in the third, and I with additional bitwise variations in the fourth—to mix the input data.[51] SHA-1, however, uses 80 rounds organized into four phases of 20 rounds each, incorporating phase-specific functions such as the choice function (Ch) in the initial phase, parity (XOR-based) in the second and fourth, and majority (Maj) in the third, along with expanded message scheduling to further diffuse the input bits.[2] These extensions in round count and function diversity aimed to bolster SHA-1's resistance to cryptanalytic attacks relative to MD5's more compact structure. In terms of security history, MD5 faced early scrutiny with Hans Dobbertin demonstrating a collision in its compression function in 1996, though this did not fully compromise the hash algorithm itself. A practical full collision attack on MD5 was achieved in 2004 by Xiaoyun Wang and colleagues, enabling the creation of distinct inputs with identical hashes in feasible computational time, which severely undermined its trustworthiness for cryptographic purposes. SHA-1, by comparison, withstood significant cryptanalysis for two decades longer, with the first practical collision demonstrated only in 2017 by researchers from Google and the CWI Institute, marking a delayed but inevitable vulnerability due to its extended design. The usage shift reflected these security timelines, as MD5's vulnerabilities prompted earlier deprecation—NIST ceased approving it for new applications by the early 2000s, with major vendors like Microsoft fully retiring support by 2014—while SHA-1 persisted in legacy systems until NIST's formal deprecation in 2011 for digital signatures and a mandated phase-out by 2030.[52][3] This progression highlighted SHA-1's interim role as a more robust alternative before the advent of stronger successors.Implementations
Core Algorithms
The SHA-1 algorithm is implemented in standard cryptographic libraries across multiple programming languages, providing straightforward access to its core functionality for computing message digests. In C, a reference implementation is available as part of RFC 3174, which includes header files, source code, and a test driver utilizing standard integer types and bitwise operations for portability across compliant compilers.[53] OpenSSL, a widely used open-source library, exposes SHA-1 through functions like SHA1() and EVP_sha1(), enabling efficient hashing of data buffers in C and C++ applications. Python's built-in hashlib module supports SHA-1 via the sha1() constructor, allowing incremental updates and finalization of digests from byte strings or files.[54] Similarly, Java's java.security.MessageDigest class provides SHA-1 support through MessageDigest.getInstance("SHA-1"), integrating seamlessly with the Java Cryptography Architecture for secure one-way hashing.[55] Hardware acceleration for SHA-1 is available via Intel SHA Extensions, introduced in 2013 as part of the x86 instruction set to optimize performance for cryptographic workloads using specialized instructions like SHA1RNDS4.[56] These extensions, implemented in processors such as those based on Goldmont and later architectures, reduce computation time for SHA-1 rounds and are available in many modern Intel processors, including low-power and high-performance architectures such as Alder Lake and later.[56][57] SHA-1 implementations rely on bitwise operations (AND, OR, XOR, NOT, and rotations) that are standardized in languages like C via <stdint.h> for 32-bit unsigned integers (uint32_t), ensuring consistent behavior across platforms without endianness issues when handling big-endian word ordering as specified.[53] The algorithm assumes 32-bit word processing, which aligns with most modern architectures but may require emulation on non-32-bit systems for full fidelity.[53] Prior to its retirement, SHA-1 was included in numerous FIPS 140-2 validated cryptographic modules, with certifications available until April 2022 when NIST ceased accepting new submissions under the standard, after which modules transitioned to FIPS 140-3, where SHA-1 is approved only for specific legacy uses like integrity checks at lower security levels until its full retirement on December 31, 2030.[58][59][6]Security Enhancements
To mitigate length extension attacks on SHA-1, which exploit the Merkle-Damgård construction to append data to a hashed message without knowing the original input, implementations commonly employ HMAC-SHA-1. This keyed hash-based message authentication code integrates a secret key into the hashing process through nested applications of the hash function, rendering length extensions computationally infeasible even if an attacker possesses the hash output and message length.[60] Following the 2017 SHAttered collision attack, which demonstrated practical collisions in SHA-1 by producing two distinct PDFs with identical hashes, Google collaborated with cryptanalyst Marc Stevens to develop a collision detection library. This open-source tool, released in 2017, serves as a drop-in replacement for standard SHA-1 libraries and detects collisions by identifying unavoidable conditions inherent to known attack vectors, such as specific differential patterns in the hash computation.[4][61] Integrated into GitHub's infrastructure, the library scans incoming content and rejects any exhibiting SHAttered-like collision artifacts, preventing malicious repository manipulations. The approach, detailed in a USENIX Security paper, accelerates detection without significantly impacting performance, covering a broader range of attack classes than initial countermeasures.[62] To counter chosen-prefix collisions, where attackers craft colliding messages with arbitrary prefixes, some protocols incorporate randomized hashing via salt or nonce addition. By prepending a unique, unpredictable value—such as a random serial number in X.509 certificates or a per-session salt in authentication schemes—the effective input becomes attacker-unpredictable, invalidating precomputed collision pairs and forcing recomputation.[63][64] This technique, while not eliminating all collision risks, substantially raises the attack cost in scenarios like certificate forgery or Git object tampering.[65] Post-2017, major implementations introduced enhancements to phase out SHA-1 reliance. OpenSSL 1.1.1, released in 2018, elevated security levels to disallow SHA-1 signatures by default in certificate validation and TLS handshakes, enforcing at least SHA-256 for new operations while supporting legacy verification only at reduced levels. Similarly, Git began experimental SHA-256 support in version 2.29 (2020), enabling repositories with the--object-format=sha256 flag to use the stronger hash for object identification, with full protocol compatibility maturing in subsequent releases like 2.42 (2023), where warnings were toned down. As of Git 2.51 (August 2025), SHA-256 is designated as the default hash algorithm in preparation for Git 3.0, expected in 2026.[66][29][67] These updates, motivated by practical breaks like SHAttered, facilitate gradual migration without immediate disruption.[30]