Hubbry Logo
SHA-1SHA-1Main
Open search
SHA-1
Community hub
SHA-1
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
SHA-1
SHA-1
from Wikipedia
Secure Hash Algorithms
Concepts
hash functions, SHA, DSA
Main standards
SHA-0, SHA-1, SHA-2, SHA-3
SHA-1
General
DesignersNational Security Agency
First published1993 (SHA-0),
1995 (SHA-1)
Series(SHA-0), SHA-1, SHA-2, SHA-3
CertificationFIPS PUB 180-4, CRYPTREC (Monitored)
Cipher detail
Digest sizes160 bits
Block sizes512 bits
StructureMerkle–Damgård construction
Rounds80
Best public cryptanalysis
A 2011 attack by Marc Stevens can produce hash collisions with a complexity between 260.3 and 265.3 operations.[1] The first public collision was published on 23 February 2017.[2] SHA-1 is prone to length extension attacks.

In cryptography, SHA-1 (Secure Hash Algorithm 1) is a hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecimal digits. It was designed by the United States National Security Agency, and is a U.S. Federal Information Processing Standard.[3] The algorithm has been cryptographically broken[4][5][6][7][8][9][10] but is still widely used.

Since 2005, SHA-1 has not been considered secure against well-funded opponents;[11] as of 2010 many organizations have recommended its replacement.[12][10][13] NIST formally deprecated use of SHA-1 in 2011 and disallowed its use for digital signatures in 2013, and declared that it should be phased out by 2030.[14] As of 2020, chosen-prefix attacks against SHA-1 are practical.[6][8] As such, it is recommended to remove SHA-1 from products as soon as possible and instead use SHA-2 or SHA-3. Replacing SHA-1 is urgent where it is used for digital signatures.

All major web browser vendors ceased acceptance of SHA-1 SSL certificates in 2017.[15][9][4] In February 2017, CWI Amsterdam and Google announced they had performed a collision attack against SHA-1, publishing two dissimilar PDF files which produced the same SHA-1 hash.[16][2] However, SHA-1 is still secure for HMAC.[17]

Microsoft has discontinued SHA-1 code signing support for Windows Update on August 3, 2020,[18] which also effectively ended the update servers for versions of Windows that have not been updated to SHA-2, such as Windows 2000 up to Vista, as well as Windows Server versions from Windows 2000 Server to Server 2003.

Development

[edit]
One iteration within the SHA-1 compression function:
  • A, B, C, D and E are 32-bit words of the state;
  • F is a nonlinear function that varies;
  • denotes a left bit rotation by n places;
  • n varies for each operation;
  • Wt is the expanded message word of round t;
  • Kt is the round constant of round t;
  • ⊞ denotes addition modulo 232.

SHA-1 produces a message digest based on principles similar to those used by Ronald L. Rivest of MIT in the design of the MD2, MD4 and MD5 message digest algorithms, but generates a larger hash value (160 bits vs. 128 bits).

SHA-1 was developed as part of the U.S. Government's Capstone project.[19] The original specification of the algorithm was published in 1993 under the title Secure Hash Standard, FIPS PUB 180, by U.S. government standards agency NIST (National Institute of Standards and Technology).[20][21] This version is now often named SHA-0. It was withdrawn by the NSA shortly after publication and was superseded by the revised version, published in 1995 in FIPS PUB 180-1 and commonly designated SHA-1. SHA-1 differs from SHA-0 only by a single bitwise rotation in the message schedule of its compression function. According to the NSA, this was done to correct a flaw in the original algorithm which reduced its cryptographic security, but they did not provide any further explanation.[22][23] Publicly available techniques did indeed demonstrate a compromise of SHA-0, in 2004, before SHA-1 in 2017 (see §Attacks).

Applications

[edit]

Cryptography

[edit]

SHA-1 forms part of several widely used security applications and protocols, including TLS and SSL, PGP, SSH, S/MIME, and IPsec. Those applications can also use MD5; both MD5 and SHA-1 are descended from MD4.

SHA-1 and SHA-2 are the hash algorithms required by law for use in certain U.S. government applications, including use within other cryptographic algorithms and protocols, for the protection of sensitive unclassified information. FIPS PUB 180-1 also encouraged adoption and use of SHA-1 by private and commercial organizations. SHA-1 is being retired from most government uses; the U.S. National Institute of Standards and Technology said, "Federal agencies should stop using SHA-1 for...applications that require collision resistance as soon as practical, and must use the SHA-2 family of hash functions for these applications after 2010",[24] though that was later relaxed to allow SHA-1 to be used for verifying old digital signatures and time stamps.[24]

A prime motivation for the publication of the Secure Hash Algorithm was the Digital Signature Standard, in which it is incorporated.

The SHA hash functions have been used for the basis of the SHACAL block ciphers.

Data integrity

[edit]

Revision control systems such as Git, Mercurial, and Monotone use SHA-1, not for security, but to identify revisions and to ensure that the data has not changed due to accidental corruption. Linus Torvalds said about Git in 2007:

If you have disk corruption, if you have DRAM corruption, if you have any kind of problems at all, Git will notice them. It's not a question of if, it's a guarantee. You can have people who try to be malicious. They won't succeed. [...] Nobody has been able to break SHA-1, but the point is the SHA-1, as far as Git is concerned, isn't even a security feature. It's purely a consistency check. The security parts are elsewhere, so a lot of people assume that since Git uses SHA-1 and SHA-1 is used for cryptographically secure stuff, they think that, Okay, it's a huge security feature. It has nothing at all to do with security, it's just the best hash you can get. ...
I guarantee you, if you put your data in Git, you can trust the fact that five years later, after it was converted from your hard disk to DVD to whatever new technology and you copied it along, five years later you can verify that the data you get back out is the exact same data you put in. [...]
One of the reasons I care is for the kernel, we had a break in on one of the BitKeeper sites where people tried to corrupt the kernel source code repositories.[25]

However Git does not require the second preimage resistance of SHA-1 as a security feature, since it will always prefer to keep the earliest version of an object in case of collision, preventing an attacker from surreptitiously overwriting files.[26] The known attacks (as of 2020) also do not break second preimage resistance.[27]

Cryptanalysis and validation

[edit]

For a hash function for which L is the number of bits in the message digest, finding a message that corresponds to a given message digest can always be done using a brute force search in approximately 2L evaluations. This is called a preimage attack and may or may not be practical depending on L and the particular computing environment. However, a collision, consisting of finding two different messages that produce the same message digest, requires on average only about 1.2 × 2L/2 evaluations using a birthday attack. Thus the strength of a hash function is usually compared to a symmetric cipher of half the message digest length. SHA-1, which has a 160-bit message digest, was originally thought to have 80-bit strength.

Some of the applications that use cryptographic hashes, like password storage, are only minimally affected by a collision attack. Constructing a password that works for a given account requires a preimage attack, as well as access to the hash of the original password, which may or may not be trivial. Reversing password encryption (e.g. to obtain a password to try against a user's account elsewhere) is not made possible by the attacks. However, even a secure password hash can't prevent brute-force attacks on weak passwords. See Password cracking.

In the case of document signing, an attacker could not simply fake a signature from an existing document: The attacker would have to produce a pair of documents, one innocuous and one damaging, and get the private key holder to sign the innocuous document. There are practical circumstances in which this is possible; until the end of 2008, it was possible to create forged SSL certificates using an MD5 collision.[28]

Due to the block and iterative structure of the algorithms and the absence of additional final steps, all SHA functions (except SHA-3)[29] are vulnerable to length-extension and partial-message collision attacks.[30] These attacks allow an attacker to forge a message signed only by a keyed hash – SHA(key || message), but not SHA(message || key) – by extending the message and recalculating the hash without knowing the key. A simple improvement to prevent these attacks is to hash twice: SHAd(message) = SHA(SHA(0b || message)) (the length of 0b, zero block, is equal to the block size of the hash function).

SHA-0

[edit]

At CRYPTO 98, two French researchers, Florent Chabaud and Antoine Joux, presented an attack on SHA-0: collisions can be found with complexity 261, fewer than the 280 for an ideal hash function of the same size.[31]

In 2004, Biham and Chen found near-collisions for SHA-0 – two messages that hash to nearly the same value; in this case, 142 out of the 160 bits are equal. They also found full collisions of SHA-0 reduced to 62 out of its 80 rounds.[32]

Subsequently, on 12 August 2004, a collision for the full SHA-0 algorithm was announced by Joux, Carribault, Lemuet, and Jalby. This was done by using a generalization of the Chabaud and Joux attack. Finding the collision had complexity 251 and took about 80,000 processor-hours on a supercomputer with 256 Itanium 2 processors (equivalent to 13 days of full-time use of the computer).

On 17 August 2004, at the Rump Session of CRYPTO 2004, preliminary results were announced by Wang, Feng, Lai, and Yu, about an attack on MD5, SHA-0 and other hash functions. The complexity of their attack on SHA-0 is 240, significantly better than the attack by Joux et al.[33][34]

In February 2005, an attack by Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu was announced which could find collisions in SHA-0 in 239 operations.[5][35]

Another attack in 2008 applying the boomerang attack brought the complexity of finding collisions down to 233.6, which was estimated to take 1 hour on an average PC from the year 2008.[36]

In light of the results for SHA-0, some experts[who?] suggested that plans for the use of SHA-1 in new cryptosystems should be reconsidered. After the CRYPTO 2004 results were published, NIST announced that they planned to phase out the use of SHA-1 by 2010 in favor of the SHA-2 variants.[37]

Attacks

[edit]

In early 2005, Vincent Rijmen and Elisabeth Oswald published an attack on a reduced version of SHA-1 – 53 out of 80 rounds – which finds collisions with a computational effort of fewer than 280 operations.[38]

In February 2005, an attack by Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu was announced.[5] The attacks can find collisions in the full version of SHA-1, requiring fewer than 269 operations. (A brute-force search would require 280 operations.)

The authors write: "In particular, our analysis is built upon the original differential attack on SHA-0, the near collision attack on SHA-0, the multiblock collision techniques, as well as the message modification techniques used in the collision search attack on MD5. Breaking SHA-1 would not be possible without these powerful analytical techniques."[39] The authors have presented a collision for 58-round SHA-1, found with 233 hash operations. The paper with the full attack description was published in August 2005 at the CRYPTO conference.

In an interview, Yin states that, "Roughly, we exploit the following two weaknesses: One is that the file preprocessing step is not complicated enough; another is that certain math operations in the first 20 rounds have unexpected security problems."[40]

On 17 August 2005, an improvement on the SHA-1 attack was announced on behalf of Xiaoyun Wang, Andrew Yao and Frances Yao at the CRYPTO 2005 Rump Session, lowering the complexity required for finding a collision in SHA-1 to 263.[7] On 18 December 2007 the details of this result were explained and verified by Martin Cochran.[41]

Christophe De Cannière and Christian Rechberger further improved the attack on SHA-1 in "Finding SHA-1 Characteristics: General Results and Applications,"[42] receiving the Best Paper Award at ASIACRYPT 2006. A two-block collision for 64-round SHA-1 was presented, found using unoptimized methods with 235 compression function evaluations. Since this attack requires the equivalent of about 235 evaluations, it is considered to be a significant theoretical break.[43] Their attack was extended further to 73 rounds (of 80) in 2010 by Grechnikov.[44] In order to find an actual collision in the full 80 rounds of the hash function, however, tremendous amounts of computer time are required. To that end, a collision search for SHA-1 using the volunteer computing platform BOINC began August 8, 2007, organized by the Graz University of Technology. The effort was abandoned May 12, 2009 due to lack of progress.[45]

At the Rump Session of CRYPTO 2006, Christian Rechberger and Christophe De Cannière claimed to have discovered a collision attack on SHA-1 that would allow an attacker to select at least parts of the message.[46][47]

In 2008, an attack methodology by Stéphane Manuel reported hash collisions with an estimated theoretical complexity of 251 to 257 operations.[48] However he later retracted that claim after finding that local collision paths were not actually independent, and finally quoting for the most efficient a collision vector that was already known before this work.[49]

Cameron McDonald, Philip Hawkes and Josef Pieprzyk presented a hash collision attack with claimed complexity 252 at the Rump Session of Eurocrypt 2009.[50] However, the accompanying paper, "Differential Path for SHA-1 with complexity O(252)" has been withdrawn due to the authors' discovery that their estimate was incorrect.[51]

One attack against SHA-1 was Marc Stevens[52] with an estimated cost of $2.77M (2012) to break a single hash value by renting CPU power from cloud servers.[53] Stevens developed this attack in a project called HashClash,[54] implementing a differential path attack. On 8 November 2010, he claimed he had a fully working near-collision attack against full SHA-1 working with an estimated complexity equivalent to 257.5 SHA-1 compressions. He estimated this attack could be extended to a full collision with a complexity around 261.

The SHAppening

[edit]

On 8 October 2015, Marc Stevens, Pierre Karpman, and Thomas Peyrin published a freestart collision attack on SHA-1's compression function that requires only 257 SHA-1 evaluations. This does not directly translate into a collision on the full SHA-1 hash function (where an attacker is not able to freely choose the initial internal state), but undermines the security claims for SHA-1. In particular, it was the first time that an attack on full SHA-1 had been demonstrated; all earlier attacks were too expensive for their authors to carry them out. The authors named this significant breakthrough in the cryptanalysis of SHA-1 The SHAppening.[10]

The method was based on their earlier work, as well as the auxiliary paths (or boomerangs) speed-up technique from Joux and Peyrin, and using high performance/cost efficient GPU cards from Nvidia. The collision was found on a 16-node cluster with a total of 64 graphics cards. The authors estimated that a similar collision could be found by buying US$2,000 of GPU time on EC2.[10]

The authors estimated that the cost of renting enough of EC2 CPU/GPU time to generate a full collision for SHA-1 at the time of publication was between US$75K and $120K, and noted that was well within the budget of criminal organizations, not to mention national intelligence agencies. As such, the authors recommended that SHA-1 be deprecated as quickly as possible.[10]

SHAttered – first public collision

[edit]

On 23 February 2017, the CWI (Centrum Wiskunde & Informatica) and Google announced the SHAttered attack, in which they generated two different PDF files with the same SHA-1 hash in roughly 263.1 SHA-1 evaluations. This attack is about 100,000 times faster than brute forcing a SHA-1 collision with a birthday attack, which was estimated to take 280 SHA-1 evaluations. The attack required "the equivalent processing power of 6,500 years of single-CPU computations and 110 years of single-GPU computations".[2]

Birthday-Near-Collision Attack – first practical chosen-prefix attack

[edit]

On 24 April 2019 a paper by Gaëtan Leurent and Thomas Peyrin presented at Eurocrypt 2019 described an enhancement to the previously best chosen-prefix attack in Merkle–Damgård–like digest functions based on Davies–Meyer block ciphers. With these improvements, this method is capable of finding chosen-prefix collisions in approximately 268 SHA-1 evaluations. This is approximately 1 billion times faster (and now usable for many targeted attacks, thanks to the possibility of choosing a prefix, for example malicious code or faked identities in signed certificates) than the previous attack's 277.1 evaluations (but without chosen prefix, which was impractical for most targeted attacks because the found collisions were almost random)[1] and is fast enough to be practical for resourceful attackers, requiring approximately $100,000 of cloud processing. This method is also capable of finding chosen-prefix collisions in the MD5 function, but at a complexity of 246.3 does not surpass the prior best available method at a theoretical level (239), though potentially at a practical level (≤249).[55] This attack has a memory requirement of 500+ GB.

On 5 January 2020 the authors published an improved attack called "shambles".[8] In this paper they demonstrate a chosen-prefix collision attack with a complexity of 263.4, that at the time of publication would cost US$45K per generated collision.

Official validation

[edit]

Implementations of all FIPS-approved security functions can be officially validated through the CMVP program, jointly run by the National Institute of Standards and Technology (NIST) and the Communications Security Establishment (CSE). For informal verification, a package to generate a high number of test vectors is made available for download on the NIST site; the resulting verification, however, does not replace the formal CMVP validation, which is required by law for certain applications.

As of December 2013, there are over 2000 validated implementations of SHA-1, with 14 of them capable of handling messages with a length in bits not a multiple of eight (see SHS Validation List Archived 2011-08-23 at the Wayback Machine).

Examples and pseudocode

[edit]

Example hashes

[edit]

These are examples of SHA-1 message digests in hexadecimal and in Base64 binary to ASCII text encoding.

  • SHA1("The quick brown fox jumps over the lazy dog")
    • Outputted hexadecimal: 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12
    • Outputted Base64 binary to ASCII text encoding: L9ThxnotKPzthJ7hu3bnORuT6xI=

Even a small change in the message will, with overwhelming probability, result in many bits changing due to the avalanche effect. For example, changing dog to cog produces a hash with different values for 81 of the 160 bits:

  • SHA1("The quick brown fox jumps over the lazy cog")
    • Outputted hexadecimal: de9f2c7fd25e1b3afad3e85a0bd17d9b100db4b3
    • Outputted Base64 binary to ASCII text encoding: 3p8sf9JeGzr60+haC9F9mxANtLM=

The hash of the zero-length string is:

  • SHA1("")
    • Outputted hexadecimal: da39a3ee5e6b4b0d3255bfef95601890afd80709
    • Outputted Base64 binary to ASCII text encoding: 2jmj7l5rSw0yVb/vlWAYkK/YBwk=

SHA-1 pseudocode

[edit]

Pseudocode for the SHA-1 algorithm follows:

Note 1: All variables are unsigned 32-bit quantities and wrap modulo 232 when calculating, except for
        ml, the message length, which is a 64-bit quantity, and
        hh, the message digest, which is a 160-bit quantity.
Note 2: All constants in this pseudo code are in big endian.
        Within each word, the most significant byte is stored in the leftmost byte position

Initialize variables:

h0 = 0x67452301
h1 = 0xEFCDAB89
h2 = 0x98BADCFE
h3 = 0x10325476
h4 = 0xC3D2E1F0

ml = message length in bits (always a multiple of the number of bits in a character).

Pre-processing:
append the bit '1' to the message e.g. by adding 0x80 if message length is a multiple of 8 bits.
append 0 ≤ k < 512 bits '0', such that the resulting message length in bits
   is congruent to −64 ≡ 448 (mod 512)
append ml, the original message length in bits, as a 64-bit big-endian integer. 
   Thus, the total length is a multiple of 512 bits.

Process the message in successive 512-bit chunks:
break message into 512-bit chunks
for each chunk
    break chunk into sixteen 32-bit big-endian words w[i], 0 ≤ i ≤ 15

    Message schedule: extend the sixteen 32-bit words into eighty 32-bit words:
    for i from 16 to 79
        Note 3: SHA-0 differs by not having this leftrotate.
        w[i] = (w[i-3] xor w[i-8] xor w[i-14] xor w[i-16]) leftrotate 1

    Initialize hash value for this chunk:
    a = h0
    b = h1
    c = h2
    d = h3
    e = h4

    Main loop:[3][56]
    for i from 0 to 79
        if 0 ≤ i ≤ 19 then
            f = (b and c) or ((not b) and d)
            k = 0x5A827999
        else if 20 ≤ i ≤ 39
            f = b xor c xor d
            k = 0x6ED9EBA1
        else if 40 ≤ i ≤ 59
            f = (b and c) or (b and d) or (c and d) 
            k = 0x8F1BBCDC
        else if 60 ≤ i ≤ 79
            f = b xor c xor d
            k = 0xCA62C1D6

        temp = (a leftrotate 5) + f + e + k + w[i]
        e = d
        d = c
        c = b leftrotate 30
        b = a
        a = temp

    Add this chunk's hash to result so far:
    h0 = h0 + a
    h1 = h1 + b 
    h2 = h2 + c
    h3 = h3 + d
    h4 = h4 + e

Produce the final hash value (big-endian) as a 160-bit number:
hh = (h0 leftshift 128) or (h1 leftshift 96) or (h2 leftshift 64) or (h3 leftshift 32) or h4

The number hh is the message digest, which can be written in hexadecimal (base 16).

The chosen constant values used in the algorithm were assumed to be nothing up my sleeve numbers:

  • The four round constants k are 230 times the square roots of 2, 3, 5 and 10. However they were incorrectly rounded to the nearest integer instead of being rounded to the nearest odd integer, with equilibrated proportions of zero and one bits. As well, choosing the square root of 10 (which is not a prime) made it a common factor for the two other chosen square roots of primes 2 and 5, with possibly usable arithmetic properties across successive rounds, reducing the strength of the algorithm against finding collisions on some bits.
  • The first four starting values for h0 through h3 are the same with the MD5 algorithm, and the fifth (for h4) is similar. However they were not properly verified for being resistant against inversion of the few first rounds to infer possible collisions on some bits, usable by multiblock differential attacks.

Instead of the formulation from the original FIPS PUB 180-1 shown, the following equivalent expressions may be used to compute f in the main loop above:

Bitwise choice between c and d, controlled by b.
(0  ≤ i ≤ 19): f = d xor (b and (c xor d))                (alternative 1)
(0  ≤ i ≤ 19): f = (b and c) or ((not b) and d)           (alternative 2)
(0  ≤ i ≤ 19): f = (b and c) xor ((not b) and d)          (alternative 3)
(0  ≤ i ≤ 19): f = vec_sel(d, c, b)                       (alternative 4)
 [premo08]
Bitwise majority function.
(40 ≤ i ≤ 59): f = (b and c) or (d and (b or c))          (alternative 1)
(40 ≤ i ≤ 59): f = (b and c) or (d and (b xor c))         (alternative 2)
(40 ≤ i ≤ 59): f = (b and c) xor (d and (b xor c))        (alternative 3)
(40 ≤ i ≤ 59): f = (b and c) xor (b and d) xor (c and d)  (alternative 4)
(40 ≤ i ≤ 59): f = vec_sel(c, b, c xor d)                 (alternative 5)

It was also shown[57] that for the rounds 32–79 the computation of:

w[i] = (w[i-3] xor w[i-8] xor w[i-14] xor w[i-16]) leftrotate 1

can be replaced with:

w[i] = (w[i-6] xor w[i-16] xor w[i-28] xor w[i-32]) leftrotate 2

This transformation keeps all operands 64-bit aligned and, by removing the dependency of w[i] on w[i-3], allows efficient SIMD implementation with a vector length of 4 like x86 SSE instructions.

Comparison of SHA functions

[edit]

In the table below, internal state means the "internal hash sum" after each compression of a data block.

Comparison of SHA functions
Algorithm and variant Output size
(bits)
Internal
state size
(bits)
Block size
(bits)
Rounds Operations Security
(bits)
Performance on Skylake (median cpb)[58] First published
Long messages 8 bytes
MD5 (as reference) 128 128
(4 × 32)
512 4
(16 operations in each round)
And, Xor, Or, Rot, Add (mod 232) ≤ 18
(collisions found)[59]
4.99 55.00 1992
SHA-0 160 160
(5 × 32)
512 80 And, Xor, Or, Rot, Add (mod 232) < 34
(collisions found)
≈ SHA-1 ≈ SHA-1 1993
SHA-1 < 63
(collisions found)[60]
3.47 52.00 1995
SHA-2 SHA-224
SHA-256
224
256
256
(8 × 32)
512 64 And, Xor, Or,
Rot, Shr, Add (mod 232)
112
128
7.62
7.63
84.50
85.25
2004
2001
SHA-384 384 512
(8 × 64)
1024 80 And, Xor, Or,
Rot, Shr, Add (mod 264)
192 5.12 135.75 2001
SHA-512 512 256 5.06 135.50 2001
SHA-512/224
SHA-512/256
224
256
112
128
≈ SHA-384 ≈ SHA-384 2012
SHA-3 SHA3-224
SHA3-256
SHA3-384
SHA3-512
224
256
384
512
1600
(5 × 5 × 64)
1152
1088
832
576
24[61] And, Xor, Rot, Not 112
128
192
256
8.12
8.59
11.06
15.88
154.25
155.50
164.00
164.00
2015
SHAKE128
SHAKE256
d (arbitrary)
d (arbitrary)
1344
1088
min(d/2, 128)
min(d/2, 256)
7.08
8.59
155.25
155.50

Implementations

[edit]

Below is a list of cryptography libraries that support SHA-1:

Hardware acceleration is provided by the following processor extensions:

Collision countermeasure

[edit]

In the wake of SHAttered, Marc Stevens and Dan Shumow published "sha1collisiondetection" (SHA-1CD), a variant of SHA-1 that detects collision attacks and changes the hash output when one is detected. The false positive rate is 2−90.[63] SHA-1CD is used by GitHub since March 2017 and git since version 2.13.0 of May 2017.[64]

See also

[edit]

Notes

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
SHA-1 (Secure Hash Algorithm 1) is a that takes an input message of arbitrary length less than 264 bits and produces a fixed 160-bit (20-byte) hash value, known as a message digest, typically expressed as 40 digits. Developed by the (NSA), it was first published by the National Institute of Standards and Technology (NIST) on April 17, 1995, as Federal Information Processing Standard (FIPS) PUB 180-1, superseding an earlier version from 1993. The algorithm processes the input data in 512-bit blocks after padding, performing 80 rounds of bitwise operations, modular additions, and rotations on five 32-bit words to generate the digest, making it suitable for applications requiring and authenticity, such as digital signatures with the (DSA). SHA-1 was designed as a revision of Ron Rivest's , incorporating a one-bit shift to enhance security against known attacks on MD4 and MD5. It quickly became a cornerstone of cryptographic protocols, adopted in standards like TLS/SSL for certificate validation, for authentication, and tools like for integrity checks, as well as in PGP and for security. By the early 2000s, SHA-1 was the most widely used globally, included in FIPS 180 updates through version 4 in 2012, though its security properties were increasingly scrutinized. Cryptanalytic advances revealed vulnerabilities in SHA-1, starting with theoretical collision attacks in 2004 that reduced its effective security below 80 bits, prompting NIST to deprecate its use for generation in 2011 and disallow it for that purpose after , 2013, while allowing continued use in other applications until further transitions. A landmark practical collision was demonstrated in February 2017 by researchers from and the CWI Institute in , who generated two different PDF files with identical SHA-1 hashes using significant computational resources, confirming the algorithm's practical break. In response, NIST announced the full retirement of SHA-1 in December 2022, mandating its phase-out by , 2030, in all remaining legacy applications like hash-based message authentication and , recommending migration to the more secure and families. Despite its obsolescence, SHA-1 persists in some non-critical legacy systems, underscoring the importance of timely cryptographic updates.

History

Origins and Development

SHA-1, or Secure Hash Algorithm 1, was developed by the (NSA) as part of the U.S. Government's Capstone project to establish robust cryptographic standards for federal use. The algorithm's design drew principles from Ronald L. Rivest's MD4 message-digest algorithm, aiming to create a more secure hashing function modeled after MD4 and its successor MD5. While specific individual designers are not publicly attributed, the effort was led by the NSA with significant input from the National Institute of Standards and Technology (NIST) to ensure compatibility with emerging requirements. The primary design goal of SHA-1 was to produce a 160-bit message digest, providing enhanced collision resistance compared to the 128-bit output of MD5 by making it computationally infeasible to find two distinct messages with the same hash value. This longer digest length was intended to support secure applications such as digital signatures, where even minor message alterations should be detectable with high probability. NIST formalized SHA-1 through the Secure Hash Standard, publishing it as Federal Information Processing Standard (FIPS) PUB 180-1 on April 17, 1995, with an effective date of October 2, 1995. SHA-1 saw initial adoption as a core component of the Digital Signature Standard (DSS), specified in FIPS PUB 186, where it is required for use with the Digital Signature Algorithm (DSA) to generate and verify signatures. This integration positioned SHA-1 as a foundational element in U.S. federal cryptography protocols from the mid-1990s onward, superseding the earlier SHA specification in FIPS PUB 180 from 1993.

Relation to SHA-0

SHA-0 was initially developed by the (NSA) and announced by the National Institute of Standards and Technology (NIST) in April 1993 as a draft version of the Secure Hash Standard (SHS) intended to succeed due to emerging weaknesses in the latter. However, shortly after its publication in May 1993, the NSA identified an undisclosed weakness in SHA-0 and requested that NIST withdraw it, limiting its distribution and preventing widespread release or adoption. This precursor version remained largely undocumented publicly, with only limited details emerging later through reverse-engineering and analyses. To address the flaw, the NSA modified SHA-0 to produce SHA-1, which was subsequently published in April 1995 as FIPS PUB 180-1. The key change involved altering the message schedule in the compression function by introducing a left (rotation) of one bit on each expanded word, effectively changing the rotation constant from 0 in SHA-0 to 1 in SHA-1; this adjustment was explicitly stated to correct the identified weakness without altering the overall structure or output size. All other aspects of , including the 160-bit output and the 80-round compression process, remained consistent. Public understanding of SHA-0's specific vulnerabilities was limited until analytical work began in the late 1990s, as the NSA never disclosed details of the original flaw. The first published collision attack on full SHA-0 appeared in 1998, demonstrating that collisions could be found with approximately 2^{61} operations using differential cryptanalysis techniques. This work by Chabaud and Joux highlighted structural differences that made SHA-0 more susceptible than SHA-1 to such attacks, though it did not directly reveal the NSA's undisclosed issue. As a result, SHA-1 was positioned by NIST as the secure, corrected of the , suitable for federal use and in cryptographic protocols, while SHA-0 was effectively abandoned and never formalized in any subsequent FIPS . This transition underscored early efforts to balance rapid deployment with rigorous validation in development.

Algorithm Description

Input Preparation

The input preparation phase of SHA-1 transforms an arbitrary-length message into a sequence of fixed-size blocks suitable for processing by the hash function. This involves padding the message to ensure its length is a multiple of 512 bits, allowing it to be divided into 512-bit (64-byte) blocks, with each block subsequently processed in 80 rounds during the compression phase. The padding rule begins by appending a single '1' bit to the message, followed by a sequence of zero bits. The number of zero bits, denoted as kk, is the smallest non-negative integer such that the total length after appending the '1' bit and kk zeros satisfies (λ+1+k)448(mod512)(\lambda + 1 + k) \equiv 448 \pmod{512}, where λ\lambda is the original message length in bits. This ensures 64 bits remain for the length field. Following the padding bits, the 64-bit binary representation of the original message length λ\lambda (in big-endian byte order) is appended. SHA-1 supports messages up to 26412^{64} - 1 bits in length, and the resulting padded message length is always a multiple of 512 bits. Prior to processing the blocks, SHA-1 initializes five 32-bit registers, H0H_0 through H4H_4, with specific hexadecimal constants:
H0=0x67452301H_0 = 0x67452301,
H1=0xefcdab89H_1 = 0xefcdab89,
H2=0x98badcfeH_2 = 0x98badcfe,
H3=0x10325476H_3 = 0x10325476,
H4=0xc3d2e1f0H_4 = 0xc3d2e1f0.
These values are derived from the first 32 bits of the fractional parts of the square roots of the first five prime numbers: 2\sqrt{2}
Add your contribution
Related Hubs
User Avatar
No comments yet.