Recent from talks
Contribute something
Nothing was collected or created yet.
| General | |
|---|---|
| Designers | Ronald Rivest |
| First published | October 1990[1] |
| Series | MD2, MD4, MD5, MD6 |
| Cipher detail | |
| Digest sizes | 128 bits |
| Block sizes | 512 bits |
| Rounds | 3 |
| Best public cryptanalysis | |
| A collision attack published in 2007 can find collisions for full MD4 in less than two hash operations.[2] | |
The MD4 Message-Digest Algorithm is a cryptographic hash function developed by Ronald Rivest in 1990.[3] The digest length is 128 bits. The algorithm has influenced later designs, such as the MD5, SHA-1 and RIPEMD algorithms. The initialism "MD" stands for "Message Digest".

The security of MD4 has been severely compromised. The first full collision attack against MD4 was published in 1995, and several newer attacks have been published since then. As of 2007, an attack can generate collisions in less than two MD4 hash operations.[2] A theoretical preimage attack also exists.
A variant of MD4 is used in the ed2k URI scheme to provide a unique identifier for a file in the popular eDonkey2000 / eMule P2P networks. MD4 was also used by the rsync protocol (prior to version 3.0.0).
MD4 is used to compute NTLM password-derived key digests on Microsoft Windows NT, XP, Vista, 7, 8, 10 and 11.[4]
Security
[edit]Weaknesses in MD4 were demonstrated by Den Boer and Bosselaers in a paper published in 1991.[5] The first full-round MD4 collision attack was found by Hans Dobbertin in 1995, which took only seconds to carry out at that time.[6] In August 2004, Wang et al. found a very efficient collision attack, alongside attacks on later hash function designs in the MD4/MD5/SHA-1/RIPEMD family. This result was improved later by Sasaki et al., and generating a collision is now as cheap as verifying it (a few microseconds).[2]
In 2008, the preimage resistance of MD4 was also broken by Gaëtan Leurent, with a 2102 attack.[7] In 2010 Guo et al published a 299.7 attack.[8]
In 2011, RFC 6150 stated that RFC 1320 (MD4) is historic (obsolete).
MD4 hashes
[edit]The 128-bit (16-byte) MD4 hashes (also termed message digests) are typically represented as 32-digit hexadecimal numbers. The following demonstrates a 43-byte ASCII input and the corresponding MD4 hash:
MD4("The quick brown fox jumps over the lazy dog")
= 1bee69a46ba811185c194762abaeae90
Even a small change in the message will (with overwhelming probability) result in a completely different hash, e.g. changing d to c:
MD4("The quick brown fox jumps over the lazy cog")
= b86e130ce7028da59e672d56ad0113df
The hash of the zero-length string is:
MD4("") = 31d6cfe0d16ae931b73c59d7e0c089c0
MD4 test vectors
[edit]The following test vectors are defined in RFC 1320 (The MD4 Message-Digest Algorithm)
MD4 ("") = 31d6cfe0d16ae931b73c59d7e0c089c0
MD4 ("a") = bde52cb31de33e46245e05fbdbd6fb24
MD4 ("abc") = a448017aaf21d8525fc10ae87aa6729d
MD4 ("message digest") = d9130a8164549fe818874806e1c7014b
MD4 ("abcdefghijklmnopqrstuvwxyz") = d79e1c308aa5bbcdeea8ed63df412da9
MD4 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") = 043f8582f241db351ce627e153e7f0e4
MD4 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") = e33b4ddc9c38f2199c3e7b164fcc0536
MD4 collision example
[edit]Let:
k1 = 839c7a4d7a92cb5678a5d5b9eea5a7573c8a74deb366c3dc20a083b69f5d2a3bb3719dc69891e9f95e809fd7e8b23ba6318edd45e51fe39708bf9427e9c3e8b9 k2 = 839c7a4d7a92cbd678a5d529eea5a7573c8a74deb366c3dc20a083b69f5d2a3bb3719dc69891e9f95e809fd7e8b23ba6318edc45e51fe39708bf9427e9c3e8b9
MD4(k1) = MD4(k2) = 4d7e6a1defa93d2dde05b45d864c429b
Note that two hex-digits of k1 and k2 define one byte of the input string, whose length is 64 bytes .
See also
[edit]References
[edit]- Bert den Boer, Antoon Bosselaers: An Attack on the Last Two Rounds of MD4. Crypto 1991: 194–203
- Hans Dobbertin: Cryptanalysis of MD4. Fast Software Encryption 1996: 53–69
- Hans Dobbertin, 1998. Cryptanalysis of MD4. J. Cryptology 11(4): 253–271
- Xiaoyun Wang, Xuejia Lai, Dengguo Feng, Hui Chen, Xiuyuan Yu: Cryptanalysis of the Hash Functions MD4 and RIPEMD. Eurocrypt 2005: 1–18
- Yu Sasaki, Lei Wang, Kazuo Ohta, Noboru Kunihiro: New Message Difference for MD4. Fast Software Encryption 2007: 329–348
- ^ Rivest, Ronald L. (October 1990). "The MD4 Message Digest Algorithm". Network Working Group. Retrieved 2011-04-29.
- ^ a b c Yu Sasaki; et al. (2007). "New message difference for MD4" (PDF).
{{cite journal}}: Cite journal requires|journal=(help) - ^ "What are MD2, MD4, and MD5?". Public-Key Cryptography Standards (PKCS): PKCS #7: Cryptographic Message Syntax Standard: 3.6 Other Cryptographic Techniques: 3.6.6 What are MD2, MD4, and MD5?. RSA Laboratories. Archived from the original on 2011-09-01. Retrieved 2011-04-29.
- ^ "5.1 Security Considerations for Implementors". Retrieved 2011-07-21.
Deriving a key from a password is as specified in [RFC1320] and [FIPS46-2].
- ^ Bert den Boer, Antoon Bosselaers (1991). "An Attack on the Last Two Rounds of MD4" (PDF). Archived from the original (PDF) on 2003-05-23.
{{cite journal}}: Cite journal requires|journal=(help) - ^ Hans Dobbertin (1995-10-23). "Cryptanalysis of MD4". Journal of Cryptology. 11 (4): 253–271. doi:10.1007/s001459900047. S2CID 7462235.
- ^ Gaëtan Leurent (2008-02-10). "MD4 is Not One-Way" (PDF). Fast Software Encryption. Lecture Notes in Computer Science. Vol. 5086. Springer. pp. 412–428. doi:10.1007/978-3-540-71039-4_26. ISBN 978-3-540-71038-7. Archived from the original (PDF) on 2011-06-11.
{{cite book}}:|journal=ignored (help) - ^ Guo, Jian; Ling, San; Rechberger, Christian; Wang, Huaxiong (2010). "Advanced Meet-in-the-Middle Preimage Attacks: First Results on Full Tiger, and Improved Results on MD4 and SHA-2". Advances in Cryptology - ASIACRYPT 2010. Lecture Notes in Computer Science. Vol. 6477. pp. 56–75. doi:10.1007/978-3-642-17373-8_4. hdl:10356/94168. ISBN 978-3-642-17372-1.
External links
[edit]- RFC 1320 - Description of MD4 by Ron Rivest
- RFC 6150 - MD4 to Historic Status
- Rivest, Ronald (1991). "The MD4 Message Digest Algorithm". Advances in Cryptology-CRYPT0' 90. Lecture Notes in Computer Science. Vol. 537. Springer Berlin / Heidelberg. pp. 303–311. doi:10.1007/3-540-38424-3_22. ISBN 978-3-540-54508-8.
Collision attacks
[edit]Overview and History
Description and Purpose
MD4 is a cryptographic hash function developed by Ronald Rivest in 1990.[2] It takes an input message of arbitrary length and produces a fixed 128-bit output, typically represented as a 32-character hexadecimal digest.[2] The primary purpose of MD4 is to provide a secure and efficient method for compressing messages, particularly for use in digital signature applications where large files must be "compressed" before encryption with a private key.[2] It was designed to support data integrity checks by generating a unique fingerprint that verifies whether a message has been altered.[2] Optimized for fast computation on 32-bit processors, MD4 achieves high throughput, such as approximately 1,450,000 bytes per second on a SUN Sparc workstation.[2] MD4 processes input messages in 512-bit blocks and maintains an internal state consisting of four 32-bit words, labeled A, B, C, and D.[2] These state variables are initialized to fixed constants: A = 0x67452301, B = 0xefcdab89, C = 0x98badcfe, and D = 0x10325476.[2] Compared to its predecessor MD2, which was optimized for 8-bit processors, MD4 offers significant improvements in speed and simplicity while maintaining the 128-bit output size and targeting enhanced security against collision attacks at the time of its release.[2] Rivest later developed MD5 in 1991 as a strengthened successor to address emerging concerns with MD4.[4]Development and Key Milestones
MD4 was developed by Ronald L. Rivest, a prominent cryptographer known for co-inventing the RSA public-key cryptosystem in 1977 alongside Adi Shamir and Leonard Adleman, and for creating the MD2 message-digest algorithm in 1989 as an early cryptographic hash function optimized for 8-bit processors.[5][6] Building on these foundations, Rivest designed MD4 in late 1990 as a faster and more secure successor to MD2, specifically tailored for 32-bit architectures to address the growing need for efficient one-way hash functions in digital signature schemes.[7] The primary motivation for MD4 stemmed from the requirements of emerging internet protocols and applications, such as Pretty Good Privacy (PGP), which demanded a secure method to compress arbitrary-length messages into a fixed 128-bit digest for integrity verification and signing with public-key systems like RSA, without relying on large lookup tables that could hinder performance.[7] Rivest placed the algorithm in the public domain to encourage review and potential standardization, reflecting the collaborative ethos of early cryptographic development in the internet era.[1] Key milestones include the first public description in RFC 1186 in October 1990, which outlined an initial version of the algorithm, followed by the definitive specification in RFC 1320 published by the Internet Engineering Task Force (IETF) in April 1992, providing a portable reference implementation.[8][1] MD4 saw rapid adoption shortly thereafter, notably in the initial release of PGP software in 1991, where it served as the core hashing mechanism for message authentication, and in various early cryptographic libraries by 1993, underscoring its role in facilitating secure email and data exchange protocols.[9] This widespread integration was short-lived, however, as emerging security concerns prompted Rivest to introduce MD5 in 1991 as a strengthened variant.[10]Algorithm Design
Message Preparation and Padding
In the MD4 algorithm, the input message is first prepared by appending padding bits to ensure it can be divided into 512-bit blocks for processing. This padding scheme begins with the addition of a single '1' bit to the end of the message, followed by enough '0' bits to make the total length congruent to 448 modulo 512 bits.[11] The purpose of this padding is to align the message length with MD4's block-based design, facilitating the compression function's operation on fixed-size inputs.[11] Padding is applied regardless of the message's original length; if the length is already congruent to 448 modulo 512 bits, a full 512 bits of padding (starting with the '1' bit) are still appended to create an additional block. At least one bit is always added, and the maximum padding added in a single step is 512 bits. Following the bit padding, a 64-bit little-endian integer representing the original message length in bits is appended as the final 64 bits, resulting in a total length that is a multiple of 512 bits.[11][12] For messages of 0 to 447 bits, the padding results in exactly one 512-bit block; for 448 to 511 bits, it results in two 512-bit blocks. This ensures all inputs are processed through the compression function.[11][12] MD4 supports a maximum original message length of bits to avoid overflow in the 64-bit length field; if the message exceeds this, only the low-order 64 bits of the length are used, though such cases are not standard.[12]Initialization and State Management
The MD4 hash function begins with the initialization of a state consisting of four 32-bit registers, denoted as A, B, C, and D. These registers are set to specific initial values: A = 0x67452301, B = 0xefcdab89, C = 0x98badcfe, and D = 0x10325476.[1] These constants are chosen arbitrarily but fixed to ensure consistent hashing across implementations.[1] The state is maintained as a chain of these four 32-bit words throughout the computation. For each 512-bit message block, the current values of A, B, C, and D serve as the input to the compression function. To preserve the chaining, the pre-compression values are temporarily saved (as AA, BB, CC, and DD), the compression is performed to update A, B, C, and D, and then the saved values are added to the updated registers: A ← AA + A, B ← BB + B, C ← CC + C, and D ← DD + D.[1] This modular addition (modulo 2^32) ensures that the output state of one block directly becomes the input state for the subsequent block, enabling the processing of messages longer than 512 bits.[1] MD4 operates on data in little-endian byte order, where the least significant byte of each 32-bit word is stored first.[1] This convention applies to both the representation of the state registers and the parsing of message blocks into 32-bit words, ensuring portability across different system architectures.[1]Compression Function Steps
The MD4 compression function processes each 512-bit message block to update the 128-bit chaining variables, producing an intermediate hash value that is carried forward. Each block is divided into 16 32-bit words, denoted as M through M[13], which serve as inputs to a series of 48 computational steps organized into three rounds of 16 steps each. These steps perform nonlinear transformations, modular additions, and bit rotations to achieve diffusion and mixing across the state variables A, B, C, and D (initialized from the chaining values). All arithmetic operations are performed modulo 2^{32}, ensuring the values remain within 32 bits.[14] In the first round (steps 0 to 15), the function F is applied, defined as F(X, Y, Z) = (X \land Y) \lor (\lnot X \land Z), which selectively combines bits based on the value of X. Each step updates one of the state variables according to the formula: where k indexes the message word (following a specific sequence: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15), and s is the left rotation amount, cycling through 3, 7, 11, 7, 11, 19, 3, 7, 11, 7, 11, 19, 3, 7, 11, 7 for the respective steps. No constant is added in this round (equivalent to adding 0). The state variables are cyclically permuted after each update: for example, after updating A, the registers become D, A, B, C for the next step. This round emphasizes bitwise selection to propagate changes from the message into the state.[14] The second round (steps 16 to 31) employs the function G, defined as G(X, Y, Z) = (X \land Y) \lor (X \land Z) \lor (Y \land Z), which provides a more balanced mixing by including majority-like operations among the inputs. The update formula becomes: Here, k follows the sequence 0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15, and rotation amounts s cycle through 3, 5, 9, 13, 3, 5, 9, 13, 3, 5, 9, 13, 3, 5, 9, 13. The constant 0x5A827999, derived from the sine of fractions of π to promote avalanche effects, is added to each step. State permutation follows the same cyclic pattern as in the first round, enhancing nonlinear diffusion.[14] The third round (steps 32 to 47) uses the function H, defined as H(X, Y, Z) = X \oplus Y \oplus Z, a linear XOR operation that further spreads bits across the state for parity-like mixing. The formula is: with k sequencing as 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15, and s values of 3, 9, 11, 15, 3, 9, 11, 15, 3, 9, 11, 15, 3, 9, 11, 15. The constant 0x6ED9EBA1, similarly based on trigonometric values, is incorporated. Cyclic permutation of the state continues, completing the block's transformation.[14] Upon completing the 48 steps, the updated state values are added to the initial chaining variables: A \leftarrow A + AA, B \leftarrow B + BB, C \leftarrow C + CC, D \leftarrow D + DD, where AA through DD are the saved copies from the block's start. This produces the chaining value for the next block or the final 128-bit hash if it is the last block. The design of these rounds, with varying functions, word orders, and rotations, aims to ensure thorough avalanche and irreversibility in the compression.[14]Security Analysis
Known Weaknesses and Theoretical Flaws
MD4 employs the Merkle-Damgård construction, which appends the message length after padding to the input blocks processed by the compression function; this structural choice renders it vulnerable to length extension attacks, where an adversary can extend a known hash value to forge a valid hash for a longer message without knowledge of the original secret. The early rounds of MD4 exhibit weak diffusion properties, with insufficient bit mixing that allows differences introduced in the input to propagate predictably through the state registers, facilitating the application of differential cryptanalysis techniques.[15] This limited avalanche in the initial steps enables attackers to construct low-probability differential paths that exploit the linear nature of the operations, compromising the function's resistance to chosen-difference manipulations.[15] MD4's compression function consists of 48 steps across three rounds, a design choice that was deemed inadequate for providing full 128-bit security even by mid-1990s standards, as it failed to achieve the expected level of non-linearity and mixing required to withstand emerging cryptanalytic methods.[4] The reliance on relatively simple boolean functions and fixed shift operations in these steps contributes to predictable behavior under analysis, falling short of the robustness needed for long-term collision resistance.[4] Theoretically, MD4 was intended to offer collision resistance on the order of 2^{64} operations due to its 128-bit output size, but the specific choices in its nonlinear functions and round structure result in a failure to attain this bound, allowing collisions to be found with significantly lower complexity through structured differences.[15] In comparison to its successor MD5, MD4's three-round structure lacks the additional fourth round introduced in MD5, which enhances the avalanche effect by further propagating bit changes across the state and improving overall diffusion to better resist differential attacks.[4] This simpler round design in MD4 prioritizes computational efficiency over security margins, contributing to its earlier vulnerability revelations.[4]Practical Attacks and Exploits
The first practical collision attack on MD4 was demonstrated by Hans Dobbertin in 1996, utilizing differential cryptanalysis to identify two distinct 512-bit blocks that produce identical hash outputs. This attack targets the full compression function and requires roughly 2^{20} evaluations of the MD4 function, executable in seconds on contemporary 1990s hardware such as a standard PC.[13][3] Building on earlier partial attacks, including those on reduced rounds by den Boer and Bosselaers from 1991, Dobbertin's method extended the approach to the complete algorithm, confirming MD4's vulnerability to collisions with high probability. The feasibility of generating such collisions in minimal time underscored MD4's inadequacy for collision-resistant applications, prompting immediate scrutiny of its deployment in protocols like digital signatures and integrity checks.[3] In 2004, Xiaoyun Wang and colleagues further advanced collision attacks on MD4, achieving practical results with a complexity of 2^8 MD4 computations (about 256 operations, verifiable by hand) through refined differential paths. This improvement rendered collisions trivial to compute even on modest resources, amplifying concerns over MD4's security in legacy systems.[3] Regarding second-preimage resistance, weaknesses were first exploited in 2005 by Hongbo Yu, Gaoli Wang, and others, who developed a second-preimage attack with complexity reduced compared to the brute-force 2^{128}. For first preimage resistance, Gaëtan Leurent presented an attack in 2008 with complexity approximately 2^{102}. These were further improved in 2010 by Jian Guo et al., reducing first preimage attacks to 2^{78.4} and second preimage attacks to 2^{69.4} with pre-computations. Although not as inexpensive as collisions, these attacks demonstrated that finding colliding messages is feasible with significant but achievable resources, further eroding trust in MD4 for authentication purposes.[17][18][3] Although no high-profile malware like Flame (which targeted MD5 weaknesses for certificate forgery in 2012) directly leveraged MD4, its vulnerabilities have contributed to exploits in outdated protocols, such as legacy NTLM authentication in Windows environments where MD4 hashing enables offline attacks on captured challenges.[19][20] MD4's demonstrated flaws led to its deprecation by NIST in 2011 via RFC 6150, which retired the algorithm to historic status and explicitly advised against its use in any new or existing security contexts due to the ease of practical attacks. Today, MD4 is considered obsolete, with modern systems migrating to secure alternatives like SHA-256 to mitigate collision and preimage risks.[3]Examples and Implementations
Standard Test Vectors
Standard test vectors for the MD4 algorithm are defined in RFC 1320 to validate the correctness of implementations by comparing computed hashes against expected outputs for specific inputs. These vectors form part of a comprehensive test suite, ensuring interoperability and adherence to the algorithm's specifications across different computing environments.[1] The following table presents key test vectors from RFC 1320, including the empty message, a single-byte input, and a 62-byte message:| Input | Description | MD4 Hash (hexadecimal) |
|---|---|---|
| (empty string) | No input bytes | 31d6cfe0d16ae931b73c59d7e0c089c0 |
| "a" | Single ASCII byte (0x61) | bde52cb31de33e46245e05fbdbd6fb24 |
| "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789" | 62 ASCII bytes covering uppercase, lowercase letters, and digits | 043f8582f241db351ce627e153e7f0e4 |
Collision Demonstration
A prominent demonstration of MD4's collision vulnerability was provided by Hans Dobbertin in 1996, who developed an attack to construct colliding messages for the full MD4 algorithm. The attack exploits structural weaknesses through differential cryptanalysis, allowing collisions to be found with low complexity (around 220 operations) in seconds on contemporary hardware.[21] Dobbertin's method adapts techniques from attacking related functions like RIPEMD, propagating differences through MD4's rounds to produce distinct messages with the same hash output. This example highlights MD4's lack of collision resistance, as small input modifications can lead to identical digests, undermining its use in integrity checks.[21]References
- Sep 1, 1998 · ... full MD4, while previously only partial attacks were known. An implementation of our attack allows us to find collisions for MD4 in a few ...