Hubbry Logo
Key derivation functionKey derivation functionMain
Open search
Key derivation function
Community hub
Key derivation function
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Key derivation function
Key derivation function
from Wikipedia

Example of a Key Derivation Function chain as used in the Signal Protocol. The output of one KDF function is the input to the next KDF function in the chain.

In cryptography, a key derivation function (KDF) is a cryptographic algorithm that derives one or more secret keys from a secret value such as a master key, a password, or a passphrase using a pseudorandom function (which typically uses a cryptographic hash function or block cipher).[1][2][3] KDFs can be used to stretch keys into longer keys or to obtain keys of a required format, such as converting a group element that is the result of a Diffie–Hellman key exchange into a symmetric key for use with AES. Keyed cryptographic hash functions are popular examples of pseudorandom functions used for key derivation.[4]

History

[edit]

The first[citation needed] deliberately slow (key stretching) password-based key derivation function was called "crypt" (or "crypt(3)" after its man page), and was invented by Robert Morris in 1978. It would encrypt a constant (zero), using the first 8 characters of the user's password as the key, by performing 25 iterations of a modified DES encryption algorithm (in which a 12-bit number read from the real-time computer clock is used to perturb the calculations). The resulting 64-bit number is encoded as 11 printable characters and then stored in the Unix password file.[5] While it was a great advance at the time, increases in processor speeds since the PDP-11 era have made brute-force attacks against crypt feasible, and advances in storage have rendered the 12-bit salt inadequate. The crypt function's design also limits the user password to 8 characters, which limits the keyspace and makes strong passphrases impossible.

Although high throughput is a desirable property in general-purpose hash functions, the opposite is true in password security applications in which defending against brute-force cracking is a primary concern. The growing use of massively-parallel hardware such as GPUs, FPGAs, and even ASICs for brute-force cracking has made the selection of a suitable algorithms even more critical because the good algorithm should enforce a certain amount of computational cost not only on CPUs, but also resist the cost/performance advantages of modern massively-parallel platforms for such tasks. Various algorithms have been designed specifically for this purpose, including bcrypt, scrypt and, more recently, Lyra2 and Argon2 (the latter being the winner of the Password Hashing Competition). The large-scale Ashley Madison data breach in which roughly 36 million passwords hashes were stolen by attackers illustrated the importance of algorithm selection in securing passwords. Although bcrypt was employed to protect the hashes (making large scale brute-force cracking expensive and time-consuming), a significant portion of the accounts in the compromised data also contained a password hash based on the fast, general-purpose, and insecure MD5 algorithm, which made it possible for over 11 million of the passwords to be cracked in a matter of weeks.[6]

In June 2017, The U.S. National Institute of Standards and Technology (NIST) issued a new revision of their digital authentication guidelines, NIST SP 800-63B-3,[7]: 5.1.1.2  stating that: "Verifiers SHALL store memorized secrets [i.e. passwords] in a form that is resistant to offline attacks. Memorized secrets SHALL be salted and hashed using a suitable one-way key derivation function. Key derivation functions take a password, a salt, and a cost factor as inputs then generate a password hash. Their purpose is to make each password guessing trial by an attacker who has obtained a password hash file expensive and therefore the cost of a guessing attack high or prohibitive."

Modern password-based key derivation functions, such as PBKDF2,[2] are based on a recognized cryptographic hash, such as SHA-2, use more salt (at least 64 bits and chosen randomly) and a high iteration count. NIST recommends a minimum iteration count of 10,000.[7]: 5.1.1.2  "For especially critical keys, or for very powerful systems or systems where user-perceived performance is not critical, an iteration count of 10,000,000 may be appropriate.” [8]: 5.2 

Key derivation

[edit]

The original use for a KDF is key derivation, the generation of keys from secret passwords or passphrases. Variations on this theme include:

  • In conjunction with non-secret parameters to derive one or more keys from a common secret value (which is sometimes also referred to as "key diversification"). Such use may prevent an attacker who obtains a derived key from learning useful information about either the input secret value or any of the other derived keys. A KDF may also be used to ensure that derived keys have other desirable properties, such as avoiding "weak keys" in some specific encryption systems.
  • As components of multiparty key-agreement protocols. Examples of such key derivation functions include KDF1, defined in IEEE Std 1363-2000, and similar functions in ANSI X9.42.
  • To derive keys from secret passwords or passphrases (a password-based KDF).
  • To derive keys of different length from the ones provided. KDFs designed for this purpose include HKDF and SSKDF. These take an 'info' bit string as an additional optional 'info' parameter, which may be crucial to bind the derived key material to application- and context-specific information.[9]
  • Key stretching and key strengthening.

Key stretching and key strengthening

[edit]

Key derivation functions are also used in applications to derive keys from secret passwords or passphrases, which typically do not have the desired properties to be used directly as cryptographic keys. In such applications, it is generally recommended that the key derivation function be made deliberately slow so as to frustrate brute-force attack or dictionary attack on the password or passphrase input value.

Such use may be expressed as DK = KDF(key, salt, iterations), where DK is the derived key, KDF is the key derivation function, key is the original key or password, salt is a random number which acts as cryptographic salt, and iterations refers to the number of iterations of a sub-function. The derived key is used instead of the original key or password as the key to the system. The values of the salt and the number of iterations (if it is not fixed) are stored with the hashed password or sent as cleartext (unencrypted) with an encrypted message.[10]

The difficulty of a brute force attack is increased with the number of iterations. A practical limit on the iteration count is the unwillingness of users to tolerate a perceptible delay in logging into a computer or seeing a decrypted message. The use of salt prevents the attackers from precomputing a dictionary of derived keys.[10]

An alternative approach, called key strengthening, extends the key with a random salt, but then (unlike in key stretching) securely deletes the salt.[11] This forces both the attacker and legitimate users to perform a brute-force search for the salt value.[12] Although the paper that introduced key stretching[13] referred to this earlier technique and intentionally chose a different name, the term "key strengthening" is now often (arguably incorrectly) used to refer to key stretching.

Password hashing

[edit]

Despite their original use for key derivation, KDFs are possibly better known for their use in password hashing (password verification by hash comparison), as used by the passwd file or shadow password file. Password hash functions should be relatively expensive to calculate in case of brute-force attacks, and KDFs are designed with this characteristic built in.[14] The non-secret parameters are called "salt" in this context.

In 2013 a Password Hashing Competition was announced to choose a new, standard algorithm for password hashing. On 20 July 2015 the competition ended and Argon2 was announced as the final winner. Four other algorithms received special recognition: Catena, Lyra2, Makwa and yescrypt.[15]

As of May 2023, the Open Worldwide Application Security Project (OWASP) recommends the following KDFs for password hashing, listed in order of priority:[16]

  1. Argon2id
  2. scrypt if Argon2id is unavailable
  3. bcrypt for legacy systems
  4. PBKDF2 if FIPS-140 compliance is required

References

[edit]

Further reading

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
A key derivation function (KDF) is a cryptographic that derives secret keying from a or key and other information, generating a binary string suitable for use as additional cryptographic keys. These functions are designed to produce pseudorandom outputs that are computationally indistinguishable from true random bits, ensuring the derived keys maintain high even if the input secret has lower . KDFs play a critical role in cryptographic protocols by enabling the secure generation of multiple keys from a single master secret, which is essential for key establishment schemes, session key derivation, and protecting against key reuse attacks. In password-based systems, KDFs incorporate mechanisms like salting and iteration to perform key stretching, deliberately slowing down computation to resist brute-force and dictionary attacks on low-entropy inputs such as human-chosen passwords. For instance, the PBKDF2 algorithm, approved by NIST for password-based key derivation, uses a pseudorandom function (PRF) like HMAC with an approved hash to iteratively derive keys, with the iteration count serving as a tunable cost factor to balance security and performance. Standards such as NIST SP 800-108 specify families of KDFs based on PRFs including , CMAC, and KMAC, supporting modes like counter, feedback, and double-pipeline for deriving keying material in various contexts. Another prominent example is , defined in RFC 5869, which employs an extract-then-expand paradigm to first distill a uniform PRF key from the input and then expand it to the desired output length, making it suitable for protocols like TLS. These standardized KDFs ensure interoperability and compliance with federal security requirements, such as those in , while addressing evolving threats like side-channel attacks through careful design.

Overview

Definition

A key derivation function (KDF) is a that derives one or more cryptographic keys from an input secret, such as a master key, , or , by applying a pseudorandom function (PRF), typically based on a or , to produce keying material suitable for use in cryptographic algorithms. The process ensures that the output keys are cryptographically strong, even if the input secret has limited or structure. The primary inputs to a KDF include the secret input key material (often denoted as ZZ or IKM), an optional salt (a non-secret random value to prevent precomputation attacks), and contextual information (such as a label or info string providing application-specific details). An iteration count may also be specified to apply the PRF multiple times, increasing computational resistance. The output consists of one or more derived keys (DK), typically as a bit string of a specified length LL, which can be partitioned for multiple uses. In general mathematical form, the operation is represented as: DK=KDF(Z,salt,info,L)DK = KDF(Z, \text{salt}, \text{info}, L) where ZZ is the input secret, salt and are optional parameters, and LL defines the output length. Unlike directly using a secret as a key, which may expose it to risks from low or predictable patterns, a KDF transforms potentially weak or structured inputs into uniformly distributed, high-entropy keys that mimic the properties of randomly generated keys. This transformation is essential for key expansion and management in protocols.

Purpose and Benefits

Key derivation functions (KDFs) primarily serve to generate one or more cryptographic keys from a single source of initial keying material, such as a or master key, enabling secure in various protocols. They also expand short or weak inputs, like user passwords, into full-length keys suitable for symmetric algorithms, thereby transforming low-entropy secrets into cryptographically robust outputs. For instance, in password-based scenarios, KDFs produce keys of appropriate length for applications like data , ensuring the derived material meets the security requirements of the target . A key benefit of KDFs is their ability to enhance resistance to brute-force attacks by increasing the computational effort required to derive keys from the input secret, without compromising the underlying secret itself. They promote key uniformity and independence, making derived keys statistically close to random and computationally indistinguishable from one another, even when generated from the same input. Additionally, KDFs format keys to align with specific cryptographic algorithms, such as deriving AES-compatible keys, which streamlines integration in security systems. Passwords often suffer from low and predictability, making them vulnerable to guessing or attacks, as users tend to choose memorable but common phrases with limited . KDFs address these weaknesses by processing such inputs into secure keys while preserving the original secret's integrity, thus mitigating offline attacks that exploit the input's guessability. In key hierarchies, KDFs facilitate by deriving distinct keys for different purposes from a master secret, such as an key and a () key within the same protocol. This approach ensures that compromise of one derived key does not affect others, enhancing overall protocol through compartmentalized key usage.

History

Early Developments

The origins of key derivation functions (KDFs) trace back to early efforts in securing password storage against brute-force attacks. In 1979, Robert Morris and introduced a deliberately slow password hashing mechanism in their paper "Password Security: A Case History," implemented as the Unix crypt command. This system used the first eight characters of a user's password as a key for the (DES) algorithm, encrypting a constant 64-bit block of zeros and iterating the process 25 times to produce an 11-character output stored in the password file. By leveraging software-based DES encryption, which was computationally intensive at the time, and adding iterations, this approach increased the time required for password guessing on hardware like the PDP-11/70 from milliseconds to seconds per attempt, serving as an early form of to derive secure keys from weak passwords. During the , the advent of faster cryptographic hash functions influenced the evolution of key derivation techniques, emphasizing the need for computational slowness to counter advancing hardware capabilities. , proposed by Ronald Rivest in 1991, and , standardized by NIST in 1995, were increasingly adopted in password-based systems due to their efficiency and , often combined with iterations or salts to derive keys from user inputs. These hashes replaced or augmented earlier DES-based methods in variants of Unix crypt implementations, highlighting an early recognition that rapid hashing alone was insufficient against brute-force attacks, as and exhaustive searches could exploit hardware speedups without deliberate delays. For instance, systems began iterating these functions multiple times to amplify derivation time, building on the 1979 principles to make offline attacks more resource-intensive. Key milestones in the underscored the growing application of simple KDFs in protocols. The Kerberos Version 5 protocol, specified in RFC 1510 in 1993, employed a basic string-to-key derivation for user passwords: the password string (with realm and principal appended) was padded to an 8-byte boundary, fan-folded and XORed to form a DES key, parity-corrected, and checked against weak keys via a CBC checksum using DES CBC. This method derived an 8-octet for authenticating clients to the , prioritizing simplicity for network environments while incorporating basic protections against direct password exposure. By 2000, NIST's initial guidelines on password-based encryption, outlined in PKCS #5 Version 2.0 (RFC 2898), formalized as a recommended function, applying a pseudorandom function like HMAC-SHA-1 iteratively (with a minimum of 1,000 rounds) alongside a salt to derive keys of variable lengths, addressing limitations of prior ad-hoc methods. This period marked a conceptual shift from relying on inherently fast cryptographic primitives—such as single-pass hashes—for to intentionally incorporating computation delays via iterations and salts, ensuring that derived keys resisted brute-force and dictionary attacks even as computing power grew exponentially. , the core technique enabling this, aimed to equate the effort of deriving a key from a to that of guessing it directly, thereby elevating weak human-memorable inputs to cryptographic strength without requiring perfect secrecy.

Standardization and Evolution

The standardization of key derivation functions (KDFs) began to formalize in the early 2000s, with the publication of RFC 2898 in 2000, which specified as a password-based KDF using a pseudorandom function like to apply salt and iterations for . In 2010, NIST released Special Publication 800-132, providing recommendations for password-based key derivation in storage and communication applications, emphasizing the use of approved pseudorandom functions and minimum iteration counts to enhance security against brute-force attacks. These standards built upon earlier foundational work, such as the Unix crypt function introduced in the 1970s, which first incorporated salting to prevent precomputed dictionary attacks. Post-2000s evolution in KDF design was driven by real-world incidents and advances in computational hardware, highlighting vulnerabilities in weaker hashing practices. The 2015 exposed over 36 million weakly protected password hashes, many using outdated or insufficiently iterated methods like with programming flaws, underscoring the need for more robust, memory-intensive KDFs to resist GPU-accelerated cracking. This incident accelerated the shift toward memory-hard functions; for instance, was introduced in 2009 as a KDF designed to require significant RAM, making parallelized attacks on specialized hardware more costly. Further momentum came from the 2013–2015 , where emerged as the winner for its balanced resistance to both time- and memory-based attacks through configurable parameters for parallelism, memory, and iterations. Recent updates reflect ongoing refinements to address emerging threats, including hardware advancements and risks. In 2015, RFC 5869 defined , an HMAC-based extract-and-expand KDF tailored for deriving keys from high-entropy sources in protocols like TLS, prioritizing simplicity and provable security properties. OWASP's 2023 Password Storage Cheat Sheet prioritizes id—a hybrid variant of Argon2 combining data-dependent and independent modes—for new implementations, recommending minimum parameters of 19 MiB memory, 2 iterations, and 1 degree of parallelism to balance security and performance. In the 2023 proposal to revise SP 800-132, NIST plans to incorporate memory-hard functions like Argon2 alongside , with updated guidance on parameters such as iteration counts. Post-2023 developments increasingly explore quantum-safe KDF adaptations, particularly those based on lattice problems, to withstand attacks from quantum algorithms like Grover's that could halve the effective of symmetric primitives. For example, lattice-based constructions such as those derived from (LWE) problems—central to NIST's post-quantum standards like ML-KEM—enable key derivation with hardness assumptions resilient to quantum adversaries, though full integration into KDF standards remains in early research stages. In , NIST finalized FIPS 203, 204, and 205, standardizing post-quantum algorithms like ML-KEM, which employ KDFs in hybrid key encapsulation mechanisms to ensure quantum resistance in key derivation processes. These efforts address gaps in prior guidelines, aiming to KDFs against anticipated quantum threats by 2030.

Core Principles

Key Stretching

Key stretching is a cryptographic technique employed in key derivation functions to enhance the security of weak inputs, such as low-entropy passwords or passphrases, by deliberately increasing the computational workload required to produce the derived key. This is accomplished through the iterative application of a , typically a cryptographic hash, which transforms the input into a stronger key by amplifying its resistance to brute-force attacks. The core goal is to make each derivation attempt sufficiently resource-intensive, thereby deterring exhaustive searches that would otherwise exploit the limited of human-chosen secrets. The mathematical foundation of key stretching relies on repeated function evaluations, where the output after NN iterations is computed as Output=HashN(InputSalt)\text{Output} = \text{Hash}^N (\text{Input} \parallel \text{Salt}), with N^N denoting sequential applications and \parallel . For NN iterations, the scales linearly with NN, as each step demands a complete execution of the underlying , resulting in a total cost of approximately O(N)O(N) operations. This linear scaling enables tunable security: practitioners select NN to achieve a target derivation time, often calibrated to one second on typical hardware, ensuring that even modest increases in attacker resources yield proportionally higher costs. Salt usage serves as a complementary measure to prevent precomputation attacks, though stretching primarily focuses on computational delay. Historically, emerged as a to the accelerating computational power predicted by , which posits that processing capabilities roughly double every 18 to 24 months, thereby halving the effective security of fixed-entropy keys over time. By design, the technique allows for adjustable iteration counts to maintain consistent derivation slowness amid hardware advancements, preserving security margins without requiring key redesign. This adaptability addresses the vulnerability of static protections to in attack feasibility. In contrast to plain hashing, which prioritizes rapid computation for efficient or integrity checks in online environments, key stretching intentionally introduces delay during to mitigate offline threats. Plain hashing enables quick lookups for but offers minimal protection against captured data, as attackers can perform rapid trials; , however, targets the derivation phase, ensuring that generating candidate keys from guesses becomes prohibitively slow, thus shifting the economic burden to the adversary.

Salt and Iteration Mechanisms

In key derivation functions (KDFs), a salt is a non-secret, randomly generated binary value, typically at least 128 bits in length, that is unique to each derivation instance or user. It serves to prevent precomputed attacks, such as rainbow tables, by ensuring that the same input secret produces distinct outputs across different derivations, thereby defeating dictionary attacks on common passwords and protecting against identical-input vulnerabilities where multiple users share the same secret. Salts are generated using an approved random bit generator and must be stored alongside the derived key or hash, as they are not secret and cannot be reconstructed. Iteration mechanisms enhance security by applying a pseudorandom function (PRF) repeatedly an adjustable number of times, often 100,000 or more, to amplify the computational workload required for derivation. The NIST SP 800-132 standard requires a minimum of 1,000 iterations, but current best practices recommend significantly higher values, such as at least 600,000 for PBKDF2-HMAC-SHA256, to counter modern attack capabilities using specialized hardware. This count is tunable to balance security needs against performance constraints, such as user-perceived delays, with higher values for critical applications. The sequential application of iterations inherently resists parallel processing, making brute-force and exhaustive search attacks more resource-intensive. These mechanisms support by deliberately prolonging derivation time from low-entropy inputs like passwords. In many constructions, the derived key results from iterating the PRF N times on the of the secret, salt, and optional context information: Derived key=PRFN(secretsaltinfo)\text{Derived key} = \text{PRF}^N(\text{secret} \parallel \text{salt} \parallel \text{info}) where \parallel denotes and NN is the . Advanced variants include peppers, which are secret, application-wide values added to the input before derivation, stored separately from the database (e.g., in a ) rather than with individual salts. Unlike salts, peppers are not unique per user and provide an extra barrier against offline attacks if the primary storage is breached, though their compromise necessitates widespread key rotation. Domain separation further refines these by incorporating non-secret context information, such as labels or identifiers, into the 'info' parameter to ensure keys derived for different purposes (e.g., versus ) remain cryptographically independent, preventing cross-use vulnerabilities.

Constructions and Algorithms

Hash-Based and HMAC-Based KDFs

Hash-based key derivation functions (KDFs) utilize cryptographic , such as SHA-256, in iterated chains to transform input keying material into derived keys of desired length. These constructions typically involve repeatedly hashing the input combined with a salt to achieve , ensuring that even low-entropy sources produce longer, more secure outputs. For example, a basic hash-based KDF might compute the derived key as the concatenation of multiple hash iterations: DK = Hash(salt || IKM) || Hash(Hash(salt || IKM)) || ... for a specified number of rounds. This approach is simple to implement and relies solely on the and preimage resistance of the underlying . However, the sequential nature of plain hash iterations in these KDFs makes them particularly vulnerable to parallel attacks, where adversaries can distribute computations across multiple processors or GPUs to accelerate brute-force or dictionary searches. Without the keyed structure of more advanced PRFs, parallelization is straightforward, reducing the effective security margin against hardware-accelerated cracking. To mitigate these issues, employ the () as a pseudorandom function (PRF), leveraging 's proven security properties for keyed hashing. HMAC constructs a PRF from a H by sandwiching the key and message between nested hashes, providing resistance to length-extension attacks inherent in Merkle-Damgård hashes. This makes HMAC-based designs suitable for deriving keys in both password and general cryptographic contexts. A foundational HMAC-based KDF is , introduced in the PKCS #5 v2.0 standard in 2000 and formalized in RFC 2898. derives a key DK of length dkLen from a password P, salt S (at least 8 octets), and iteration count c (current recommendations suggest at least 310,000 or higher for HMAC-SHA256, depending on hardware capabilities as of 2024) using a PRF such as HMAC-SHA256 (HMAC-SHA1 is deprecated for new uses). The algorithm proceeds in blocks: for each block i from 1 to l = ceil(dkLen / hLen), compute T_i as the XOR of c PRF values, starting with U_1 = PRF(P, S || INT(i)) and U_k = PRF(P, U_{k-1}) for k = 2 to c; then T_i = U_1 XOR U_2 XOR ... XOR U_c. The final DK is the T_1 || T_2 || ... || T_l, truncated to dkLen octets. This iteration mechanism, briefly referencing the chaining in U_k computations, enforces computational work to slow down attackers. supports variable-length outputs up to (2^32 - 1) * hLen and is widely implemented for its balance of security and performance. For non-password scenarios, such as deriving keys from Diffie-Hellman exchanges or entropy sources, HKDF provides a more modular HMAC-based alternative, specified in RFC 5869 (2010) following the extract-then-expand paradigm. The extract step first produces a fixed-length pseudorandom key PRK from input keying material IKM and optional salt (defaulting to hLen zeros if omitted):

PRK = HMAC-Hash(salt, IKM)

PRK = HMAC-Hash(salt, IKM)

This step "extracts" uniformity from potentially biased or low-entropy IKM, assuming the hash function's properties. The expand step then generates output keying material OKM of length L using PRK, contextual info (to bind the derivation to a specific use), and a counter: initialize T_0 as empty string, then for i = 1 to N = ceil(L / hLen),

T_i = HMAC-Hash(PRK, T_{i-1} || info || 0x01 || ... || 0xFF (for i in bytes))

T_i = HMAC-Hash(PRK, T_{i-1} || info || 0x01 || ... || 0xFF (for i in bytes))

Finally, OKM is the first L octets of T_1 || T_2 || ... || T_N. HKDF's design ensures derived keys are computationally independent and context-specific, making it ideal for protocols like TLS or IKE without relying on low-entropy passwords; it was motivated by the need for a simple, provably secure KDF under minimal hash assumptions. HMAC-based KDFs like and excel in CPU efficiency, enabling fast derivation on general-purpose hardware while incorporating salts and iterations to thwart offline attacks. Nonetheless, their reliance on sequential PRF evaluations leaves them susceptible to parallelization on GPUs or , where attackers can scale computations dramatically to test multiple candidates simultaneously.

Memory-Hard and Specialized Functions

Memory-hard functions represent an evolution in key derivation functions (KDFs) designed to impose significant memory requirements on computations, thereby increasing the cost of hardware-accelerated attacks such as those using or GPUs. These functions aim to level the playing field between general-purpose hardware and specialized attack devices by forcing sequential memory access patterns that are inefficient to parallelize. A seminal example is , introduced in 2009, which requires substantial memory allocation—typically on the order of 1 GiB for secure parameters—to compute the derivation, making it resistant to cost-effective parallelization on GPUs while remaining feasible on standard CPUs. Scrypt operates by first mixing the password and salt using PBKDF2 with HMAC-SHA256, then performing a sequential memory-hard operation via the SMix function, which fills and accesses large blocks of memory in a dependent manner to thwart optimization. This design ensures that attackers cannot economically scale brute-force attempts, as the memory bandwidth becomes the primary bottleneck rather than computational speed alone. Bcrypt, proposed in 1999, predates fully memory-hard designs but incorporates adaptive resource hardness through an exponential cost factor in its Blowfish cipher setup phase, effectively stretching computation time while using modest memory. By iteratively expanding the Blowfish key schedule with the password and salt, bcrypt allows tunable work factors (e.g., cost of 12 or higher for modern security), providing a foundation for resource-intensive derivation that adapts to advancing hardware threats. Argon2, selected as the winner of the 2015 , advances -hard KDFs with configurable parameters for time cost (t), cost (m, in KiB), and parallelism (p), enabling fine-tuned security trade-offs. It employs Blake2b as its core permutation and fills blocks sequentially across to support parallel execution while maintaining resistance to side-channel attacks through data-dependent or independent access. Argon2 offers three variants: Argon2d for data-dependent indexing to resist GPU optimizations, Argon2i for independent access to mitigate timing leaks in side-channel scenarios, and the hybrid Argon2id, which combines both for broad applicability in password-based key derivation. The process involves generating pseudorandom blocks, compressing them via Blake2b, and extracting the final key, with recommended parameters like m=2^{16} (64 MiB) for interactive logins, t=3, p=4 per RFC 9106 as of , balancing usability and security. Specialized KDFs address emerging threats, such as those from , by integrating post-quantum primitives. For instance, the ML-KEM standard (derived from CRYSTALS-Kyber) incorporates a simple hash-based KDF to derive symmetric keys from encapsulated post-quantum public-key exchanges, ensuring IND-CCA security against quantum adversaries without relying on vulnerable classical assumptions like discrete logarithms. These constructions prioritize lattice-based hardness assumptions, using modules over rings to generate keys resistant to , though they often trade higher computational overhead for quantum resistance. The primary trade-off in memory-hard and specialized functions is increased resource demands—both and time—which enhance ASIC resistance but can strain legitimate users on resource-constrained devices, necessitating careful parameter selection to maintain practical .

Applications

Password-Based Derivation

Password-based key derivation functions (KDFs) are designed to transform human-memorable passwords, which typically offer low of approximately 20 to 40 bits due to predictable patterns in user choices, into cryptographically secure keys or hashes suitable for , , or verification. These inputs are inherently weak compared to high-entropy secrets, necessitating KDFs that incorporate mechanisms like salting and to amplify against brute-force and attacks. By applying repeated hashing or computationally intensive operations, password-based KDFs produce fixed-length outputs that can serve as keys for symmetric ciphers or as stored hashes for verification, distinguishing them from simple hashing by emphasizing for broader cryptographic use. The core process for password-based derivation begins with generating a unique random salt for each to prevent precomputation attacks, followed by computing the output as KDF(, salt, parameters), where parameters often include iteration counts or memory costs. For verification in systems, the stored KDF output is recomputed with the submitted and salt; a match confirms validity without revealing the original . In key derivation scenarios, such as full-disk encryption, the KDF output directly yields a master key used to encrypt data blocks with algorithms like AES. For instance, the (LUKS) standard originally employed in LUKS1, but LUKS2 uses Argon2id by default to derive encryption keys from user passphrases, ensuring that even low-entropy inputs protect stored data effectively. Common examples include , standardized in RFC 2898 and updated in PKCS #5 v2.1 (RFC 8018), which uses a pseudorandom function like HMAC-SHA256 iterated thousands of times to derive keys from passwords. In web applications, is widely adopted for storing password hashes, as it adaptively increases computational cost via a configurable work factor to maintain security against hardware advances. More recent systems favor , the winner of the 2015 , for its resistance to parallel attacks in login and derivation workflows. A key challenge in password-based derivation is balancing security against offline attacks with usability, as excessive computation can degrade login performance. Guidelines from recommend configuring KDFs with work factors such as a minimum of 600,000 iterations for with HMAC-SHA256, ensuring resistance to high-speed cracking without introducing unacceptable delays for legitimate users. This tension is particularly acute in resource-constrained environments, where memory-hard functions like help mitigate GPU-accelerated brute-forcing by demanding significant RAM alongside CPU cycles.

General Cryptographic Uses

Key derivation functions (KDFs) play a crucial role in cryptographic protocols for generating session keys from shared secrets obtained through key agreement mechanisms, such as Diffie-Hellman exchanges. In these scenarios, the shared secret serves as the input keying material (IKM), which the KDF processes to produce cryptographically suitable keys for encryption, authentication, and other purposes. This application is particularly prominent in secure communication protocols where high-entropy inputs are available, allowing KDFs to expand and diversify the secret without the entropy limitations inherent to password-based contexts. For instance, in the Transport Layer Security (TLS) Protocol Version 1.3, HKDF—an HMAC-based extract-and-expand construction—is employed to derive all session keys from the shared secret established via ephemeral Diffie-Hellman (DHE) or elliptic curve Diffie-Hellman (ECDHE). KDFs also facilitate key hierarchies, where a master key is systematically derived into multiple child keys to support various protocol functions while maintaining independence between them. This is achieved through parameters like nonces, sequence numbers, or context information that ensure domain separation, preventing reuse of the same input across different key uses. In the Internet Key Exchange Protocol Version 2 () for , the Diffie-Hellman shared secret and nonces first generate an IKE security association (SA) key (SK_d), which acts as a master key; subsequent child SA keys for IPsec tunnels are then derived from this master using a pseudorandom function (PRF), incorporating identifiers for traffic selectors to enforce separation. Standardized examples illustrate these uses in key establishment schemes. The NIST Special Publication 800-56C Revision 2 outlines KDF methods, including one-step and two-step (extract-then-expand) constructions, for deriving keying material from shared secrets in protocols like ANSI X9.63, which specifies a hash-based KDF for Diffie-Hellman key agreement. In applications, hierarchical deterministic (HD) wallets employ KDFs to generate private keys from a master , as defined in Improvement Proposal 32 (BIP-32); this uses HMAC-SHA512 for child key derivation along specified paths, enabling organized key trees for multiple addresses without exposing the root . These applications of KDFs in key agreement and hierarchy ensure forward secrecy by leveraging ephemeral shared secrets that are discarded after derivation, protecting past sessions even if long-term keys are compromised later. Additionally, the domain separation provided by KDF inputs promotes key independence, mitigating risks from key reuse across protocol components and allowing secure expansion from a single high-entropy secret to multiple independent keys.

Security Considerations

Vulnerabilities and Attacks

Key derivation functions (KDFs) are susceptible to offline brute-force attacks when an attacker obtains stored derived keys or hashes, allowing exhaustive guessing without online restrictions. Modern hardware, particularly GPUs, significantly accelerates these attacks by parallelizing computations, reducing the effectiveness of iteration-based slowing mechanisms; for instance, a single high-end GPU like the 4090 can compute over 80 billion hashes per second as of 2025, enabling rapid cracking of weak passwords protected by outdated KDFs. Rainbow table attacks exploit precomputed chains of hash values to reverse derived keys efficiently, but the inclusion of unique salts in KDFs defeats these by requiring recomputation for each salt, rendering precomputed tables ineffective. Side-channel attacks further threaten KDF implementations by analyzing unintended leaks such as execution timing variations or power consumption patterns during derivation, potentially revealing key material without direct access to outputs. Parallelism in KDF designs introduces vulnerabilities to specialized hardware; for example, , intended as memory-hard to resist such threats, has faced ASIC-based attacks in practice, as demonstrated by custom chips developed for scrypt computations in cryptocurrency mining that accelerate brute-force efforts. Quantum computing poses an emerging threat via , which provides a quadratic speedup for brute-force searches, effectively halving the security level of symmetric keys derived by most KDFs and remaining unaddressed in standard constructions. Specific KDFs exhibit targeted weaknesses: PBKDF2's reliance on CPU-intensive iterations creates a exploitable by GPUs, with attacks achieving speeds over 100,000 derivations per second on commodity hardware, far outpacing CPU defenses. Analyses as of 2025 highlight side-channel vulnerabilities in variants, particularly Argon2d, where data-dependent memory access enables timing or power-based key recovery in unprotected environments. The incompleteness of quantum resistance in current KDFs underscores a critical gap, as Grover's speedup necessitates larger output key sizes or hybrid post-quantum designs to maintain equivalent security margins against exhaustive searches. Memory-hard functions like aim to mitigate ASIC parallelism by inflating memory costs, though real-world hardware adaptations have partially undermined this.

Recommendations and Best Practices

When selecting a key derivation function (KDF), Argon2id is recommended as the preferred option for password-based key derivation due to its resistance to both side-channel and GPU-based attacks. For Argon2id, minimum parameters include 19 MiB of memory, 2 iterations, and 1 degree of parallelism to achieve adequate security while balancing performance. If Argon2id is unavailable, serves as a suitable alternative, with reserved for legacy or FIPS-compliant systems. Implementations of KDFs should incorporate constant-time operations to mitigate timing side-channel attacks, ensuring that execution time does not vary based on input values. Salts must be generated randomly and stored alongside the derived keys in plaintext, as they are intended to be public and unique per key. Peppers, as application-wide secrets, require secure storage separate from the database, such as in hardware security modules (HSMs) or secrets vaults. Parameters like iteration counts should be periodically updated to account for advances in hardware, such as increased GPU parallelism, by increasing computational costs to maintain target derivation times. Standards from NIST advise using at least 10,000 iterations for in password verifiers, with the count tuned as high as server performance permits to resist brute-force attacks, targeting a derivation time of approximately 100-500 ms on standard hardware; the July 2025 revision of SP 800-63B emphasizes dynamic adjustments to these parameters for emerging threats. For PBKDF2-HMAC-SHA256, recommendations as of 2023 align with 600,000 iterations to meet modern security needs while adhering to these timing benchmarks. Auditing KDF implementations involves simulating attacks using tools like , which supports cracking derived keys from various KDFs to estimate offline attack resistance based on real hardware performance. Future-proofing requires selecting parameters that enhance quantum resistance, such as doubling hash output sizes for underlying functions like SHA-256 to maintain security against . Post-2023 recommendations emphasize integrating into KDFs via constructions like MFKDF2, which derives keys from factors such as passwords, TOTP codes, and hardware tokens, providing flexible and provably secure without relying on centralized servers. This approach addresses limitations in single-factor derivations while supporting upgrades to stronger parameters as needed.

References

Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.