Recent from talks
Nothing was collected or created yet.
Chosen-plaintext attack
View on WikipediaThis article needs additional citations for verification. (November 2015) |
A chosen-plaintext attack (CPA) is an attack model for cryptanalysis which presumes that the attacker can obtain the ciphertexts for arbitrary plaintexts.[1] The goal of the attack is to gain information that reduces the security of the encryption scheme.[2]
Modern ciphers aim to provide semantic security, also known as ciphertext indistinguishability under chosen-plaintext attack, and they are therefore, by design, generally immune to chosen-plaintext attacks if correctly implemented.
Introduction
[edit]In a chosen-plaintext attack the adversary can (possibly adaptively) ask for the ciphertexts of arbitrary plaintext messages. This is formalized by allowing the adversary to interact with an encryption oracle, viewed as a black box. The attacker’s goal is to reveal all or a part of the secret encryption key.
It may seem infeasible in practice that an attacker could obtain ciphertexts for given plaintexts. However, modern cryptography is implemented in software or hardware and is used for a diverse range of applications; for many cases, a chosen-plaintext attack is often very feasible (see also In practice). Chosen-plaintext attacks become extremely important in the context of public key cryptography where the encryption key is public and so attackers can encrypt any plaintext they choose.
Different forms
[edit]There are two forms of chosen-plaintext attacks:
- Batch chosen-plaintext attack, where the adversary chooses all of the plaintexts before seeing any of the corresponding ciphertexts. This is often the meaning intended by "chosen-plaintext attack" when this is not qualified.
- Adaptive chosen-plaintext attack (CPA2), where the adversary can request the ciphertexts of additional plaintexts after seeing the ciphertexts for some plaintexts.
General method of an attack
[edit]A general batch chosen-plaintext attack is carried out as follows [failed verification]:
- The attacker may choose n plaintexts. (This parameter n is specified as part of the attack model, it may or may not be bounded.)
- The attacker then sends these n plaintexts to the encryption oracle.
- The encryption oracle will then encrypt the attacker's plaintexts and send them back to the attacker.
- The attacker receives n ciphertexts back from the oracle, in such a way that the attacker knows which ciphertext corresponds to each plaintext.
- Based on the plaintext–ciphertext pairs, the attacker can attempt to extract the key used by the oracle to encode the plaintexts. Since the attacker in this type of attack is free to craft the plaintext to match his needs, the attack complexity may be reduced.
Consider the following extension of the above situation. After the last step,
- The adversary outputs two plaintexts m0 and m1.
- A bit b is chosen uniformly at random .
- The adversary receives the encryption of mb, and attempts to "guess" which plaintext it received, and outputs a bit b'.
A cipher has indistinguishable encryptions under a chosen-plaintext attack if after running the above experiment the adversary can't guess correctly (b=b') with probability non-negligibly better than 1/2.[3]
Examples
[edit]The following examples demonstrate how some ciphers that meet other security definitions may be broken with a chosen-plaintext attack.
Caesar cipher
[edit]The following attack on the Caesar cipher allows full recovery of the secret key:
- Suppose the adversary sends the message:
Attack at dawn, - and the oracle returns
Nggnpx ng qnja. - The adversary can then work through to recover the key in the same way as a Caesar cipher. The adversary could deduce the substitutions
A→N,T→Gand so on. This would lead the adversary to determine that 13 was the key used in the Caesar cipher.
With more intricate or complex encryption methodologies the decryption method becomes more resource-intensive, however, the core concept is still relatively the same.
One-time pads
[edit]The following attack on a one-time pad allows full recovery of the secret key. Suppose the message length and key length are equal to n.
- The adversary sends a string consisting of n zeroes to the oracle.
- The oracle returns the bitwise exclusive-or of the key with the string of zeroes.
- The string returned by the oracle is the secret key.
While the one-time pad is used as an example of an information-theoretically secure cryptosystem, this security only holds under security definitions weaker than CPA security. This is because under the formal definition of CPA security the encryption oracle has no state. This vulnerability may not be applicable to all practical implementations – the one-time pad can still be made secure if key reuse is avoided (hence the name "one-time" pad).
In practice
[edit]In World War II US Navy cryptanalysts discovered that Japan was planning to attack a location referred to as "AF". They believed that "AF" might be Midway Island, because other locations in the Hawaiian Islands had codewords that began with "A". To prove their hypothesis that "AF" corresponded to "Midway Island" they asked the US forces at Midway to send a plaintext message about low supplies. The Japanese intercepted the message and immediately reported to their superiors that "AF" was low on water, confirming the Navy's hypothesis and allowing them to position their force to win the battle.[3][4]
Also during World War II, Allied codebreakers at Bletchley Park would sometimes ask the Royal Air Force to lay mines at a position that didn't have any abbreviations or alternatives in the German naval system's grid reference. The hope was that the Germans, seeing the mines, would use an Enigma machine to encrypt a warning message about the mines and an "all clear" message after they were removed, giving the allies enough information about the message to break the German naval Enigma. This process of planting a known-plaintext was called gardening.[5] Allied codebreakers also helped craft messages sent by double agent Juan Pujol García, whose encrypted radio reports were received in Madrid, manually decrypted, and then re-encrypted with an Enigma machine for transmission to Berlin.[6] This helped the codebreakers decrypt the code used on the second leg, having supplied the original text.[7]
In modern day, chosen-plaintext attacks (CPAs) are often used to break symmetric ciphers. To be considered CPA-secure, the symmetric cipher must not be vulnerable to chosen-plaintext attacks. Thus, it is important for symmetric cipher implementors to understand how an attacker would attempt to break their cipher and make relevant improvements.
For some chosen-plaintext attacks, only a small part of the plaintext may need to be chosen by the attacker; such attacks are known as plaintext injection attacks.
Relation to other attacks
[edit]A chosen-plaintext attack is more powerful than known-plaintext attack, because the attacker can directly target specific terms or patterns without having to wait for these to appear naturally, allowing faster gathering of data relevant to cryptanalysis. Therefore, any cipher that prevents chosen-plaintext attacks is also secure against known-plaintext and ciphertext-only attacks.
However, a chosen-plaintext attack is less powerful than a chosen-ciphertext attack, where the attacker can obtain the plaintexts of arbitrary ciphertexts. A CCA-attacker can sometimes break a CPA-secure system.[3] For example, the El Gamal cipher is secure against chosen plaintext attacks, but vulnerable to chosen ciphertext attacks because it is unconditionally malleable.
See also
[edit]References
[edit]- ^ Ross Anderson, Security Engineering: A Guide to Building Dependable Distributed Systems. The first edition (2001): http://www.cl.cam.ac.uk/~rja14/book.html
- ^ Barrera, John Fredy; Vargas, Carlos; Tebaldi, Myrian; Torroba, Roberto (2010-10-15). "Chosen-plaintext attack on a joint transform correlator encrypting system". Optics Communications. 283 (20): 3917–3921. Bibcode:2010OptCo.283.3917B. doi:10.1016/j.optcom.2010.06.009. ISSN 0030-4018.
- ^ a b c Katz, Jonathan; Lindell, Yehuda (2007). Introduction to Modern Cryptography: Principles and Protocols. Boca Raton: Chapman and Hall/CRC. ISBN 978-1584885511. OCLC 893721520.
- ^ Morris, Christopher (1993), "Navy Ultra's Poor Relations", in Hinsley, F.H.; Stripp, Alan (eds.), Codebreakers: The inside story of Bletchley Park, Oxford: Oxford University Press, p. 235, ISBN 978-0-19-280132-6
- ^ Kelly, Jon (27 January 2011). "The piece of paper that fooled Hitler". BBC. Retrieved 1 January 2012.
The Nazis believed Pujol, whom they code named Alaric Arabel, was one of their prize assets
- ^ Seaman (2004). "The first code which Garbo was given by the Germans for his wireless communications turned out to be the identical code which was currently in use in the German circuits"
Chosen-plaintext attack
View on GrokipediaFundamentals
Definition
A chosen-plaintext attack (CPA) is a model of cryptanalysis in which an adversary can select arbitrary plaintext messages and obtain the corresponding ciphertexts produced by an encryption scheme under an unknown secret key, with the objective of recovering the key or enabling decryption of additional ciphertexts.[4] This attack assumes active interaction with the encryption process, distinguishing it from passive eavesdropping, where the adversary only observes existing ciphertexts without influencing the inputs.[1] Key terms in this context include plaintext, which refers to the original, unencrypted message; ciphertext, the encrypted output; and encryption oracle, a conceptual black-box interface that encrypts chosen plaintexts without disclosing the underlying key.[5] The adversary, denoted as , interacts with this oracle by submitting plaintexts and receiving ciphertexts , where is the encryption function and is the secret key hidden from . The concept of chosen-plaintext attacks was first formalized in the 1970s amid the development of public-key cryptography, notably in the seminal work by Diffie and Hellman, who highlighted such attacks as a critical threat to encryption systems beyond mere ciphertext-only analysis.[1]Adversarial Model
In the chosen-plaintext attack (CPA) model, the adversary is classified as active, interacting directly with the encryption system by submitting plaintexts of its choice to obtain corresponding ciphertexts, in contrast to passive adversaries who only observe existing communications. This active role enables the attacker to gather targeted information about the encryption process. Adversaries are further distinguished as adaptive or non-adaptive: adaptive adversaries may choose subsequent plaintexts based on responses to prior queries, allowing dynamic strategy refinement, while non-adaptive (or batch) adversaries submit all plaintexts in advance without feedback. The adaptive variant is the standard assumption in formal CPA definitions, as it captures more powerful attacks.[6] Adversaries in the CPA model are constrained to be probabilistic polynomial-time (PPT) algorithms, meaning they run in time polynomial in the security parameter , and their behavior incorporates randomness to model realistic computational limitations. They are permitted a polynomial number of queries to the encryption oracle, ensuring the attack remains feasible only under bounded resources. These bounds reflect the foundational assumption in modern cryptography that no efficient algorithm can break secure systems with non-negligible probability.[7][6] The primary goal of a CPA adversary is to distinguish between encryptions of two distinct messages or, more broadly, to recover partial information about plaintexts, such as satisfying a predicate on the message. Success is quantified by the distinguishing advantage , defined as the absolute difference between the probability of correctly guessing the encrypted message and random guessing (), where a secure scheme ensures is negligible in for all PPT adversaries. Secondary goals like full key recovery imply CPA vulnerability but are not the minimal requirement for the model.[7][6] The adversary gains black-box access to an encryption oracle , which computes ciphertexts for chosen plaintexts under a secret key without revealing or internal details of the algorithm. This oracle simulates real-world scenarios where an attacker might control plaintext submission, such as in a compromised encryption device or service, but lacks decryption capabilities. In the indistinguishability formulation, the oracle may be augmented to a left-or-right variant that encrypts one of two submitted messages, formalizing the distinguishing challenge.[6]Attack Methods
General Procedure
A chosen-plaintext attack (CPA) follows a structured methodology that leverages access to an encryption oracle, allowing the adversary to select plaintexts and observe corresponding ciphertexts under a fixed secret key. This general procedure applies across various cryptosystems, particularly block ciphers, and aims to recover the key or forge new ciphertexts by exploiting structural weaknesses. The process is divided into distinct phases to systematically probe and analyze the encryption function.[8] The initialization phase involves setting up the adversarial environment, including establishing access to the encryption oracle and understanding the target cryptosystem's parameters, such as block size, key length, and round structure. During this setup, the adversary may perform preliminary tests with simple plaintexts to confirm oracle functionality and identify any observable behaviors, such as error messages or timing leaks, without yet committing to a full attack strategy. This phase ensures efficient resource allocation for subsequent steps, as seen in standard cryptanalytic frameworks for symmetric ciphers.[9] In the query phase, the adversary adaptively or non-adaptively selects plaintexts to submit to the oracle, receiving the resulting ciphertexts. Plaintext selection strategies are tailored to the anticipated weakness; for instance, choosing all-zero plaintexts can probe substitution patterns in the cipher's S-boxes, revealing key-dependent mappings, while structured inputs like those with specific bit differences target differential properties. The number of queries varies by attack type—representative examples include 2^{43} plaintexts for linear cryptanalysis on DES or fewer for simpler systems—but the goal is to generate plaintext-ciphertext pairs that amplify exploitable biases or correlations. This phase may involve multiple iterations, with each new plaintext potentially informed by prior responses in adaptive scenarios.[9] The analysis phase processes the collected ciphertexts using techniques such as statistical analysis to detect non-random distributions, linear approximations to approximate key bits via high-probability equations over plaintext-ciphertext pairs, or exhaustive search over reduced key spaces derived from partial information. For example, in linear cryptanalysis, the adversary computes correlations between linear combinations of input and output bits across queries to hypothesize subkey values, verifying them against observed data. These methods reduce the effective key space, enabling feasible computation even for large keys.[9] Finally, the exploitation phase utilizes insights from analysis to achieve the attack's objective, such as full key recovery, decryption of arbitrary ciphertexts, or forgery of valid encryptions. Success is measured by the attack's ability to outperform random guessing, typically achieving key recovery with probability significantly greater than 1/2^{key length}, often exceeding 90% in optimized cases like linear attacks on reduced-round ciphers. The attack breaks the system if this advantage allows practical decryption or impersonation.[9] To illustrate the query phase's iterative nature, the following pseudocode depicts a basic loop for submitting chosen plaintexts and collecting responses:initialize empty list PT = [], CT = []
for i from 1 to num_queries:
select plaintext m_i based on strategy (e.g., all zeros or differential pair)
query [oracle](/page/Oracle): c_i = Enc_k(m_i)
append m_i to PT
append c_i to CT
return PT, CT for analysis
initialize empty list PT = [], CT = []
for i from 1 to num_queries:
select plaintext m_i based on strategy (e.g., all zeros or differential pair)
query [oracle](/page/Oracle): c_i = Enc_k(m_i)
append m_i to PT
append c_i to CT
return PT, CT for analysis
Oracle-Based Techniques
In chosen-plaintext attacks (CPAs), oracle-based techniques exploit the encryption oracle by submitting carefully crafted plaintexts to obtain corresponding ciphertexts, thereby inferring key material or structural weaknesses in the cipher. These methods prioritize strategic query selection to maximize information gain per interaction, distinguishing them from brute-force or passive approaches. Seminal works emphasize the encryption oracle's role in enabling controlled experimentation, as seen in differential cryptanalysis where specific plaintext differences are chosen to propagate through the cipher rounds. Query optimization forms a core element of these techniques, particularly through adaptive querying, which refines subsequent plaintext choices based on prior oracle responses. This iterative refinement allows attackers to efficiently explore the key space or plaintext dependencies, reducing overall query needs compared to non-adaptive strategies. A representative example occurs in cryptanalysis of order-preserving encryption (OPE) schemes, where an adversary uses binary search on the plaintext domain to recover the ordering of encrypted values. Starting with a midpoint plaintext (where is the domain size), the attacker queries its encryption and compares it against known points; depending on the result, the search interval halves, achieving key dependency isolation in queries, with the domain size.[10] Efficiency metrics underscore the practicality of oracle-based techniques, with query complexity often expressed as , where is the number of required plaintext-ciphertext pairs. In differential cryptanalysis, for instance, breaking the full 16-round DES demands approximately chosen plaintexts, paired with a time complexity of equivalent encryptions for analysis, illustrating how query volume trades against computational resources. Space requirements scale linearly with for storing pairs, while adaptive optimizations like binary search minimize (e.g., logarithmic reductions) at the expense of heightened per-query processing. These trade-offs are critical, as excessive queries risk detection, whereas intensive computation may exceed feasible limits.[11] Common pitfalls arise from the encryption oracle's inherent constraints in pure CPA scenarios, which limit outputs to ciphertexts alone, excluding decryption access or feedback on validity—unlike chosen-ciphertext settings that enable padding oracle exploitation through error signals. Consequently, techniques simulating padding-like leaks via edge-case plaintexts (e.g., malformed inputs causing internal validation failures) remain constrained, often yielding incomplete information. To mitigate detection, attackers adopt avoidance strategies such as randomizing query patterns or mimicking legitimate plaintext distributions, ensuring queries blend with normal oracle usage without triggering rate limits or logging anomalies.[12]Historical and Illustrative Examples
Caesar Cipher
The Caesar cipher, dating to the 1st century BCE, was employed by Julius Caesar for securing military and official communications, as documented in ancient accounts by Suetonius and later analyzed in comprehensive histories of cryptography. This simple substitution cipher uses a fixed rotation amount (typically between 1 and 25) as the key, where encryption shifts each plaintext letter by positions in the alphabet, mathematically expressed as , with letters represented numerically (A = 0, B = 1, ..., Z = 25).[13] The Caesar cipher exemplifies a basic vulnerability to chosen-plaintext attacks (CPAs) due to its deterministic encryption and limited key space of 25 possibilities. In a CPA scenario, the attacker accesses an encryption oracle to submit chosen plaintexts and receive corresponding ciphertexts, enabling key recovery without needing intercepted messages.[13] To execute the attack, the attacker begins by querying the oracle with the single-character plaintext "A" (p = 0). The resulting ciphertext reveals the key directly: if the output is "D" (c = 3), then , as . With the key known, the attacker can compute the decryption function for any ciphertext, fully compromising the system. For a step-by-step illustration assuming the attacker verifies the mapping across the alphabet (though unnecessary for key recovery), subsequent queries could include "B" (expected ciphertext "E" for k=3), "C" ("F"), and so on up to "Z" ("C"), confirming the uniform shift and exposing the entire substitution table:A→D, B→E, C→F, ..., W→Z, X→A, Y→B, Z→C. In the worst case, exhaustively querying all 26 letters maps the full permutation, but a single targeted query like "A" suffices in practice.[13] This vulnerability arises from the cipher's linear, key-dependent structure, which allows complete key exposure through minimal chosen plaintexts, rendering it insecure against even basic adversarial access to the encryption process. Early 20th-century cryptography texts, such as those by William Friedman, first detailed such systematic attacks on classical ciphers like the Caesar in pedagogical contexts.
One-Time Pad
The one-time pad is a symmetric-key encryption system in which the plaintext message is encrypted by bitwise XOR with a secret key of equal length, producing the ciphertext . Decryption reverses the process using the same key, as , since XOR is its own inverse. This construction requires the key to be generated from a truly uniform random source and used only once for a single message of matching length.[14] Theoretically, the one-time pad provides perfect secrecy against chosen-plaintext attacks, meaning that even an adversary with unlimited computational resources and access to an encryption oracle cannot gain any information about the plaintext from the ciphertext. Claude Shannon proved in 1949 that this security holds if the key is truly random, uniformly distributed over all possible bit strings of the message length, and never reused, ensuring that every possible plaintext is equally likely given any ciphertext. In a CPA setting, unlimited queries to the oracle yield only random-looking ciphertexts indistinguishable from encryptions of arbitrary chosen plaintexts, preserving the key's secrecy.[14] In practice, however, implementation flaws such as key reuse introduce severe vulnerabilities exploitable via chosen-plaintext techniques. During World War II, the Venona project revealed how Soviet intelligence reused one-time pads under wartime production pressures, allowing U.S. cryptanalysts to break numerous diplomatic and espionage messages. By identifying reused pads and employing cribs—guessed common phrases or structures in plaintexts akin to chosen-plaintext submissions—the analysts recovered keys and full messages. A classic two-time pad attack demonstrates this: if the same key encrypts two distinct messages and to yield and , then XORing the ciphertexts gives . Guessing one plaintext segment (e.g., via a crib) allows recovery of the other, cascading to the full messages and key.[15] Similar reuse issues compromised U.S. diplomatic communications in the 1940s, where the State Department repeatedly used daily keys an average of nine times across messages, enabling enemy cryptanalysts to exploit patterns and recover sensitive content through methods analogous to CPA recoveries. These breaches led to significant intelligence leaks, underscoring how deviations from one-time use undermine the system's theoretical invulnerability.[16]Security Notions and Implications
Indistinguishability under CPA
Indistinguishability under chosen-plaintext attack (IND-CPA) is a fundamental security notion in modern cryptography that formalizes the requirement for an encryption scheme to resist adversaries who can obtain encryptions of chosen plaintexts.[6] In this model, the adversary interacts with an encryption oracle during a query phase, submitting any plaintexts of its choice to receive corresponding ciphertexts, but without access to the secret key.[17] Following this phase, the adversary selects two distinct plaintext messages and of equal length and receives the encryption of one of them, chosen uniformly at random (say where ). The adversary then outputs a guess for , aiming to determine which message was encrypted based solely on the challenge ciphertext and prior oracle queries.[6] The IND-CPA game proceeds in two main phases: the query phase, where the adversary makes polynomially many encryption queries to the oracle , and the challenge phase, where receives and attempts to guess . The advantage of in this game is defined as , where is the security parameter and is the encryption scheme. An encryption scheme is IND-CPA secure if, for all probabilistic polynomial-time adversaries , this advantage is negligible in .[6] To establish IND-CPA security, cryptosystems are typically proven secure via reductions to well-studied hardness assumptions. For the ElGamal encryption scheme, security reduces to the decisional Diffie-Hellman (DDH) assumption in a cyclic group of prime order. The proof proceeds by a hybrid argument over game hops, where the real IND-CPA game is shown to be indistinguishable from a game where the challenge ciphertext uses a random group element instead of the true encryption. Specifically, consider the following games for an adversary :- Game 0: The standard IND-CPA game, where wins if .
- Game 1: Identical to Game 0, except the second component of the challenge ciphertext is replaced by a random (independent of ).
