TCP sequence prediction attack
View on Wikipedia| Part of a series on |
| Computer hacking |
|---|
A TCP sequence prediction attack is an attempt to predict the sequence number used to identify the packets in a TCP connection, which can be used to counterfeit packets.[1]
The attacker hopes to correctly guess the sequence number to be used by the sending host. If they can do this, they will be able to send counterfeit packets to the receiving host which will seem to originate from the sending host, even though the counterfeit packets may in fact originate from some third host controlled by the attacker. One possible way for this to occur is for the attacker to listen to the conversation occurring between the trusted hosts, and then to issue packets using the same source IP address. By monitoring the traffic before an attack is mounted, the malicious host can figure out the correct sequence number. After the IP address and the correct sequence number are known, it is basically a race between the attacker and the trusted host to get the correct packet sent. One common way for the attacker to send it first is to launch another attack on the trusted host, such as a denial-of-service attack. Once the attacker has control over the connection, they are able to send counterfeit packets without getting a response.[2]
If an attacker can cause delivery of counterfeit packets of this sort, they may be able to cause various sorts of mischief, including the injection into an existing TCP connection of data of the attacker's choosing, and the premature closure of an existing TCP connection by the injection of counterfeit packets with the RST bit set, a TCP reset attack.
Theoretically, other information such as timing differences or information from lower protocol layers could allow the receiving host to distinguish authentic TCP packets from the sending host and counterfeit TCP packets with the correct sequence number sent by the attacker. If such other information is available to the receiving host, if the attacker can also fake that other information, and if the receiving host gathers and uses the information correctly, then the receiving host may be fairly immune to TCP sequence prediction attacks. Usually, this is not the case, so the TCP sequence number is the primary means of protection of TCP traffic against these types of attack.
Another solution to this type of attack is to configure any router or firewall to not allow packets to come in from an external source but with an internal IP address. Although this does not fix the attack, it will prevent the potential attacks from reaching their targets.[2]
See also
[edit]References
[edit]- ^ Bellovin, S.M. (1 April 1989). "Security Problems in the TCP/IP Protocol Suite". ACM SIGCOMM Computer Communication Review. 19 (2): 32–48. doi:10.1145/378444.378449. Retrieved 6 May 2011.
- ^ a b "TCP Sequence Prediction Attack". 6 April 2019.
TCP sequence prediction attack
View on GrokipediaTCP Fundamentals
Sequence Numbers in TCP
Transmission Control Protocol (TCP) employs 32-bit sequence numbers to manage the reliable delivery of data across a connection-oriented network. These sequence numbers, which range from 0 to 2^{32} - 1 and wrap around using modulo arithmetic, assign a unique identifier to every byte of data transmitted. This mechanism enables the receiver to order incoming packets correctly, acknowledge the receipt of specific bytes, and detect issues such as duplicates, losses, or out-of-order arrivals.[7] During the TCP three-way handshake, sequence numbers are initially established to synchronize the endpoints. The client initiates the connection by sending a SYN segment containing its Initial Sequence Number (ISN) as the sequence number (SEQ). The server responds with a SYN-ACK segment, which includes its own ISN as SEQ and acknowledges the client's ISN by incrementing it by 1 (ACK = client's ISN + 1). Finally, the client sends an ACK segment with SEQ equal to its ISN + 1 and ACK equal to the server's ISN + 1, confirming the synchronization. This process ensures both sides agree on the starting point for their respective byte streams.[8] In ongoing data transmission, sequence numbers track the position of data within the byte stream. Each TCP segment's SEQ field specifies the sequence number of its first data byte, allowing the receiver to reconstruct the original stream by reassembling segments in order. Acknowledgments are cumulative: an ACK value of X indicates that all bytes up to but not including X have been successfully received and buffered. The sender advances its next sequence number (SND.NXT) after transmitting data and uses the oldest unacknowledged sequence number (SND.UNA) to manage the send window, ensuring flow control and reliability.[7] For illustration, consider a simple connection where the client selects an ISN of 1000:- Client → Server: SEQ=1000, SYN
- Server → Client: SEQ=3000, ACK=1001, SYN-ACK
- Client → Server: SEQ=1001, ACK=3001, ACK
Initial Sequence Number Selection
In the original TCP specification outlined in RFC 793, the initial sequence number (ISN) serves a critical purpose: to ensure that sequence numbers for a new connection are sufficiently distant from those of any prior connection using the same port pair, thereby preventing the acceptance of old or duplicate packets that could confuse the protocol state machine. This mechanism addresses potential issues arising from network delays or system crashes, where lingering packets from terminated sessions might otherwise be misinterpreted as valid. To achieve this, each TCP endpoint selects its own ISN independently at the start of a connection, which becomes the starting point for all subsequent sequence numbers in that direction.[9] RFC 793 recommends generating the ISN using a 32-bit clock-based mechanism to promote uniqueness over time. Specifically, the ISN generator is tied to a 32-bit clock whose least significant bit increments roughly every 4 microseconds, resulting in the full sequence space wrapping around approximately every 4.55 hours, as specified in the RFC (a literal calculation using exactly 4 microseconds per tick yields about 4.77 hours). This design aims to provide a degree of randomness and progression. However, the relatively slow tick rate and deterministic nature of the clock make the ISN somewhat predictable if an attacker can observe or estimate the system's timing.[9] Early TCP implementations often deviated from this ideal, introducing even greater predictability in ISN selection. For instance, the 4.2BSD Unix TCP/IP stack employed a global ISN counter that incremented linearly: by 128 (or ) every second of system uptime and by an additional 64 after each new connection initiation. This approach, while simple, exacerbated vulnerability to prediction, as the ISN for a new connection could be estimated by monitoring prior connections and applying the fixed increments. Other early systems similarly relied on fixed starting values or purely linear progressions without sufficient randomization, further compromising the intended protections of the RFC.Attack Mechanism
Predicting Sequence Numbers
In a TCP sequence prediction attack targeting an ongoing session, the off-path attacker predicts the sequence numbers based on known vulnerabilities in the target's TCP implementation, such as predictable Initial Sequence Number (ISN) generation algorithms or patterns inferred from probing the system.[1] Attackers may establish probe connections to the target host to observe ISN values and model the generation process, for example, by noting time-based increments without accessing the actual session traffic.[10] Prediction methods rely on exploiting predictable patterns in vulnerable TCP implementations, such as those deriving sequence numbers from system clocks. In early systems like 4.2BSD Unix, sequence numbers incremented linearly by a fixed amount—typically 128 per second plus 64 per new connection—enabling attackers to estimate future numbers by observing probe values and applying the known offset.[10] More generally, attackers guess based on clock-derived increments, adding time-based offsets (e.g., microseconds since boot) or assuming linear progression in flawed generators that lack sufficient randomness.[11] Key challenges include sequence number wrapping, where the 32-bit field overflows after reaching 4,294,967,295, potentially disrupting predictions if the session spans the wrap point.[9] Variable increments further complicate accuracy, as retransmissions, packet delays, or out-of-order delivery can alter the expected progression beyond simple linear estimates.[1] Success depends on low network latency to enable real-time prediction and injection, minimizing the gap between prediction and action. For instance, using data from probes, the attacker can predict the current sequence number by adding estimated increments based on elapsed time and connection activity to a baseline ISN value. This approach succeeds in vulnerable systems where patterns remain consistent, but modern randomized implementations render it ineffective.[11]Session Hijacking via Injection
Once the attacker has successfully predicted the TCP sequence numbers for an ongoing session, the hijacking proceeds by injecting forged packets that exploit the legitimacy checks based on these numbers. The attacker injects forged packets using the predicted sequence number (SEQ), the estimated acknowledgment number (ACK), and the legitimate source IP address and port to impersonate the sender. For instance, a data packet carrying malicious payload—such as commands in a telnet session—can be injected directly into the stream. The packet includes the attacker's controlled payload while maintaining TCP header details like flags (e.g., PSH/ACK) and the predicted SEQ to pass validation; upon receipt, the receiver accepts it as valid if the SEQ falls within the current window, effectively incorporating the injected content.[12][2] The attack culminates in the receiver processing the injected payload, leading to outcomes such as session takeover. In vulnerable services like telnet or rlogin, this enables command injection, allowing the attacker to execute arbitrary instructions on the target host as if from the legitimate user, or to manipulate data flows, such as redirecting traffic or altering session state. For example, injecting a command like "whoami" in a shell session could reveal user privileges, escalating the compromise.[1][2] However, the attack has inherent limitations that constrain its feasibility. It requires the attacker to have network visibility or routing that allows spoofed packets to reach the target without being filtered, often necessitating proximity to the same subnet or control over intervening routers. Additionally, if the predicted SEQ deviates from the actual value by more than the receiver's current window size—typically 1 to 65,535 bytes—the forged packet will be rejected as out-of-sequence, rendering the injection ineffective.[1][12]Historical Context
Early Vulnerabilities
The Transmission Control Protocol (TCP) was introduced as part of the TCP/IP protocol suite in the 4.2BSD release in August 1983, marking a significant milestone in the development of the ARPANET and early internetworking.[13] This implementation, derived from the specifications in RFC 793 published in 1981, established TCP as a reliable transport protocol but contained inherent design and implementation flaws that exposed it to sequence prediction attacks.[14] Initial vulnerabilities became evident shortly after deployment, with the first detailed analysis appearing in 1985. This vulnerability was further analyzed publicly by Steve Bellovin in his 1989 paper, which described sequence number spoofing in conjunction with IP spoofing.[12] A primary flaw stemmed from the predictable generation of Initial Sequence Numbers (ISNs) in early TCP implementations. According to RFC 793, ISNs were intended to be selected from a clock-driven 32-bit counter that incremented approximately every 4 microseconds, cycling every 4.55 hours to ensure uniqueness across connection incarnations.[14] However, the 4.2BSD implementation rendered this mechanism highly predictable by incrementing the global ISN by 128 units every second and by an additional 64 units after each new connection was established, allowing an attacker to observe ongoing traffic and compute future ISNs with relative ease.[10] This lack of randomization meant that sequence numbers for new connections could be guessed accurately, especially given the small size of the receiver's window—typically ranging from 1 KB to 32 KB in early systems, such as the default 2 KB receive buffer in 4.2BSD—which limited the range of acceptable sequence numbers and increased the likelihood of successful injection.[10][15] These vulnerabilities affected early UNIX variants, particularly 4.2BSD and subsequent releases up to the early 1990s, which were widely adopted in academic, research, and defense networks.[10] Systems running these implementations, including those at Bell Labs and U.S. Department of Defense sites, relied solely on sequence numbers for data integrity and connection validation, without any cryptographic protections or additional authentication mechanisms in the protocol design.[14] As noted by Robert Morris in his 1985 technical report, this dependence on guessable sequence numbers enabled off-path attackers to forge packets and hijack sessions, highlighting a fundamental weakness in the trust model of early TCP.[10]Notable Exploits
One of the earliest notable demonstrations of a TCP sequence prediction attack was detailed by Robert T. Morris in his 1985 paper on vulnerabilities in the 4.2BSD Unix TCP/IP implementation. Morris illustrated how the predictable generation of initial sequence numbers (ISNs) allowed an off-path attacker to forge packets and hijack ongoing sessions by guessing the next expected sequence number with high probability after observing prior connections from the target host.[10] This proof-of-concept highlighted the potential for session hijacking without direct network access, establishing the foundational risks of non-random ISNs. In the 1990s, hacker Kevin Mitnick applied TCP sequence prediction techniques to hijack sessions in networks including DECNET, most famously during his 1994 intrusion into Tsutomu Shimomura's systems. By combining IP spoofing with accurate prediction of sequence numbers, Mitnick gained unauthorized access to workstations, demonstrating the attack's real-world viability against production environments without on-path presence.[16] This exploit, part of Mitnick's broader activities, evaded detection long enough to steal proprietary software and data, amplifying concerns over TCP's authentication weaknesses. Early proof-of-concept tools emerged in the mid-1990s to test and demonstrate sequence prediction vulnerabilities, such as a 1995 implementation shared on the Bugtraq mailing list that achieved high success rates on systems with linear or incremental ISN generators by brute-forcing guesses during connection establishment.[17] These tools, often scripted in C or Perl, targeted legacy Unix variants and revealed that many deployed systems remained susceptible, with prediction difficulty varying based on ISN algorithms. The exploits collectively drove widespread awareness, as documented in academic literature and CERT advisories, which quantified risks like session hijacking probabilities and urged protocol improvements to prevent off-path attacks.[18]Defenses and Mitigations
Randomization of ISNs
To mitigate TCP sequence prediction attacks, the randomization of Initial Sequence Numbers (ISNs) emerged as a core defense mechanism, evolving through IETF standards to ensure unpredictability. In 1996, RFC 1948 proposed an initial approach using a combination of a fine-grained timer (advancing every 4 microseconds) and a cryptographic hash function applied to the connection tuple (source and destination IP addresses and ports) along with a secret value, aiming to produce ISNs that resist guessing by off-path attackers.[11] This time-based randomization sought to avoid the predictable increments used in earlier TCP implementations, such as those incrementing by a fixed value per connection or time unit. By 2012, RFC 6528 advanced this to standards track status, mandating a cryptographically secure pseudorandom function (PRF) for ISN generation—typically MD5 or a stronger hash like SHA-1—incorporating the same connection details plus a periodically refreshed secret key (at least 128 bits) and the timer offset, thereby formalizing robust protection against sequence number inference.[1] Modern TCP implementations in operating systems have adopted these standards using per-connection cryptographic computations to derive 32-bit ISNs from larger internal states, ensuring high entropy without maintaining connection-specific state. For instance, the Linux kernel has employed hash-based ISN generation since version 2.6.12 (released in 2005), leveraging functions likesecure_tcp_seq that apply a keyed hash (initially MD5-based, later upgraded to SipHash for efficiency) over the connection parameters and system secrets, often combined with a 64-bit monotonic counter to prevent reuse across rapid connections.[19] Similarly, Microsoft Windows implementations from Windows Vista (2007) onward integrate strong randomization using per-connection hashing mechanisms, aligning with RFC 6528 guidelines to produce unpredictable ISNs.[20] These approaches typically reseed secrets on boot or at intervals, balancing security with performance in high-throughput environments.
The effectiveness of ISN randomization lies in expanding the prediction space to the full 32-bit range of TCP sequence numbers, reducing the success probability of blind guessing to approximately 1 in 4.29 billion (2^{32}), which effectively thwarts off-path injection attacks by making valid sequence prediction computationally infeasible without observing prior traffic. This is exemplified by the common formula for ISN computation: