Handshake (computing)
View on WikipediaThis article is missing information about Handshaking in, e.g., buses, software. (December 2021) |
In computing, a handshake is a process in which two devices establish a communication link by authenticating and validating each other's signals. An example is the handshaking between a hypervisor and an application in a guest virtual machine.
In telecommunications, a handshake is an automated process of negotiation between two participants (example "Alice and Bob") through the exchange of information that establishes the protocols of a communication link at the start of the communication, before full communication begins.[1] The handshaking process usually takes place in order to establish rules for communication when a computer attempts to communicate with another device. Signals are usually exchanged between two devices to establish a communication link. For example, when a computer communicates with another device such as a modem, the two devices will signal each other that they are switched on and ready to work, as well as to agree to which protocols are being used.[2]
Handshaking can negotiate parameters that are acceptable to equipment and systems at both ends of the communication channel, including information transfer rate, coding alphabet, parity, interrupt procedure, and other protocol or hardware features. Handshaking is a technique of communication between two entities. However, within TCP/IP RFCs, the term "handshake" is most commonly used to reference the TCP three-way handshake. For example, the term "handshake" is not present in RFCs covering FTP or SMTP. One exception is Transport Layer Security, TLS, setup, FTP RFC 4217. In place of the term "handshake", FTP RFC 3659 substitutes the term "conversation" for the passing of commands.[3][4][5]
A simple handshaking protocol might only involve the receiver sending a message meaning "I received your last message and I am ready for you to send me another one." A more complex handshaking protocol might allow the sender to ask the receiver if it is ready to receive or for the receiver to reply with a negative acknowledgement meaning "I did not receive your last message correctly, please resend it" (e.g., if the data was corrupted en route).[6]
Handshaking facilitates connecting relatively heterogeneous systems or equipment over a communication channel without the need for human intervention to set parameters.
Example
[edit]TCP three-way handshake
[edit]
Establishing a normal TCP connection requires three separate steps:
- The first host (Alice) sends the second host (Bob) a "synchronize" (SYN) message with its own sequence number , which Bob receives.
- Bob replies with a synchronize-acknowledgment (SYN-ACK) message with its own sequence number and acknowledgement number , which Alice receives.
- Alice replies with an acknowledgment (ACK) message with acknowledgement number , which Bob receives and to which he doesn't need to reply.
- In this setup, the synchronize messages act as service requests from one server to the other, while the acknowledgement messages return to the requesting server to let it know the message was received.
The reason for the client and server not using a default sequence number such as 0 for establishing the connection is to protect against two incarnations of the same connection reusing the same sequence number too soon, which means a segment from an earlier incarnation of a connection might interfere with a later incarnation of the connection.
SMTP
[edit]The Simple Mail Transfer Protocol (SMTP) is the key Internet standard for email transmission. It includes handshaking to negotiate authentication, encryption and maximum message size.
TLS handshake
[edit]When a Transport Layer Security (SSL or TLS) connection starts, the record encapsulates a "control" protocol—the handshake messaging protocol (content type 22). This protocol is used to exchange all the information required by both sides for the exchange of the actual application data by TLS. It defines the messages formatting or containing this information and the order of their exchange. These may vary according to the demands of the client and server—i.e., there are several possible procedures to set up the connection. This initial exchange results in a successful TLS connection (both parties ready to transfer application data with TLS) or an alert message (as specified below).
The protocol is used to negotiate the secure attributes of a session. (RFC 5246, p. 37)[7]
WPA2 wireless
[edit]The WPA2 standard for wireless uses a four-way handshake defined in IEEE 802.11i-2004.
Dial-up access modems
[edit]One classic example of handshaking is that of dial-up modems, which typically negotiate communication parameters for a brief period when a connection is first established, and there after use those parameters to provide optimal information transfer over the channel as a function of its quality and capacity. The "squealing" (which is actually a sound that changes in pitch 100 times every second) noises made by some modems with speaker output immediately after a connection is established are in fact the sounds of modems at both ends engaging in a handshaking procedure; once the procedure is completed, the speaker might be silenced, depending on the settings of operating system or the application controlling the modem.
Serial "Hardware Handshaking"
[edit]This frequently used term describes the use of RTS and CTS signals over a serial interconnection. It is, however, not quite correct;[citation needed] it's not a true form of handshaking, and is better described as flow control.
Mobile device charging
[edit]In mobile device chargers offering special quick-charge abilities to supported devices, the charging process will switch up to a higher output voltage for increased power transfer. But this could cause serious damage to an unsupported device or even result in a fire. It is therefore very important for the device and charger to first perform a handshake to "agree" on mutually supported charge parameters. If such a charger can't identify the connected device or determine its compatibility, it will default to normal but much slower charge parameters within the USB standard.
References
[edit]- ^ "What is handshaking? - Definition from WhatIs.com". SearchNetworking. Retrieved 2018-02-19.
- ^ Ware, Peter; Chivers, Bill; Cheleski, Paul (2001). Jacaranda Information Processes and Technology: HSC Course. Australia: John Wiley & Sons Australia. pp. 92–93. ISBN 978-0701634728.
- ^ TCP RFC 793, 2581
- ^ SMTP RFC 821,5321, 2821, 1869,6531, 2822
- ^ FTP 959, 3659 (conversation), 2228,4217 (TLS handshake),5797
- ^ "handshaking". TheFreeDictionary's Encyclopedia.
- ^ The Transport Layer Security (TLS) Protocol, version 1.2. IETF. August 2008. doi:10.17487/RFC5246. RFC 5246.
Handshake (computing)
View on GrokipediaOverview
Definition
In computing, a handshake refers to an initial exchange of signals or messages between two devices, programs, or systems to establish, negotiate, or verify the parameters necessary for subsequent communication or data transfer. This process ensures that both parties are ready and capable of interacting under agreed-upon conditions, such as transmission speed or protocol compatibility.[5][6] Key characteristics of a computing handshake include its bidirectional nature, which facilitates synchronization between the communicating entities; negotiation of operational parameters, potentially encompassing data rates, authentication credentials, or session keys; and incorporation of mechanisms for detecting errors or incompatibilities during the setup phase. These elements collectively prepare a stable channel, preventing mismatches that could lead to failed interactions. Unlike unilateral signals, handshakes require mutual confirmation to proceed.[7][8] The term "handshake" derives from the human physical gesture symbolizing agreement or trust, and its analogy was adopted in early computing for data communication systems and network protocols.[9][10] A handshake differs from a simple acknowledgment (ACK), which is a basic, one-way confirmation of data receipt during an active session; in contrast, a handshake constitutes a structured, multi-step sequence dedicated to initial connection establishment rather than ongoing verification. In protocols like TCP, handshakes play a foundational role in synchronizing endpoints before data exchange begins.[11][12]Purpose and Importance
Handshakes in computing primarily serve to establish reliable connections between communicating entities, such as devices or software processes, by synchronizing their operational states and confirming mutual readiness for data exchange.[7] This process allows the parties to negotiate essential parameters, including data transfer rates, encoding formats, and error-checking mechanisms, ensuring compatibility before any substantive communication occurs.[10] Additionally, handshakes facilitate authentication to verify the identities of participants and detect potential incompatibilities early, preventing wasted resources on mismatched interactions.[13] The benefits of handshakes are substantial, as they reduce transmission errors by incorporating mechanisms for detection and correction, such as acknowledgments, thereby enhancing overall data integrity.[10] By verifying identities and enabling encryption negotiation, handshakes bolster security against unauthorized access, while also promoting efficient resource allocation through synchronized flow control that avoids overwhelming slower components.[13] Furthermore, they support backward compatibility, allowing newer systems to interoperate with legacy hardware or software by dynamically adjusting to supported capabilities.[7] Despite these advantages, handshakes introduce drawbacks, including overhead in terms of time and bandwidth due to the multiple signal exchanges required, which can introduce latency in high-speed environments.[7] They are also susceptible to certain risks, such as resource exhaustion attacks that exploit the state-holding nature of the process, exemplified by SYN flooding where incomplete handshakes consume server memory and processing capacity.[14] Failure modes, like timeouts or mismatched responses, can lead to abrupt connection drops, necessitating retry mechanisms that further amplify overhead.[15] In modern computing, handshakes are indispensable for the scalability of distributed systems, where myriad devices must interoperate seamlessly across networks.[13] Their role is particularly vital in Internet of Things (IoT) ecosystems and cloud environments, enabling secure, synchronized data flows among heterogeneous devices while supporting real-time applications in industrial automation and beyond.[13] Without effective handshaking, the reliability and efficiency of these expansive, interconnected infrastructures would be severely compromised.[10]Types of Handshakes
Software Handshakes
Software handshakes, also known as software flow control, involve the use of special control characters embedded within the data stream to coordinate data transmission between devices, without requiring dedicated hardware control lines. This method relies on in-band signaling over the existing transmit and receive data lines, typically in serial communications.[16][17] The primary mechanism uses ASCII control characters: XON (Transmit On, DC1, hexadecimal 0x11) to resume transmission and XOFF (Transmit Off, DC3, hexadecimal 0x13) to pause it. When a receiving device's buffer approaches capacity, it sends an XOFF character to the sender, which halts data flow until an XON is received. This process ensures synchronization and prevents buffer overflows, though it can be less reliable if control characters are corrupted or misinterpreted as data.[16][18] Software handshakes operate at the physical or data link layer of the OSI model and are commonly applied in asynchronous serial interfaces, such as RS-232 connections between computers and peripherals like printers or modems. They are particularly useful in scenarios with limited cabling, as no extra wires are needed beyond TX, RX, and ground. However, they introduce potential latency from processing control characters and risk data disruption if the protocol lacks error checking.[16] Compared to hardware handshakes, software methods offer flexibility for software-configurable systems but may incur higher CPU overhead for parsing control characters. In some setups, both can be combined, with hardware taking precedence if enabled.[19] The origins of software handshaking date to the early 1960s, coinciding with the development of ASCII (1963) and asynchronous teletypewriter systems, where control characters were used to manage transmission over telephone lines. It became standardized in serial communication protocols and remains in use for legacy and embedded systems.[16][20]Hardware Handshakes
Hardware handshakes involve signal-based exchanges between devices using dedicated control pins or electrical lines at the physical or data link layer, enabling direct coordination without reliance on higher-layer software protocols.[21] These mechanisms ensure reliable data flow by signaling device readiness and managing transmission timing through voltage level changes, where +3 V to +15 V represents logic 0 (SPACE) and -3 V to -15 V represents logic 1 (MARK) in standards like RS-232.[22] This approach is particularly suited to environments where precise, low-level synchronization is required to prevent data loss or buffer overflows.[23] Common mechanisms include the Request-to-Send (RTS) and Clear-to-Send (CTS) signals, which provide hardware flow control in serial interfaces. In this protocol, the data terminal equipment (DTE), such as a computer, asserts the RTS line (pin 4) to indicate readiness to transmit data, prompting the data circuit-terminating equipment (DCE), like a modem, to assert CTS (pin 5) when it can receive, thereby initiating data transfer.[21] Additional signals, such as Data Terminal Ready (DTR) and Data Set Ready (DSR), may complement RTS/CTS by confirming overall connection status before communication begins.[22] These voltage-driven interchanges allow for simple synchronization without embedding control information in the data stream itself.[23] Hardware handshakes find primary application in point-to-point serial links, such as RS-232 connections between computers and peripherals like printers or sensors, where software cannot reliably predict timing due to variable processing delays.[21] For instance, in environmental control systems interfacing with devices like thermostats, RTS/CTS ensures half-duplex communication proceeds only when both ends are prepared, avoiding transmission errors in unreliable timing scenarios.[23] This is essential for legacy and embedded systems relying on direct electrical signaling for basic coordination.[22] Compared to software handshakes, hardware methods offer lower latency through immediate electrical responses, eliminating the need for packet-based acknowledgments and reducing CPU overhead.[24] They impose no additional data overhead, making them efficient for real-time flow control, though their scope is limited to straightforward negotiations such as readiness signaling rather than complex parameter exchanges like baud rate or parity settings.[21] In hybrid systems, hardware handshakes can integrate with software approaches to provide layered control for more robust communication.[24] The origins of hardware handshaking trace back to the 1960s, evolving from teletypewriter systems that required reliable signal coordination for early data transmission over telephone lines. These mechanisms were formalized in the EIA RS-232 standard, first published in 1962 by the Electronic Industries Association to standardize interfaces between data terminals and modems, with subsequent revisions ensuring compatibility across serial ports.[21][22]Handshakes in Networking Protocols
TCP Three-Way Handshake
The TCP three-way handshake is a fundamental mechanism in the Transmission Control Protocol (TCP) for establishing a reliable, connection-oriented communication session between a client and a server. It consists of three sequential steps—Synchronize (SYN), Synchronize-Acknowledge (SYN-ACK), and Acknowledge (ACK)—designed to synchronize sequence numbers and verify bidirectional reachability, ensuring both endpoints can reliably exchange data without prior state assumptions. This process prevents issues like old or duplicate packets from previous connections interfering with new ones, as each side independently generates and confirms its initial sequence number (ISN). Defined in the original TCP specification, the handshake operates at the transport layer and is essential for TCP's reliability features, such as ordered delivery and error recovery.[25] The process begins with the client (active opener) initiating a connection by sending a SYN segment to the server (passive opener). This SYN packet includes the client's 32-bit ISN, randomly selected to avoid predictability, and sets the SYN control flag while leaving the acknowledgment number undefined. Upon receipt, the server responds with a SYN-ACK segment, which includes its own 32-bit ISN, sets both SYN and ACK flags, and specifies an acknowledgment number equal to the client's ISN plus one (ACK = client's ISN + 1) to confirm receipt of the SYN; this step consumes one sequence number from the server's side. Finally, the client sends an ACK segment acknowledging the server's ISN plus one (ACK = server's ISN + 1), completing the handshake and transitioning both endpoints to the ESTABLISHED state, where data transmission can begin. Each segment may carry TCP options, and the SYN and SYN-ACK packets do not carry application data to maintain the handshake's focus on connection setup.[25][26][27] During the handshake, several key parameters are exchanged to optimize the connection. The Maximum Segment Size (MSS) is advertised via a TCP option in the SYN and SYN-ACK segments, allowing each side to inform the other of its maximum receivable data payload size (typically derived from the interface MTU minus headers), though it is not formally negotiated but unilaterally stated for the peer to respect. The initial receive window size, a 16-bit field in the TCP header, is included in all segments to enable flow control by indicating the amount of buffer space available for incoming data, starting from the SYN-ACK onward. Additionally, support for Selective Acknowledgments (SACK) can be proposed through the SACK-Permitted option in the SYN segment, permitting the receiver to acknowledge non-contiguous blocks of data later in the session if both sides agree during the handshake. These parameters ensure efficient data transfer tailored to the network path.[28][29][30] Sequence number synchronization relies on a simple yet robust formula: each endpoint generates a random 32-bit ISN at the start, with the SYN segment advancing the sequence by one (as SYN consumes a slot), so subsequent data begins at ISN + 1; acknowledgments confirm this by setting ACK = peer's ISN + 1 in the response. This mutual confirmation—client acknowledging server's ISN + 1 in the final ACK, and server having already acknowledged the client's—ensures both sides agree on the starting point for byte-stream numbering, preventing desynchronization. In mathematical terms, for client ISN_c and server ISN_s:TLS Handshake
The TLS handshake is a multi-phase cryptographic exchange protocol that establishes a secure communication session between a client and a server over a reliable transport layer, such as TCP, by authenticating the parties, negotiating cryptographic parameters, and deriving symmetric session keys for subsequent encrypted data transfer.[37] This process ensures confidentiality, integrity, and authenticity while preventing eavesdropping and tampering during session initiation.[38] The handshake begins with the ClientHello message, in which the client specifies supported TLS versions (e.g., up to 1.3), a list of preferred cipher suites defining encryption algorithms and key exchange methods, a 32-byte random nonce for freshness, and optional extensions such as supported key exchange groups (e.g., secp256r1 or X25519).[39] The server responds with a ServerHello message selecting the highest mutually supported version and cipher suite, its own 32-byte random nonce, and relevant extensions, followed by its X.509 certificate chain for public-key authentication.[40] The key exchange phase then occurs, typically using ephemeral Diffie-Hellman (DHE) or elliptic curve Diffie-Hellman (ECDHE) for forward secrecy, where the client and server exchange public values to compute a shared premaster secret without transmitting it directly; as of November 2025, hybrid post-quantum key exchanges—combining classical methods like X25519 with post-quantum algorithms such as ML-KEM (Kyber)—are widely adopted for quantum-resistant security, with over 50% of traffic on major platforms using such protections; alternatively, RSA-based exchange encrypts the premaster secret with the server's public key from the certificate.[41][42] The handshake concludes with Finished messages from both parties, each containing a message authentication code (MAC) computed over the entire handshake transcript using the newly derived keys to verify integrity and prevent replay attacks.[43] Central to the TLS handshake are concepts from public-key cryptography, which facilitate initial server (and optionally client) authentication via digital certificates, transitioning to efficient symmetric cryptography for the session's bulk encryption using algorithms like AES.[44] Extensions such as Server Name Indication (SNI) allow the client to specify the target hostname in the ClientHello, enabling virtual hosting on shared IP addresses without compromising security.[45] Key derivation combines the premaster secret with the client and server random nonces through a pseudorandom function (PRF); in earlier versions like TLS 1.2, this uses a PRF based on HMAC-SHA256, while TLS 1.3 employs HKDF (HMAC-based key derivation function) for enhanced security.[46] For instance, in TLS 1.3, traffic secrets are derived as:SMTP Handshake
The SMTP handshake is the initial command-based exchange in the Simple Mail Transfer Protocol (SMTP) that establishes a session between a client Mail Transfer Agent (MTA) and a server MTA for email delivery, enabling capability negotiation and identity verification before message transfer begins.[51] This process occurs at the application layer over a TCP connection and differs from lower-layer handshakes by relying on textual commands and responses rather than binary flags or cryptographic keys. It ensures both parties agree on protocol features, such as support for extended capabilities, while identifying the domains involved in the transaction.[52] The handshake begins when the client initiates a TCP connection to the server on port 25, prompting the server to issue a 220 "Service ready" greeting, which may include the server's domain and software details.[53] The client then sends an EHLO (Extended SMTP) command followed by its fully qualified domain name (FQDN) or address literal, signaling support for SMTP extensions; alternatively, it uses the simpler HELO command for basic SMTP without extensions.[54] In response, the server replies with a multiline 250 "OK" status, listing supported extensions—such as AUTH for authentication or STARTTLS for opportunistic TLS encryption—in a parameterized format that allows the client to select compatible features.[55] If EHLO fails or extensions are unavailable, the client falls back to HELO, and the server confirms with a single 250 response, clearing any prior state.[54] Key parameters in the handshake include the client's domain identification via EHLO or HELO, which verifies the sender's origin, and the server's advertised features, such as 8BITMIME for transporting 8-bit text content without alteration or PIPELINING for sending multiple commands without waiting for individual responses to improve efficiency.[52] These extensions are registered with IANA and defined in separate RFCs, allowing modular enhancement of the base protocol.[56] Following the greeting exchange, the client may negotiate authentication using AUTH parameters or initiate encryption with STARTTLS if advertised, upgrading the session to TLS in a single sentence of integration without altering the core handshake flow.[57] Error handling during the handshake involves standardized reply codes; for instance, a 421 "Service not available" response indicates the server is busy or shutting down, prompting the client to close the connection and implement retry logic with exponential backoff.[58] Other errors, like 500 for syntax issues or 503 for invalid sequence, terminate the session immediately, distinguishing temporary (4xx) from permanent (5xx) failures to guide client behavior.[59] The SMTP handshake is standardized in RFC 5321, published in 2008, which consolidates and updates the original RFC 821 from 1982 by incorporating Extended SMTP (ESMTP) mechanisms for modern extensions while maintaining backward compatibility.[60][61] This evolution ensures robust session setup across diverse email infrastructures. In the broader email flow, the handshake establishes sender and receiver identities through domain parameters, paving the way for subsequent MAIL FROM and RCPT TO commands that specify paths without re-verifying basics.[62] For illustration, a typical EHLO exchange might appear as follows:C: EHLO client.example.com
S: 250-server.example.com Hello client.example.com
S: 250-8BITMIME
S: 250-PIPELINING
S: 250 STARTTLS
S: 250 [OK](/page/OK)
This format highlights the server's multiline response, enabling the client to proceed with feature-aware commands.[63]
