Recent from talks
Nothing was collected or created yet.
| SOAP | |
|---|---|
| Status | Active |
| First published | Initially as XML-RPC in June 1998 |
| Latest version | 1.2 27 April 2007 |
| Domain | Text-based protocol |
| License | Copyright by W3C, implementations are royalty-free |
| Website | w3 |
| Internet protocol suite |
|---|
| Application layer |
| Transport layer |
| Internet layer |
| Link layer |
This article needs additional citations for verification. (December 2023) |
SOAP (originally an acronym for Simple Object Access Protocol)[a] is a messaging protocol specification for exchanging structured information in the implementation of web services in computer networks. It uses XML Information Set for its message format, and relies on application layer protocols, most often Hypertext Transfer Protocol (HTTP), although some legacy systems communicate over Simple Mail Transfer Protocol (SMTP), for message negotiation and transmission.
Characteristics
[edit]SOAP provides the Messaging Protocol layer of a web services protocol stack for web services. It is an XML-based protocol consisting of three parts:
- an envelope, which defines the message structure[1] and how to process it
- a set of encoding rules for expressing instances of application-defined datatypes
- a convention for representing procedure calls and responses
SOAP has three major characteristics:
- extensibility (security and WS-Addressing are among the extensions under development)
- neutrality (SOAP can operate over any protocol such as HTTP, SMTP, TCP, UDP)
- independence (SOAP allows for any programming model)
As an example of what SOAP procedures can do, an application can send a SOAP request to a server that has web services enabled—such as a real-estate price database—with the parameters for a search. The server then returns a SOAP response (an XML-formatted document) with the resulting data, e.g., prices, location, features. Since the generated data comes in a standardized machine-parsable format, the requesting application can then integrate it directly.
The SOAP architecture consists of several layers of specifications for:
- message format
- Message Exchange Patterns (MEP)
- underlying transport protocol bindings
- message processing models
- protocol extensibility
SOAP evolved as a successor of XML-RPC, though it borrows its transport and interaction neutrality from Web Service Addressing[2] and the envelope/header/body from elsewhere (probably from WDDX).[citation needed]
History
[edit]SOAP was designed as an object-access protocol and released as XML-RPC in June 1998 as part of Frontier 5.1 by Dave Winer, Don Box, Bob Atkinson, and Mohsen Al-Ghosein for Microsoft, where Atkinson and Al-Ghosein were working.[3] The specification was not made available until it was submitted to IETF 13 September 1999.[4][5] According to Don Box, this was due to politics within Microsoft.[6] Because of Microsoft's hesitation, Dave Winer shipped XML-RPC in 1998.[7]
The submitted Internet Draft did not reach RFC status and is therefore not considered a "web standard" as such. Version 1.1 of the specification was published as a W3C Note on 8 May 2000.[8] Since version 1.1 did not reach W3C Recommendation status, it can not be considered a "web standard" either. Version 1.2 of the specification, however, became a W3C recommendation on June 24, 2003. SOAP originally stood for "Simple Object Access Protocol" but version 1.2 of the standard dropped this acronym.[9]
The SOAP specification[10] was maintained by the XML Protocol Working Group[11] of the World Wide Web Consortium until the group was closed 10 July 2009.
After SOAP was first introduced, it became the underlying layer of a more complex set of web services, based on WSDL, XSD and UDDI. These different services, especially UDDI, have proved to be of far less interest,[citation needed] but an appreciation of them gives a complete understanding of the expected role of SOAP compared to how web services have actually evolved.[according to whom?]
SOAP terminology
[edit]SOAP specification can be broadly defined to be consisting of the following three conceptual components: protocol concepts, encapsulation concepts and network concepts.[12]
Protocol concepts
[edit]- SOAP
- This is a set of rules formalizing and governing the format and processing rules for information exchanged between a SOAP sender and a SOAP receiver.
- SOAP nodes
- These are physical/logical machines with processing units which are used to transmit/forward, receive and process SOAP messages. These are analogous to nodes in a network.
- SOAP roles
- Over the path of a SOAP message, all nodes assume a specific role. The role of the node defines the action that the node performs on the message it receives. For example, a role "none" means that no node will process the SOAP header in any way and simply transmit the message along its path.
- SOAP protocol binding
- A SOAP message needs to work in conjunction with other protocols to be transferred over a network. For example, a SOAP message could use TCP as a lower layer protocol to transfer messages. These bindings are defined in the SOAP protocol binding framework.[13]
- SOAP features
- SOAP provides a messaging framework only. However, it can be extended to add features such as reliability, security etc. There are rules to be followed when adding features to the SOAP framework.
- SOAP module
- A collection of specifications regarding the semantics of SOAP header to describe any new features being extended upon SOAP. A module needs to realize zero or more features. SOAP requires modules to adhere to prescribed rules.[14]
Data encapsulation concepts
[edit]- SOAP message
- Represents the information being exchanged between 2 SOAP nodes.
- SOAP envelope
- It is the enclosing element of an XML message identifying it as a SOAP message.
- SOAP header block
- A SOAP header can contain more than one of these blocks, each being a discrete computational block within the header. In general, the SOAP role information is used to target nodes on the path. A header block is said to be targeted at a SOAP node if the SOAP role for the header block is the name of a role in which the SOAP node operates. (ex: A SOAP header block with role attribute as ultimateReceiver is targeted only at the destination node which has this role. A header with a role attribute as next is targeted at each intermediary as well as the destination node.)
- SOAP header
- A collection of one or more header blocks targeted at each SOAP receiver.
- SOAP body
- Contains the body of the message intended for the SOAP receiver. The interpretation and processing of SOAP body is defined by header blocks.
- SOAP fault
- In case a SOAP node fails to process a SOAP message, it adds the fault information to the SOAP fault element. This element is contained within the SOAP body as a child element.
Message sender and receiver concepts
[edit]- SOAP sender
- The node that transmits a SOAP message.
- SOAP receiver
- The node receiving a SOAP message. (Could be an intermediary or the destination node).
- SOAP message path
- The path consisting of all the nodes that the SOAP message traversed to reach the destination node.
- Initial SOAP sender
- This is the node which originated the SOAP message to be transmitted. This is the root of the SOAP message path.
- SOAP intermediary
- All the nodes in between the SOAP originator and the intended SOAP destination. It processes the SOAP header blocks targeted at it and acts to forward a SOAP message towards an ultimate SOAP receiver.
- Ultimate SOAP receiver
- The destination receiver of the SOAP message. This node is responsible for processing the message body and any header blocks targeted at it.
Specification
[edit]
The SOAP specification defines the messaging framework, which consists of:
- The SOAP processing model, defining the rules for processing a SOAP message[15]
- The SOAP extensibility model defining the concepts of SOAP features and SOAP modules[15]
- The SOAP underlying protocol binding framework describing the rules for defining a binding to an underlying protocol that can be used for exchanging SOAP messages between SOAP nodes[15]
- The SOAP message construct defining the structure of a SOAP message[15]
SOAP building blocks
[edit]A SOAP message is an ordinary XML document containing the following elements:
| Element | Description | Required |
|---|---|---|
| Envelope | Identifies the XML document as a SOAP message. | Yes |
| Header | Contains header information. | No |
| Body | Contains call and response information. | Yes |
| Fault | Provides information about errors that occurred while processing the message. | No |
Transport methods
[edit]Both SMTP and HTTP are valid application layer protocols used as transport for SOAP, but HTTP has gained wider acceptance as it works well with today's internet infrastructure; specifically, HTTP works well with network firewalls. SOAP may also be used over HTTPS (which is the same protocol as HTTP at the application level, but uses an encrypted transport protocol underneath) with either simple or mutual authentication; this is the advocated WS-I method to provide web service security as stated in the WS-I Basic Profile 1.1.
This is a major advantage over other distributed protocols like GIOP/IIOP or DCOM, which are normally filtered by firewalls. SOAP over AMQP is yet another possibility that some implementations support. SOAP also has an advantage over DCOM that it is unaffected by security rights configured on the machines that require knowledge of both transmitting and receiving nodes. This lets SOAP be loosely coupled in a way that is not possible with DCOM. There is also the SOAP-over-UDP OASIS standard.
Message format
[edit]XML Information Set was chosen as the standard message format because of its widespread use by major corporations and open source development efforts. Typically, XML Information Set is serialized as XML. A wide variety of freely available tools significantly eases the transition to a SOAP-based implementation. The somewhat lengthy syntax of XML can be both a benefit and a drawback. While it facilitates error detection and avoids interoperability problems such as byte-order (endianness), it can slow processing speed and can be cumbersome. For example, CORBA, GIOP, ICE, and DCOM use much shorter, binary message formats. On the other hand, hardware appliances are available to accelerate processing of XML messages.[16][17] Binary XML is also being explored as a means for streamlining the throughput requirements of XML. XML messages by their self-documenting nature usually have more 'overhead' (e.g., headers, nested tags, delimiters) than actual data in contrast to earlier protocols where the overhead was usually a relatively small percentage of the overall message.
In financial messaging SOAP was found to result in a 2–4 times larger message than previous protocols FIX (Financial Information Exchange) and CDR (Common Data Representation).[18]
XML Information Set does not have to be serialized in XML. For instance, CSV and JSON XML-infoset representations exist. There is also no need to specify a generic transformation framework. The concept of SOAP bindings allows for specific bindings for a specific application. The drawback is that both the senders and receivers have to support this newly defined binding.
Example message (encapsulated in HTTP)
[edit]The message below requests a stock price for AT&T (stock ticker symbol "T").
POST /InStock HTTP/1.1
Host: www.example.org
Content-Type: application/soap+xml; charset=utf-8
Content-Length: 299
SOAPAction: "http://www.w3.org/2003/05/soap-envelope"
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:m="http://www.example.org">
<soap:Header>
</soap:Header>
<soap:Body>
<m:GetStockPrice>
<m:StockName>T</m:StockName>
</m:GetStockPrice>
</soap:Body>
</soap:Envelope>
Technical critique
[edit]This section needs additional citations for verification. (August 2020) |
Advantages
[edit]- SOAP's neutrality characteristic explicitly makes it suitable for use with any transport protocol. Implementations often use HTTP as a transport protocol, but other popular transport protocols can be used. For example, SOAP can also be used over SMTP,[19] JMS[20][21] and message queues.
- SOAP, when combined with HTTP post/response exchanges, tunnels easily through existing firewalls and proxies, and consequently doesn't require modifying the widespread computing and communication infrastructures that exist for processing HTTP post/response exchanges.
- SOAP has available to it all the facilities of XML, including easy internationalization and extensibility with XML Namespaces.
Disadvantages
[edit]- When using standard implementation and the default SOAP/HTTP binding, the XML infoset is serialized as XML. To improve performance for the special case of XML with embedded binary objects, the Message Transmission Optimization Mechanism was introduced.
- When relying on HTTP as a transport protocol and not using Web Services Addressing or an Enterprise Service Bus, the roles of the interacting parties are fixed. Only one party (the client) can use the services of the other.
- SOAP is less "simple" than the name would suggest. The verbosity of the protocol, low parsing speed of XML, and lack of a standardized interaction model led to the dominance of services using the HTTP protocol more directly. See, for example, REST.
- Being protocol-agnostic, SOAP is unable to take advantage of protocol-specific features and optimizations such as REST's Uniform Interface or caching – instead having to reimplement them (as with WS-Addressing).
See also
[edit]- SOAP with Attachments
- SOAP with Attachments API for Java
- SOAP-over-UDP
- List of web service protocols
- Message Transmission Optimization Mechanism (MTOM)
- XML-binary Optimized Packaging (XOP)
- Extensible User Interface Protocol (XUP) – a SOAP-based UI protocol
- WebSocket
- Web Services Security
- WS-Security based products and services
Notes
[edit]References
[edit]- ^
Hirsch, Frederick; Kemp, John; Ilkka, Jani (2007-01-11). Mobile Web Services: Architecture and Implementation. John Wiley & Sons (published 2007). p. 27. ISBN 9780470032596. Retrieved 2014-09-15.
Simple Object Access Protocol (SOAP) defines a messaging envelope structure designed to carry application payload in one portion of the envelope (the message body) and control information in another (the message header).
- ^ "Web Services Addressing (WS-Addressing)". www.w3.org. Archived from the original on 2016-09-25. Retrieved 2016-09-15.
- ^ "Exclusive .NET Developer's Journal "Indigo" Interview with Microsoft's Don Box". Dotnet.sys-con.com. Archived from the original on 2019-01-06. Retrieved 2012-10-04.
- ^ "XML Cover Pages on the history of SOAP". Coverpages.org. Archived from the original on 2001-03-03. Retrieved 2003-07-22.
- ^ "SOAP: Simple Object Access Protocol". Ietf Datatracker. September 1999. Archived from the original on 2021-02-25. Retrieved 2015-09-20.
- ^ "Don Box on the history of SOAP". XML.com. 2001-04-04. Archived from the original on 2015-06-18. Retrieved 2015-09-20.
- ^ "XML-RPC for Newbies". 1998-07-14. Archived from the original on October 12, 1999.
- ^ "W3C Note on Simple Object Access Protocol (SOAP) 1.1". W3C. 2000-05-08. Archived from the original on 2021-03-04. Retrieved 2015-09-20.
- ^ "SOAP Version 1.2 Part 1: Messaging Framework (Second Edition)". W3C. April 27, 2007. Archived from the original on 2012-06-19. Retrieved 2012-06-15.
Note: In previous versions of this specification the SOAP name was an acronym. This is no longer the case. (Underneath section 1. Introduction)
- ^ "SOAP Specifications". W3C. Archived from the original on 2021-04-15. Retrieved 2014-03-29.
- ^ "W3C XML Protocol Working Group". W3C. Archived from the original on 2018-12-25. Retrieved 2014-03-29.
- ^ "SOAP Version 1.2 Part 1: Messaging Framework (Second Edition)". www.w3.org. Archived from the original on 2016-09-20. Retrieved 2016-09-14.
- ^ "Binding Framework Proposal". www.w3.org. Archived from the original on 2017-07-11. Retrieved 2016-09-14.
- ^ "SOAP Version 1.2 Part 1: Messaging Framework (Second Edition)". www.w3.org. Archived from the original on 2016-09-20. Retrieved 2016-09-14.
- ^ a b c d "SOAP Version 1.2 Part 1: Messaging Framework (Second Edition)". www.w3.org. Archived from the original on 2017-04-02. Retrieved 2020-06-24.
- ^ "IBM Datapower". 306.ibm.com. 2011-11-30. Archived from the original on 2008-06-22. Retrieved 2012-10-04.
- ^ "IBM Zurich XML Accelerator Engine" (PDF). Archived from the original (PDF) on 2012-09-30. Retrieved 2012-10-04.
- ^ "Evaluating SOAP for High Performance Business Applications: Real-Time Trading Systems". Tenermerx Pty Ltd University of Technology, Sydney. 2011-11-30. Archived from the original on 2013-08-10. Retrieved 2013-03-14.
- ^ Jonathan Chawke (March 9, 2001). "Making Apache SOAP Invocations using SMTP". Apache-SOAP User's FAQ.
- ^ "SOAP over JMS protocol". IBM. Archived from the original on March 22, 2020. Retrieved March 22, 2020.
- ^ "SOAP-JMS FAQ". SOAP-JMS Binding Working Group. Archived from the original on July 17, 2017. Retrieved March 22, 2020.
Further reading
[edit]- Benoît Marchal, "Soapbox: Why I'm using SOAP", IBM
- Uche Ogbuji, "Tutorial: XML messaging with SOAP", Principal Consultant, Fourthought, Inc.
External links
[edit]Overview
Definition and Purpose
SOAP is a lightweight, XML-based messaging protocol intended for the exchange of structured information in decentralized, distributed environments.[2] It defines a standardized way to encode messages and perform remote interactions between applications, ensuring compatibility across diverse systems without reliance on specific vendor technologies.[1] The primary purpose of SOAP is to enable remote procedure calls (RPC) and document-style messaging, allowing applications to invoke methods or exchange entire documents over networks in heterogeneous environments.[6] By standardizing this communication, SOAP promotes interoperability between services built with different programming languages, operating systems, or platforms, facilitating seamless integration in web services architectures.[7] This platform-independent approach addresses key challenges in distributed computing, such as firewall traversal via HTTP and reduced configuration complexity.[7] SOAP was developed to overcome limitations of earlier protocols like DCOM and CORBA, which were hindered by vendor dependencies and poor suitability for internet-scale deployment.[7] It uses XML as the foundational format for encapsulating messages, providing a flexible and extensible structure for data representation.[2]Key Characteristics
SOAP employs a stateless, one-way messaging model that forms the foundation of its communication paradigm, enabling the transmission of messages without maintaining session state between sender and receiver.[8] This model supports higher-level patterns such as request-response through the composition of multiple one-way messages, allowing for flexible interaction in distributed systems.[9] Additionally, SOAP's design ensures binding neutrality, meaning it is not tied to any specific underlying transport protocol and can operate over various transports via a standardized binding framework.[10] A core feature of SOAP is its extensibility mechanism, achieved through the use of header blocks within the message envelope. These headers allow the addition of protocol features—such as security, transaction management, or routing—without modifying the core payload, promoting decentralized extensibility by different participants in the message path.[11] To manage potential naming conflicts in heterogeneous distributed environments, SOAP leverages XML namespaces throughout its message structure, ensuring unambiguous identification of elements and attributes.[12] This XML-based foundation, detailed further in message format specifications, underpins the protocol's structured information exchange.[2] SOAP incorporates fault tolerance through a standardized mechanism for error reporting directly embedded in the message format. When processing errors occur, such as a failure to understand a required header or invalid message content, a SOAP Fault element is generated, containing structured details like fault codes (e.g., env:MustUnderstand or env:DataEncodingUnknown) and reasons, which facilitates reliable error handling across intermediaries and endpoints.[13]History and Development
Origins and Initial Specification
The development of SOAP began in 1998 at Microsoft, where engineers including Satish Thatte and others sought a simpler mechanism for remote procedure calls (RPC) over the web, positioning it as an alternative to the more complex Distributed Component Object Model (DCOM).[14] Key contributors such as Don Box of DevelopMentor collaborated closely with Microsoft, drawing on earlier ideas like Dave Winer's XML-RPC specification released that summer amid internal delays at Microsoft.[15] This effort aimed to create a lightweight protocol that could leverage the growing ubiquity of HTTP and XML for distributed computing, avoiding the proprietary and platform-bound nature of DCOM.[3] Early motivations for SOAP centered on addressing the limitations of binary protocols like DCOM, which struggled with firewall traversal due to their non-standard ports and formats, often requiring tunneling or special configurations that hindered interoperability.[14] By adopting XML for message encoding and HTTP as the transport layer, SOAP enabled seamless passage through firewalls—treating requests as standard web traffic—while promoting XML-driven interoperability across diverse platforms, languages, and vendors without reliance on binary data exchanges.[15] This approach was seen as essential for enabling web-based RPC in enterprise environments, where existing protocols like CORBA and DCOM had faltered in achieving broad adoption over the internet.[3] The initial formal specification, SOAP 1.1, emerged as a de facto standard in early 2000 through collaboration between Microsoft, IBM, DevelopMentor, and others, before deeper W3C involvement.[1] On May 8, 2000, this version was jointly submitted to the W3C as a technical note by authors including Don Box, Satish Thatte, David Ehnebuske of IBM, and Dave Winer of UserLand Software, proposing the formation of an XML Protocol Working Group to standardize it.[1] Published under the title "Simple Object Access Protocol (SOAP) 1.1," the note outlined a lightweight XML-based protocol for exchanging structured information in decentralized systems, marking SOAP's transition from an internal Microsoft initiative to an open, industry-backed effort.[1]Evolution of Versions
SOAP 1.2 marked a significant evolution in the SOAP protocol, achieving W3C Recommendation status on June 24, 2003, which formalized its role as a lightweight protocol for exchanging structured information in distributed environments. This version introduced the standardized media type "application/soap+xml" for identifying SOAP messages serialized in XML 1.0, improving upon the ambiguous "text/xml" used in earlier iterations and enhancing interoperability across systems.[16] Fault handling saw substantial improvements in SOAP 1.2, with the fault element restructured to include mandatory Code and Reason sub-elements, along with optional Node, Role, Detail, and hierarchical Subcode for more granular error reporting and diagnostics.[17] These changes addressed limitations in prior versions by providing clearer mechanisms for conveying error semantics, such as version mismatches via the Upgrade header.[18] SOAP 1.2 also enhanced support for non-HTTP transports by defining an abstract binding framework that allows the protocol to operate over diverse underlying protocols, including SMTP and TCP, without tying it exclusively to HTTP.[10] RPC conventions were refined to specify precise rules for representing remote procedure calls, parameters, and return values in the message body, promoting consistency in procedural-style interactions.[19] To ensure practical interoperability, SOAP 1.2 was designed in alignment with the WS-I Basic Profile, a set of guidelines from the Web Services Interoperability organization that includes conformance tests and best practices for implementing SOAP with WSDL. The Basic Profile versions, starting from 1.0 in 2004, incorporated SOAP 1.2 requirements to facilitate cross-vendor compatibility.[20] The second edition of SOAP 1.2, released as a W3C Recommendation on April 27, 2007, incorporated errata fixes and minor clarifications without altering the core specification.[21] Post-2007, active development of the core SOAP protocol declined, with standardization efforts redirecting toward the WS-* stack of extensions—such as WS-Security, WS-ReliableMessaging, and WS-Addressing—to address advanced features like security and transactionality. This shift marked the last major W3C update to SOAP itself in 2007.[22] As of 2025, SOAP 1.2 remains the current and maintained W3C standard, continuing to support legacy enterprise systems requiring robust, standardized messaging.[2] However, it has been largely superseded by RESTful APIs in new projects, which offer greater simplicity, lighter payloads, and better alignment with modern web architectures.[23]Core Concepts and Terminology
Protocol Fundamentals
SOAP serves as a lightweight protocol designed for the exchange of structured information to facilitate the implementation of web services in decentralized and distributed environments. It enables communication between applications across various underlying protocols by defining a standardized messaging framework that supports extensibility and interoperability. This protocol operates at a high level of abstraction, allowing messages to traverse multiple nodes without being tied to specific transport mechanisms.[2] Central to SOAP's protocol layer are the concepts of nodes, roles, and relay mechanisms, which govern how messages are processed and forwarded. A SOAP node represents any entity—such as a server, client, or intermediary—that participates in message handling, including the initial sender, which originates the message, and the ultimate receiver, which performs the final processing. Intermediaries are nodes that receive, potentially modify, and forward messages along the path, acting in a relay capacity to support features like routing, security transformations, or aggregation. Each node assumes one or more roles, identified by URIs, such as the "next" role (http://www.w3.org/2003/05/soap-envelope/role/next), which indicates a node should process and relay the message; the "none" role, signaling no further processing; or the "ultimateReceiver" role, denoting the endpoint for message delivery. These roles ensure targeted processing of message components, with relay mechanisms allowing intermediaries to forward unprocessed portions of the message to subsequent nodes.[2] The message path in SOAP follows a linear progression from the initial sender through zero or more intermediaries to the ultimate receiver, where each node adheres to a defined processing model. Upon receipt, a node evaluates header blocks targeted to its roles, applying any necessary actions before relaying the message if applicable. An optional mustUnderstand attribute on header blocks mandates processing by nodes assuming the relevant role; if a node cannot comprehend or process a header with this attribute set to true, it must generate a fault rather than proceed. This attribute promotes reliability by enforcing comprehension of critical extensions, such as security or transaction headers, while allowing optional elements to be ignored without disruption. SOAP messages are encapsulated in XML format to represent this structure, though detailed formatting is specified elsewhere.[2] Fault generation in SOAP follows strict rules to handle errors consistently across the protocol layer, using a dedicated SOAP fault element within the message envelope. Standardized fault codes provide precise diagnostics, categorized under a top-level value like "Sender" or "Receiver" for the fault's origin, with subcodes offering further granularity. For instance, the VersionMismatch fault code (with subcode http://www.w3.org/2003/05/soap-envelope/VersionMismatch) is generated when a node detects an incompatible SOAP version in the envelope, such as a SOAP 1.1 message processed by a 1.2 node. Similarly, the MustUnderstand fault code (subcode http://www.w3.org/2003/05/soap-envelope/MustUnderstand) arises if a required header block is not processed, ensuring intermediaries and receivers signal processing failures explicitly. Other codes, like DataEncodingUnknown, address issues with unrecognized data formats, all of which halt normal processing and propagate the fault back along the message path unless relayed otherwise. These mechanisms enhance robustness in distributed systems by standardizing error reporting and recovery.[2]Message Structure and Encapsulation
A SOAP message is fundamentally structured as an XML document encapsulated within a mandatory SOAP Envelope element, which serves as the root container for all message content and ensures consistent processing across distributed systems.[2] This envelope allows for the wrapping of arbitrary data, including non-SOAP payloads, by providing a standardized framework that supports extensibility without imposing rigid constraints on the underlying application information.[2] The envelope's namespace, "http://www.w3.org/2003/05/soap-envelope", identifies it uniquely and facilitates interoperability in decentralized environments.[2] Within the envelope, the message distinguishes between an optional SOAP Header and a mandatory SOAP Body, enabling flexible organization of control and payload data.[2] Header blocks, contained in the header element, are extensible XML elements designed for auxiliary information such as metadata or processing directives, which may be targeted at intermediaries via attributes likerole and mustUnderstand.[5] These blocks support optional processing and can include application-defined extensions, but they are not required for basic message transmission.[2] In contrast, the body element exclusively carries the core application data intended for the ultimate receiver, ensuring that essential payload remains protected and directly accessible without intermediary interference.[2] This separation promotes modularity, as headers handle protocol-level concerns while the body focuses on semantic content.[5]
For handling binary or large attachments that could inefficiently inflate XML messages through base64 encoding, SOAP employs specialized mechanisms to optimize transmission.[4] The SOAP with Attachments (SwA) profile utilizes MIME Multipart/Related structures to encapsulate a primary SOAP message alongside zero or more secondary parts, such as images or files, referenced via URIs within the envelope.[4] This approach keeps attachments distinct from the XML envelope, reducing overhead while maintaining SOAP processing rules for the primary part.[4] Complementing SwA, the Message Transmission Optimization Mechanism (MTOM) further enhances binary data handling by serializing optimized octet streams directly, avoiding encoding bloat and integrating seamlessly with XML infosets for better performance in web services.[4]
SOAP messages adhere to specific encoding rules to represent data structures, balancing flexibility with schema compliance.[5] The literal encoding style, which is schema-based and the default in SOAP 1.2, allows direct use of application-defined XML without additional serialization layers, promoting interoperability through XML Schema validation.[5] In contrast, the SOAP-encoded style, a graph-based approach using accessors and multi-references for complex data like RPC parameters, is declared via the encodingStyle attribute but is optional in SOAP 1.2, though literal encoding is recommended to simplify implementation and reduce ambiguity.[5]
Technical Specification
Building Blocks
The SOAP message is structured around four primary building blocks that ensure its portability and extensibility across diverse environments. At the core is the Envelope, which serves as the root element encapsulating the entire message and defining its overall framework. The Envelope declares the namespacehttp://www.w3.org/2003/05/soap-envelope, enabling unambiguous identification of SOAP-specific elements and attributes within XML documents.[2]
The Header functions as an optional container for auxiliary information, such as processing directives that guide how intermediate nodes handle the message. It allows for the inclusion of metadata like security tokens or transaction identifiers, targeted via attributes such as role (which specifies the intended recipient node) and mustUnderstand (which mandates comprehension by the recipient). This design supports the protocol's extensibility model without interfering with the core payload.[2]
Central to the message is the Body, which holds the actual application data or invocation details, accommodating both RPC-style calls (where parameters are passed for remote procedure execution) and document-oriented content (such as XML fragments for business documents). The Body can contain multiple child elements, each representing distinct data units, thereby facilitating complex exchanges while maintaining separation from header metadata.[2]
For error handling, the Fault element provides a standardized mechanism for reporting issues, always nested within the Body to indicate that the message's primary intent has failed. It comprises sub-elements including Code (categorizing the fault type, e.g., VersionMismatch or DataEncodingUnknown), Reason (offering a textual explanation in one or more languages), Node (identifying the URI of the originating SOAP node), and Role (specifying the role the node assumed when generating the fault). This structure promotes consistent diagnostics across implementations.[2]
Message Format Details
The SOAP message format is fundamentally an XML document structured around the Envelope element, which serves as the root and encapsulates the entire message. The Envelope, defined in the XML schema (soap-envelope.xsd) for the http://www.w3.org/2003/05/soap-envelope namespace from the W3C SOAP 1.2 specification, includes an optional Header element for metadata and processing instructions, and a mandatory Body element for the primary payload or fault information.[2] The schema specifies that the Envelope must be the document's only child element, with its local name fixed as "Envelope" in the namespacehttp://www.w3.org/2003/05/soap-envelope. Key attributes include soapenv:mustUnderstand (of type xs:boolean), which mandates that targeted header blocks be processed or result in a fault if ignored; soapenv:role (of type xs:anyURI), designating the intended recipient node (e.g., "next" for intermediaries or "ultimateReceiver" for the final destination); and soapenv:relay (of type xs:boolean), indicating whether unprocessed header blocks should be forwarded along the message path.[24] These attributes enable extensible processing while ensuring interoperability.
Namespace declarations are integral to the format, with the core namespace http://www.w3.org/2003/05/soap-envelope qualifying all fundamental elements like Envelope, Header, and Body to avoid conflicts in mixed XML vocabularies. An optional encoding namespace, http://www.w3.org/2003/05/soap-encoding, may be used for serialized data structures within the Body, though its application is not required for basic literal messages. The encodingStyle attribute on Envelope, Header, or Body elements (type xs:anyURI) specifies serialization rules when present, allowing for graph-like data representations with references, but it defaults to no encoding if omitted.[2] This namespace separation supports modularity, as SOAP messages can embed content from other XML schemas without ambiguity.
For remote procedure call (RPC) representations, the SOAP Body contains a single access element named after the method (e.g., <m:GetStockPrice>), acting as a struct with child elements for input parameters labeled by their names (e.g., <symbol>IBM</symbol> for an input string).[25] Responses follow a similar convention, using a Body element named after the method with a "Response" suffix (e.g., <m:GetStockPriceResponse>), where output parameters and return values appear as child elements; the return value is specifically accessed via an edge named "result" in the namespace http://www.w3.org/2003/05/soap-rpc unless the method is void.[19] Parameters must adhere to XML naming rules, with conventions for mapping non-XML names (e.g., prefixing with underscores). The literal style, which directly uses schema-defined XML without additional encoding attributes, is preferred for its simplicity and direct interoperability, contrasting with the optional encoded style that permits complex, referenced data structures but increases parsing overhead.[26]
Validation of SOAP messages requires conformance to the W3C XML Schema for the Envelope, ensuring structural integrity without mandating full schema processing for the payload. Messages must serialize to valid XML 1.0, using the XML Infoset abstract model, and avoid DTDs in favor of schema-based definitions for portability.[2] Non-conformant messages, such as those violating namespace rules or attribute types, trigger a "VersionMismatch" or "MustUnderstand" fault as defined in the schema. Implementations are encouraged to validate against the normative schema at http://www.w3.org/2003/05/soap-envelope to verify Envelope integrity, though payload validation depends on application-specific schemas.[13] This rigorous syntax promotes reliable exchange across diverse systems.
Implementation Aspects
Transport Protocols
SOAP messages are primarily transported using the HTTP protocol, which serves as the default binding in the SOAP specification.[26] The HTTP binding employs the POST method to encapsulate SOAP envelopes within the HTTP entity body, enabling synchronous request-response interactions over the web.[27] In early versions of SOAP, such as SOAP 1.1, the SOAPAction HTTP header was required to indicate the intent of the SOAP request, typically specifying the operation to be invoked on the receiving endpoint.[28] This header facilitated intermediary processing and dispatching, though in SOAP 1.2, it was deprecated in favor of an action parameter within the Content-Type header for the "application/soap+xml" media type.[29] Beyond HTTP, SOAP supports bindings to other protocols for specialized use cases. The SMTP binding, defined as an illustrative example in the SOAP 1.2 framework, leverages email for asynchronous messaging, where requests and responses are exchanged via separate messages correlated through headers like Message-ID and In-Reply-To.[30] This approach suits scenarios requiring store-and-forward delivery, such as decoupled systems or environments with intermittent connectivity. Additionally, SOAP can bind directly to TCP for low-level, stream-based connections, allowing messages to be transmitted over raw TCP sockets without higher-level protocol overhead, though no standardized TCP binding exists in the core specification—examples include custom implementations reusing TCP port infrastructure.[31] To enable transport-agnostic addressing and routing, SOAP integrates with WS-Addressing, a W3C recommendation that introduces standardized XML elements for endpoint references and message information headers.[32] WS-Addressing allows SOAP messages to specify destinations (e.g., via wsa:To), replies (wsa:ReplyTo), and faults independently of the underlying transport, decoupling endpoint identification from protocol-specific details like HTTP URLs or SMTP addresses.[33] This facilitates advanced routing through intermediaries and supports multi-hop scenarios across heterogeneous networks. A key consideration in selecting transport protocols for SOAP is firewall traversal. HTTP's ubiquity as a web protocol allows SOAP messages to pass through most corporate firewalls and proxies without configuration changes, as they appear as standard HTTP traffic on port 80 or 443, unlike proprietary protocols that often require exceptions or dedicated ports.[34] This advantage has contributed to HTTP's dominance in SOAP deployments, promoting interoperability in distributed environments while minimizing security infrastructure modifications.Binding and Interoperability
SOAP bindings define how abstract service operations described in WSDL are mapped to specific transport protocols and message formats, enabling the concrete realization of web services. The Web Services Description Language (WSDL) provides a framework for these mappings through its binding elements, which specify details such as the SOAP version, encoding style, and transport protocol for each operation.[35][36] For instance, a WSDL binding can indicate that an operation uses SOAP 1.2 over HTTP, including how input and output messages are serialized.[37] To promote interoperability across diverse implementations, the Web Services Interoperability (WS-I) organization developed profiles that constrain SOAP and related specifications. The WS-I Basic Profile 1.1, published in 2006, refines SOAP 1.1, WSDL 1.1, and XML Schema to ensure compatible web services by mandating document/literal encoding and prohibiting certain features like SOAP encoding.[38] Similarly, the Basic Profile 2.0, released in 2010, extends these guidelines to SOAP 1.2 and WSDL 2.0, emphasizing clarifications for reliable message exchange and service descriptions.[39] The WS-I organization concluded operations in 2017, but its profiles continue to guide interoperability practices. Compliance with these profiles is verified using tools like the WS-I Analyzer, which examines WSDL documents and message logs against profile assertions to identify potential interoperability issues.[40][41] Interoperability challenges in SOAP often arise from version mismatches between SOAP 1.1 and 1.2, which differ in envelope namespaces and fault handling, leading to parsing errors in cross-version interactions.[42] Encoding style discrepancies, such as the use of RPC/encoded versus document/literal, further complicate compatibility by introducing serialization rules that may not align across platforms.[26] Solutions typically involve strict adherence to literal encoding as prescribed by WS-I profiles, which ensures messages conform directly to XML schemas without additional SOAP-specific serialization, thereby reducing ambiguity and enhancing cross-system reliability.[39] Tooling support facilitates the generation and management of SOAP bindings in practice. The Apache Axis library, a Java-based SOAP engine, automates binding creation from WSDL files using tools like WSDL2Java, supporting both client and server implementations with configurable transport options.[43][44] In the Microsoft .NET ecosystem, the System.Web.Services.Description namespace provides classes like SoapBinding for defining protocol bindings in WSDL, integrated into ASP.NET Web Services for seamless deployment and invocation.[45] These libraries ensure that developers can produce standards-compliant bindings while addressing interoperability requirements.Practical Examples
Basic Message Exchange
In the basic SOAP message exchange pattern, a client initiates communication by sending a request message to a server, encapsulating a method invocation within the SOAP Envelope. The Envelope serves as the fundamental container for all SOAP messages, comprising an optional Header for processing instructions and a mandatory Body for the primary payload. The Body of the request message contains the method call, such asGetStockPrice, along with its parameters expressed as XML elements. For instance, a request to retrieve the stock price for a specific company might include a parameter like the stock symbol.[2]
The following XML illustrates a simplified request message for the GetStockPrice method:
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
<soap:Body>
<m:GetStockPrice xmlns:m="http://example.org/stock">
<m:StockSymbol>[IBM](/page/IBM)</m:StockSymbol>
</m:GetStockPrice>
</soap:Body>
</soap:Envelope>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
<soap:Body>
<m:GetStockPrice xmlns:m="http://example.org/stock">
<m:StockSymbol>[IBM](/page/IBM)</m:StockSymbol>
</m:GetStockPrice>
</soap:Body>
</soap:Envelope>
GetStockPrice method would embed the return value, such as the price, as a child element of the corresponding response method.[2]
An example response message might appear as:
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
<soap:Body>
<m:GetStockPriceResponse xmlns:m="http://example.org/stock">
<m:Price>150.25</m:Price>
</m:GetStockPriceResponse>
</soap:Body>
</soap:Envelope>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
<soap:Body>
<m:GetStockPriceResponse xmlns:m="http://example.org/stock">
<m:Price>150.25</m:Price>
</m:GetStockPriceResponse>
</soap:Body>
</soap:Envelope>
role to direct processing—such as soap:role="http://www.w3.org/2003/05/soap-envelope/next" for the next intermediary—and mustUnderstand="true" to mandate comprehension, triggering a fault if ignored. For example, in a chain of intermediaries for routing a stock price request, a Header block might specify transaction IDs or security tokens, enabling the first intermediary to route based on the ID while the next validates the token, all without altering the Body. Unprocessed headers can be relayed forward if relay="true" is set, ensuring propagation through the chain. This model supports modular processing, such as in enterprise service buses, while maintaining message integrity.[2]
Encapsulated HTTP Example
A SOAP message encapsulated over HTTP typically uses the POST method to send the request to the service endpoint, with specific headers to indicate the SOAP format and action. The Content-Type header is set toapplication/soap+xml for SOAP 1.2, and the optional action parameter in the Content-Type or the SOAPAction header (for SOAP 1.1 compatibility) specifies the intended action. This binding follows the HTTP rules defined in the SOAP 1.2 specification.[26]
Consider a sample request to a weather service endpoint at http://example.com/[weather](/page/Weather) to query the forecast for a specific city. The HTTP request might appear as follows:
[POST](/page/Post-) /weather HTTP/1.1
Host: example.com
Content-Type: application/soap+xml; charset=[utf-8](/page/UTF-8); action="http://example.com/GetWeather"
Content-Length: 456
<?xml version="1.0" encoding="[utf-8](/page/UTF-8)"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:wea="http://example.com/[weather](/page/Weather)">
<soap:Header/>
<soap:Body>
<wea:GetWeather>
<wea:CityName>[London](/page/London)</wea:CityName>
<wea:CountryName>[United Kingdom](/page/United_Kingdom)</wea:CountryName>
</wea:GetWeather>
</soap:Body>
</soap:Envelope>
[POST](/page/Post-) /weather HTTP/1.1
Host: example.com
Content-Type: application/soap+xml; charset=[utf-8](/page/UTF-8); action="http://example.com/GetWeather"
Content-Length: 456
<?xml version="1.0" encoding="[utf-8](/page/UTF-8)"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:wea="http://example.com/[weather](/page/Weather)">
<soap:Header/>
<soap:Body>
<wea:GetWeather>
<wea:CityName>[London](/page/London)</wea:CityName>
<wea:CountryName>[United Kingdom](/page/United_Kingdom)</wea:CountryName>
</wea:GetWeather>
</soap:Body>
</soap:Envelope>
GetWeather) with parameters for the city and country, enabling the service to process the query.
Upon successful processing, the server responds with HTTP status 200 OK, including a similar Content-Type header and the response envelope in the body. For the weather query, this might return forecast data:
HTTP/1.1 200 OK
Content-Type: application/soap+xml; charset=utf-8
Content-Length: 567
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:wea="http://example.com/weather">
<soap:Header/>
<soap:Body>
<wea:GetWeatherResponse>
<wea:WeatherData>Partly cloudy, 15°C</wea:WeatherData>
</wea:GetWeatherResponse>
</soap:Body>
</soap:Envelope>
HTTP/1.1 200 OK
Content-Type: application/soap+xml; charset=utf-8
Content-Length: 567
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:wea="http://example.com/weather">
<soap:Header/>
<soap:Body>
<wea:GetWeatherResponse>
<wea:WeatherData>Partly cloudy, 15°C</wea:WeatherData>
</wea:GetWeatherResponse>
</soap:Body>
</soap:Envelope>
multipart/related; type="application/soap+xml"; boundary="uuid:...", followed by the SOAP envelope referencing the attachment via an Include element (e.g., xop:Include href="cid:uuid-123" rel="nofollow") and a separate MIME part for the binary data. This avoids base64 encoding overhead for large payloads.
Security and Extensions
Security Mechanisms
SOAP employs WS-Security (WS-SeC), an OASIS standard that extends SOAP messages with security features embedded in the<wsse:Security> header to ensure end-to-end protection regardless of transport. This specification supports message-level integrity via XML signatures, confidentiality through XML encryption, and authentication using various security tokens, all leveraging standards like XML Digital Signature and XML Encryption.[46][47]
XML signatures in WS-Security provide integrity by digitally signing selected parts of the SOAP envelope, such as the body or custom headers, using algorithms like RSA-SHA256. The signature process involves referencing elements via XPath or ID attributes and applying canonicalization to normalize the XML before hashing and signing, ensuring that any alteration invalidates the signature. WS-Security mandates the use of Exclusive XML Canonicalization (http://www.w3.org/2001/10/xml-exc-c14n#) to handle namespace issues, preventing canonicalization attacks where attackers exploit prefix redefinitions or external entity inclusions to alter signed content without detection.[46]
For confidentiality, WS-Security integrates XML Encryption to protect sensitive data within the message. Symmetric keys (e.g., AES-128) encrypt elements like the SOAP body, producing <xenc:EncryptedData> and <xenc:CipherData> in the Security header, while the key itself may be encrypted asymmetrically using the recipient's public key via <xenc:EncryptedKey>. This allows multiple recipients to decrypt portions independently, supporting scenarios like secure multi-party communications.[46]
Timestamps enhance security by embedding freshness indicators in the <wsu:Timestamp> element within the Security header, specifying creation time (<wsu:Created>) and optional expiration (<wsu:Expires> ) in UTC format (e.g., 2005-10-13T08:42:00Z). When signed, these prevent replay attacks by allowing recipients to verify the message age against a tolerance window, typically 5 minutes.[46]
The UsernameToken profile enables straightforward authentication by including a <wsse:UsernameToken> in the Security header, containing a <wsse:Username> and optional <wsse:Password>. Passwords transmit as plain text (requiring secure transport like TLS) or as a digest: Base64(SHA-1(nonce + created + password)), where the nonce is a Base64-encoded random value unique to the request. This digest method, combined with the <wsu:Created> timestamp, resists offline dictionary attacks and replay attempts, as recipients cache nonces for the timestamp's validity period and reject duplicates or expired tokens.[48][49]
SAML integration via the SAML Token Profile allows WS-Security to incorporate Security Assertion Markup Language (SAML) assertions for federated identity. SAML v1.1 or v2.0 assertions, issued by identity providers, embed as <wsse:BinarySecurityToken> (Base64-encoded) or directly as SAML elements in the Security header, carrying subject confirmations, attributes, and authorization decisions. These tokens enable single sign-on across domains in SOAP exchanges, with signatures ensuring assertion integrity and binding to the message via WS-Security references.[50]
WS-Security mitigates key vulnerabilities inherent to XML-based messaging, including XML canonicalization attacks through exclusive canonicalization, which isolates the signed XML from external influences like document subsets or entity expansions that could enable signature bypassing. Replay protection relies on nonces in tokens and signed timestamps, ensuring one-time use; for instance, UsernameToken nonces prevent token reuse, while overall message nonces in signatures block duplicated requests within defined time bounds.[46][49]
