Arch2Arch Tab BEA.com
Syndicate this blog (XML)

The Straight Stuff on WS-ReliableMessaging

Bookmark Blog Post

del.icio.us del.icio.us
Digg Digg
DZone DZone
Furl Furl
Reddit Reddit

Gilbert Pilz's Blog | May 18, 2006  11:57 AM | Comments (1)


The Straight Stuff on WS-ReliableMessaging

Eagerly awaited yet widely misunderstood, Web Services Reliable Messaging (WS-ReliableMessaging) “describes a protocol that allows messages to be delivered reliably between distributed applications in the presence of software component, system, or network failures”. Version 1.0 of the specification [1] was published by Microsoft, IBM, BEA and TIBCO. At the time of this writing, version 1.1 is under development by the OASIS Web Services Reliable Exchange (WS-RX) Technical Committee. Although the basic purpose of this specification is clear enough, there are a number of misconceptions about its fundamental nature.

Where's the Boeuf?

Like most of the WS-* specifications, the surprising thing about WS-RM (to further abbreviate WS-ReliableMessaging) is how little it, well, specifies. This sounds negative, but it is actually one of the strengths of the WS-* specifications (see Secure, Reliable, Transacted Web Services [2] for an explanation of why discrete, minimal specifications are a good thing). WS-RM specifies a SOAP protocol for sending a message and getting an acknowledgment when that message is received. WS-RM also specifies the resending of messages that have not been acknowledged. That's it.

To understand why such a simple concept has value we need to think about the problem that WS-RM is designed to address. Suppose we have a piece of software that communicates with another piece of software to achieve some business function; sending an invoice, for example. This interaction is inherently asynchronous. That is, we don't expect to get an immediate response to the invoice because we know that there is a certain amount of processing (perhaps involving people) that needs to occur before the invoice is accepted. Furthermore, we know that these two systems won't communicate with one another directly. Instead the messages they exchange will be routed through a series of intermediaries such as service hubs, security gateways, etc. As the sender, one of our immediate and primary concerns is whether the invoice made it through all those hops and reached its intended recipient. Previous solutions to this problem, such as the Rosettanet Implementation Framework, have tended to bake the solution into the application protocol. WS-RM solves this problem in a generic manner that allows the solution to be applied to multiple protocols.

A Little More Detail

To really understand the limits and capabilities of WS-RM it's necessary to know a little bit more about how it works. Figure 1 below is lifted directly from the WS-ReliableMessaging specification [3]. It illustrates two entities, an Application Source (AS) and an Application Destination (AD) communicating via WS-RM. The AS sends a message to the RM Source (RMS). The RMS then transmits the message to the RM Destination (RMD) using the WS-RM protocol. Once it has received the message, the RMD delivers the message to the Application Destination (AD).

Frame1

As part of the protocol the RMD sends acknowledgments of the messages it has received back to the RMS. For its part the RMS is responsible for holding onto messages and retransmitting them until it receives an acknowledgment from the RMD. The interactions between an RMS and RMD occur within the context of protocol-level sessions termed “Sequences”. Any message sent from the RMS to the RMD can be uniquely identified by the ID of the Sequence in which it was sent and the number of the message within that Sequence.

To relate this model to our invoice delivery example; our invoicing system is the Application Source and our customer's purchasing system is the Application Destination. The RMS and RMD nodes are components of our respective WS-RM implementations.

The Asynchronous Sweet Spot

Returning to our invoice delivery example, you will remember that we described an asynchronous interaction. Although WS-RM adds some value to synchronous interactions it provides the most value in the case of the asynchronous interactions. It's not difficult to see why this is so. In general, synchronous operations specify fairly short timeouts (minutes rather than hours). Any failure between the requesting system and the providing system will be discovered in short order. On the other hand, suppose the agreement between us and our customer specifies a maximum turnaround time for invoices of 72 hours. If our invoice fails to reach our customer's system it's possible that it will take three days to discover this. The value of receiving timely acknowledgments (or an exception if the message is never acknowledged) is obvious.

What's In a Name?

Some of the confusion around WS-RM stems from the word “reliable”. WS-RM doesn't do anything to increase the intrinsic reliability of your software or its underlying infrastructure. WS-RM smooths over temporary network and service outages but if, for example, you lose your network for 24 hours there is nothing WS-RM can magically do to get your messages to their destination. This is enormously important because it means that using WS-RM does not free you from having to write the exception handling logic to deal with the cases where your messages do not reach their destination. WS-RM allows you to simplify this logic since you can be certain that, prior to triggering an exception, WS-RM has already made every attempt to send the message.

In addition to this, WS-RM specifies that messages are acknowledged when they are received by the RMD, not when they are delivered to the AD. At the point when an RMS receives an acknowledgment for a message you cannot be absolutely certain that the AD got that message. However, this line of thinking quickly gets digressive. Even if you did know that the message had been delivered to the AD, you couldn't be sure that the AD didn't subsequently crash before it could save that message, etc. One party of a distributed interaction can never be absolutely certain of what is happening to the other party in that interaction unless it receives some form of signal from that party. WS-RM could have defined an entire set of acknowledgments (i.e. message received, message delivered, message validated, message persisted), but it's not clear that the need for this kind of “acknowledgment framework” is great enough to justify the added complexity. What WS-RM does is solve the most pressing issue, namely “did this message make it through to the 'other side'”? It does this in the simplest way possible by acknowledging the message when it is received by the RMD. Baring events such as an OS crash, disk failure, etc. applications can reasonably expect that the RMD will eventually deliver the message to the AD.

Sessions

A common misconception about WS-RM is that it is “TCP at the SOAP level” [4]. Although there are similarities between WS-RM and TCP, they are more different than they are alike. Foremost amongst these differences is the level of session support provided by the two technologies. One of TCP's main purposes is to provide a “communications session” between two applications. This is not one of the goals of WS-RM. While it is true that a WS-RM Sequence is a kind of session, the purpose of a Sequence is to provide a scope for the messages and acknowledgments exchanged between the RMS and RMD. The WS-RM specification says nothing about exposing Sequences to the AS or AD, nor are there any guarantees about the behavior an AS or AD may expect from a Sequence. There isn't even a guarantee of a one-to-one relationship between applications (AS-AD pairs) and Sequences. Some vendor's WS-RM architectures implement the RMS and the RMD as independent gateways that multiplex several “application sessions” over a single Sequence. The bottom line is that, if your WS-RM implementation exposes the Sequence and you use that Sequence as an application-level session, don't be surprised if your code won't inter operate with code that uses a different WS-RM implementation.

Assurances (and the Lack Thereof)

One of the more difficult aspects of the WS-RM specification is its treatment of “delivery assurances”. Roughly speaking a delivery assurance is a contract to deliver a message to the AD only when certain conditions have been met. For example, “exactly once” refers to an assurance wherein the RMD will deliver a particular message (defined by a its Sequence ID and message number) to the AD once and only once. Duplicate messages (the result of retry attempts, network hiccups, lost acknowledgments, etc.) will be dropped by the RMD. “In order” is an assurance wherein the RMD delivers messages to the AD in the same order that the AS sent them to the RMS.

By its nature the WS-RM protocol is well suited for supporting “at least once”, “exactly once”, as well as ordered and non-ordered versions of both of these assurances. However, for reasons we don't have the time to go into, WS-RM treats any designation of these assurances as a local contract between the AD and the RMD. Within the limits of the WS-RM specification, neither the AS nor the RMS is capable of discovering the details of this contract either at runtime or via the service description.

Appendix: References

[1] http://specs.xmlsoap.org/ws/2005/02/rm/ws-reliablemessaging.pdf

[2] http://www-128.ibm.com/developerworks/webservices/library/ws-securtrans/index.html

[3] http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=ws-rx#technical

[4] http://blogs.msdn.com/shycohen/archive/2006/02/20/535717.aspx


Comments

Comments are listed in date ascending order (oldest first) | Post Comment

  • I think the subject is very important and this is a quite good presentation on it. "Even if you did know that the message had been delivered to the AD, you couldn't be sure that the AD didn't subsequently crash before it could save that message, etc. One party of a distributed interaction can never be absolutely certain of what is happening to the other party in that interaction unless it receives some form of signal from that party" is the core for me. Sometimes, it is important for the Sender to know if the message has been actually received by the final Receiver. However, I prefer to have an infrastructure, which I can rely on with regard to GUARANTEED delivery and which frees the Sender from knowing about the fact of delivery. WS-RM architecture with durable subscription addresses my task at the application level; now I need only a support from the hardware for this matter. I was able to successfully use the WS-RM on WebLogic platform for acynchronously delivery of audit messages in a security system. You can read about this example in "Assured Delivery of Audit Data", WLDJ, Nov-Dec 2005, Vol. 4, Issue 6. - Michael Pouln

    Posted by: m3poulin on May 29, 2006 at 3:52 AM



Only logged in users may post comments. Login Here.

Powered by
Movable Type 3.31