The Straight Stuff on WS-ReliableMessaging
Gilbert Pilz's Blog |
May 18, 2006 11:57 AM
|
Comments (1)
The Straight Stuff on WS-ReliableMessaging
Eagerly awaited yet widely misunderstood, Web Services Reliable
Messaging (WS-ReliableMessaging) “describes a protocol that
allows messages to be delivered reliably between distributed
applications in the presence of software component, system, or
network failures”. Version 1.0 of the specification [1]
was published by Microsoft, IBM, BEA and TIBCO. At the time of this
writing, version 1.1 is under development by the OASIS
Web Services Reliable Exchange (WS-RX) Technical Committee.
Although the basic purpose of this specification is clear enough,
there are a number of misconceptions about its fundamental nature.
Where's the Boeuf?
Like most of the WS-* specifications, the surprising thing about
WS-RM (to further abbreviate WS-ReliableMessaging) is how little it,
well, specifies. This sounds negative, but it is actually one
of the strengths of the WS-* specifications (see Secure,
Reliable, Transacted Web Services [2] for
an explanation of why discrete, minimal specifications are a good
thing). WS-RM specifies a SOAP protocol for sending a message and
getting an acknowledgment when that message is received. WS-RM also
specifies the resending of messages that have not been acknowledged.
That's it.
To understand why such a simple concept has value we need to think
about the problem that WS-RM is designed to address. Suppose we have
a piece of software that communicates with another piece of software
to achieve some business function; sending an invoice, for example.
This interaction is inherently asynchronous. That is, we don't expect
to get an immediate response to the invoice because we know that
there is a certain amount of processing (perhaps involving people)
that needs to occur before the invoice is accepted. Furthermore, we
know that these two systems won't communicate with one another
directly. Instead the messages they exchange will be routed through a
series of intermediaries such as service hubs, security gateways,
etc. As the sender, one of our immediate and primary concerns is
whether the invoice made it through all those hops and reached its
intended recipient. Previous solutions to this problem, such as the
Rosettanet
Implementation Framework, have tended to bake the solution into
the application protocol. WS-RM solves this problem in a generic
manner that allows the solution to be applied to multiple protocols.
A
Little More Detail
To really understand the limits and
capabilities of WS-RM it's necessary to know a little bit more about
how it works. Figure 1 below is lifted directly from the
WS-ReliableMessaging specification [3]. It
illustrates two entities, an Application Source (AS) and an
Application Destination (AD) communicating via WS-RM. The AS sends a
message to the RM Source (RMS). The RMS then transmits the message to
the RM Destination (RMD) using the WS-RM protocol. Once it has
received the message, the RMD delivers the message to the Application
Destination (AD).

As part of the protocol the RMD sends
acknowledgments of the messages it has received back to the RMS. For
its part the RMS is responsible for holding onto messages and
retransmitting them until it receives an acknowledgment from the RMD.
The interactions between an RMS and RMD occur within the context of
protocol-level sessions termed “Sequences”. Any message
sent from the RMS to the RMD can be uniquely identified by the ID of
the Sequence in which it was sent and the number of the message
within that Sequence.
To relate this model to our invoice delivery example; our
invoicing system is the Application Source and our customer's
purchasing system is the Application Destination. The RMS and RMD
nodes are components of our respective WS-RM implementations.
The
Asynchronous Sweet Spot
Returning to our invoice delivery example, you will remember that
we described an asynchronous interaction. Although WS-RM adds some
value to synchronous interactions it provides the most value in the
case of the asynchronous interactions. It's not difficult to see why
this is so. In general, synchronous operations specify fairly short
timeouts (minutes rather than hours). Any failure between the
requesting system and the providing system will be discovered in
short order. On the other hand, suppose the agreement between us and
our customer specifies a maximum turnaround time for invoices of 72
hours. If our invoice fails to reach our customer's system it's
possible that it will take three days to discover this. The value of
receiving timely acknowledgments (or an exception if the message is
never acknowledged) is obvious.
What's
In a Name?
Some of the confusion around WS-RM stems from the word “reliable”.
WS-RM doesn't do anything to increase the intrinsic reliability of
your software or its underlying infrastructure. WS-RM smooths over
temporary network and service outages but if, for example, you lose
your network for 24 hours there is nothing WS-RM can magically do to
get your messages to their destination. This is enormously important
because it means that using WS-RM does not free you from having to
write the exception handling logic to deal with the cases where your
messages do not reach their destination. WS-RM allows you to simplify
this logic since you can be certain that, prior to triggering an
exception, WS-RM has already made every attempt to send the message.
In addition to this, WS-RM specifies that messages are
acknowledged when they are received by the RMD, not when they
are delivered to the AD. At the point when an RMS receives an
acknowledgment for a message you cannot be absolutely certain that
the AD got that message. However, this line of thinking quickly gets
digressive. Even if you did know that the message had been delivered
to the AD, you couldn't be sure that the AD didn't subsequently crash
before it could save that message, etc. One party of a distributed
interaction can never be absolutely certain of what is happening to
the other party in that interaction unless it receives some form of
signal from that party. WS-RM could
have defined an entire set of acknowledgments (i.e. message
received, message delivered, message validated, message persisted),
but it's not clear that the need for this kind of “acknowledgment
framework” is great enough to justify the added complexity.
What WS-RM does is solve the most pressing issue, namely “did
this message make it through to the 'other side'”? It does this
in the simplest way possible by acknowledging the message when it is
received by the RMD. Baring events such as an OS crash, disk failure,
etc. applications can reasonably expect that the RMD will eventually
deliver the message to the AD.
Sessions
A common misconception about WS-RM is that it is “TCP at the
SOAP level” [4]. Although there are
similarities between WS-RM and TCP, they are more different than they
are alike. Foremost amongst these differences is the level of session
support provided by the two technologies. One of TCP's main purposes
is to provide a “communications session” between two
applications. This is not one of the goals of WS-RM. While it
is true that a WS-RM Sequence is a kind of session, the purpose of a
Sequence is to provide a scope for the messages and acknowledgments
exchanged between the RMS and RMD. The WS-RM specification says
nothing about exposing Sequences to the AS or AD, nor are
there any guarantees about the behavior an AS or AD may expect from a
Sequence. There isn't even a guarantee of a one-to-one relationship
between applications (AS-AD pairs) and Sequences. Some vendor's WS-RM
architectures implement the RMS and the RMD as independent gateways
that multiplex several “application sessions” over a
single Sequence. The bottom line is that, if your WS-RM
implementation exposes the Sequence and you use that Sequence as an
application-level session, don't be surprised if your code won't
inter operate with code that uses a different WS-RM implementation.
Assurances
(and the Lack Thereof)
One of the more difficult aspects of the WS-RM specification is
its treatment of “delivery assurances”. Roughly speaking
a delivery assurance is a contract to deliver a message to the AD
only when certain conditions have been met. For example, “exactly
once” refers to an assurance wherein the RMD will deliver a
particular message (defined by a its Sequence ID and message number)
to the AD once and only once. Duplicate messages (the result of retry
attempts, network hiccups, lost acknowledgments, etc.) will be
dropped by the RMD. “In order” is an assurance wherein
the RMD delivers messages to the AD in the same order that the AS
sent them to the RMS.
By its nature the WS-RM protocol is well suited for supporting “at
least once”, “exactly once”, as well as ordered and
non-ordered versions of both of these assurances. However, for
reasons we don't have the time to go into, WS-RM treats any
designation of these assurances as a local contract between the AD
and the RMD. Within the limits of the WS-RM specification, neither
the AS nor the RMS is capable of discovering the details of this
contract either at runtime or via the service description.
Appendix:
References
[1]
http://specs.xmlsoap.org/ws/2005/02/rm/ws-reliablemessaging.pdf
[2]
http://www-128.ibm.com/developerworks/webservices/library/ws-securtrans/index.html
[3]
http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=ws-rx#technical
[4]
http://blogs.msdn.com/shycohen/archive/2006/02/20/535717.aspx
Comments
Comments are listed in date ascending order (oldest first) | Post Comment
-
I think the subject is very important and this is a quite good presentation on it.
"Even if you did know that the message had been delivered to the AD, you couldn't be sure that the AD didn't subsequently crash before it could save that message, etc. One party of a distributed interaction can never be absolutely certain of what is happening to the other party in that interaction unless it receives some form of signal from that party" is the core for me.
Sometimes, it is important for the Sender to know if the message has been actually received by the final Receiver. However, I prefer to have an infrastructure, which I can rely on with regard to GUARANTEED delivery and which frees the Sender from knowing about the fact of delivery.
WS-RM architecture with durable subscription addresses my task at the application level; now I need only a support from the hardware for this matter.
I was able to successfully use the WS-RM on WebLogic platform for acynchronously delivery of audit messages in a security system. You can read about this example in "Assured Delivery of Audit Data", WLDJ, Nov-Dec 2005, Vol. 4, Issue 6.
- Michael Pouln
Posted by: m3poulin on May 29, 2006 at 3:52 AM
|