Skip navigation.
Arch2Arch Tab BEA.com

Introducing SDO

by John Beatty
05/24/2004

Service Data Objects (SDO) is a specification recently published jointly by BEA and IBM, and is being standardized through the Java Community Process by the JSR-235 Expert Group. SDO is a data programming architecture and API for the Java platform that unifies data programming across data source types, provides robust support for common application patterns, and enable applications, tools, and frameworks to more easily query, read, update, and introspect data. For a high-level overview of the SDO architecture, see the white paper titled Next-Generation Data Programming: Service Data Objects (pdf).

This article discusses the design and motivation behind the SDO APIs. Note that the SDO API discussed in this article is the version published jointly by BEA and IBM in November 2003. The output of the JSR-235 Expert Group will likely be different from this API.

Overview

The SDO APIs include a dynamic data API, a data type introspection API, and a data change tracking API. If you don't know what all that means, don't despair: I'll start with the simplest concepts and work up.

DataObject: When Java Beans Aren't an Option

SDO is based on the concept of a "data object", which is simply an object instance containing some data. Normally people use plain-old Java objects (POJOs, or Java beans) or plain-old Java interfaces (POJIs) for representing data in a persistence-mechanism-neutral manner (more on relational and XML data shortly). For example, people commonly construct "data transfer objects" for their purpose using POJOs.

We call Java bean-style APIs "static" because pre-defined data types with a discrete set of properties (or getter/setter methods) already exist. Static data APIs are not always feasible, however, because sometimes the Java classes don't even exist. For example, in many dynamic queries, the shape of the returned data is not known a priori, and thus we can't stuff the data into pre-existing Java classes. Another example is data structures that are extensible; for example, with XML data, you often don't know the precise shape of it before you parse it (assuming the schema is extensible).

This is where the SDO DataObject interface is handy: it provides a "dynamic" data API. Having a dynamic data API is especially useful when you need to create a generic framework that can support scenarios involving dynamic queries, unknown data shapes, and extensible schemas.

The basic operations on DataObject are set([property name], [property value]) and get([property name]). There are a lot more methods on DataObject, but we'll get into those in a moment. Let's look at some code. Say that I have a person interface like such:


        public interface Person {
                String getName();
                void setName(String name);
        }


Then mock-up client code would look like the following (assume a PersonImpl Java bean exists that implements the person interface):


        Person p = new PersonImpl();
        p.setName("John");
        System.out.println(p.getName());


This is what the Java programmer is used to for many programming tasks. But what if the person interface doesn't exist at runtime when I need to process person data? Then we can use DataObject. Presuming there is an implementation of DataObject called DataObjectImpl with a default constructor (note that the core SDO spec only defines the interfaces), I could write the following code, which accomplishes the same thing as above:


 
        DataObject o = new DataObjectImpl();
        o.set("name", "John");
        System.out.println(o.get("name"));


The observant reader will notice that there's an important piece of data lost in the above: the DataObject instance doesn't know that it's a person, and thus the client can't do any runtime type checking. SDO handles this, and we'll discuss it momentarily.

Taking a step back, it's instructive to note that the need for a DataObject API in Java arises because Java is a statically typed language that does not enable additional fields and methods to be added to an object instance at runtime. Not all languages are like this. Notably, Python is a dynamically typed language, and it allows attributes (which are equivalent to Java fields) to be added to an object instance at runtime. For example, the following Python code is roughly equivalent to the Java code using DataObject above:


  
        o = dataobject()
        setattr(o, "name", "John")
        print o.name


The above code assumes there is some Python class defined named "dataobject", which could simply be defined as:


  
        class dataobject:
                pass


I don't show the Python code to make value judgments between Python and Java; rather, it shows more explicitly why Java needs a class like DataObject if you need a dynamic data API.

Getting Explicit

DataObject offers more than the set() and get() methods that simply take and return java.lang.Object. In a normal POJOs or POJIs, the accessors are typed more specifically than that. That is, they deal with string, int, calendar, etc., rather than object. The DataObject interface allows its properties to be typed as well, and provides additional getters and setters for that purpose. This is similar to the variety of getters and setters offered by java.lang.reflect.Field. For example, in the object example, I could have done the following:


        DataObject o = new DataObjectImpl();
        o.setString("name", "John");
        System.out.println(o.getString("name"));


Of course, changing the last line of code above wasn't strictly necessary in this case, but if you needed to operated on the returned object as a string, the explicit getString() is required.

Java's Other Dynamic Data APIs

Java had other dynamic data APIs before DataObject came around. Notably, the JDBC ResultSet and RowSet APIs are dynamic data APIs for relational data, and the DOM API (specifically, node and element) is a dynamic data API for XML data. Equivalent code using these APIs would roughly be as follows (for brevity, I'll show the gets and ignore the sets):

Using the RowSet API:


        RowSet rs = ...;
        System.out.println(rs.getString("name"));
        And using the DOM API: 
        Node n = ...; // find the name node
        System.out.println(n.getNodeValue());


The salient point here is that SDO's DataObject interface is a generic dynamic data API, meaning that it can be used independent of any particular persistence mechanism or serialization format. It's designed to work well with object data, relational data, tabular data, and XML. It allows higher-level frameworks to work with data generically across heterogonous data sources.

Types and Properties

Statically typed data APIs like POJOs and POJIs have all the type information hard-coded into them. The interface or class defines the type, and this type has properties accessed with statically typed getters and setters. And the types can be introspected with java.lang.Class and the java.lang.reflect APIs. These can be used for a variety of things, ranging from simple runtime type tests to having generic frameworks operate on your Java objects. In the case where Java classes and interfaces don't exist, obviously this approach doesn't work. We need an equivalent to java.lang.Class and java.lang.refect.Field in SDO. This is the role that the type and property interfaces play. Type is roughly equivalent to class, and property is roughly equivalent to field.

Let's look at some code. In the case of a person interface, I can play around with type information:


        Object o = ...; // set o to some Person object
        if (o instanceof Person) {
          Person p = (Person) o;
          System.out.println(p.getName());
        }


Using the SDO APIs, I would write the code as follows:


        DataObject o = ...; // set o to some DataObject
        Type t = o.getType();
        if (t.getName() == "Person") {
          System.out.println(o.getString("name"));
        }


Note that for the above to work properly, the type information had to come from somewhere. You'll notice that in the code above I hypothesized a simple DataObjectImpl object that was constructed without any parameters--this would need to be changed if you wanted to create a DataObject with a type. You may be asking the question: Why doesn't SDO define concrete classes to build up SDOs from scratch? This may be done eventually. We wanted to focus first on providing a client API that could consume SDOs, with the assumption that SDO-enabled products will construct SDOs in a proprietary fashion for now. This is subject to change over time.

Besides just asking a DataObject for its type, you can also ask it for its set of Property objects. This enables tools to traverse the graph of DataObjects. The below code is such as tool; it pretty-prints a DataObject and its properties:


        public void printDataObject(DataObject dataObject, int indent) {
          Type type = dataObject.getType();
          List properties = type.getProperties();
          for (int p=0, size=properties.size(); p < size; p++) {
            if  (dataObject.isSet(p)) {
              Property property = (Property) properties.get(p);
              // For many-valued properties, process a list of values 
              if (property.isMany()) {
                List values = dataObject.getList(p);
                for (int v=0; count=values.size(); v < count; v++) {
                  printValue(values.get(v), property, indent);
                }
              else { // For single-valued properties, print out the value
                printValue(dataObject.get(p), property, indent);
              }
            }
          }
        }

        private void printValue(Object value, Property property, int indent) { // Get the name of the property
          String propertyName = property.getName(); // Construct a string for the proper indentation
          String margin = "";
          for (int i = 0; i < indent; i++) margin += "\t";
          if (!property.isContainment()) {
            // For non-containment properties, just print the value
            System.out.println(margin + propertyName + ": " + value);
          } else {
            // For containment properties, display the value with printDataObject
            String typeName = property.getType().getName();
            System.out.println(margin + propertyName + " (" +typeName+ "):");
            printDataObject((DataObject) value, indent + 1);
          }
        }


You'll notice from the above code that there are some ins-and-outs with the SDO APIs that we haven't covered. Check out the SDO specification for more details.

Marrying Java Beans and SDO

Pre-defined POJOs and POJIs representing your data are easier to work with than the DataObject interface. As I've pointed out earlier, this isn't always possible. If I need to write a framework that can deal with the least common denominator, that framework should work with the DataObject interface. But we don't want to leave our Java beans users out in the cold. So what can we do? There are a few possible strategies here.

Let's take JAXB as an example. A JAXB-compliant tool can generate POJIs from an XML schema definition (XSD), and this tool will also generate "Impl" classes to that implement those interfaces. One could envision a JAXB schema compiler being SDO-enabled whereby the Impl classes also implement the DataObject interface. Thus, frameworks could take a JAXB-generated object and successfully downcast to DataObject.

Change Tracking

SDOs have a good memory: they can remember what happened in the past and tell you exactly what changed. Why is this useful? A very common access pattern is for a client (e.g., a Web application) to receive some information from a data source, change some of that information, and then have the data source commit those changes. In the SDO architecture, we call the service that is responsible for answering queries and committing changes a "mediator". The mediator front-ends a data source to accomplish this. Thus, you could have SQL/relational mediators, XML query mediators hitting XML data stores, and all sorts of combinations thereof. For the mediator to accomplish its task of committing changes to a DataObject, it needs to be able to ask the DataObject what has changed.

The DataGraph interface provides a convenient mechanism to pass around a set of data. The DataGraph interface provides the getChangeSummary() method, which hands back a ChangeSummary object. From this object, you can get a list of data objects that have changed and what the new values are. There is an additional method getOldValues() that can tell you what the old value for a DataObject was.

Summary

In this article we covered the basics of the SDO APIs. There's quite a bit we didn't cover, including property access using XPath expressions, support for XML-style mixed content, XML serialization forms for DataGraphs, and a few other features. The SDO specification (.pdf) provides complete coverage of these topics. And keep your eyes open for upcoming SDO-enabled products from both BEA and IBM, as well as on JSR-235, which will be standardizing the SDO APIs within the JCP.

Article Tools

Email E-mail
Print Print
Blog Blog

Related Technologies

Bookmark Article

del.icio.us del.icio.us
Digg Digg
DZone DZone
Furl Furl
Reddit Reddit