Arch2Arch Tab BEA.com
Syndicate this blog (XML)

Why a unified view of data is important?

Bookmark Blog Post

del.icio.us del.icio.us
Digg Digg
DZone DZone
Furl Furl
Reddit Reddit

Gabriel Bechara's Blog | October 11, 2007   1:48 PM | Comments (3)


Why a unified view of data is important?

Why a unified view of data is important?

While mergers and acquisitions have become business as usual, unification of data and processes is one of the main aspects IT departments in large organization have to deal with. With in this context, being able to have a unified view of the data will certainly help when implementing processes that will span globally across all the silos of an organization. Those silos being historically associated with different lines of business with in one organization, will tend to multiply, as mergers or acquisitions goes on, silos might even appear within one line of business: a global organization might choose to maintain and strengthen specificities related to national localizations improving its client proximity.

Top-Down approach and the domain data model

In the article An Introduction to Enterprise Architecture, I frequently referred to the "the canonical data models" or the "domain UML models", that should be understood by the entire organization and used in the reusable enterprises services, exposed thru the service infrastructure. The business processes being the enterprise level processes, modeled on the enterprise business process plan, should use this unified data model. For a reminder, the plans are shown in the figure below:

image002.jpg

Designing enterprise level processes is one way to identify reusable services. The enterprise level processes will span thru different silos with in a global organization and will need to consume common services in their activities. Those services to be usable/reusable globally could be sharing a common data model. This data representing the business entities (entities of the UML domain model, or business objects) might have different data representations in the different silos of an organization. Thus it's important to unify the data representation on the enterprise level, this data being needed by enterprise wide processes.

There might be different use case for data unification. One case consists of having the same business entity stored in different back-ends. If we take for example the client of a bank, data about this client might be stored an ERP, in the accounting system, in a CRM etc.  Another use case is when dealing with the same type of data with multiple representations, one per country for example, that will be the case of a bank that has extended it's implantations aboard to multiple countries by acquiring several smaller banks.

image003.jpg

As shown in the previous figure designing process that should span across the whole organization, each local bank having a different representation or the same business entities, needs to use one common view of the data.

Even if all the banks has one single software that holds all data for all banks, and that is rarely the case, especially if a bank has to deal with different branches that might be distributed in different countries, each with it's own regulatory and accounting rules, you still might have to deal with different versions of the same single software, or with modules that will not apply the same business rules depending on their localization (potentially rendering different data models).

An example: international payments within this global bank

Let's take for example the use case of an international payment as designed using AquaLogic BPM:

image004.jpg

This process consists in:

  • Entering the payment informations for the local client, this being done locally in one of the banks belonging to the group, by a local clerk
  • Approvals and Validation rules are defined and applied globally for all banks belonging to the group
  • International payments are dealt with in a common unit for all the banks of the group
  • Accounting is done one two levels: globally and locally. The first for evaluating financial results, the latter in appliance to local accounting rules

Note: This case can be more complicated in a real world scenario. The process and the data model exposed later are intentionally simplified.

Designing the canonical Model

Whatever bank the client belongs to, the process will be dealing with a unified representation of the data needed for this process. The data will be processed centrally for approvals/authorization controls, centrally for the payment and partly in a centralized manner for the accounting, but the data needed when entering the payment informations will come from different data sources depending on the localization of the bank.

The simplified UML representation of the entities involved for this type of operations (the real case is much more complex) will consist of:

image005.jpg

This model can be considered as canonical if it has the ability to contain the necessary elements needed for unifying data representation of the entities used globally thru the organization.

Note: External keys, not represented in the previous diagram, may need to be added to link the canonical model to  the physical model; another option is using reference tables for key mapping.

On the application level (with in one application) using object/relational mappers such as Kodo, Hibernate or EJB 3 (OpenJPA), the latter being the preferred choice today, will continue to be a common pattern. But in the context we have described above the O/R tools might have reached their limits: we need to unify data representation and service the persistent data representing the client, address, account etc. knowing that it is stored in different databases with different formats; we also need to expose this data thru Web Service using an XML representation since we need to consume it using Web Services clients. We might also consider using SDO, when data consumption becomes complex, this will handle some aspects such as disconnected data graphs and optimistic concurrency control model.

AquaLogic Data Services Platform

Using AquaLogic Data Services Platform (ALDSP) the canonical XML model, provided as an XSD and used as a support to a logical data model, using the terms of ALDSP, can be mapped to physical data sources using a graphical designer. Using the modeling tool of ALDSP for the UML representation above, the blue extract of the model, will give:

image006.jpg

PaymentCanonical and ClientCanonical being the logical models represented in XML, the sources of the logical models ClientCanonical being for the client the two physical models ClientSource1 and ClientSource2.

The logical data model can then be exposed thru web services, or consumed using SDO or JDBC clients. No need to worry about caching data and defining time to live for caches, or securing access to data: this can be configured later in the administration console of ALDSP.

In our use case it will be logical to expose the data models using CRUD Web services, those services being consumed thru a mediation layer as in the figure below:

image007.jpg

Advantages of a data access layer using AquaLogic Data Services Platform

Some of the main advantages of using AquaLogic Data Services Platform as a data access layer are:

  • Configuration of data access is done using queries, mainly graphically generated, not coding.
  • The Distributed query ore optimized: ALDSP Includes a robust distributed query optimizer. Optimization is done based on data source capabilities, it delegates as much work to the underlying sources as it can and employs efficient join techniques.
  • Data transformation is mapped from underlying sources into desired schemas via powerful XQuery-based data manipulation and transformation tools.
  • Policy-based caching permits optimization of system performance by managing the trade-off between response time and freshness of information through caching of frequently-used data services.
  • Data architects and developers can create unified views and queries rapidly via a graphical drag and drop tool. ALDSP discovers metadata from physical sources in an automated fashion. This will reduce overall development time and costs.
  • Configurable role based security and data redaction; this gives the ability to associates a security policy with a data element within a Data Service's responses.
  • Web service interface creation can easily be done, eliminating the need for developing Web service interfaces.
  • Data can be consumed using Web Services, SDO, JDBC, ADO.NET.

Conclusion

Unification of data is an important subject for processes that will span multiple silos in an organization. This should be governed centrally and automated using adequate tooling. Using AquaLogic Data Services Platform the Data Services Layer will be faster to implement, easier to maintain, reuse, and extend, without requiring expensive integration resources or extensive coding. New data silos will be easier to add to the global canonical model adding fluidity to organizational wide processes, providing dynamic extensions of data models.

Related blogs and articles


Comments

Comments are listed in date ascending order (oldest first) | Post Comment

  • Hi,
    Your article raises a significant point and key enabler of SOA strategy. A unified, enterprise level, canonical domain 'model' is important, critical and immensely beneficial for SOA strategy. However, in my choice of the word 'model', I slightly differ from the view presented in the article.

    For example: "Those services to be usable/reusable globally could be sharing a common data model". I believe that business services will continue using their 'silo' models for legacy or other practical reasons. But it is the SOA infrastructure which will provide 'data translation services' that will allow Service A and B using their respective and different domain models dA and dB to communicate/exchange. Such data translation services will use the canonical enterprise domain model to translate 'silo models' dA to/from dB.

    In that sense the canonical model is a 'model' where individual service will use a particular view of this 'canonical' model.

    Thank you again for a well-written article on an important topic.

    Posted by: pinaki.poddar on October 21, 2007 at 8:42 PM

  • Hello,

    This is a very interesting point you are raising.

    We might use the words "canonical model" when doing Application2Application integration as described the Canonical Data Model pattern (http://www.enterpriseintegrationpatterns.com/CanonicalDataModel.html). Within this scope I do agree with your point of view "Such data translation services will use the canonical enterprise domain model to translate silo models dA to/from dB".

    But I was trying (to show how) to take this approach a step further than A2A integration toward unifying the data model to be used directly by global enterprise processes. I should perhaps not call this "canonical model": what about "canonical domain model"?

    Unifying data on the enterprise level will require from the business analysts/experts the production of a "complete" UML domain model (related to their business and used in the global processes); this model should then be reviewed/completed by data architects that have the necessary technical knowledge...

    Using some business standards, as SWIFT in the banking business might help. For example: from the business point of view the SWIFT payment messages as MT103 and MT202 contains all the necessary data for the payment operation described above. Other data, such as the internal account to be debited, or the outstanding off-balance accounts, will be deduced by the accounting system: no need to transport this in the canonical model; on the other hand data such as the client id, that is internal to the bank and therefore not in the SWIFT message, should be present in the canonical model...

    I would say:

    "In that sense the unified canonical domain model is a model where individual services provided by existing silos will use a particular view of this canonical model"

    This is an approach that has it's limitations and need to be adapted to the all specific cases that might occurs. But, in my opinion, it's better to fix a higher objective that the one that may be attainable: the cleaner the model is the longer it will sustain... until those global processes will become themselves silos if the acquirer bank gets acquired :-).

    Posted by: BECHARAG on October 22, 2007 at 1:16 AM

  • More on the canonical domain data model used by enterprise services can be found at http://www.soapatterns.org/canonical_data_model.asp

    Posted by: BECHARAG on March 2, 2008 at 10:53 AM



Only logged in users may post comments. Login Here.

Powered by
Movable Type 3.31