Why a unified view of data is important?
Gabriel Bechara's Blog |
October 11, 2007 1:48 PM
|
Comments (3)
Why a unified view of data is important?
Why a unified view of data is important?
While mergers and acquisitions have become
business as usual, unification of data and processes is one of the main aspects
IT departments in large organization have to deal with. With in this context, being
able to have a unified view of the data will certainly help when implementing
processes that will span globally across all the silos of an organization. Those
silos being historically associated with different lines of business with in
one organization, will tend to multiply, as mergers or acquisitions goes on, silos
might even appear within one line of business: a global organization might
choose to maintain and strengthen specificities related to national
localizations improving its client proximity.
Top-Down approach and the domain data model
In the article An Introduction to Enterprise Architecture,
I frequently referred to the "the canonical
data models" or the "domain UML models", that should be understood by the
entire organization and used in the reusable enterprises services, exposed thru
the service infrastructure. The business processes being the enterprise level
processes, modeled on the enterprise business process plan, should use this
unified data model. For a reminder, the plans are shown in the figure below:
Designing enterprise
level processes is one way to identify reusable services. The enterprise level
processes will span thru different silos with in a global organization and
will need to consume common services in their activities. Those services to be usable/reusable
globally could be sharing a common data model. This data representing the
business entities (entities of the UML domain model, or business objects)
might have different data representations in the different silos of an
organization. Thus it's important to unify the data representation on the
enterprise level, this data being needed by enterprise wide processes.
There might be different use case for data
unification. One case consists of having the same business entity stored in
different back-ends. If we take for example the client of a bank, data about
this client might be stored an ERP, in the accounting system, in a CRM etc.
Another use case is when dealing with the same type of data with
multiple representations, one per country for example, that will be the case of
a bank that has extended it's implantations aboard to multiple countries by acquiring
several smaller banks.
As shown in the previous figure designing
process that should span across the whole organization, each local bank having
a different representation or the same business entities, needs to use one
common view of the data.
Even if all the banks
has one single software that holds all data for all banks, and that is rarely
the case, especially if a bank has to deal with different branches that might
be distributed in different countries, each with it's own regulatory and accounting
rules, you still might have to deal with different versions of the same single
software, or with modules that will not apply the same business rules depending
on their localization (potentially rendering different data models).
An example: international payments within this global bank
Let's take for example the use case of an
international payment as designed using AquaLogic BPM:
This process consists in:
-
Entering the payment informations for the local client,
this being done locally in one of the banks belonging to the group, by a local
clerk
- Approvals and Validation rules are defined and
applied globally for all banks belonging to the group
-
International payments are dealt with in a
common unit for all the banks of the group
- Accounting is done one two levels: globally and
locally. The first for evaluating financial results, the latter in
appliance to local accounting rules
Note: This case can be more complicated in
a real world scenario. The process and the data model exposed later are
intentionally simplified.
Designing the canonical Model
Whatever bank the client belongs to, the
process will be dealing with a unified representation of the data needed for
this process. The data will be processed centrally for approvals/authorization
controls, centrally for the payment and partly in a centralized manner for
the accounting, but the data needed when entering the payment informations will
come from different data sources depending on the localization of the bank.
The simplified UML representation of the
entities involved for this type of operations (the real case is much more
complex) will consist of:
This model can be considered as canonical
if it has the ability to contain the necessary elements needed for unifying data
representation of the entities used globally thru the organization.
Note: External keys, not represented in the
previous diagram, may need to be added to link the canonical model to the
physical model; another option is using reference tables for key mapping.
On the application level (with in one
application) using object/relational mappers such as Kodo,
Hibernate or EJB 3 (OpenJPA), the latter being the preferred choice today, will
continue to be a common pattern. But in the context we have described above the
O/R tools might have reached their limits: we need to unify data representation
and service the persistent data representing the client, address, account
etc. knowing that it is stored in different databases with different formats; we also
need to expose this data thru Web Service using an XML representation since we
need to consume it using Web Services clients. We might also consider using SDO,
when data consumption becomes complex, this will handle some aspects such as
disconnected data graphs and optimistic concurrency control model.
AquaLogic Data Services Platform
Using AquaLogic Data Services Platform (ALDSP)
the canonical XML model, provided as an XSD and used as a support to a logical
data model, using the terms of ALDSP, can be mapped to physical data sources using
a graphical designer. Using the modeling tool of ALDSP for the UML
representation above, the blue extract of the model, will give:
PaymentCanonical and ClientCanonical being the logical models represented in XML,
the sources of the logical models ClientCanonical being for the client the two physical models ClientSource1 and ClientSource2.
The logical data model can then be exposed thru web
services, or consumed using SDO or JDBC clients. No need to worry about caching
data and defining time to live for caches, or securing access to data: this can be
configured later in the administration console of ALDSP.
In our use case it will be logical to
expose the data models using CRUD Web services, those services being consumed
thru a mediation layer as in the figure below:
Advantages of a data access layer using AquaLogic Data Services Platform
Some of the main advantages of using AquaLogic Data Services Platform as a data access layer are:
- Configuration of data access is done using
queries, mainly graphically generated, not coding.
- The Distributed query ore optimized: ALDSP
Includes a robust distributed query optimizer. Optimization is done based on
data source capabilities, it delegates as much work to the underlying sources
as it can and employs efficient join techniques.
- Data transformation is mapped from underlying
sources into desired schemas via powerful XQuery-based data manipulation and
transformation tools.
- Policy-based caching permits optimization of
system performance by managing the trade-off between response time and
freshness of information through caching of frequently-used data services.
- Data architects and developers can create
unified views and queries rapidly via a graphical drag and drop tool. ALDSP
discovers metadata from physical sources in an automated fashion. This will
reduce overall development time and costs.
- Configurable role based security and data redaction;
this gives the ability to associates a security policy with a data element
within a Data Service's responses.
- Web service interface creation can easily be
done, eliminating the need for developing Web service interfaces.
- Data can be consumed using Web Services, SDO,
JDBC, ADO.NET.
Conclusion
Unification of data is an important subject
for processes that will span multiple silos in an organization. This should be
governed centrally and automated using adequate tooling. Using AquaLogic Data
Services Platform the Data Services Layer will be
faster to implement, easier to maintain, reuse, and extend, without requiring
expensive integration resources or extensive coding.
New data silos will be easier to add to the global canonical model adding
fluidity to organizational wide processes, providing dynamic extensions of data
models.
Related blogs and articles
Comments
Comments are listed in date ascending order (oldest first) | Post Comment
-
Hi,
Your article raises a significant point and key enabler of SOA strategy. A unified, enterprise level, canonical domain 'model' is important, critical and immensely beneficial for SOA strategy. However, in my choice of the word 'model', I slightly differ from the view presented in the article.
For example: "Those services to be usable/reusable globally could be sharing a common data model". I believe that business services will continue using their 'silo' models for legacy or other practical reasons. But it is the SOA infrastructure which will provide 'data translation services' that will allow Service A and B using their respective and different domain models dA and dB to communicate/exchange. Such data translation services will use the canonical enterprise domain model to translate 'silo models' dA to/from dB.
In that sense the canonical model is a 'model' where individual service will use a particular view of this 'canonical' model.
Thank you again for a well-written article on an important topic.
Posted by: pinaki.poddar on October 21, 2007 at 8:42 PM
-
Hello,
This is a very interesting point you are raising.
We might use the words "canonical model" when doing
Application2Application integration as described the Canonical Data Model pattern (http://www.enterpriseintegrationpatterns.com/CanonicalDataModel.html). Within this scope I do agree with your point of view "Such
data translation services will use the canonical enterprise domain model to
translate silo models dA to/from dB".
But I was trying (to show how) to take this approach a
step further than A2A integration toward unifying the data model to be used
directly by global enterprise processes. I should perhaps not call this "canonical model": what about "canonical domain model"?
Unifying data on the enterprise level will require from the business analysts/experts the production of a "complete" UML domain model (related to their business and used in the global processes); this model should then be reviewed/completed by data architects that have the necessary technical knowledge...
Using some business standards, as SWIFT in
the banking business might help. For example: from the business point of view
the SWIFT payment messages as MT103 and MT202 contains all the necessary data
for the payment operation described above. Other data, such as the internal
account to be debited, or the outstanding off-balance accounts, will be deduced
by the accounting system: no need to transport this in the canonical model; on
the other hand data such as the client id, that is internal to the bank and therefore
not in the SWIFT message, should be present in the canonical model...
I would say:
"In that sense the unified canonical domain model is a
model where individual services provided by existing silos will use a
particular view of this canonical model"
This is an approach that has it's
limitations and need to be adapted to the all specific cases that might occurs.
But, in my opinion, it's better to fix a higher objective that the one that may
be attainable: the cleaner the model is the longer it will sustain... until those global
processes will become themselves silos if the acquirer bank gets acquired :-).
Posted by: BECHARAG on October 22, 2007 at 1:16 AM
-
More on the canonical domain data model used by enterprise services can be found at http://www.soapatterns.org/canonical_data_model.asp
Posted by: BECHARAG on March 2, 2008 at 10:53 AM
|