Slice: OpenJPA for Distributed Databases: Part II
Pinaki Poddar's Blog |
January 8, 2008 10:28 PM
|
Comments (4)
In my last post, I have talked about Slice a plug-in extension of OpenJPA for distributed database environment. Slice provides key features that will help the developers to build distributed database applications. | As the adjoining schematic suggests, Slice plug-in on the southbound interfaces of OpenJPA where it interacts with database. Slice abstracts a set of distributed databases in a single virtual database such that rest of OpenJPA kernel can operate exactly in the same way. The key features that will help you to build distributed transactional application are: Non-intrusive: No change in the application code or persistent domain model. Absolutely. Customized Data Distribution Policy: Implement a single method to decide which database slice will persist a new instance. Per-slice Configuration: Each database slice can be configured with their own database drivers or any other properties. Automatic Tracking: Slice remembers the original database slice for any instance that is loaded from the database as a result of query or find() operation. Slice also traverses the relationships that are annotated as CascadeType.PERSIST. So once user-defined data distribution policy decides a database slice for a root instance, all the instances related to the root instance automatically assigned the same database slice. Parallel Query: All major database operations (query and flush) is executed in parallel across the database slices. Distributed Transaction: If each slice is XA-complaint, then even if persistence unit is configured for RESOURCE_LOCAL transaction, Slice will employ a two-phase commit protocol. | | Salient Features and Overview of Slice | | The detailed instructions and downloads are available at http://people.apache.org/~ppoddar/slice/site/index.html.
Comments
Comments are listed in date ascending order (oldest first) | Post Comment
-
First of all Happy 2k+8 for all the dev2dev community.
I am very interested in your OpenJPA articles and some questions are arising to me. I think OpenJPA and BEA ALDSP have some similarities, though one is a framework and the other one is a complete suite. Do you agree? What are, from your point of view, the most important similarities and differences between them.
Thanks in advance.
Posted by: jguerra@bea.com on January 9, 2008 at 10:01 AM
-
The core difference between them is
a) how they represent data and metadata
b) how important is transaction integrity
Data is described in JPA-centric world (i.e. metadata) as Plain, Old Java objects -- strictly-typed, rich in metaphors, capable of expressing complex business domains. JPA also adheres strictly to transactional integrity albeit with a single RDBMS instance.
In the world of ALDSP (of which my understanding is limited and hence only shooting my mouth based on prior experience in similar problem domain), data is more flexible, dynamic and hence not strictly-typed.
Transaction is not critical but access is -- so ALDSP style service may be able to compose DataObjects sourcing from multiple persistent datastores not necessarily RDBMS. However, maintaining transaction across these heterogeneous sources is a non-trivial proposition and may not be its focus/priority.
So, JPA and ALDSP are focussed on different aspects of data-aware applications as well as implicit assumptions about their operating environments.
Applications that require rich business domain model, strong transactional consistency, has well-defined periphery will benefit from JPA. On the other hand, ALDSP will be more suitable for applications that are read-intensive, uses heterogeneous data sources, exchanges data with other systems/services/components.
Posted by: pinaki.poddar on January 9, 2008 at 10:31 AM
-
Certainly, OpenJPA and ALDSP have similarities as they're both data access technologies. Having said that, ALDSP focuses on data access *and* integration. JPA is, IMO, less focused on integration. Certainly, Slice adds integration behavior albeit focused on a particular pattern: horizontal database partitioning.
ALDSP is focused on real-time, in-place access and *integration* of diverse data sources on behalf of diverse consumers. That is, ALDSP supports more than just relational data as data sources (web services, XML files, flat files, Java wrapped, API-based sources) and diverse consumers (SOAP web services, Java clients via the Service Data Objects standard, and, yes, JDBC).
ALDSP is used to construct logical, business-friendly data services which abstract the details of diverse, multi-source access and integration. As such, ALDSP emphasizes modeling and not coding. It is metadata driven and makes use of a query planner and optimizer to determine the most efficient multi-source access and integration plan. ALDSP is also about read *and* update in that data services can be updateable, and ALDSP does all the right things for multi-source transactional update using XA.
In way of slight correction, ALDSP does support rich 'domain models' including complex hierarchical models (shapes as they're called in DSP) and complex update transactional scenarios.
If your world is only relational and Java, then consider JPA. If, however, your world is more than relational and includes SOA-based consumers (an ESB like ALSB or BPM), you should take a look at ALDSP at:
http://bea.com/framework.jsp?CNT=index.htm&FP=/content/products/aqualogic/data_services/
Brad
ALDSP Product Manager
Posted by: bwright on January 10, 2008 at 12:42 PM
-
I surely have shot my mouth without knowing much about what a fantastic thing ALDSP is. ALDSP does seem to solve the most complex problems that confronts data-centric operations in a heterogeneous, distributed environment.
Browsing through ALDSP documentation, however, I could not locate answer to few of my simple questions:
a) can ALDSP establish relationship across persistence stores? That is if PERSON records are in database A and ADDRESS records are in database B, can one do a query: "Find me the PERSONs whose ADDRESS is such and such" and expect the result to be available as Person SDO which is related to Address SDO. In my view, this will require capability to join relations in-memory and in its full glory will match up to a RDBMS SQL engine.
b) the client view of data (in the form of SDO or POJO) and persistent stores view of data (RDBMS records or XML File) often differ. JPA addresses a section (POJO vs RDBMS) of this 'impedance mismatch' and calls this O-R mapping -- that requires user specification because any automatic means to reconcile a client view of data with persistent schema is bound to be insufficient. So, what is ALDSP way to describe this 'O-R mapping'?
c) Distributed transaction requires certain capabilities from the participating resources. For example, in RDBMS, the driver for the data source has to be XA-complaint. Many diverse resources that ALDSP can operate upon may not have such capabilities at all. What does ALDSP do when the data sources lack capability for two-phase commit or even rollback?
Slice is, of course, limited in its modest goal -- its client model is only POJO, it works only with RDBMS, it supports two-phase commit only if underlying data sources are XA-complaint, it enforces collocation constraint (i.e. no cross-database relationship).
In fact, it will be sacrilege to even compare Slice/OpenJPA with ALDSP.
Posted by: pinaki.poddar on January 10, 2008 at 2:04 PM
|