Arch2Arch Tab BEA.com
Syndicate this blog (XML)

Oracle nightmare? I think not yet.

Bookmark Blog Post

del.icio.us del.icio.us
Digg Digg
DZone DZone
Furl Furl
Reddit Reddit

Dmitri Maximovich's Blog | September 22, 2005  12:22 PM | Comments (5)


Bill Roth posted his opinion on state of open source database systems claiming that

it could be the beginning of the end of Oracle's business model. If you really think about it, open source databases are getting very good. Better than good enough.

So is it really that open source databases ready to replace Oracle in as Enteprise database and we all soon should stop buying that expensive Oracle releases?

Let's see. How do you like this issue in MySql for example (note status 'Not a bug'):

By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate, and due to the design of the MySQL network protocol is easier to implement. If you are working with ResultSets that have a large number of rows or large values, and can not allocate heap space in your JVM for the memory required, you can tell the driver to 'stream' the results back one row at-a-time.

Now I'd like to see how's "US Democratic National Committee is using MySQL with a 1.5 billion row database for its voting analytics" - selecting all rows in memory or fetching one by one? I'm not saying that it's impossible, sure there is workarounds for this but this is not the only one problem.

To name a few more - how about read locking? Oracle employ multi-versioning for read consistency so your read requests are never blocked. Most (all?) other databases use shared read locks (which will lead to deadlocks and blocking) - good luck with dealing with these problems in shared enterprise environment.

Now how about robust distributed transactions support? It's just planned for the first time for next version of PostgreSql (not sure about MySql) and it will take them a while to iron it out. Enterprice application without distributed transactions? Sure...again, probably possible in most cases with workarounds and so on but the list can go on and on.

Now don't get me wrong, I'm using PosgreSql in many projects and this is excellent database as I'm sure MySQL and some others and there is definitely progress in this area so eventually it has a potential to be compared to Oracle, but I don't think this time has come just yet.

It's very simple to create a connection pool in WebLogic as long as database has a JDBC driver, but I wouldn't rush to conclude based on this fact that the database is ready for a prime time in the Enterprice as Oracle replacement.


Comments

Comments are listed in date ascending order (oldest first) | Post Comment

  • By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate, and due to the design of the MySQL network protocol is easier to implement. If you are working with ResultSets that have a large number of rows or large values, and can not allocate heap space in your JVM for the memory required, you can tell the driver to 'stream' the results back one row at-a-time.
    Now I'd like to see how's "US Democratic National Committee is using MySQL with a 1.5 billion row database for its voting analytics" - selecting all rows in memory or fetching one by one? I'm not saying that it's impossible, sure there is workarounds for this but this is not the only one problem.
    Sounds like a non-problem to me, at least for 99% of users. There's for most purposes two kinds of database-based applications out there OLTP and OLAP. The thing that folks want to optimize with OLTP is response time, which usually relates to reducing concurrency in the bottle-necked component (usually the database). In addition result sets in OLTP-like applications are usually small (10's of rows, a few hundred, very rarely thousands, and almost never millions).

    If you've done a little bit of cross-vendor JDBC development, you'll know that other databases do the same unless you ask for explicit cursors (SQL Server, Sybase come to mind). In fact, even with Oracle, these types of applications rarely end up calling fetch more than once to grab the results, because they fit within the default fetch size that the driver is using.

    Now, if I see an application, even it's OLAP-like, bringing back millions of millions of rows to the client to do analytics, I have to ask myself "Why?". One of the strengths of the relational model is the ability for it to act on sets, and it is very adept at summarizing, aggregating, generating statistics, etc. Couple this with the usually rich set of functions that today's databases have, there's rarely (notice I didn't say never) a reason to pull millions of rows to the client to do analytics.

    MySQL-5.x and beyond have row fetch/pre-fetch, very similar to Oracle. I think you need to understand that the other model that MySQL supports isn't "fetch one row at a time". Prior to MySQL-5.x, result set data is always returned in "firehose" mode (similar to SQL server). There's no round-trip for each row returned. When rows are processed one at a time, it's more like you're turning off the "valve". Through the magic of the network stack, the server only returns rows as fast as you can read them.

    To name a few more - how about read locking? Oracle employ multi-versioning for read consistency so your read requests are never blocked. Most (all?) other databases use shared read locks (which will lead to deadlocks and blocking) - good luck with dealing with these problems in shared enterprise environment.

    You should've done your homework there. MySQL (when using transactions) and PostgreSQL both use MVCC just like Oracle.

    I think you'd also see some arguments from IBM and Microsoft and Sybase that their databases aren't "Enterpriseworthy" because they don't have MVCC (okay, SQL2005 has it as an optional transaction mode) and use read locking.

    Now how about robust distributed transactions support? It's just planned for the first time for next version of PostgreSql (not sure about MySql) and it will take them a while to iron it out. Enterprice application without distributed transactions? Sure...again, probably possible in most cases with workarounds and so on but the list can go on and on.

    Well, it's in MySQL-5.0, which is just about to come out of beta. Yes, we know it will take some time for XA to be ironed out in MySQL, it didn't just magically happen on day one for any other vendor either. Look at the amount of workarounds, limitations and bugs listed in BEA's own documentation concerning various XA implementations, which means everyone is still trying to get it solid. It's a hard nut to crack. Distributed transactions are straightforward in theory, but a pain in practice.

    Honestly, I'm getting tired of the overuse of the "enterprise" label. What does an enterprise application look like? Is it based on scale? scope? complexity? Is it based on how much you payed for the software and how much you pay the folks that keep it running (often that seems the case!).

    I think all enterprises have applications of varying scales, scopes and complexities.

    There are plenty of applications listed at http://www.mysql.com/customers/ that are all "enterprise" in scale, scope or complexity where MySQL is the database.

    Today, in September 2005 are there more complex "enterprise" applications where Oracle might be a better fit? Sure. Are there a lot of applications at every enterprise where Oracle or some other large complex commercial product is expensive overkill both in acquisition and ongoing costs? Sure. That's the sweet spot of today's open source databases, in my opinion. That picture will change to be more in favor for the open source databases as time goes on, but I think their will always be a commodity/complex software split in the enterprise. Maybe Bill Roth's timeline is just a bit too optimistic for when that split shifts more to the commodity side?

    Posted by: mark-mysql on September 22, 2005 at 1:39 PM

  • Mark,

    Thanks for the response. I'm glad to learn that MySQl and PosgreSql supports non blocking read operations. I should have synchronize myself with latest state of the features in these databases. And by no means I was trying to imply that open source databases has no space in any company no matter how large it is.

    I was probably overusing term 'Enterprise' in my post without describing what I mean by that term. By Enterprise I mean really large databases which is sharead among different applications/processes and the biggest part of the projects is usually integration between various resources (such as databases, MOM, etc). I work a lot with WebLogic and Oracle and I know first hand that there are problems and bugs, especially in the area of XA transactions. Now knowing that Oracle has support for XA for very log time and WebLogic supports Oracle since I don't know what version and there are still problems doesn't make me very optimistic about database which is about co come out of beta. I'm sorry, I'm optimist but when it comes to XA support I learnt my lesson - it's just not 'magically happens', it will take time (years I would say) before it would work relatively trouble free for new product.

    Example with ResultSet is really a minor one, just to highlight the problem. I know, it's bad to return more than 10 rows from select, I know there are workarounds for the most cases but do everybody have time to spend on so many workarounds here and there or they just have barely enough time to implement business functionality?

    Can it be done - sure, I bet you can build large system even using flat files for data storage if you have access to enough resources and time, it is the right way to do it - not always. ;-)

    Posted by: maximdim on September 22, 2005 at 2:11 PM

  • Current problems with TSS which uses PostgreSql as backend also reveal some interesting pecularities of PostgreSql database (even without XA). See for example this post

    Don't forget that there are (still) much more resources (books, articles, etc) available for Oracle and more developers/admins familiar with this database as well, which also could be significant factor if you're developing something big.

    Posted by: maximdim on September 23, 2005 at 11:35 AM

  • More about XA support in MySQL is here

    Posted by: maximdim on September 27, 2005 at 11:38 AM

  • The TSS thread you link to seems to show that PostgreSQL has is great :)

    Posted by: steven_shaw on March 7, 2006 at 9:50 PM



Only logged in users may post comments. Login Here.

Powered by
Movable Type 3.31