Arch2Arch Tab BEA.com
Syndicate this blog (XML)

Recovering from O'RAC'LE outages

Bookmark Blog Post

del.icio.us del.icio.us
Digg Digg
DZone DZone
Furl Furl
Reddit Reddit

Hussein Badakhchani's Blog | March 12, 2007   4:05 AM | Comments (4)


It's a quiet day and all seems well with the world and them suddenly a node in the Oracle RAC dies without warning. Phones ring, text messags fly and the DBAs assemble togehter. Pretty quickly they manage to restart the damn thing but all the connections to the RAC have failed over to the other node. While there is no immediate issue you will need to rebalance the connections at some point during day so what are your options?

Here's what you can do from the WLS8.1 management console in order of greatest invasivness to service delivery:

  1. Restart the WLS server. Yes, this will rebalance your connection pool in a known period of time. No its not very nice to your customers and stop touting as a 24/7 service provider. This operation is also the slowest way to reset your connection pools.
  2. Untarget and retarget your connection pool. This has the effect of closing and reopening all your connections which will rebalance them accross all nodes. All users of the pool on a particular target (server or cluster) will lose service. It is a faster operation than restating your server but if your not careful you could run into JNDI errors if you retarget the pool too quickly.
  3. Force suspend the connection pool and then resume it. Upon resumption the pool recreates all the connections that were closed and in effect connections are rebalanced against all defined nodes. This operation can be done on per server basis and limits service degridation to users of the connection pool. It is also the fastest way of rebalancing your connections out of the option I am putting forward.
  4. Reset the connection pools. WLS will wait for a connection to become free before closing and reopening them. This can take some time if the site is very active and infact if the connection does not become free after a given timeout period it seems that WLS skips it and so the outcome of resetting connections cannot be well predicted if the site is busy. However this is the least invasive approach in terms of service delivery.

I think the order in which you define the your tns listeners in the in the JDBC connection URL will also have an impact on balancing of connections, especially in the case of resetting the connetion pools. Rather than defining the listerners on one node and then the next node it may be better to alternate nodes. I don't have any evidence that this will improve the rebalancing but it is something I plan to test soon.

If you know of better options for rebalancing those pesky connections pools let me know, if it wasn't for that meddling o'rac'le database I could have gotton away with it.


Comments

Comments are listed in date ascending order (oldest first) | Post Comment

  • What about multipools Hoos? I remember reading something about them a while back in connection with RAC. (I love the "or rac le" - ;-) )

    Posted by: jonmountjoy on March 18, 2007 at 2:27 PM

  • Jon, thanks for the pointer you maybe onto something. Are you suggsting that I create, say, two pools one for each node and then add them to a multipool with loadbalancing enabled?

    If one of the nodes goes down and is then restarted how would the connections load balance back out onto the restarted node?

    I should add that this configuration is using the Oracle thin driver 10g. Aparantly the thick driver can you transparent application failover (TAF) to achieve this, but this is no use to me :(.

    Posted by: hoos on March 19, 2007 at 2:59 PM

  • I don't think you want the loadbalancing option but the failover option (you can loadbalance between pools in the multipool too if you want!) Multipools have an option for re-enablement of pools that failed, which answers your second question.
    Oh, I see there's an entire page on RAC. Hope that helps!

    Posted by: jonmountjoy on March 20, 2007 at 4:39 PM

  • Nice one, I'll have a read, test it out and report back.

    Posted by: hoos on March 20, 2007 at 7:06 PM



Only logged in users may post comments. Login Here.

Powered by
Movable Type 3.31