Arch2Arch Tab BEA.com
Syndicate this blog (XML)

Artificial Intelligence Filtering Of WebLogic Logs

Bookmark Blog Post

del.icio.us del.icio.us
Digg Digg
DZone DZone
Furl Furl
Reddit Reddit

Simon Vans-Colina's Blog | November 14, 2006   3:05 AM | Comments (3)


Last week I wrote an article about my experiment in using a spam filter to classify candidate CVs.

The article got quite a lot of attention, rising as high as number 3 on programming.reddit.com, probably thanks to Jon bumping it onto the front page. (Thanks Jon)

This week I want to show how you can use CRM114 to solve one of most common administrator nightmares.

A couple of years ago I had the great fortune to work on a project whose release date had been mandated by the government... In legislation...

This particular project was a code fork of an in-development application, with responsibility for parts being owned by 2 Major consulting firms (you know who you are) as well as a team of in house developers.
With the government deadline bearing down on us, and near constant political in-fighting the application went in to UAT.

The first thing I noticed was that within hours the 100Gb disk partition we had allocated to log files was full.
The application was slow. It was outputting log data faster than the disks could write it, and when something went wrong, finding out what it was, was impossible.

Developers would ask us to "send them the log files" and we would fall out of our chairs laughing.

But delaying the release date for something as trivial as log file purity was unthinkable. Even to mention such a thing would result would result in dirty looks.

So the application went live, spitting out stack traces faster than Oracle spits out press releases. With, of course, the promise that they would revisit it some time in the future. (after all the developers had moved on, yeah right.)

What we really needed was some kind of Artificial Intelligence filtering, that could cut down the volume of traffic. Somthing that could learn the difference between the "interesting errors" and the "oh-thats-normal-don't-worry-about-that" errors. And what the hell is a normal stack trace anyway?

I didn't have a solution then, but i do now: CRM114

So what should an intelligent log file parser look like. Well for starters you need to train it, so we'll need a script called learn.

$ ./learn --in=LogFileName.log --ks=weblogiclog

View Source learn.crm

This script will walk through a WebLogic log asking you about each entry. An entry is defined as starting with "####" and continuing over as many lines as necessary till it sees the start of the next entry. This works well for WebLogic logs, and its easy enough to change the Regular Expression to work with different types of logs.

Once you've done a bit of training, you hit CTRL-C and you'll notice that your learning has created two 12Mb files, in this case weblogiclog.ksi and weblogiclog.ksd. (Knowledge Store Interesting, and Knowledge Store Dull)

In this case weblogiclog.ksi contains all the information about Interesting log entries, and weblogiclog.ksd contains information about Dull log entries.

The next script is the one that will walk through a WebLogic log showing only the entries that more closely match the Interesting category.

$ ./filter --in=LogFileName.log --ks=weblogiclog

View Source: filter.crm

Running this script on a sample WebLogic log I found on the Internet did indeed find all sorts of interesting things. Stack traces are the easy ones: once its been told a particular "stack" is Interesting (or Dull) it will correctly classify it from then on.

Intelligent log filtering shouldn't be an alternative to enforced log file purity.

It might be possible to insist that developers train up a knowledge store on their application during UAT, and deliver that into production with their app.

CRM114 could then be integrated with monitoring solutions like BMC Patrol so that Patrol can raise an alert when an Interesting log entry occurs.

In this way Admins could spend less time wading through the logs trying to work out what's a real error, and more time fixing things.


Comments

Comments are listed in date ascending order (oldest first) | Post Comment

  • If you want to play with this at home:

    1/ Download CRM114 from: http://crm114.sourceforge.net/crm114-20061103-BlameDalkey_libc-2.2.5-static-s_bin.tar.bz2

    2/ Get yourself a Weblogic log. (try searching google for JiraServer.log for the one i used.)

    3/ Install CRM114 on your linux box, and copy the 2 scripts from above.

    Posted by: simonvc on November 14, 2006 at 3:12 AM

  • Very nice work there. The language looks ... unique ... but I can see real value in AI sorting the wheat from the chaff in log files. Afterall, thats what we do in out own limited manner, and we parse log files much slower than any program would.

    Posted by: t_g_nielsen on November 14, 2006 at 8:44 AM

  • This has some potential! Check out my entry Deatth from a thousand cuts, it details some the issues around log file messages that I believe to be important. Jon has also blogged on this subject and highlighted the use of WLDF. This could be really useful as a code share project. If you could continue add to the training carried out by others by sharing for files then others could benefit from the work and add to it. Is this possible with CRM114? Also you could create training files to spot different types of problems. This could be very useful for auditing. I'm going to download it and try it out now. Cheers.

    Posted by: hoos on November 14, 2006 at 11:20 AM



Only logged in users may post comments. Login Here.

Powered by
Movable Type 3.31