Arch2Arch Tab BEA.com

Henrik Stahl's Blog

Henrik Stahl Henrik Stahl's Homepage
Henrik Ståhl works as a product manager in the JRockit team, and has been with BEA since 2004. His experience with JRockit goes back to the first public release, for which he was an early beta tester. He has previously worked as a developer, systems architect, with performance testing and tuning, and with IT security.

SPECjvm2008 released!

Posted by hstahl on May 7, 2008 at 11:57 AM | Permalink | Comments (2)

SPEC just announced the release of SPECjvm2008. According to the press release, it is "a benchmark suite for measuring the performance of a Java Runtime Environment (JRE), containing several real life applications and benchmarks focusing on core java functionality".

This release is the result of hard work by employees of several companies, including AMD, BEA, HP, IBM, Intel and Sun. It incorporates workloads developed by these SPEC members as well as several independent contributors acknowledged on the SPEC web site. We are especially proud of the fact that the effort was driven by Stefan Särne from the JRockit performance team. (Great job, Stefan!)

Some highlights:
- mixture of workloads like the preceding version SPECjvm98 or SPECCPU2006
- base (no JVM tuning) and peak results
- free download from SPEC

The combination of base (out of the box) and peak (tuned) results will make this workload challenging for JVM implementers. Base is especially challenging since we need to ensure that our JVM runs well out of the box regardless of whether it's a tiny laptop or a really massive server. As it stands today, a benchmark run will sometimes not even complete in "base" configuration on some JVM/OS/hardware combinations, because the default heap size is too small or some similar limitation. And base performance can be truly abysmal sometimes. (JRockit is guilty on both counts here, but it's in good company!). The good news is that the pure existence of this workload will force all JVM vendors to improve in new dimensions, for the good of the Java community

Looking into the crystal ball, I expect that SPECjvm2008 will complement SPECjbb2005 as a Java workload for hardware product launches, and will be used by JVM vendors like BEA, IBM and Sun to improve our products and make more or less substantiated claims about performance leadership :-)



Real-time Java at LiqudityHub

Posted by hstahl on March 18, 2008 at 2:54 AM | Permalink | Comments (0)

WebLogic Real Time (WLRT) is a BEA product including a JRockit version with our innovative Deterministic GC. This GC algorithm is as simple as it is powerful. To use it, you start your JVM with -Xgcprio:deterministic -Xpausetarget=10ms, where the latter is the service level agreement (SLA) for GC pause times. Specify 10 milliseconds, and WLRT will limit the GC pause times to that length. There are some caveats in the form of application size etc, but it works well even for pretty large applications. WLRT has improved over time so that for smaller applications on fast hardware, WLRT can now give you low single digit millisecond worst-case pause times.

To get an idea on how WLRT performs for real applications, check out Tony Harrop's and Jeremy Vickers' QCon presentation on the LiquidityHub solution (more here). There is a quote on slide 14 that I'm particularly fond of :-)

Check out the WebLogic Real Time white paper for more information, and if you're interested in extreme transaction processing in Java, take a look at WebLogic Event Server. Evaluation versions of these products can be found on the BEA download site.

NB. This is not RTSJ, it is standard Java. Yes, it is possible to get millisecond worst-case latencies in standard Java...

JRockit + Eclipse = TRUE

Posted by hstahl on March 14, 2008 at 12:46 AM | Permalink | Comments (1)

The BEA JRockit team has been working on increasing its support for Eclipse users over the past couple of years. Some deliveres made along the way was the port of JRockit Mission Control 2.0 to the Eclipse RCP platform that we did as part of the JRockit R27.1 release in December 2006 and the addition of JRockit as an Eclipse reference platform earlier the same year. Important milestones, but so far no real integration between Mission Control and the Eclipse or BEA Workshop IDEs. Until now, that is.

Earlier this week, we quietly released the JRockit Mission Control 3.0.2 tools suite in the form of Eclipse IDE Plug-Ins, and made it available for download on the BEA Update Site. In addition to the standard JRockit Mission Control features, it also adds basic integration with the IDE environment such as the capability to jump-to-source from out profiling and diagnostics tools. We also went live with a JRockit Mission Control section in the BEA Bugzilla where you can report bugs and request feature enhancements. We will of course continue to monitor our online user forums as well.

But to paraphrase Churchill, this is only the end of the beginning. Moving forward, we are going to do more frequent releases on the Update Site, both in the form of official JRockit Mission Control updates and some more experimental stuff. We are also going to enable updating the standalone Mission Control application from the Update Site to speed up patch delivery and feature enhancements.

We are going to reveal more specifics on what we have up our sleeves on EclipseCon next week, so for those attending that conference I recommend that you attend Marcus Hirt's presentation, or grab one of the BEA reps attending the conference. If you are unable to attend, add Marcus' blog to your feeds instead - that's where you get the news first!



Tips and tricks for dealing with a fragmented Java heap

Posted by hstahl on December 18, 2007 at 6:07 AM | Permalink | Comments (0)

Heap fragmentation occurs when a Java application allocates a mix of small and large objects that have different life times. The main negative effect of fragmentation is that long GC pause times caused by the JVM being forced to compact the Java heap. These long pause times are typically triggered when your Java program attempts to allocate a large object, such as an array.

As described in my previous blog entry on this topic, you can use JRockit Mission Control to find out how fragmented the heap is. But a fragmented heap is only a problem if it leads to long pause times (or an OutOfMemoryError). To find out the impact on pause times, you can run with the -Xverbose:gcpause command line flag, which will give you something like:

[INFO ][gcpause] old collection phase 1-0 pause time: 73.214054 ms, (start time: 15.807 s)
[INFO ][gcpause] (pause includes compaction: 1.029 ms (external), update ref: 1.532 ms)
[INFO ][gcpause] Threads waited for memory 66.612 ms starting at 17.568 s
[INFO ][gcpause] old collection phase 1-0 pause time: 66.449507 ms, (start time: 17.569 s)
[INFO ][gcpause] (pause includes compaction: 1.236 ms (internal), update ref: 1.488 ms)

The pauses in this example are clearly not a problem, but it can sometimes be much longer than this.

If you don't want to restart your JVM, you can enable this during runtime by running jrmcd <pid> verbosity set=gcpause=info. After you have the data you need, disable informational logs with jrcmd <pid> verbosity set=gcpause=error.

Or you can do a JRA recording (see the previous blog entry) and look in the GC details tab, where the time spent in compaction is clearly visible:

jra-compaction.PNG

Before we look into the possible strategies for dealing with fragmentation, it is crucial to understand what causes it. The first key observation is that fragmentation is caused by GC. When the JVM performs GC it will clear out dead objects. It's the act of removing these dead objects that creates the holes in the heap. Memory allocation only has an indirect impact, in that it can create a pattern in the heap that later leads to the GC fragmenting the heap.

A second key observation is that fragmentation is only a problem if you can't use the holes in the heap. As long as you only allocate small objects, it doesn't matter how fragmented the heap is.

With these two observations in mind, here are some tips:

1. Increase the heap size

Increasing the heap size will decrease the frequency of GCs. One benefit of this is that objects are more likely to be dead than if GCs are very frequent, and if more objects are dead then there will be fewer live objects around that can contribute to create holes. In other words, the heap holes will on average be larger, which implies less fragmentation. Also, if GCs are less frequent, you can possible afford longer pause times since the impact of GC on throughput will be lower. Be aware that increasing the heap size can cause a slight increase in pause times.

A special case is to run with an infinitely large heap and never GC, which will of course avoid fragmentation completely.

2. Use a generational GC

Running JRockit with -Xgc:gencon or -Xgc:genpar will enable the use of a nursery or young space. The nursery will store recently allocated objects and when it is full a nursery GC will be performed, in which objects that are still alive will be moved to the old space. Since all objects that survive are (eventually) moved to the old space, the nursery will never be fragmented. And fragmentation of the old space will happen much more slowly since objects moved there will on average survive for a long time. Also, since the old space will fill up less rapidly, fragmentation-causing old space GCs will be less frequent.

A common strategy for avoiding the cost of old space GCs (often called "full GCs") is to configure your nursery size so that almost all objects die before they reach the old space. If you do this carefully, you can postpone old space GCs for a very long time. I've seen some installations where the app has been configured to avoid old space GCs for a full day, after which it is restarted to force creation of a "clean" heap. This more frequent among non-JRockit users, since our compaction is fairly efficent and the pause times tend to be acceptable. One word of warning here: This strategy is not guaranteed to avoid full GCs, since that depends on the load on your application, exact heap layout etc, so don't rely on it too much and configure it with a large safety margin.

3. Tune compaction parameters

The default behavior of JRockit is to analyze the fragmentation of the heap and do a little bit of compaction every old space GC cycle. The proportion of the heap that it decides to defragment is called the compaction ratio which is typically stated as a percentage of the heap size, where a common figure would be perhaps 5%. If your application causes a lot of fragmentation, you can configure this ratio manually which gives you the ability to create a balance of power between the GC and your memory-hungry Java program. You can try with -XXcompactratio=10 or so to start with. A high number will lead to longer pause times, but also means that the JVM will be able to cope with higher fragmentation.

If you want to do advanced tuning, look in the JRockit reference guide for parameters that impact compaction. Two examples are -XXinternalcompactratio and -XXexternalcompaction.

4. Don't allocate memory

Ok, so it's time to look at what you can do in your Java code. The most obvious tip is avoid memory allocation. This will have direct impact on both the frequency of GCs and indirectly on the GC pause time, since less objects will be alive at the time of a GC. You can use the Memory Leak Detector to analyze your application's allocation pattern and trace down excessive allocation to where in your source code it occurs. See Marcus Hirt's blog entry on this subject for tips.

5. Avoid allocating large objects

Arrays and other large objects are always the biggest culprit when it comes to fragmentation. They cause the heap to fill up quickly, leading to frequent GCs, they create irregular patterns in the heap and they request big contiguous chunks of memory on the heap at allocation time, which can be impossible for the JVM to fulfill without first doing compaction. To avoid excessive allocation of large objects, think twice before copying arrays etc. All code involving string processing (char arrays), XML, I/O (byte arrays) etc is a target for optimization in this area. Again, the Memory Leak Detector is a very powerful tool for analyzing this.

6. Allocate objects with similar lifespan in chunks

This is my last tip, and it is a bit esoteric so please bear with me... The idea is that any larger operation can decrease its impact on fragmentation by allocating the memory it needs in a chunk and then keep it alive until the operation is complete. Consider a J2EE transaction such as a servlet request. When you start this request, you can have one metaobject created that allcates most or all of the objects that you need to process that particular request. Since these allocations will all be performed by a single thread and very closely spaced in time, they will typically end up stored as a contiguous block in the Java heap. If you keep all these objects alive until the transaction are done, they will all become subject for GC at the same time, and cleared as dead objects during the same GC cycle. Ergo, less fragmentation. So nulling out objects prematurely to decrease live data may not be the best in the long run.

Final words

That's it for this time... I hope you found this useful! Don't hestitate to ask if you found something unclear. Keep up the good coding!



How fragmented is my Java heap?

Posted by hstahl on December 11, 2007 at 2:06 AM | Permalink | Comments (7)

One major cause for long GC pause times is heap fragmentation. How problematic this for an application depends on its allocation pattern. The worst possible case is an application that allocates a mix of objects with very different sizes and lifetimes. After the application has been running for a while, the Java heap will be fragmented by lots of long lived Java objects spread out across the heap. There may be plenty of free space available, but no large block of contiguous free memory. When the application then attempts to allocate a large object such as an array, it is unable to find room to store it. The result will be a long GC pause while the heap is compacted.

If you suspect you have this issue with your application, the first step is to try to find out exactly how fragmented the heap is. One simple way of doing this is by using the JRockit Runtime Analyzer. Here's an example:

1. Start a Java application on your workstation
2. Start up JRockit Mission Control using the Start menu icon in Windows or the $JR_HOME/bin/jrmc executable
3. Locate the process you want to analyze, right-click and select to start a JRA recording.

start-jra.PNG

4. Select recording time (here 2 minutes) and start the recording

jra-wizard.PNG

5. After the recording has finished, it will be opened in the JRMC GUI. Select the Heap tab. Heap fragmentation is displayed in black in the Heap Contents pie chart.

jra-heap.PNG

The application in this example is well behaved and shows only 11% fragmentation - well within acceptable limits. I would start getting concerned if it went above 30%, and more if it continued to increase. Another warning sign is if the free memory distribution (pie chart on the right) contains a very large proportion of smaller free blocks.

If you found this useful, let me know and I'll write an entry on how to deal with heap fragmentation.



Configuring JRockit Mission Control to work through a firewall

Posted by hstahl on November 16, 2007 at 3:00 AM | Permalink | Comments (0)

Highly recommended read: Blog entry by Paco Gomez on Monitoring JRockit Through A Firewall.

BEA videos on YouTube

Posted by hstahl on November 15, 2007 at 12:12 AM | Permalink | Comments (0)

There are some short clips covering BEA technologies on YouTube. My favorite is the Predictable Java video. I wish my coffee machine was that well-behaved!

Insights from the JVM factory

Posted by hstahl on October 3, 2007 at 12:38 PM | Permalink | Comments (0)

My colleague Noora Peura has posted her first blog entry discussing how to properly size your heap. I have been told that we can expect a regular series of blogs with useful tidbits for the Java developer. So go ahead and add it to your blog/rss feeds so you don't miss out on anything!

SPECjbb2005 results from the Intel Tigerton launch

Posted by hstahl on September 6, 2007 at 12:41 AM | Permalink | Comments (2)

Sept 27: added link to the Tigerton result on www.spec.org, updated disclaimer.

Intel repainted the x86 landscape last year when they introduced their new Core microarchitecture. However, these chips have only been available for single and dual processor systems, while users that needed four (or more) processors have had to settle with systems based on the older Netburst architecture. This is changing with the imminent launch of Tigerton, which is a quad-core Xeon MP chip based on the Core architecture. We have been working with Intel (press release) and Fujitsu Siemens (press release) on tuning and benchmarking for this system with the following results.

Summary:
1) 2 - 2.5 times faster than previous generation x86 systems from Intel & AMD
2) compares favorably with (presumably more expensive) big iron systems based on IBM Power, Sun SPARC and Itanium

tigerton-x86.PNG

The results shown are the total system throughput, measured in SPECjbb2005 bops, and throughput per socket and per core. The latter is a good indication of single-thread performance.

As we can see, the new RX600 S4 system based on Tigerton is almost exactly 2x faster than the 3.4 GHz Xeon 7140M (codename Tulsa). This boost comes from doubling the number of cores per chip, as well as a per-core performance that is slightly better than Tulsa, which is a nice feat given that the frequency is ~15% lower.

Comparing to AMD, it is around 2.5x faster than current generation Opterons. It remains to be seen if their upcoming Barcelona can compete on this benchmark.

tigerton-bigiron.PNG

This comparison shows that Tigerton is a match for much bigger systems based on SPARC and Itanium, as well as similarly sized systems based on the new IBM Power6 chip. One caveat here is that big iron systems can scale much higher than x86, especially Itanium which is available in configurations as big as 512 CPUs (1024 cores) in a single system image(!)

Q & A

What's your interest in this result?
We have been working with Intel and Fujitsu-Siemens on producing this benchmark score and are proud over the good result.

What difference does the JVM do? Isn't the comparison unfair if you don't use the same JVM?
If you're a frequent reader of my blog you may remember that JRockit is approximately 20-30% better on x86 on this benchmark than the latest Sun JVM, see this blog entry for details. However, that doesn't affect this comparison to any great extent since all x86 results were based on JRockit (but slightly different versions). Of the big iron results, the Itanium publication was with JRockit while the Sun and IBM scores are based on their own respective JVMs, presumably optimized for their own platforms.

What does this result mean?
If we assume that SPECjbb2005 is a reasonable indicator of Java performance, this means that the new Intel Tigerton is the most powerful x86 server for Java today.

What would you use this hardware for?
Using it for a single large Java application is certainly possible, but the application will have to be well-written to not have scalability issues with 16 cores. It's probably more reasonable to use it to run several small Java applications. Intel suggests that it is a good platform for virtualization (server consolidation). I agree, and suggest that you take a look at WLS Virtual Edition if you're interested in this.

Can you show us the full benchmark details?
Sure, here is a summary; follow the links for full information:
Fujitsu-Siemens PRIMERGY RX600 S4 based on 4-chip/16-core Intel Xeon X7350 437,412 bops @ 54,677 bops/JVM
4-chip/8-core Intel Xeon 7140M 217,344 bops @ 54,334 bops/JVM
4-chip/8-core AMD Opteron 8222SE 176,909 bops @ 44,227 bops/JVM
16-chip/32-core Sun SPARC64 VI (2.4 GHz) 440,207 bops @ 27,513 bops/JVM
16-chip/32-core Intel Itanium 2 (1.6 GHz) 471,030 bops @ 58,879 bops/JVM
4-chip/8-core IBM Power6 (4.7 GHz) 346,742 bops @ 86,686 bops/JVM

Fine print: Competetive scores quoted above reflect results published on http://www.spec.org as of September 27, 2007. All scores are in SPECjbb2005 bops. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005. SPEC and the benchmark name SPECjbb2005 are trademarks of the Standard Performance Evaluation Corporation.



Network Computing on the Liquid VM

Posted by hstahl on August 17, 2007 at 12:53 PM | Permalink | Comments (0)

Network Computing did a review on WLS Virtual Edition a few days back. A fairly good summary of the current benefits and limitations of this solution.

From a technical perspective, Liquid VM consists of a package of JRockit and a small OS shim we call Bare Metal. This combination yields a very light-weight container capable of running typical server-side Java apps like WLS with a significant decrease in memory footprint over other virtualized containers.

The current version runs as a software appliance on VMWare, so you need that installed first. VMWare has - not surprisingly - taken a fancy to this solution and it figures prominently in their list of virtual appliances, or you can get it directly from the BEA download site.



A second look at Java 6 performance

Posted by hstahl on August 7, 2007 at 7:33 AM | Permalink | Comments (5)

When we released JRockit for Java SE6, I promised to present some benchmark numbers. Here they are - only 4 months late...

The results shown here are based on the SPECjbb2005 and XMLMark workloads. We have run the benchmarks in a realistic "base" configuration as well as in a fully tuned configuration and have tried to make as fair apples-to-apples comparisons as possible. See the Q&A section below for more details.

SPECjbb2005

This is the only modern industry-standard JVM benchmark available and the main arena for Java performance comparisons by JVM and hardware vendors.

intel-jbb.PNG

amd-jbb.PNG

Or in table format:

jbb-table.PNG

As you can see, JRockit is much faster than the Sun JVM on both Intel and AMD hardware, and in both the base and tuned configurations. The JRockit performance advantage is between 5 and 65%. The best known config with JRockit is 27% faster on Intel and 19% faster on AMD in our setup.

Note: While we have run the benchmarks on both Intel and AMD hardware, using these results to compare the performance of the hardware is misleading since we used a multi-JVM configuration for the Intel result and a single-JVM configuration for the AMD result. The reason for this is that we had too little memory in the AMD machine to run a multi-JVM config.

XMLMark 1.1

This benchmark was originally used by Sun and Microsoft to compare Java to .Net performance. We have not used it to drive performance enhancements in JRockit, so it can be used as one data point to help validate that our JVM optimizations are generic. I used the XMLMark code provided by Microsoft, download here.

The XMLMark benchmark consists of 3 different components for SAX and 6 components for DOM. The results below were calculated as the geometric mean of these components.

intel-xmlmark.PNG

amd-xmlmark.PNG

The result is clear: JRockit is faster in every single combination. On SAX, JRockit's lead is between 32 and 66%, on DOM between 3 and 17%.

Configuration

Hardware and operating system

2-way AMD Opteron 2220 SE, 8 GB RAM, Windows X64 Edition

2-way Intel X5355, 12 GB RAM, Windows x64 Edition

Both systems had large pages enabled in the OS to enable the use of large pages for both JVMs.

Intel system had hardware prefetching disabled in the BIOS. This benefits the software
prefetchers in JRockit and the Sun JVM (for Sun, this is AFAIK new in 1.6.0_02).

Sun JVM

Sun JDK 1.6.0_02, 32 and 64-bit

For SPECjbb2005:
-Xms1500m -Xmx1500m -server -Xss128k (32-bit JVM, base)
-Xms3500m -Xmx3500m -server -Xss128k (64-bit JVM, base)
-Xms1500m -Xmx1500m -Xmn800m -server -XX:+UseBiasedLocking -XX:+AggressiveOpts -XX:+UseParallelOldGC -Xss128k -XX:+UseLargePages -Xbatch (32-bit JVM, tuned)
-Xms3650m -Xmx3650m -Xmn2000m -server -XX:+UseBiasedLocking -XX:+AggressiveOpts -XX:+UseParallelOldGC -Xss128k -XX:+UseLargePages -Xbatch (64-bit JVM, tuned)

For XMLMark (32-bit JVM only):
-Xms1500m -Xmx1500m -server (base)
-Xms1500m -Xmx1500m -server -xx:+AggressiveOpts -XX:+UseLargePages (tuned)

JRockit

JRockit 1.6.0_01 R27.3.0, 32 and 64-bit

For SPECjbb2005:
-Xms3500m -Xmx3500m (32-bit JVM, base)
-Xms3500m -Xmx3500m (64-bit JVM, base)
-Xms3650m -Xmx3650m -Xns3000m -XXaggressive -XXlazyunlocking -Xlargepages -Xgc:genpar -XXtlasize:min=4k,preferred=1024k -XXcallprofiling (32-bit JVM, tuned)
-Xms3650m -Xmx3650m -Xns3000m -XXaggressive -XXlazyunlocking -Xlargepages -Xgc:genpar -XXtlasize:min=4k,preferred=1024k -XXcallprofiling (64-bit JVM, tuned)

For XMLMark (32-bit JVM only):
-Xms1500m -Xmx1500m -Xgc:parallel (base)
-Xms1500m -Xmx1500m -Xgc:parallel -XXlazyunlocking -XXcallprofiling -Xlargepages(tuned)

Q & A

How did you decide on the tuning parameters? For our base tuning we only allowed the most commonly used JVM parameters, and those needed to work around inherent weaknesses in each JVM.

For Sun, this varies with platform but on Windows you typically have to specify -server, max heap size (-Xmx) and often the stack size (Xss). The Sun JVM is less sensitive to the initial heap size (-Xms). You also often have to configure a larger perm space, though that didn't affect the benchmarks used for these results. To increase performance, we used the same parameters used by Sun in benchmark publications.

For JRockit, the most important thing is to set the minimum heap size (-Xms), and sometimes select the static throughput-optimized GC (-Xgc:parallel). There is typically less need to set the max heap size (-Xmx). To increase performance, publication options were used.

For both JVMs, we only used the most common tuning parameters for XMLMark even for the "tuned" scenario.

Wouldn't it be more fair/realistic to run with an out-of-the-box configuration?
Not really. To start with, most server-side Java installations do use basic tuning options and products like WebLogic Server come with precreated start scripts configured that way. Also, out-of-the-box configuration of the JVM varies with platform so it could yield really poor performance in some configurations. For instance, our tests were run on Windows, and Sun uses the Client VM by default on that platform which performs much worse for typical server-side benchmarks. Also, not configuring things like stack size would mean that the benchmark
would terminate during the run; hardly fair to Sun!

Aren't tuned configurations pointless? I mean; people don't take the time to tune anyway!
The configurations used for benchmark publications do tend to be a bit extreme, yes. It is unlikely that many users have the skill or inclination to tune most products to that extent. However, that is less true for Java benchmarks like SPECjbb2005 than for more complex benchmarks since there are so few tuning parameters involved.

Also, while most users may not engage in heavy tuning, it is probably the case that the users that really care about performance do. So peak performance is still interesting.

How did you configure the XMLMark benchmark?
We ran every component benchmark for 10 minutes, of which the last five were used to calculate the score. This seemed to be a reasonable compromise giving us a run long enough to allow the GC to impact the score while keeping the total runtime reasonably low. The benchmark was configured to use one thread per CPU core - 8 on the Intel Xeon server, 4 on the AMD server - in order to fully load the system.

Can you generalize these results to other platforms? What about other workloads?
Running similar benchmarks on other x86 hardware will probably produce similar results. You can find some comparisons on http://www.spec.org/jbb2005/results/jbb2005.html to back this up if you look
hard, or search the web for independent (non-Sun, non-BEA) results.

You can definitely not generalize to non-x86 platforms like SPARC or Itanium.

Generalizing across workloads is very difficult. However, in our own testing we often see that
other memory-intensive applications show a similar advantage to JRockit. Examples of such applications include XML processing including web services and JSP/Servlet heavy apps like WLP.

You get really impressive performance with JRockit... What type of optimizations have you made to get these results?
Things that affect SPECjbb2005 performance include: compressed references, lock optimizations, lazy unlocking, software prefetching, memory locality optimizations in the GC, generic compiler enhancements and optimizations to a few core Java APIs like HashMap and BigDecimal. Most of these have been done by the JRockit team, some are inherited from Sun (the Java API changes). Interestingly, almost all of the optimizations that affect SPECjbb2005 have very broad applicability.

Why do you use a larger heap with JRockit than with Sun?
The 32-bit Sun JVM cannot allocate more than approximately 1.5 GB on Windows since it requires a contiguous memory space for the heap. JRockit does not have this limitation, nor does it affect 64-bit JVMs.

We did not use a larger heap for the XMLMark results shown here. If we had increased the heap size from 1.5 GB to 3.5 GB it would have increased the performance another 1-2%.

Why is the 64-bit JRockit so much faster than Sun?
When you run the 64-bit JRockit with a small heap, it compresses the references to 32-bit which decreases the pressure on the memory bus. This yields a 10-20% performance advantage, but does not apply if you run with a larger heap. The current implementation is limited to 4 GB, though this can in theory be extended to a 32 GB heap with a slight loss in performance.

This technique can obviously not be used if you need a larger heap than the limit. However, it is quite common to use a 64-bit JVM even with a small heap; for instance if you want to be able to expand the heap later or if you are integrating with any 64-bit native libraries such as a floating point library.

Sun used a partially different set of benchmarks and a different configuration for their Java 6 launch. What's your response to that?
I have already provided our view on the tricky "out of the box" question above. The other benchmarks Sun used were scimark2 and volanomark. Both of these are very old and we feel they do not reflect the way our customers use JRockit so we don't work on them. Sun obviously comes to a different conclusion since they keep referring to them every so often.

Also, Sun used a 6-month old version of the 32-bit JRockit 5.0 (32-bit) and a slightly newer 64-bit version in their comparison - disregarding the fact that the class libraries differ which doesn't make for an apples-to-apples JVM comparison. The comparison above is intentionally made with brand new releases from both Sun and BEA. JRockit is even at a slight disadvantage since the Sun JVM is based on the 1.6.0_02 class libraries but the JRockit version is based on 1.6.0_01.

These results are worthless since JRockit is optimized only for SPECjbb2005!
That's a really, really funny statement made by one of our competitors, though we forgive him
since we like our competition and the statement was obviously made in frustration. Make love, not war!

Is the Sun JVM better since it has been used for more SPECjAppServer2004 publications?
Another really funny statement, which was implied in a Sun JavaOne presentation this year. But it's quite obvious that quantity doesn't count when it comes to benchmarks, only results. And since there are no apples-to-apples comparisons available on this benchmark we cannot draw any conclusions about JVM impact on it.

So what JVM is the fastest?
JRockit of course :-)

Seriously, the answer is the much-hated "it depends". For SPECjbb2005 (and XMLMark) on x86 right now JRockit seems to be in the lead, so I would expect it to be used for the upcoming Intel and AMD product launches. If you're running a BEA product you should be using JRockit since that is where we focus our efforts. But at the end of the day, you have to do your own benchmarking if you really want to know.

Disclaimer: The SPECjbb2005 results quoted above were produced by BEA and have been submitted to SPEC for review. SPEC and the benchmark name SPECjbb2005 are trademarks of the Standard Performance Evaluation Corporation. The XMLMark version we used is the property of Microsoft.



JRockit R27.3 and JRockit Mission Control 3.0 are GA

Posted by hstahl on July 2, 2007 at 11:42 AM | Permalink | Comments (0)

JRockit R27.3 is now available for download! This version introduces JRockit Mission Control 3.0 with a cool new tool that helps you identify and analyze latency issues (pauses) in your application. My colleague Marcus Hirt has promised to do a writeup about this in his blog, so I'll leave that to him and will go on to the core JRockit news instead.

As always, we have included a set of performance enhancements which is expected to yield a small (up to 10 %) performance boost especially for out-of-the-box configurations for many "normal" Java applications including typical WLS and WLP deployments. We have also fixed some specific performance issues, one example being long old space GC pauses for applications that allocate large numbers of very large arrays. Developers who are using single-CPU machines will note a significant boost when single-stepping in a debugger (noticeably faster than the latest Sun JVM on my laptop).

The R27.3 version of JRockit has been updated to use the latest class libraries from Sun: J2SE 1.4.2_14, J2SE 5.0 Update 11 and Java SE 6 Update 1. This means that it includes all class libary fixes, security fixes, performance enhancements, timezone updates etc found in the corresponding Sun release.

For everyone who has been fighting with the torrent of time zone data updates this spring, one simple but very usable tool has been added that displays the current tzdata patch level. To run this, simply execute $JROCKIT_HOME/bin/tzinfo.

Our Asian users will find both the documentation and the Mission Control GUI localized to Simplified Chinese as well as Japanese. The localized docs are available from the respective international e-docs web sites, and to get a localized Mission Control GUI just start the tool with a user whose locale specifies either of those languages.

Before signing off, I'd like to thank the Java community for the support, encouragement and not least bug reports we get. We have fixed a number of such issues, including a problem with long obfuscated class names and a couple of JVMTI issues. Keep 'em coming!

Useful links:
JRockit release notes
Mission Control release notes
JRockit docs
Mission Control docs
Download site
Online store for Mission Control licenses

For questions, comments or concerns, please visit our user forums.



Another happy JRockit user

Posted by hstahl on June 11, 2007 at 12:22 AM | Permalink | Comments (2)

It's always fun to get good feedback on your products. This user has discovered the benefit of using JRockit with Weka: a larger heap and better performance.

http://zen-turkey.com/blog/default.aspx?id=16&t=Weka-on-JRockit-is-Weka-on-steroids

I couldn't have said it better myself :-)



Using JRockit for Development

Posted by hstahl on May 14, 2007 at 12:22 AM | Permalink | Comments (1)

Peter Laird in the Portal team has posted a blog about Optimized Development for WebLogic Portal Apps. He has done some research of how to make iterative development more effective, and draws some interesting conclusions. Highly recommended read!

Some comments on his recommendations:

Like many Java apps, Portal is susceptible to PermSpace issues when using the Sun JVM. This is a very common problem that has riddled the Sun JVM for years, and as far as I know there is no solution in sight. JRockit does not have the concept of a PermSpace so one easy way of working around this issue is to switch to JRockit!

Peter recommends that you specify "-Xgc:parallel" on the command line to override the default GC heuristics. I agree with this recommendation for Portal if you are using an older version of JRockit. However, a better solution is to upgrade to JRockit R27.2 (or later) which has much improved GC heuristics, most likely making the use of the explicit Parallel GC redundant.



Are you using JRockit for development? Upgrade!

Posted by hstahl on April 18, 2007 at 2:02 AM | Permalink | Comments (7)

Assuming you are a Java developer considering an upgrade of your workstation JVM, here are some reasons to upgrade to the latest JRockit R27.2:

JRockit Mission Control 2.0
This gives you access to tools such as our Runtime Analyzer profiler and our Memory Leak Detector, which can help you improve the quality of your code and resolve any performance issues. While we charge for this feature for production use, it is free for development. All you need to do is download a license file and install it in the JRE subdirectory of your JRockit instalation.

Connect on-demand from developer tools
JRockit instances running locally (on the same machine, by the same user) as your developer tools are available for connect-on demand using JRockit Mission Control (JRockit 1.4.2 and later) or any JVMTI-based tools that support the Attach API (JRockit 5.0 and 6 only). This is easily demonstrated by having the Mission Control GUI running and then fire up any Java process on the same machine - after a few seconds it will automatically be detected in the GUI and made available for a debugging or profiling session.

Bug fixes
We have prioritized fixing bugs that affect developers in our last few releases. A few examples:
- Various Mission Control bug fixes (R27.2)
- Fixed all issues found running Eclipse tests on JRockit (R27.1)
- A couple of issues with breakpoints "disappearing" during debugging sessions have been fixed (R27.1)
- Various -Xdebug fixes, for those of you still on old 1.4.2 versions of JRockit (R26.0)

Moving forward
You will see more improvements in this area in future JRockit releases. One example in the next release - R27.3 - is that it contains performance enhancements for debugging scenarios. If you have any comments or suggestions on improvements - let us know through our user forums!



May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31


Search this blog:


Archives

May 2008
March 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
December 2006
November 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
September 2005

Categories

Product: BEA JRockit
Product: WebLogic Real Time

Recent Entries

SPECjvm2008 released!

Real-time Java at LiqudityHub

JRockit + Eclipse = TRUE


Powered by
Movable Type 3.31