Monday, 18 June 2012

GCViewer: New Garbage Collection Analysis Plugin for VisualVM

New garbage collection monitoring plugin for VisualVM allows for investigating crucial GC metrics in greater detail and higher resolution. Recommended especially for monitoring latency constrained Java applications.

Oracle acknowledged there is a bug in hotspot and published my bug report:
You can learn more from a disclaimer at the end of the article.
When you are involved in software development and your application is deployed to production environments, it is inevitable that every now and then you will have to deal with profiling or at least measuring performance of the deployed code. One of Java features that makes it a great and widely adopted platform is Java's garbage collection mechanism. It enormously simplifies implementation and relieves the developer from tricky allocation/deallocation operations. Unfortunately Java garbage collection comes at a cost. The process of garbage collection is not entirely controllable and only allows us to set boundaries that the GC algorithm will try to obey. This may pose a serious problem when we are dealing with highly constrained application performance. In case of high-throughput requirements the objective is to be able to process as many concurrent transactions as possible in a given time. There are also latency requirements that enforce application to process a single transaction within a given time. For the latter GC pauses become first-class citizen issue as they are always on the processing critical path.
The first and most fundamental element of working on performance is measurement. Oracle Hotspot's garbage collector implementations offer a number of ways for monitoring the performance of GC operations. The most comprehensive data can be obtained with jstat and by examining the output of -verbose:gc JVM feature.
Unfortunately these produce text output that requires some further parsing/post-processing.
There is also a dedicated JMX MBean that allows one to examine basic GC metrics at runtime but unfortunately it can only report collection count and collection duration which is additionally measured in milliseconds. Obviously this renders the MBean unusable if you are in the sub-milisecond requirements area.
You can resort to a great Visual VM plugin called Visual GC. It is an extremely useful tool for working with GC performance but lacks some metrics that are important especially for a low-latency developer. These would be: individual GC pause times (in resolution higher than a millisecond), the amount of promoted and survived bytes and the GC cost.
For that very reason I decided to write a new GC analysis plugin for Visual VM. It supplements the Visual GC plugin by providing the missing metrics as 4 additional charts. The plugin is still in its early alpha stage, but turned to be useful for me in a number of occasions, so I decided to share it with the community. Here are some screenshots of the plugin in action:

GCViewer plugin with all charts enabled

GC pause times

Bytes promoted vs survived

The project is hosted on github and you are free to do whatever you want with the code. There is also an NBM package for the lazy (or for those who like me are not NetBeans fans). If you think this is a useful plugin and can be fixed/extended in any way please post your suggestions or try to reach me through github.
Important note: It only works with OpenJDK and Hotspot by exploiting JVMStat API to access built-in HotSpot counters. For some reason the new parallel GC (-XX:+UseParNewGC) does not expose all the counters as the parallel scavenging collector does (-XX:+UseParallelGC), so for the former only GC Pauses and Promoted vs Survived charts will work. I reported this issue to Oracle a while ago, but probably got ignored as the bug never appeared in their database.