Tuesday, January 04, 2011

Flat and Graph Profiles for JRuby 1.6

Sometimes it's the little things that make all the difference in the world.

For a long time, we've extolled the virtues of the amazing JVM tool ecosystem. There's literally dozens of profiling, debugging, and monitoring tools, making JRuby perhaps the best Ruby tool ecosystem you can get. But there's a surprising lack of tools for command-line use, and that's an area many Rubyists take for granted.

To help improve the situation, we recently got the ruby-debug maintainers to ship our JRuby version, so JRuby has easy-to-use command-line Ruby debugging support. You can simply "gem install ruby-debug" now, so we'll stop shipping it in JRuby 1.6.

We also shipped a basic "flat" instrumented profiler for JRuby 1.5.6. It's almost shocking how few command-line profiling tools there are available for JVM users; most require you to boot up a GUI and click a bunch of buttons to get any information at all. Even when there are tools for profiling, they're often bad at reporting results for non-Java languages like JRuby. So we decided to whip out a --profile flag that gives you a basic, flat, instrumented profile of your code.

To continue this trend, we enlisted the help of Dan Lucraft of the RedCar project to expand our profiler to include "graph" profiling results. Dan previously implemented JRuby support for the "ruby-prof" project, a native extension to C Ruby, in the form of "jruby-prof" (which you can install and use today on any recent JRuby release). He was a natural to work on the built-in profiling support.

For the uninitiated, "flat" profiles just show how much time each method body takes, possibly with downstream aggregate times and total aggregate times. This is what you usually get from built-in command-line profilers like the "hprof" profiler that ships with Hotspot/OpenJDK. Here's a "flat" profile for a simple piece of code.

~/projects/jruby ➔ jruby --profile.flat -e "def foo; 100000.times { (2 ** 200).to_s }; end; foo"
Total time: 0.99

total self children calls method
----------------------------------------------------------------
0.99 0.00 0.99 1 Object#foo
0.99 0.08 0.90 1 Fixnum#times
0.70 0.70 0.00 100000 Bignum#to_s
0.21 0.21 0.00 100000 Fixnum#**
0.00 0.00 0.00 145 Class#inherited
0.00 0.00 0.00 1 Module#method_added

A "graph" profile shows the top N call stacks from your application's run, breaking them down by how much time is spent in each method. It gives you a more complete picture of where time is being spent while running your application. Here's a "graph" profile (abbreviated) for the same code.

~/projects/jruby ➔ jruby --profile.graph -e "def foo; 100000.times { (2 ** 200).to_s }; end; foo"
%total %self total self children calls name
---------------------------------------------------------------------------------------------------------
100% 0% 1.00 0.00 1.00 0 (top)
1.00 0.00 1.00 1/1 Object#foo
0.00 0.00 0.00 145/145 Class#inherited
0.00 0.00 0.00 1/1 Module#method_added
---------------------------------------------------------------------------------------------------------
1.00 0.00 1.00 1/1 (top)
99% 0% 1.00 0.00 1.00 1 Object#foo
1.00 0.09 0.91 1/1 Fixnum#times
---------------------------------------------------------------------------------------------------------
1.00 0.09 0.91 1/1 Object#foo
99% 8% 1.00 0.09 0.91 1 Fixnum#times
0.70 0.70 0.00 100000/100000 Bignum#to_s
0.21 0.21 0.00 100000/100000 Fixnum#**
---------------------------------------------------------------------------------------------------------
0.70 0.70 0.00 100000/100000 Fixnum#times
69% 69% 0.70 0.70 0.00 100000 Bignum#to_s
---------------------------------------------------------------------------------------------------------
0.21 0.21 0.00 100000/100000 Fixnum#times
21% 21% 0.21 0.21 0.00 100000 Fixnum#**
---------------------------------------------------------------------------------------------------------
0.00 0.00 0.00 145/145 (top)
0% 0% 0.00 0.00 0.00 145 Class#inherited
---------------------------------------------------------------------------------------------------------
0.00 0.00 0.00 1/1 (top)
0% 0% 0.00 0.00 0.00 1 Module#method_added

As you can see, you get a much better picture of why certain methods are taking up time and what component calls are contributing most to that time.

We haven't settled on the final command-line flags, but look for the new graph profiling (and the cleaned-up flat profile) to ship with JRuby 1.6 (real soon now!)