Saturday, December 22, 2007

Project Idea: Native JSON Gem for JRuby

json-lib 2.2 has been released by Andres Almiray, and he boldly claims that "it now has become the most complete JSON parsing library". It supports Java, Groovy, and JRuby quite well, and Andres has an extensive set of json-lib examples to get you started.

But this post is about a project idea for anyone interested in tackling it: make a JRuby-compatible version of the fast JSON library from Florian Frank using Andres's json-lib (or another library of your choosing, if it's as "complete" as json-lib).

JSON is being used more and more for remoting, not just for AJAX but for RESTful APIs as well. Rails 2.0 supports REST APIs using either XML or JSON (or YAML, I believe), and many shops are settling on JSON.

So there's a possibility this could be a bottleneck for JRuby unless we have a fast native JSON library. There's json-pure, also from Florian Frank, which is a pure Ruby version of the library...but that will never compete with a fast version in C or Java.

Anyone up to the challenge? JRUBY-1767: JRuby needs a fast JSON library

Update: Marcin tells me that Florian's JSON library uses Ragel, which may be an easier path to getting it up and running on JRuby. Hpricot and Mongrel also use Ragel, and both already have JRuby versions.

Thursday, December 20, 2007

A Few Easy (?) JRuby 1.1 Bugs

When I posted on a few easy JRuby 1.0.2 bugs a couple months ago, I got a great response. So since I'm doing a bug tracker review today for JRuby 1.1, here's a new list for intrepid developers.

DST bug in second form of Time.local - This doesn't seem like it should be hard to correct, provided we can figure out what the correct behavior is. We've recently switch JRuby to Joda Time from Java's Calendar, so this one may have improved or gotten easier to resolve.

Installing beta RubyGems fails - This actually applies to the current release of RubyGems, 1.0.0...I tried to do the update and got this nasty error, different from the one originally reported. Someone more knowledgable about RubyGems internals could probably make quick work of this.

weakref.rb could (should?) be implemented in Java - So I already went ahead and did this, because it was a trivial bit of work (and perhaps a good piece of code to look at it you want to see how to easily write extensions to JRuby). But what's still missing are any sort of weakref tests or specs. My preference would be for you to add specs to Rubinius's suite, which will someday soon graduate to a top-level standard test suite of its own. But at any rate, a nice set of tests/specs for weakref would do good for everyone.

AnnotationFormatError when running trunk jruby-complete --command irb - This is actually a JarJar Links problem that we're using a patched version for right now. Problem is...even the current jarjar-1.0rc6 still breaks when I incorporate it into the build. These sorts of bugs can drive a person mad, so if anyone wants to shag out what's wrong with jarjar here, we'd really appreciate it.

Failure in ruby_test Numeric#truncate test - The first of a few trivial numeric failures from Daniel Berger's "ruby_test" suite. Pretty easy to jump into, I would expect.

Failure in ruby_test Numeric#to_int test - And another, perhaps the same sort of issue.

IO.select does not work properly with timeout - This one would involve digging into JRuby IO code a bit, but for someone who knows IO reasonably well it may not be difficult. The primary issue is that while almost all other types of IO in JRuby use NIO channels, stdio does not. So basically, you can't do typical non-blocking IO operations against stdin, stdout, or stderr. Think you can tackle it?

Iconv character set option //translit is not supported - The Iconv library, used by MRI for character encoding/decoding/transcoding, supports more transliteration options than we've been able to emulate with Java's CharSet classes. Do you know of a way to support Iconv-like transliteration in Java?

jirb exits on ESC followed by any arrow key followed by return - this is actually probably a Jline bug, but maybe one that's easily fixed?

bfts test_file_test failures - The remaining failures here seem to mostly be because we don't have UNIXServer implemented (since UNIX domain sockets aren't supported on the JVM). However, since JRuby ships with JNA, it could be possible for someone familiar with UNIX socket C APIs to wire up a nice implementation for us. And for that matter, it would be useful for just about anyone who wants to use UNIX sockets from Java.

Process module does not implement some methods
- Again, here's a case where it probably would be easy to use JNA to add some missing POSIX/libc functions to JRuby.

Implement win32ole library using one of the available Java-COM bridges - This one already has a start! Some fella named Rui Lopes started implementing the Win32OLE Ruby library using Jacob, a Java/COM bridge. I'd love to see this get completed, since Ruby's DSL capabilities map really nicely to COM/ActiveX objects. (It's also complicated by the fact that most of the JRuby core team develops on Macs.)

allow "/" as absolute path in Windows - this is the second oldest JRuby 1.1-flagged bug, and from reviewing the comments we still aren't sure the correct way to make it work. It can't be this hard, can it?

IO CRLF compatibility with cruby on Windows - Another Windows-related bug that really should be brought to a close. Koichiro Ohba started looking into it some time ago, but we haven't heard from him recently. This is the oldest JRuby 1.1 bug, numbered JRUBY-61, and is actually the second-oldest open bug overall. Can't someone help figure this bloody thing out?

Saturday, December 08, 2007

Upcoming Events: Dec 2007, Jan/Feb 2008

JavaPolis 2007 - Antwerp, Belgium - December 10-14 - Sounds like a great event this year, with claims of over 3200 registrations so far. I'll be sharing the JRuby/NetBeans tutorial with Brian Leonard on the 10th and the JRuby/Rails talk with Ola Bini on the 12th. Outside of that, I'll probably be hacking in the main area. Come say hi.

Microsoft Lang.NET Symposium - Redmond, Washington - January 28-30 - I'll be there to get ideas about building a language platform, sharing war stories with fellow language implementers, and probably contributing a bit to John Rose's talk on the Multi-Language VM project. Oughta be a fun time...though it feels a bit weird making my first trip to Microsoft.

acts_as_conference - Orlando, Florida - February 8-9 - Robert Dempsey of Rails For All invited me to come talk about JRuby and Rails...though I'll be doing things a bit differently this time (not showing how to build a Rails app, but showing purely how JRuby improves the Rails ecosystem). Who could pass up a trip to Florida from Minnesota at this time of year?

FOSDEM 2008 - Brussels, Belgium - February 23-24 - FOSDEM invited me to present on the OSS languages track. I've got some great ideas for how to tackle this one. Given that it's an OSS conference, I think it's finally time to show how JRuby has evolved in the past three years from a slow, partial interpreter and runtime to the fastest Ruby 1.8-compatible implementation around. It's been a hell of a ride, and it's gotta qualify as an OSS success story.

Outside these four events, I've had invitations for plenty others (I could probably just do conferences...but how would I ever get anything done?) so I'm sure there will be more to come. You can also count on JavaOne in San Francisco this spring, Ruby Kaigi in Tokyo this summer, RubyConf Europe in Prague some time between April and July, and maybe RailsConf 2008 in Portland (though there's a good chance I won't be presenting).

Friday, December 07, 2007

Groovy 1.5 Released!

The Groovy team has kicked out their second major production release, Groovy 1.5...and skipped straight from 1.0 to 1.5. Why? Perhaps because they added generics, enums, static imports, annotations, fully dynamic metaclasses, improved performance, ... and much more. I think the move to 1.5 was certainly warranted, and we've been considering making the next JRuby release 1.5 for the same reasons.

Congratulations to the Groovy team! I'm looking forward to seeing 1.6 and 2.0 in the future!

OpenJDK Migration to Mercurial is Complete!

I'm really excited about this one. Kelly O'Hair reports that OpenJDK source has been fully migrated to Mercurial! This means that daily development on OpenJDK (eventually to produce Java 7 and other great things) will happen on the same repository that you, dear reader, can access from home. And it's using Mercurial, one of the two big Distributed SCM apps, so you can pull off an entire repo and maintain your own OpenJDK workshop at home. Excellent news...I now finally have an excuse to learn Hg, and I can finally put in the effort to get OpenJDK building with the knowledge that I'll safely be able to "pull" changes as they happen. Thank you to the OpenJDK migration team!

See also Kelly O'Hair's sun.com blog for articles on the OpenJDK Mercurial layout and how to work with it.

Wednesday, December 05, 2007

Groovy in Ruby: Implement Interface with a Map

Some of you may know I participate in the Groovy community as well. I'm hoping to start contributing some development time to the Groovy codebase, but for now I've mostly been monitoring their progress. One thing the Groovy team has more experience with is integrating with Java.

Now if you ask the Groovy team, they'll make some claim like "it's all Java objects" or "Groovy integrates seamlessly with Java" but neither of those are entirely true. Groovy does integrate extremely well with Java, but it's because of a number of features they've added over time to make it so...many of them not directly part of the Groovy language but features of their core libraries and portions of their runtime.

Since Ruby and Groovy seem to be the two most popular (or noisiest) non-Java JVM languages these days, I thought I'd start a series of posts showing how to add Groovy features missing from Ruby to JRuby. But there's a catch: I'll use only Ruby code to do this, and what I show will work on any unmodified JRuby release. That's the beauty of Ruby: the language is so flexible and fluid, you can implement many features from other languages without ever modifying the implementation.

First up, Groovy's ability to implement an interface from a Map.

1. impl = [
2. i: 10,
3. hasNext: { impl.i > 0 },
4. next: { impl.i-- },
5. ]
6. iter = impl as Iterator
7. while ( iter.hasNext() )
8. println iter.next()
Ok, this is Groovy code. The brackety thing assigned to 'impl' shows Groovy's literal Map syntax (a Hash to you Rubyists). Instead of providing literal strings for the keys, Groovy automatically turns whatever token is in the key position into a Java String. So 'i' becomes a String key referencing 10, 'hasNext' becomes a String key referencing a block of code that checks if impl.i is greater than zero, and so on.

The magic comes on line 6, where the newly-constructed Map is coerced into a java.util.Iterator implementation. The resulting object can then be passed to other code that expects Iterator, such as the while loop on lines 7 and 8, and the values from the Map will be used as the code for the implemented methods.

To be honest, I find this feature a bit weird. In JRuby, you can implement a given interface on any class, add methods to that class at will, and get most of this functionality without ever touching a Hash object. But it's pretty simple to implement this in JRuby:
1. module InvokableHash
2. def as(java_ifc)
3. java_ifc.impl {|name, *args| self[name].call(*args)}
4. end
5. end
Here we have one of Ruby's wonderful modules, which I appreciate more each day. This InvokableHash module provides only a single method 'as' which accepts a Java interface type and produces an implementation of that type that uses the contents of hash keys to implement the methods. That's really all there is to it. So by reopening the Hash class, we gain this functionality:
1. class Hash
2. include InvokableHash
3. end
And we're done! Let's see the fruits of our labor in action:
1. impl = {
2. :i => 10,
3. :hasNext => proc { impl[:i] > 0 },
4. :next => proc { impl[:i] -= 1 }
5. }
6. iter = impl.as java.util.Iterator
7. while (iter.hasNext)
8. puts iter.next
9. end
Our final Ruby code looks roughly like the Groovy code. On lines 1 through 5 we construct a literal Hash. Notice that instead of automatically turning identifier tokens into Strings, Ruby uses the exact object you specify for the key, and so here we use Ruby Symbols as our hash keys (they're roughly like interned Strings, and highly recommended for hash keys). On line 6, we coerce our Hash into an Iterator instance (and we could have imported Iterator above to avoid the long name). And then lines 7 through 9 use the new Iterator impl in exactly the same way as the Groovy code.

You've gotta love a language this flexible, especially with JRuby's magic Java integration features to back it up.

Thursday, November 29, 2007

Coming Soon

$ jruby -J-Djruby.compat.version=ruby1_9 -v
ruby 1.9.1 (2007-11-29 rev 4842) [i386-jruby1.1b1]

Wednesday, November 28, 2007

Tab Sweep November 28

WiiHear - Streaming audio for the Wii. Very cool. It was a little spotty for me, but your results may be better. And in related news: Super Mario Galaxy is the best Mario ever. I'll post a review later.

Easy(?) JRuby bugs from Rubinius specs - Vladimir Sizikov has been contributing new/fixed specs to Rubinius while running all the specs against JRuby. He's reported several new bugs as a result, which could potentially be easy API edge cases. Give them a shot. I'll try to post an "easy bug round-up" soon too.

File IO in Different Languages
- A quick comparison of (very) basic file I/O in TCL, Python, PHP, Ruby, Groovy, Scala, and JavaScript. The poster then goes on to compare startup times for the Java implementations of these languages, showing an area JRuby has some issues (slowest startup time, at around 1.6s). We know about the problem, and hopefully NailGun plus other improvements in the future will help. Largely, it's a JVM issue; classes are slow to load and verify, and we create literally hundreds of tiny classes.

Minneapolis - I live in Richfield, an "urban suburb" of Minneapolis (often dubbed Minneapolis Junior), but I generally just say I'm from Minneapolis, since most people recognize it. The Wikipedia article on Minneapolis is excellent (as is the one on the Twin Cities area). A few choice facts:

  • Minneapolis has more per-capita theater seats than any US city other than New York
  • The Twin Cities is one of the warmest places in Minnesota, with an average annual temperature of 45.4 ºF. Monthly average daily high temperatures range from 21.9 °F (-5.6 °C) in January to 83.3 °F (28.5 °C) in July; the average daily minimum temperatures for the two months are 4.3 °F (-15.4 °C) and 63.0 °F (17 °C) respectively. So it gets cold in the winter and hot in the summer, not too hot nor too cold. It's nice to have four seasons.
  • Every home in Minneapolis is no more than six blocks away from a wooded park. In the summer, Minneapolis from the air looks like a bunch of skyscrapers surrounded by a bed of trees.
It's a nice place to live. I'd be hard pressed to find somewhere better.

Oniguruma Regular Expression Syntax version 5.6 - This is the library Marcin ported. I believe all of this is supported in Joni. Of course we wrap it so normal Ruby regular expressions work exactly as they would under Ruby 1.8, but Marcin will release Joni as a standalone library too.

Dr. Nic Installs Mingle with Capistrano - A nice walkthrough.

JavaPolis 2007 - I will be attending and co-presenting a JRuby/NetBeans tutorial with Brian Leonard and a JRuby/Rails session with Ola Bini. Only two weeks and I'll be in the land of beer, chocolate, and diamonds.

acts_as_conference 2007 - I will be attending and presenting something JRubyish and Railsy. Perhaps my last Rails presentation before handing such things off to people who actually know Rails? This conference also wins my vote for "most cumbersome title".

RailsConf 2008 CFP - I haven't decided if I plan to present anything this year. There are many, many other folks doing real-world JRuby on Rails work that could probably do a better job.

FOSDEM 2008 - I have been invited and will be attending. Back to the land of beer, chocolate and diamonds.

Tuesday, November 27, 2007

REXML Numbers With Joni

As Ola reported earlier today, we've merged Joni, Marcin Mielczynski's port of Oniguruma, to JRuby trunk. Here's the description from the Oniguruma home page:

Oniguruma is a regular expressions library.
The characteristics of this library is that different character encoding
for every regular expression object can be specified.
The benefit for us is avoiding the encode/decode we previously had to do for every regular expression match, since Ruby uses byte[]-based strings and all Java regular expression engines work with char[]. You can imagine the overhead all that array churn introduced.

After running through a series of basic optimizations, most of the key expressions we worried about were performing as well as or much better than JRegex, so Ola went through with the conversion over the past couple days. Marcin is continuing to work on various optimizations, but both Ola and I have been playing with the new code. And it's looking great.

You may remember I reported recently about how the regexp bottleneck impacted XML parsing with REXML. Here's the numbers run against JRuby immediately before merging Joni:
read content from stream, no DOM
3.362000 0.000000 3.362000 ( 3.362000)
1.232000 0.000000 1.232000 ( 1.232000)
0.887000 0.000000 0.887000 ( 0.887000)
1.009000 0.000000 1.009000 ( 1.010000)
0.801000 0.000000 0.801000 ( 0.801000)
read content once, no DOM
9.869000 0.000000 9.869000 ( 9.869000)
9.779000 0.000000 9.779000 ( 9.779000)
9.786000 0.000000 9.786000 ( 9.786000)
9.655000 0.000000 9.655000 ( 9.655000)
9.601000 0.000000 9.601000 ( 9.601000)
read content from stream, build DOM
1.368000 0.000000 1.368000 ( 1.368000)
1.297000 0.000000 1.297000 ( 1.297000)
1.192000 0.000000 1.192000 ( 1.192000)
1.131000 0.000000 1.131000 ( 1.131000)
0.812000 0.000000 0.812000 ( 0.812000)
read content once, build DOM
10.595000 0.000000 10.595000 ( 10.595000)
9.489000 0.000000 9.489000 ( 9.488000)
9.947000 0.000000 9.947000 ( 9.947000)
9.821000 0.000000 9.821000 ( 9.821000)
9.414000 0.000000 9.414000 ( 9.415000)
And here's the performance numbers today, with Joni:
read content from stream, no DOM
2.309000 0.000000 2.309000 ( 2.308000)
1.217000 0.000000 1.217000 ( 1.217000)
0.776000 0.000000 0.776000 ( 0.776000)
0.825000 0.000000 0.825000 ( 0.825000)
0.637000 0.000000 0.637000 ( 0.637000)
read content once, no DOM
0.370000 0.000000 0.370000 ( 0.369000)
0.415000 0.000000 0.415000 ( 0.415000)
0.288000 0.000000 0.288000 ( 0.288000)
0.260000 0.000000 0.260000 ( 0.260000)
0.254000 0.000000 0.254000 ( 0.254000)
read content from stream, build DOM
1.455000 0.000000 1.455000 ( 1.455000)
0.916000 0.000000 0.916000 ( 0.916000)
0.887000 0.000000 0.887000 ( 0.888000)
0.827000 0.000000 0.827000 ( 0.827000)
0.607000 0.000000 0.607000 ( 0.607000)
read content once, build DOM
0.630000 0.000000 0.630000 ( 0.630000)
0.664000 0.000000 0.664000 ( 0.664000)
0.680000 0.000000 0.680000 ( 0.680000)
0.553000 0.000000 0.553000 ( 0.553000)
0.650000 0.000000 0.650000 ( 0.650000)
Marcin's being modest about the work, but we're all absolutely amazed by it.

So finally the last really gigantic performance bottleneck in JRuby is gone, and it appears that JRuby's slow regexp era has come to a close. Next targets: the remaining issues with IO and Java integration performance.

Java 6 Port for OS X (Tiger and Leopard)

I just stumbled across this little gem today:

Landon Fuller's JDK 6 Port for OS X

Who Landon Fuller is I don't know. But I find it incredibly impressive that he's managed to get the base JDK 6 ported to OS X and working. Talk about showing the value of an open-source JDK...Landon Fuller FTW. Apple, are you hiring? Perhaps this guy can kick the Apple JDK process in the ass.

So naturally when I'm confronted with a final JDK 6 release for OS X one thing immediately springs to mind: performance.

We'd always suspected that the early preview version of JDK 6 on OS X was not showing us the true awesome performance we could expect from a final version.

We were right.

So I'll give you two sets of numbers, one that's specious and unreliable and the other that's a more real-world test.

fib numbers

Yes, good old fib. A constant in benchmarking. It shows practically nothing, and yet people use it to demonstrate perf. And in the case of Ruby 1.9, they've specifically optimized for integer-math-heavy benchmarks like this.

Ruby 1.9:

  0.400000   0.000000   0.400000 (  0.413737)
0.420000 0.010000 0.430000 ( 0.421622)
0.400000 0.000000 0.400000 ( 0.411591)
0.410000 0.000000 0.410000 ( 0.411593)
0.400000 0.000000 0.400000 ( 0.410080)
0.410000 0.000000 0.410000 ( 0.408836)
0.400000 0.000000 0.400000 ( 0.408572)
0.410000 0.000000 0.410000 ( 0.408114)
0.400000 0.000000 0.400000 ( 0.410374)
0.400000 0.000000 0.400000 ( 0.413096)

Very nice numbers, especially considering Ruby 1.8 benchmarks at about 1.7s on my system. Ruby 1.9 contains several optimizations for integer math, including the use of "tagged integers" for Fixnum values (saving object costs) and fast math opcodes in the 1.9 bytecode specification (avoiding method dispatch). JRuby does neither of these, representing Fixnums as a normal Java object containing a wrapped Long and dispatching as normal for all numeric operations.

JRuby trunk:
  0.783000   0.000000   0.783000 (  0.783000)
0.510000 0.000000 0.510000 ( 0.510000)
0.510000 0.000000 0.510000 ( 0.510000)
0.506000 0.000000 0.506000 ( 0.506000)
0.505000 0.000000 0.505000 ( 0.504000)
0.507000 0.000000 0.507000 ( 0.507000)
0.510000 0.000000 0.510000 ( 0.510000)
0.507000 0.000000 0.507000 ( 0.507000)
0.508000 0.000000 0.508000 ( 0.508000)
0.510000 0.000000 0.510000 ( 0.510000)

This is improved from numbers in the 0.68s range under the Apple JDK 6 preview. Pretty damn hot, if you ask me. I love being able to sit back and do nothing while performance numbers improve. It's a nice change from 16 hour days.

Anyway, back to performance. JRuby also supports an experimental frameless execution mode that omits allocating and initializing per-call frame information. In Ruby, frames are used for such things as holding the current method visibility, the current "self", the arguments and block passed to a method, and so on. But in many cases, it's safe to omit it entirely. I haven't got it running 100% safe in JRuby yet, and probably won't before 1.1 final comes out...but it's on the horizon. So then...numbers.

JRuby trunk, frameless execution:
  0.627000   0.000000   0.627000 (  0.627000)
0.409000 0.000000 0.409000 ( 0.409000)
0.401000 0.000000 0.401000 ( 0.401000)
0.402000 0.000000 0.402000 ( 0.402000)
0.403000 0.000000 0.403000 ( 0.403000)
0.403000 0.000000 0.403000 ( 0.403000)
0.404000 0.000000 0.404000 ( 0.405000)
0.401000 0.000000 0.401000 ( 0.401000)
0.403000 0.000000 0.403000 ( 0.403000)
0.405000 0.000000 0.405000 ( 0.405000)

Hello hello? What do we have here? JRuby actually executing fib faster than an optimized Ruby 1.9? Can it truly be?

Pardon my snarkiness, but we never thought we'd be able to match Ruby 1.9's integer math performance without seriously stripping down Fixnum and introducing fast math operations into the compiler. I guess we were wrong.

M. Ed Borasky's MatrixBenchmark

I like Borasky's matrix benchmark because it's a non-trivial piece of code, and pulls in a Ruby standard library (matrix.rb) as well. It basically inverts a matrix of a particular size and multiplies the original by the inverse. I show here numbers for a 64x64 matrix, since it's long enough to show the true benefit of JRuby but short enough I don't get bored waiting.

Ruby 1.9:
Hilbert matrix of dimension 64 times its inverse = identity? true
21.630000 0.110000 21.740000 ( 21.879126)
JRuby trunk:
Hilbert matrix of dimension 64 times its inverse = identity? true
14.780000 0.000000 14.780000 ( 14.780000)

This is down from 16-17s under the Apple JDK 6 preview and a clean 25% faster than Ruby 1.9.

So what have we learned today?
  • Sun's JDK 6 provides frigging awesome performance
  • Apple users are crippled without a JDK 6 port. Apple, I hope you're paying attention.
  • Landon Fuller is my hero of the week. I know Landon will just point at the excellent work to port JDK 6 to FreeBSD and OpenBSD...but give yourself some credit, you did what none of the other Leopard whiners did.
  • JRuby rocks
Note: You have to be a Java Research License licensee to legally download the binary or source versions of Landon's port. That or complain to Apple about some dude making a working port before they did. Landon mentions on his blog that he plans to contribute this work to OpenJDK soon...which would quickly result in a buildable GPLed JDK for OS X. Awesome.

Friday, November 23, 2007

Oracle Mix Proving JRuby is "The Best Way"?

I have a tendency to post intentionally inflammatory posts that usually end up evenly divided between "yea" and "nay". But this time, I've got Rich Manalang's post from the JRuby on Rails front lines to back me up.

Rich is one of the primaries (Rich: THE primary?) behind Oracle's new Mix site, the first highly-visible public site based on JRuby on Rails. And after the experience he's convinced that JRuby is "the best way" to deploy Rails.

Read the whole article, but I think Rich's final paragraph sums it up pretty well:

This was an amazing project to be a part of. And one thing I’ll say is that for anyone working in a Java EE environment where you have to use the stack that’s there, the future is bright and it’s all because of jRuby, Rails, and the speed and agility at which you can build applications on that framework. I’m convinced that jRuby is [the] best way to deploy a Rails app if you need performance and flexibility. My prediction: next year will be the year for jRuby’s rise into the mainstream.

Thursday, November 22, 2007

JRuby on ME Devices?

Roy Hayun presented his "JRuby on ME" talk this past JavaOne and got a pretty solid response. He ported a pre-1.0 JRuby to CDC by incrementally stripping out libraries and functionality that couldn't be supported. He succeeded, and yesterday delivered to us a buildable version of his JRubME 0.1 (that name was a typo on his original JavaOne submission...but the missing 'y' produced a comically good name).


In the JRuby download "research" area, you can find the results of his work plus some docs and a short presentation on JRubyME. As near as I can tell he based it on JRuby 0.9.8. The same work could probably be done to produce a stripped version of current JRuby, and with a bit more work we could probably tweak and reconfigure the source to support ME execution as a normal build target. Both are exercises for you all until there's more time (or Ruby on ME becomes a primary goal rather than performance and compatibility on SE :) )

There you go! You wanted JRuby on ME, now you have a damn good start! Run with it!

Wednesday, November 21, 2007

GlassFish Gem Build Instructions

Hopefully by now you've heard about the GlassFish Gem. It's a roughly 3MB gem that includes only the pieces of GlassFish necessary to launch a production JRuby on Rails server. Instead of using WAR deployment, you just run "glassfish_rails" and point at your Rails dir. The result is a multi-request production-ready server.

But there are things that could be improved. It deploys apps under a subcontext, rather than at the root context. It doesn't appear to route static content correctly. It doesn't provide options for configuring the deployed app, like for connection pools, number of JRuby runtime to spin up, and so on. Basically, it needs people to try it out and provide improvement suggestions.

Now you can build the GlassFish Gem yourself Arun has provided step-by-step instructions on how to get all the necessary files and generate your own GlassFish Gem. I believe this could be the best way to deploy JRuby on Rails apps for both production and development use, so I really hope you'll give it a try and offer suggestions and patches back to the GlassFish team (or funnel them through me if you like).

Tab Sweep

Inspired by Tim Bray's periodic tab sweeps (Tim, do you collect unread tabs into the dozens like I do?) here's the first of hopefully many tab sweeps from me.

Kindle by Amazon - I've been waiting for an ePaper-based reader, but I don't think this is going to be the one. No cables or syncing? That says to me everything I read has to go through Amazon, and my existing PDFs will be worthless. Black and white. Ugly as sin ("Hello 1996? I found something that belongs to you"). Flop.

Neal Ford on JRuby and Ruby versus others (podcast) - Neal does a great job explaining JRuby to the masses while understanding the deeper reasons why it's one of the better languages for the JVM. One note to Neal: JRuby is far from being a simple port of the MRI C code; it's grown far, far beyond that now and is considerably more advanced in design and implementation.

Tim Bray's two-question Ruby tools survey - Tim's pretty good at keeping his finger on the pulse of the dev community. Guess that's why he's a director and distinguished engineer.

John Rose's report on his and my meeting with the PyPy team - John summed it up pretty well, but he's interested for different reasons than I. He's interested in finding ways to evolve the JVM to support the sorts of optimizations the PyPy JIT is performing (or will perform, where it's not complete yet). I'm interested in the possibility of a generic language toolchain that allows you to build your language of choice in a subset of your language of choice; effectively a way to quickly bootstrap a language's most rabid users directly into the implementation process, rather than forcing them to use a new language like Java or C#. If it can be done, my money's on that approach beating all "Language Runtimes" that saddle implementers with exactly the sorts of languages they don't want to use.

Glimmer - Despite sharing its name with the ill-fated Whitney Houston movie, Glimmer appears to be an interesting yet-another-JRuby-GUI-framework based on SWT and data binding. Is it obvious yet that GUI development in the C Ruby world is sorely lacking?

Greg Haygood mixes traditional JSP webapp and Rails in the same WAR - Perfectly valid. Now to start blurring the lines between JRuby on Rails and the various Java web frameworks and technologies.

Oracle Mix, the first big-time public JRuby on Rails site - A top-level oracle.com site running JRuby on Rails. That's hardcore. And I don't think they're even running JRuby 1.1, with all its performance glory. And I know about upcoming sites you don't. Future is bright.

David Bock with another OS X Java rant - But he's right about one thing...until Java 6 comes out on OS X, JRuby users will have to be content running the slower Java 5 or the somewhat-flaky Java 6 developer preview (no longer available for download). Write your congressman.

Tor Norbye's NetBeans Ruby features for last week - The guy's a machine. Awesome stuff, and more to come.

JRuby CafePress Store - Any proceeds will go...somewhere, I dunno. I just put it up because I got tired of people whining that they wanted JRuby t-shirts. I didn't bump up any prices, so who knows if there will even be any profit from them. If there is, I'll leave it be until there's something to use it for. If not, so be it. BTW: Don't get the logo on a black shirt; it won't look right.

Experiments with frameless execution - (site is down at the moment; try later) JRuby, like many other language implementations on general-purpose VMs, suffers from the overhead of heap-allocated "frame objects" that hold information about the current method call. They're needed because various languages often need per-call information on an easily accessible stack, or need to be able to tuck call frames away for future use (in closures, for example). They're overhead because the JVM already is allocating (and frequently optimizing away) call frames for the underlying Java code, and there's no way to get at those frames or overload them with new duties. IronPython is a notable example of a language impl that has opted to avoid frame objects, at the cost of some Python features. JRuby, in an effort to support all Ruby features, still has frame objects; but it may be possible to optimize them away in certain cases. The link has microbenchmark numbers for framed and frameless execution. Both are faster than Ruby 1.9; frameless by several times.

Paul Brannan's Ludicrous JIT compiler for Ruby - Promising work, and it gets some decent gains over Ruby 1.9.

Tuesday, November 20, 2007

Bytecode Tools in Ruby: A Low-level DSL

I've been toying with the idea of rewriting the JRuby compiler in Ruby, or at least writing the appropriate plumbing that would allow someone to do something similar. Migrating the JRuby compiler may or may not be worth it, since the existing Java compiler is basically done and working well, and a conversion would be sure to introduce bugs here and there. But it would certainly be a show of faith to give it a try.

As part of this effort, I've built up some basic utility code and a simple JVM bytecode builder that could act as the lowest level of such a compiler. I'm looking for input on the syntax at this point, while I take a break from it to explore JRuby Java integration improvements I think should be done before 1.1.

So here's the Ruby source of the builder, as contained within a test case:

require 'test/unit'
require 'compiler/builder'
require 'compiler/signature'

class TestBuilder < Test::Unit::TestCase
import java.lang.String
import java.util.ArrayList
import java.lang.Void
import java.lang.Object
import java.lang.Boolean

include Compiler::Signature

def test_class_builder
cb = Compiler::ClassBuilder.build("MyClass", "MyClass.java") do
field :list, ArrayList

constructor(String, ArrayList) do
aload 0
invokespecial Object, "<init>", Void::TYPE
aload 0
aload 1
aload 2
invokevirtual this, :bar, [ArrayList, String, ArrayList]
aload 0
swap
putfield this, :list, ArrayList
returnvoid
end

static_method(:foo, this, String) do
new this
dup
aload 0
new ArrayList
dup
invokespecial ArrayList, "<init>", Void::TYPE
invokespecial this, "<init>", [Void::TYPE, String, ArrayList]
areturn
end

method(:bar, ArrayList, String, ArrayList) do
aload 1
invokevirtual(String, :toLowerCase, String)
aload 2
swap
invokevirtual(ArrayList, :add, [Boolean::TYPE, Object])
aload 2
areturn
end

method(:getList, ArrayList) do
aload 0
getfield this, :list, ArrayList
areturn
end

static_method(:main, Void::TYPE, String[]) do
aload 0
ldc_int 0
aaload
invokestatic this, :foo, [this, String]
invokevirtual this, :getList, ArrayList
aprintln
returnvoid
end
end

cb.write("MyClass.class")
end
end

For those of you who don't speak bytecode, here's roughly the Java code that this would produce:
import java.util.ArrayList;

public class MyClass {
public ArrayList list;

public MyClass(String a, ArrayList b) {
list = bar(a, b);
}

public static MyClass foo(String a) {
return new MyClass(a, new ArrayList());
}

public ArrayList bar(String a, ArrayList b) {
b.add(a.toLowerCase());
return b;
}

public ArrayList getList() {
return list;
}

public static void main(String[] args) {
System.out.println(foo(args[0]).getList());
}
}

The general idea is that fairly clean-looking Ruby code can be used to generate real Java classes, providing a readable base for code generation tools like compilers.

There's a couple things to notice here:
  • Everything is public. I have not wired in visibility and other modifiers mainly because it starts to look cluttered no matter how I try. Suggestions are welcome.
  • The bytecode, while clean looking, is pretty raw. This interface also doesn't save you from yourself; if you're not ordering your bytecodes right, you'll end up with an unverifiable class file.
  • It's not apparent just from looking at the code which types specified are return values and which are argument values. Something more explicit could be useful here.
I'd like to continue this work. The above code, run against JRuby trunk and the lib/ruby/site_ruby/1.8/compiler library I'm working on, will produce a working MyClass class file:
~/NetBeansProjects/jruby $ jruby test/compiler/test_builder.rb
Loaded suite test/compiler/test_builder
Started
.
Finished in 0.096 seconds.

1 tests, 0 assertions, 0 failures, 0 errors
~/NetBeansProjects/jruby $ java -cp . MyClass foo
[foo]

So it's actually emitting the appropriate bytecode for this class.

Comments? Thoughts for improvement?

Monday, November 19, 2007

Have You Written RubySpec Today?

You know Ruby. You've been coding up Ruby apps for a while now. You've seen the power and the magic that Ruby offers developers, and you've seen some of the weirder, wilder, and perhaps uglier sides of Ruby. You think you're pretty well-versed in the core classes, and you can hold your own when metaprogramming.

You're a Ruby Programmer. Now Prove it.

RubySpec is a wiki-based Ruby specification, aimed at forming an English-language, community-driven spec for Ruby 1.8 (and in the future, Ruby 1.9). There's a lot of content there already, and a lot of stubbed articles and missing details. And it's your turn to contribute.

You'll be in good company. Ruby oldschoolers like why the lucky stiff and Matz himself have contributed updates and fixes. Ryan Davis contributed the content of his Ruby Quickref. Those of us on the JRuby team try to update it when we find pecularities in the Ruby language, classes, or runtime. And many folks use it as a convenient reference.

RubySpec is connected with the Ruby Documentation Project, which also hosts Ruby 1.8 and 1.9 RDoc-generated documentation directly from the C source. The spec is designed to fill in the gaps, explaining more details about the language, the runtime, and the implementations that would be useful to folks interested in a deeper look at Ruby.

So...have you written RubySpec today?

The First Ruby Mailing List Translator

In response to my post about the Ruby community needing an autotranslator for the key mailing lists, due to the language barrier between English-speaking and Japanese-speaking folks, I received quite a bit of interest, and a few people are looking into automatic solutions, manual solutions, and various solutions in between. But I believe we have a first attempt to meet the challenge.

Jason Toy has set up a mailing list translator site for the Ruby community. It provides autotranslated text of many Ruby mailing lists (both directions, and far more than I expected anyone to tackle), and even better it provides the original text and invites bilingual Rubyists to submit better translations for individual posts.

Jason commented on the previous post, saying his site still needs some work, but it's definitely on the right track. The obvious missing feature is a way to subscribe to the translated lists, either via feeds or mailing lists. Anyone feel like lending a hand can contact Jason at (I think) admin@translator.rubynow.com. Anyone feeling like doing this one better, maybe by setting up an army of human translators to proxy information across the divide, don't let this stop you :)

Saturday, November 17, 2007

RejectConf 4 Calls to Action: Ruby specs and JRuby failures

For those of you that don't know, RejectConf started last year at RubyConf 2006 in Denver because Ryan Davis (zenspider; author of the many parsetree-based tools like flog and heckle, part of the vlad team, and so on) had a rejected presentation and still wanted to show it off. It turned out to be one of the most entertaining parts of the conference, with dozens of folks taking 5-15 minutes to demo something they thought would be cool. Some were met with applause, some were booed down Apollo-style, and everyone had a good time.

RejectConf 4 was at RubyConf 2007 earlier this month. I took the opportunity to toss out two calls to action:

  1. Help contribute to Rubinius by adding specs, and if you're especially lazy just copy any of the many tests in JRuby (that are not specs, but which we'd just as soon see migrate to a single suite).
  2. Report failures in JRuby running the specs, and possibly fix them if you feel like helping out even more.
Confreaks have all the RejectConf 4 videos up now, and my call to action is there with the others. Check it and the others out, and consider helping out either the spec project or JRuby.

Monday, November 05, 2007

Ruby Community Seeks Autotranslator

As many of you know, Ruby was created in Japan by Yukihiro Matsumoto, and most of the core development team is still Japanese to this day. This has posed a serious problem for the Ruby community, since the language barrier between the Japanese core team and community and the English-speaking community is extremely high. Only a few members of the core team can speak English comfortably, so discussions about the future of Ruby, bug fixes, and new features happens almost entirely on the Japanese ruby-dev mailing list. That leaves those of us English speakers on the ruby-core mailing list out in the cold.

We need a two-way autotranslator.

Yes, we all know that automated translation technology is not perfect, and that for East Asian languages it's often barely readable. But even having partial, confusing translations of the Japanese emails would be better than having nothing at all, since we'd know that certain topics are being discussed. And English to JP translators do a bit better than the reverse direction, so core team members interested in ruby-core emails would get the same benefit.

I imagine this is also part of the reason Rails has not taken off as quickly in Japan as it has in the English-speaking world: the Rails core team is peopled primarily by English speakers, and the main Rails lists are all in English. Presumably, an autotranslating gateway would be useful for many such communities.

But here's the problem: I know of no such service.

There are multiple translation services, for free and for pay, that can handle Japanese to some level. Google Translate and Babelfish are the two I use regularly. But these only support translating a block of text or a URL entered into a web form. There also does not appear to be a Google API for Translate, so screen-scraping would be the only option at present.

The odd thing about this is that autotranslators are good enough now that there could easily be a generic translation service for dozens of languages. Enter in source and target languages, source and target mailing lists, and it would busily chew through mail. For closely-related European languages, autotranslators do an extremely good job. And just last night I translated a Chinese blog post using Google Translate that ended up reading as almost perfect English. The time is ripe for such a service, and making it freely available could knock down some huge barriers between international communities.

So, who's going to set it up first and grab the brass ring (or is there a service I've overlooked)?

Updated Alioth Numbers for JRuby 1.1b1

Oh yes, you all know you love the Alioth Shootout.

Isaac Gouy has updated the JRuby numbers, and modified the default comparison to be with Ruby 1.8.6 rather than with Groovy as it was before. And true to form, JRuby is faster than Ruby on 14 out of 18 benchmarks.

There are reasons for all four benchmarks that are slower:

  • pidigits is simply too short for JRuby to hit its full stride. Alioth runs it with n = 2500, which on my system doing a simple "time" results in JRuby taking 11 seconds and Ruby taking 5. If I bump that up to 5000, JRuby takes 27 seconds to Ruby's 31.
  • regex-dna and recursive-complement are both hitting the Regexp performance problem we have in JRuby 1.0.x and in the 1.1 beta. We expect to have that resolved for 1.1 final, and Ola Bini and Marcin Mielczynski are each developing separate Regexp engines to that end.
  • startup, beyond being a touch unfair for a JVM-based language right now, is actually about half our fault and half the JVM's. The JVM half is the unpleasantly high cost of classloading, and specifically the cost of generating many small temporary classes into their own classloaders, as we have to do in JRuby to avoid leaking classes as we JIT and bind methods at runtime. The JRuby half is the fact that we're loading and generating so many classes, most of them too far ahead of time or that will never be used. So there's blame to go around, but we'll never have Ruby's time for this.
Standard disclaimer applies about reading too much into these kinds of benchmarks, but it's certainly a nice set of numbers to wake up to.

Closures Prototype Applied to JRuby Compiler

A bit ago, I was catching up on my feeds and noticed that Neal Gafter had announced the first prototype of Java closures. I've been a fan of the BGGR proposal, so I thought I'd catch up on the current status and try applying it to a pain point in the JRuby source: the compiler.

The current compiler is made up of two halves: the AST walker and the bytecode emitter. The AST walker recursively walks the AST, calling appropriate methods on a set of interfaces into the bytecode emitter. The bytecode emitter, in turn, spits out appropriate bytecodes and calls back to the AST walker. Back and forth, the AST is traversed and all nested structures are assembled appropriately into a functional Java method.

This back and forth is key to the structure and relative simplicity of the compiler. Take for example the following method in ASTCompiler.java, which compiles a Ruby "next" keyword (similar to Java's "continue"):

public static void compileNext(Node node, MethodCompiler context) {
context.lineNumber(node.getPosition());

final NextNode nextNode = (NextNode) node;

ClosureCallback valueCallback = new ClosureCallback() {
public void compile(MethodCompiler context) {
if (nextNode.getValueNode() != null) {
ASTCompiler.compile(nextNode.getValueNode(), context);
} else {
context.loadNil();
}
}
};

context.pollThreadEvents();
context.issueNextEvent(valueCallback);
}

First, the "lineNumber" operation is called on the MethodCompiler, my interface for primary bytecode emitter. This emits bytecode for line number information based on the parsed position in the Ruby AST.

Then we get a reference to the NextNode passed in.

Now here's where it gets a little tricky. The "next" operation can be compiled in one of two ways. If it occurs within a normal loop, and the compiler has an appropriate jump location, it will compile as a normal Java GOTO operation. If, on the other hand, the "next" occurs within a closure (and not within an immediately-enclosing loop), we must initiate a non-local branch operation. In short, we must throw a NextJump.

In Ruby, unlike in Java, "next" can take an optional value. In the simple case, where "next" is within a normal loop, this value is ignored. When a "next" occurs within a closure, the given value becomes the local return from that invocation of the closure. The idea is that you might write code like this, where you want to do an explicit local return from a closure rather than let the return value "fall off the end":
def foo
puts "still going" while yield
end

a = 0
foo {next false if a > 4; a += 4; true}

...which simply prints "still going" four times.

The straightforward way to compile this non-local "next" would be to evaluate the argument, construct a NextJump object, swap the two so we can call the NextJump(IRubyObject value) constructor with the given value, and then raise the exception. But that requires us to juggle values around all the time. This simple case doesn't seem like such a problem, but imagine the hundreds or thousands of nodes the compiler will handle for a given method, all spending at least part of their time juggling stack values around. It would be a miserable waste.

So the compiler constructs a poor-man's closure: an anonymous inner class. The inner class implements our "ClosureCallback" interface which has a single method "compile" accepting a single MethodCompiler parameter "context". This allows the non-local "next" bytecode emitter to first construct the NextJump, then ask the AST compiler to continue processing AST nodes. The compiler walks the "value" node for the "next" operation, again causing appropriate bytecode emitter calls to be made, and finally we have our value on the stack, exactly where we want it. We continue constructing the NextJump and happily toss it into the ether.

The final line of the compileNext method initiates this process.

So what would this look like with the closure specification in play? We'll simplify it with a function object.
public static void compileNext(Node node, MethodCompiler context) {
context.lineNumber(node.getPosition());

final NextNode nextNode = (NextNode) node;

ClosureCallback valueCallback = { MethodCompiler => context
if (nextNode.getValueNode() != null) {
ASTCompiler.compile(nextNode.getValueNode(), context);
} else {
context.loadNil();
}
};

context.pollThreadEvents();
context.issueNextEvent(valueCallback);
}

That's starting to look a little cleaner. Gone is the explicit "new"ing of a ClosureCallback anonymous class, along with the superfluous "compiler" method declaration. We're also seeing a bit of magic outside the function type: closure conversion. Our little closure that accepts a MethodCompiler parameter is being coerced into the appropriate interface type for the "valueCallback" variable.

How about a more complicated example? Here's a much longer method from JRuby that handles "operator assignment", or any code that looks like a += b:
public static void compileOpAsgn(Node node, MethodCompiler context) {
context.lineNumber(node.getPosition());

// FIXME: This is a little more complicated than it needs to be;
// do we see now why closures would be nice in Java?

final OpAsgnNode opAsgnNode = (OpAsgnNode) node;

final ClosureCallback receiverCallback = new ClosureCallback() {
public void compile(MethodCompiler context) {
ASTCompiler.compile(opAsgnNode.getReceiverNode(), context); // [recv]
context.duplicateCurrentValue(); // [recv, recv]
}
};

BranchCallback doneBranch = new BranchCallback() {
public void branch(MethodCompiler context) {
// get rid of extra receiver, leave the variable result present
context.swapValues();
context.consumeCurrentValue();
}
};

// Just evaluate the value and stuff it in an argument array
final ArrayCallback justEvalValue = new ArrayCallback() {
public void nextValue(MethodCompiler context, Object sourceArray,
int index) {
compile(((Node[]) sourceArray)[index], context);
}
};

BranchCallback assignBranch = new BranchCallback() {
public void branch(MethodCompiler context) {
// eliminate extra value, eval new one and assign
context.consumeCurrentValue();
context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
context.getInvocationCompiler().invokeAttrAssign(opAsgnNode.getVariableNameAsgn());
}
};

ClosureCallback receiver2Callback = new ClosureCallback() {
public void compile(MethodCompiler context) {
context.getInvocationCompiler().invokeDynamic(
opAsgnNode.getVariableName(), receiverCallback, null,
CallType.FUNCTIONAL, null, false);
}
};

if (opAsgnNode.getOperatorName() == "||") {
// if lhs is true, don't eval rhs and assign
receiver2Callback.compile(context);
context.duplicateCurrentValue();
context.performBooleanBranch(doneBranch, assignBranch);
} else if (opAsgnNode.getOperatorName() == "&&") {
// if lhs is true, eval rhs and assign
receiver2Callback.compile(context);
context.duplicateCurrentValue();
context.performBooleanBranch(assignBranch, doneBranch);
} else {
// eval new value, call operator on old value, and assign
ClosureCallback argsCallback = new ClosureCallback() {
public void compile(MethodCompiler context) {
context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
}
};
context.getInvocationCompiler().invokeDynamic(
opAsgnNode.getOperatorName(), receiver2Callback, argsCallback,
CallType.FUNCTIONAL, null, false);
context.createObjectArray(1);
context.getInvocationCompiler().invokeAttrAssign(opAsgnNode.getVariableNameAsgn());
}

context.pollThreadEvents();
}

Gods, what a monster. And notice my snarky comment at the top about how nice closures would be (it's really there in the source, see for yourself). This method obviously needs to be refactored, but there's a key goal here that isn't addressed easily by currently-available Java syntax: the caller and the callee must cooperate to produce the final result. And in this case that means numerous closures.

I will spare you the walkthrough on this, and I will also spare you the one or two other methods in the ASTCompiler class that are even worse. Instead, we'll jump to the endgame:
public static void compileOpAsgn(Node node, MethodCompiler context) {
context.lineNumber(node.getPosition());

// FIXME: This is a little more complicated than it needs to be;
// do we see now why closures would be nice in Java?

final OpAsgnNode opAsgnNode = (OpAsgnNode) node;

ClosureCallback receiverCallback = { MethodCompiler context =>
ASTCompiler.compile(opAsgnNode.getReceiverNode(), context); // [recv]
context.duplicateCurrentValue(); // [recv, recv]
};

BranchCallback doneBranch = { MethodCompiler context =>
// get rid of extra receiver, leave the variable result present
context.swapValues();
context.consumeCurrentValue();
};

// Just evaluate the value and stuff it in an argument array
ArrayCallback justEvalValue = { MethodCompiler context, Object sourceArray, int index =>
compile(((Node[]) sourceArray)[index], context);
};

BranchCallback assignBranch = { MethodCompiler context =>
// eliminate extra value, eval new one and assign
context.consumeCurrentValue();
context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
context.getInvocationCompiler().invokeAttrAssign(opAsgnNode.getVariableNameAsgn());
};

ClosureCallback receiver2Callback = { MethodCompiler context =>
context.getInvocationCompiler().invokeDynamic(
opAsgnNode.getVariableName(), receiverCallback, null,
CallType.FUNCTIONAL, null, false);
};

// eval new value, call operator on old value, and assign
ClosureCallback argsCallback = { MethodCompiler context =>
context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
};

if (opAsgnNode.getOperatorName() == "||") {
// if lhs is true, don't eval rhs and assign
receiver2Callback.compile(context);
context.duplicateCurrentValue();
context.performBooleanBranch(doneBranch, assignBranch);
} else if (opAsgnNode.getOperatorName() == "&&") {
// if lhs is true, eval rhs and assign
receiver2Callback.compile(context);
context.duplicateCurrentValue();
context.performBooleanBranch(assignBranch, doneBranch);
} else {
context.getInvocationCompiler().invokeDynamic(
opAsgnNode.getOperatorName(), receiver2Callback, argsCallback,
CallType.FUNCTIONAL, null, false);
context.createObjectArray(1);
context.getInvocationCompiler().invokeAttrAssign(
opAsgnNode.getVariableNameAsgn());
}

context.pollThreadEvents();
}

There's two things I'd like you to notice here. First, it's a bit shorter as a result of the literal function objects and closure conversion. It's also a bit DRYer, which naturally plays into code reduction. Second, there's far less noise to contend with. Rather than having a minimum of five verbose lines to define a one-line closure (for example), we now have three terse ones. We've managed to tighten the focus to the lines of code we're actually interested in: the bodies of the closures.

Of course this quick tour doesn't get into the much wider range of features that the closures proposal contains, such as non-local returns. It also doesn't show closures being invoked because with closure conversion many existing interfaces can be represented as function objects automatically.

I'll be looking at the closure proposal a bit more closely, and time permitting I'll try to get a simple JRuby prototype compiler wired up using the techniques above. I'd recommend you give it a try too, and offer Neal your feedback.

Sunday, November 04, 2007

Ruby Continues to Climb on TIOBE

I've posted about TIOBE here before.

The TIOBE Programming Community index gives an indication of the popularity of programming languages. The index is updated once a month. The ratings are based on the world-wide availability of skilled engineers, courses and third party vendors. The popular search engines Google, MSN, Yahoo!, and YouTube are used to calculate the ratings. Observe that the TIOBE index is not about the best programming language or the language in which most lines of code have been written.

I noticed this month that Ruby has moved up to #9 in the list, passing JavaScript. Also noted in the November Newsflash is that Ruby is currently the front runner to win "programming language of the year" for the second year in a row, closely followed by D and C#.

TIOBE Programming Community Index

Is Werewolf Killing the Conference Hackfest?

The latest craze at conferences, especially those associated with O'Reilly or Ruby, is the game Werewolf (historically known as Mafia, but Werewolf has become more popular). The premise of the game is simple: (from Wikipedia's Mafia article)

Mafia (also known under the variant Werewolf or Vampire) is a party game modeling a battle between an informed minority and an uninformed majority. Mafia is usually played in groups with at least five players. During a basic game, players are divided into two teams: 'Mafia members', who know each other; and 'honest people', who generally know the number of Mafia amongst them. The goal of both teams is to eliminate each other; in more complicated games with multiple factions, this generally becomes "last side standing".
Substitute "villagers" for "honest people" and "werewolves" for "mafia members" and you get the general idea. They're the same game.

Again from Wikipedia:
Mafia was created by Dimma Davidoff at the Psychological Department of Moscow State University, in spring of 1986, and the first players were playing in classrooms, dorms, and summer camps of Moscow University. [citation needed] The game became popular in other Soviet colleges and schools and in the 1990s it began to be played in Europe (Hungary, Poland, England, Norway) and then the United States. Mafia is considered to be one of "the 50 most historically and culturally significant games published since 1800" by about.com.
So you get the idea.

It's a fun game. I first played it at Foo Camp 2007, and I've played at each Ruby-related conference since then. It's a great way to get to know people, and an interesting study in social dynamics. I have my own opinions about strategy and in particular a fatal flaw in the game, but I'll leave those for another day. Instead, I have a separate concern:

Is Werewolf killing the conference hackfest?

Last year at RubyConf, Nick Sieger and I sat up until 4AM with Eric Hodel, Evan Phoenix, Zed Shaw and others hacking on stuff. Eric showed a little Squeak demo for those of us who hadn't used it, Zed and I talked about getting a JRuby-compatible Mongrel release out (which finally happened a year later) and I think we all enjoyed the time to hack with a few others who code for the love of coding.

As another example, RubyGems, the definitive and now official packaging system for Ruby apps and libraries, was written during a late-night hackfest at an early RubyConf (2002?). I'm not remembering my history so well, but I believe Rake had a similar start, as well as several other projects including the big Rails 1.0 release's final hours.

This year, and for the past several conferences, there's been a significant drop in such hackfests. And it's because of Werewolf.

Immediately after the Matz's keynote last night, many of the major Ruby players sequestered themselves in isolated groups to play Werewolf for hours (and yes, I know many did not). They did not write the next RubyGems. They did not plant the seeds of the next great Ruby web framework. They did not advance the Ruby community. They played a game.

Don't get me wrong...I have been tempted to join every Werewolf game I can. I enjoy the game, and I feel like I'm at least competent at it. And I can appreciate wanting to blow off steam and play a game after a long conference day. I often feel the same way.

But I'm worried about the general trend. Not only does Werewolf seem to be getting more popular, it seems to draw in many of the best and brightest conference attendees, attendees who might otherwise be just as happily hacking on the "hard problems" we still face in the software world.

So what do you think? Is this a passing trend, or is it growing and spreading? Does it mean the doom of the late-night, post-conference hackfest and the inspirational products it frequently produces? Or is it just a harmless pastime for a few good folks who need an occasional break?

Friday, November 02, 2007

Top Five Questions I Get Asked

I'm at RubyConf this weekend, the A-list gathering of Ruby dignitaries. Perhaps one day I'll be considered among them.

At any rate, here's the top five questions I get asked. Perhaps this will help people move directly on to more interesting subjects :)

#1: Are you Charles Nutter

Yes, I am. I look like that picture up in the corner of my blog.

#2: How's life been at Sun/How's Sun been treating you?

Coming to Sun has been the best professional experience of my life. When they took us in over a year ago, many worried that they'd somehow ruin the project or steer it in the wrong direction (and I had my occasional concerns as well). The truth is that they've remained entirely hands-off with regards to JRuby while simultaneously putting a ton of effort into NetBeans Ruby support and committing mission-critical internal projects to running on JRuby. Beyond that, I've become more and more involved in the growing effort to support many languages on the JVM, through changes to Java and the JVM specification, public outreach, and cultivating and assisting language-related projects. It's been amazing.

Along with this question I often get another: "What's it like to work at Sun?" Unfortunately, this one is a little hard for me to answer. Tom Enebo and I work from our homes in Minneapolis, so "working at Sun" basically means hacking on stuff in my basement for 16 hours a day. In that regard, it's like...great.

#3 What's next for JRuby?

Usually this comes from people who haven't really been following JRuby much, and that's certainly understandable. We had our 1.0 release just five months ago, and we've had two maintenance releases since then plus a major release coming out in beta this weekend. So there's a lot to keep track of.

JRuby 1.0 was released in RC form at JavaOne 2007 and in final form at Ruby Kaigi 2007 in Tokyo. It was focused largely on compatibility, but had early bits of the Ruby-to-bytecode compiler (perhaps 25% complete). Over the summer, we simultaneously worked on improving compatibility (adding missing features, fixing bugs) and preparing for another major release (with me spending several months working on the compiler and various performance optimizations). That leads us to JRuby 1.1, coming out in beta form this weekend. JRuby 1.1 represents a huge milestone for JRuby and for Ruby in general. It's now safe to say it's the fastest way to run Ruby code in a 1.8-compatible environment. It includes the world's first Ruby compiler for a general-purpose VM, and the first complete and working 1.8-compatible compiler of any kind. And it has finally reached our largest Rails milestone to date: it runs Rails faster than Ruby 1.8.x does.

What comes after JRuby 1.1? Well, we need to finally clean up our association/integration with the other half of JRuby: Java support. Java integration works great right now--probably good enough for 90% of users. But JRuby still doesn't quite fit into the Java ecosystem as well as we'd like. Along with continuing performance and compatibility work, JRuby 2.0 should finally run the last mile of Java integration. Then, maybe we'll be "done"?

#4 What do you think of [Ruby implementation X]?

I usually hear this as "what bad things do you want to say about the other implementations", even though people probably don't intend it that way. So, for the record, here's my current position on the big five alternative Ruby implementations.

Rubinius: Evan and I talk quite a bit about implementation design, challenges of implementing Ruby, and so on. JRuby uses Rubinius's specs for testing, and Rubinius borrows/ports many of our tests for their specs. I've even made code contributions to Rubinius and Sun sponsored the Ruby Hackfest in Denver this past September. We're quite cozy.

Technical opinion? It's a great idea to implement Ruby in Ruby...I'd rather use Ruby than Java in JRuby whenever possible. However the challenge of making it perform well is a huge one. When all your core classes are implemented in Ruby, you need your normal Ruby execution to be *way* faster than Ruby 1.8 to have comparable application performance...like orders of magnitude faster. Think about it this way: if each core class method makes ten calls to more core class methods, and each of those methods calls ten native methods (finally getting to the VM kernel), you've got 100x more method invocations than an implementation with a "thicker" kernel that's immediately a native call. Now the situation probably isn't that bad in Rubinius, and I'm sure Evan and company will manage to overcome performance challenges, but it's a difficult problem to solve writing a VM from scratch.

We all like difficult problems, don't we?

XRuby: XRuby is the "other" implementation of Ruby for the JVM. Xue Yong Zhi took the approach of writing a compile-only implementation, but that's probably the only major difference from the JRuby approach. XRuby has similar core class structure to JRuby, and when I checked a few months ago, it could run some benchmarks faster than JRuby. Although it's still very incomplete, it's a good implementation that's probably not getting enough attention.

Xue and I talk occasionally on IM about implementation challenges. I believe we've both been able to help each other.

Ruby.NET: Ruby.NET is the first implementation of Ruby for the CLR, and by far the most complete. Wayne Kelly and John Gough from the University of Queensland ran this Microsoft-funded project when it was called "Gardens Point Ruby.NET Compiler", and Wayne has continued leading it in its "truly open source" project form on Google Code. I exchanged emails many times with Wayne about strategies for opening up the project, and I involved myself in many of the early discussions on their mailing list. I'm also, oddly enough, listed as a project owner on the project page. I'd like to see them succeed...more Ruby options means Ruby gains more validation on all platforms, and that means JRuby will be more successful. Ironic, perhaps, that long-term Ruby and JRuby success depends on all implementations doing well...but I believe that to be true.

IronRuby: IronRuby is the current in-progress implementation of Ruby for the CLR, primarily being developed by Microsoft's John Lam and his team. IronRuby is still very early in their implementation, but they've already achieved one big milestone: they have their Ruby core classes available as a public, open-source project on RubyForge, and you, dear reader, can actually contribute. In my book, it qualifies as the first "real" open source project out of Microsoft since it finally recognizes that OSS is a two-way street.

Technically, IronRuby is also an interesting project because it's stretching Microsoft's DLR in some pretty weird directions. DLR was originally based on IronPython, and it sounds like there's been growing pains supporting Ruby. Ruby is a much more complicated language to implement than Python, and from what I've heard the IronRuby team has had to diverge from DLR in a few areas (perhaps temporarily). But it also sounds like IronRuby is helping to expand DLR's reach, which is certainly good for MS and for folks interested in the DLR.

John and I talk at conferences, and I monitor and participate in the IronRuby mailing list and IRC channel. We Rubyists are a close-knit bunch.

Ruby 1.9.1/YARV: Koichi Sasada has done the impossible. He's implemented a bytecode VM and compiler for Ruby without altering the memory model, garbage collector, or core class implementations. And he's managed to deliver targeted performance improvements for Ruby that range from two times to ten times faster. It's an amazing achievement.

Ruby 1.9.1 will finally come out this Christmas with Koichi's YARV as the underlying execution engine. I talk with Koichi on IRC frequently, and I've been studying his optimizations in YARV for ideas on how to improve JRuby performance in the future. I've also offered suggestions on Koichi's Fiber implementation, resulting in him specifying a core Fiber API that can safely be implemented in JRuby as well. Officially, Ruby 1.9.1 is the standard upgrade path for Rubyists interested in moving on from Ruby 1.8 (and not concerned about losing a few features). JRuby will support Ruby 1.9 execution in the near future, and we already have a partial YARV VM implementation. JRuby and YARV will continue to evolve together, and I expect we'll be cooperating closely in the future.

#5 How's JRuby's Performance?

This is the first question that finally gets into interesting details about JRuby. JRuby's performance is great, and improving every day. Straight-line execution is now typically 2-4x better than Ruby 1.8.6. Rails whole-app performance ranges from just slightly slower to slightly better, though there have been reports of specific apps running many times faster. There's more work to do across the board, but I believe we've finally gotten to a point where we're comfortable saying that JRuby is the fastest complete Ruby 1.8 implementation available.

See also my previous posts and posts by Ola Bini and Nick Sieger for more information on performance. There's more detail there.

Saturday, October 27, 2007

Easy JRuby 1.0.2 Bugs

For those of you looking to start helping JRuby along, here's a few open 1.0.2 bugs that would be pretty easy for a newb to look at. They'd also help you get your first taste of JRuby internals. Good eatin!

This should help get you started:

FYI: Most of these are also 1.1 bugs, so you'll be helping two releases at once (provided you fix them and provide patches for both branches, of course!)

This is fairly simple, isolated core class code. Have a look at src/org/jruby/RubyArray.java

There's a patch here that looks pretty clean, but needs updating to apply cleanly to trunk and 1.0 branch.

We have File.sysopen defined, but for some reason we don't have IO.sysopen. If you can't fix it, at least do some research as to *why*.

This one throws a NullPointerException, which are always really ugly, really broken, and usually really easy to figure out and fix. The spec in question is our (oldish) copy in test/externals/rubinius/spec/core.

This one just needs proper error handling, and there's a preliminary patch provided.

I'm not sure if we can support these, but if we can we should. If we can't, we should raise an error. Either way it's probably not hard to resolve.

There's a patch here that just needs a little more work.

I don't even understand what's happening in this one, because I keep falling asleep reading the description. I think it's probably easy.

Again, there's a patch here that appears to just need a bit more work.

This one has all the background work done on the JRuby side. We're simply not using the right algorithm for Range#step over character ranges. A quick peek at MRI source or a little insight on how to "just fix it" is all that's needed here, and the code is pretty simple. (Update: Fixed, thanks to Mathias Biilmann Christensen.)

This has a patch that's been applied to trunk, but it didn't apply cleanly to the 1.0 branch. Tidy it up, and it's done.

...but Kernel#trap does. Huh? (Update: Fixed! Ola had already repaired it, but not closed the bug.)

This is pretty straightforward...we just don't have Q/q implemented. (Update: Fixed, thanks to Riley Lynch.)

Thursday, October 25, 2007

JRuby 1.1 on Rails: More Performance Previews

Nick Sieger, a Sun coworker and fellow JRuby core team member, has posted the results of benchmarking his team's large Rails-based project on MRI and the codebase soon to be released as JRuby 1.1 (trunk). Read it here:


Now there's a couple things I get out of this post:
  1. JRuby on Rails is starting, at least for this app, to surpass MRI + Mongrel for simple serial-request benchmarks. And this a week before the 1.1 beta release and all the continuing performance enhancements we know are still out there. I think we're gonna make it, folks. If this continues, there's no doubt in my mind JRuby 1.1 will run Rails fastest.
  2. JRuby on Rails will perform at least as well as MRI + Mongrel for the app Nick and his teammates are building. Several months ago they committed to eventually deploying on JRuby, and if you knew what I know about this app, you'd know why that scared the dickens out of me. But I'm glad the team had faith in JRuby and the JRuby community, and I'm glad we've been able to deliver.
I hope we're approaching a time when people believe that JRuby is the real deal--an excellent choice for running Rails now and potentially the best choice in the near future. There's always more work to do, but it's good to see the effort paying off.

Give it a try today, why not?

Tuesday, October 23, 2007

Help Us Complete JRuby 1.1 and 1.0.2 Releases

We're looking to do releases of a 1.1 beta and 1.0.2 over the next couple weeks, and we're hoping to pull in help from the community to turn over as many bugs as possible. A lot of the bugs we'd like to fix for each release wouldn't be very difficult for folks to get into, even if you only have a bit of Ruby experience and a good Java background. Currently we're looking at the following counts:

  • JRuby 1.0.2: 54 scheduled - These are almost exclusively compatibility issues, though there's a few minor performance items and several stability fixes.
  • JRuby 1.1: 147 scheduled - There's almost all the 1.0.2 bugs in here plus many additional performance enhancements and bug fixes that can't be cleanly/easily backported to 1.0.2.
  • Post 1.1: 7 JRuby 1.x, 71 unscheduled - These bugs are in danger of getting punted to a future release, so if you have a bug in these lists or you see something you think you'd like to/want to fix, now's the time.
The 1.0.2 release is basically a maintenance release to the 1.0 branch, and as such includes primarily compatibility and stability fixes. There's a couple minor perf issues that will be resolved, generally only where they were considered so slow as to be "broken".

The big story is going to be 1.1. It will include a massive number of performance enhancements and bug fixes. It will include a complete Ruby-to-Java bytecode compiler, which is responsible for much of the performance improvements. And it should also include a completely reworked regular expression subsystem that eliminates one of the last really major bottlenecks running Rails code. We may not have that done for the 1.1 beta release at RubyConf, but it will be there for 1.1 final around a month later.

So if you've been looking for a chance to get involved in JRuby, hop on the mailing lists or start browing the bug lists above, and feel free to email me directly (firstname.lastname@sun.com) if you have questions about any specific bug. JRuby needs your support!

Wednesday, October 17, 2007

Another Performance Discovery: REXML

I've discovered a really awful bottleneck in REXML processing.

Look at these results for parsing our build.xml:

read content from stream, no DOM
2.592000 0.000000 2.592000 ( 2.592000)
1.326000 0.000000 1.326000 ( 1.326000)
0.853000 0.000000 0.853000 ( 0.853000)
0.620000 0.000000 0.620000 ( 0.620000)
0.471000 0.000000 0.471000 ( 0.471000)
read content once, no DOM
5.323000 0.000000 5.323000 ( 5.323000)
5.328000 0.000000 5.328000 ( 5.328000)
5.209000 0.000000 5.209000 ( 5.209000)
5.173000 0.000000 5.173000 ( 5.173000)
5.138000 0.000000 5.138000 ( 5.138000)

When reading from a stream, the content is read in in chunks, with each chunk being matched in turn. Because our current regexp engine uses char[] instead of byte[], each chunk must be decoded into UTF-16 characters, matched, and encoded back into UTF-8 bytes.

For small chunks, like those read off a stream, this decode/encode cycle is fairly quick. Here, the streamed numbers are pretty close to MRI. However, when an XML string is parsed from memory, the process goes like this:
  1. set buffer to entire string
  2. match against the buffer
  3. set buffer to post match (remainder of the string)
Now this is obviously a little inefficient, since it creates a lot of extra strings, but a copy-on-write String implementation helps a lot. However in our case it also means that we decode/encode the entire remaining XML string for every element match. For any nontrivial file, this is *terrible* overhead.

So what's the fix? Here's the same second benchmark using a StringIO object passed to the parser instead, with a simple change to rexml/source.rb:

Diff:
Index: lib/ruby/1.8/rexml/source.rb
===================================================================
--- lib/ruby/1.8/rexml/source.rb (revision 4596)
+++ lib/ruby/1.8/rexml/source.rb (working copy)
@@ -1,4 +1,5 @@
require 'rexml/encoding'
+require 'stringio'

module REXML
# Generates Source-s. USE THIS CLASS.
@@ -8,7 +9,7 @@
# @return a Source, or nil if a bad argument was given
def SourceFactory::create_from arg#, slurp=true
if arg.kind_of? String
- Source.new(arg)
+ IOSource.new(StringIO.new(arg))
elsif arg.respond_to? :read and
arg.respond_to? :readline and
arg.respond_to? :nil? and

New numbers:
read content once, no DOM
0.640000 0.000000 0.640000 ( 0.640000)
0.693000 0.000000 0.693000 ( 0.693000)
0.542000 0.000000 0.542000 ( 0.542000)
0.349000 0.000000 0.349000 ( 0.349000)
0.336000 0.000000 0.336000 ( 0.336000)

This is a perfect indication why JRuby's Rails performance is currently nowhere near what it will be. We continue to find these little gems...and there's no telling how many more are out there. With recent execution performance numbers looking extremely solid and recent Rails performance getting closer and closer, the upcoming 1.1 release ought to be amazing.

Friday, October 12, 2007

Performance Update

As some of you may know, I've been busily migrating all method binding to use Java annotations. The main reasons for this are to simplify binding and to provide end-to-end metadata that can be used for optimizing methods. It has enabled using a single binding generator for 90% of methods in the system (and increasing). And today that has enabled making some impressive perf improvements.

The method binding ends up looking like this:

@JRubyMethod(name = "[]", name2 = "slice",
required = 1, optional = 1)
public IRubyObject aref(IRubyObject[] args) {
...

This binds the aref Java method to the two Ruby method names [] and slice and enforces a minimum of one argument and a maximum of two. And it does this all automatically; no manual arity checking or method binding is necessary. Neat. But that's not the coolest result of the migration.

The first big step I took today was migrating all annotation-based binding to directly generate unique DynamicMethod subclasses rather than unique Callback subclasses that would then be wrapped in a generic DynamicMethod implementation. This moves generated code closer to the actual calls.

The second step was to completely disable STI dispatch. STI, we shall miss you.

So, benchmarks. Of course fibonacci numbers are indicative of only a very narrow range of performance, but I think they're a good indicator of where general performance will go in the future, as we're able to expand these optimizations to a wider range of methods.

JRuby before the changes:
$ jruby -J-server -O bench_fib_recursive.rb
1.039000 0.000000 1.039000 ( 1.039000)
1.182000 0.000000 1.182000 ( 1.182000)
1.201000 0.000000 1.201000 ( 1.201000)
1.197000 0.000000 1.197000 ( 1.197000)
1.208000 0.000000 1.208000 ( 1.208000)
1.202000 0.000000 1.202000 ( 1.202000)
1.187000 0.000000 1.187000 ( 1.187000)
1.188000 0.000000 1.188000 ( 1.188000)

JRuby after:
$ jruby -J-server -O bench_fib_recursive.rb
0.864000 0.000000 0.864000 ( 0.863000)
0.640000 0.000000 0.640000 ( 0.640000)
0.637000 0.000000 0.637000 ( 0.637000)
0.637000 0.000000 0.637000 ( 0.637000)
0.642000 0.000000 0.642000 ( 0.642000)
0.643000 0.000000 0.643000 ( 0.643000)
0.652000 0.000000 0.652000 ( 0.652000)
0.637000 0.000000 0.637000 ( 0.637000)

This is probably the largest performance boost since the early days of the compiler, and it's by far the fastest fib has ever run. Here's MRI (Ruby 1.8) and YARV (Ruby 1.9) numbers for comparison:

MRI:
$ ruby bench_fib_recursive.rb
1.760000 0.010000 1.770000 ( 1.813867)
1.750000 0.010000 1.760000 ( 1.827066)
1.760000 0.000000 1.760000 ( 1.796172)
1.760000 0.010000 1.770000 ( 1.822739)
1.740000 0.000000 1.740000 ( 1.800645)
1.750000 0.010000 1.760000 ( 1.751270)
1.750000 0.000000 1.750000 ( 1.778388)
1.740000 0.000000 1.740000 ( 1.755024)

And YARV:
$ ./ruby -I lib bench_fib_recursive.rb
0.390000 0.000000 0.390000 ( 0.398399)
0.390000 0.000000 0.390000 ( 0.412120)
0.400000 0.010000 0.410000 ( 0.424013)
0.400000 0.000000 0.400000 ( 0.415217)
0.400000 0.000000 0.400000 ( 0.409039)
0.390000 0.000000 0.390000 ( 0.415853)
0.400000 0.000000 0.400000 ( 0.415201)
0.400000 0.000000 0.400000 ( 0.504051)

What I think is really awesome is that I'm comfortable showing YARV's numbers, since we're getting so close--and YARV has a bunch of additional integer math optimizations we don't currently support and thought we'd never be able to compete with. Well, I guess we can.

However a more reasonable benchmark is the "pentomino" benchmark in the YARV suite. We've always been slower than MRI...much slower some time ago when nothing compiled. But times they are a-changin'. Here's JRuby before the changes:
$ time jruby -J-server -O sbench/bm_app_pentomino.rb

real 1m50.463s
user 1m49.990s
sys 0m1.131s

And after:
$ time jruby -J-server -O bench/bm_app_pentomino.rb

real 1m25.906s
user 1m26.393s
sys 0m0.946s

MRI:
$ time ruby test/bench/yarv/bm_app_pentomino.rb

real 1m47.635s
user 1m47.287s
sys 0m0.138s

And YARV:
$ time ./ruby -I lib bench/bm_app_pentomino.rb

real 0m49.733s
user 0m49.543s
sys 0m0.104s

Again, keep in mind that YARV is optimized around these benchmarks, so it's not surprising it would still be faster. But with these recent changes--general-purpose changes that are not targeted at any specific benchmark--we're now less than 2x slower.

My confidence has been wholly restored.

Thursday, September 27, 2007

The Compiler Is Complete

It is a glorious day in JRuby-land, for the compiler is now complete.

Tom and I have been traveling in Europe the past two weeks, first for RailsConf EU in Berlin and currently in Århus, Denmark for JAOO (which was an excellent conference, I highly recommend it). And usually, that would mean getting very little work done...but time is short, and I've been putting in full days for almost the entire trip.

Let's recap my compiler-related commits made on the road:

  • r4330, Sep 16: ClassVarDeclNode and RescueNode compiling; all tests pass...we're in the home stretch now!
  • r4339, Sep 17: Fix for return in rescue not bubbling all the way out of the synthetic method generated to wrap it.
  • r4341, Sep 17: Adding super compilation, disabled until the whole findImplementers thing is tidied up in the generated code.
  • r4342, Sep 17: Enable super compilation! I had to disable an assertion to do it, but it doesn't seem to hurt things and I have a big fixme on it.
  • r4355, Sep 19: zsuper is getting closer, but still not enabled.
  • r4362, Sep 20: Enabled a number of additional flow-control syntax within ensure bodies, and def within blocks.
  • r4363, Sep 20: Re-disabling break in ensure; it caused problems for Rails and needs to be investigated in more depth.
  • r4367, Sep 21: Removing the overhead of constructing ISourcePosition objects for every line in the compiler; I moved construction to be a one-time cost and perf numbers went back to where they were before.
  • r4368, Sep 21: Small optz for literal string perf and memory use: cache a single bytelist per literal in the compiled class and share it across all literal strings constructed.
  • r4370, Sep 22: Enable compilation of multiple assignment with args (rest args)
  • r4375, Sep 24: Total refactoring of zsuper argument processing, and zsuper is now enabled in the compiler. We still need more/better tests and specs for zsuper, unfortunately.
  • r4377, Sep 24: Compile the remaining part of case/when, for when *whatever (appears as nested when nodes in the ast...why?)
  • r4388, Sep 25: Add compilation of global and constant assignment in masgn/block args
  • r4392, Sep 25: Compilation of break within ensurified sections; basically just do a normal breakjump instead of Java jumps
  • r4400, Sep 25: Fix for JRUBY-1388, plus an additional fix where it wasn't scoping constants in the right module.
  • r4401, Sep 25: Retry compilation!
  • r4402, Sep 26: Multiple additional cleanups, fixes, to the compiler; expand stack-based methods to include those with opt/rest/block args, fix a problem with redo/next in ensure/rescue; fix an issue in the ASTInspector not inspecting opt arg values; shrink the generated bytecode by offloading to CompilerHelpers in a few places. Ruby stdlib now compiles completely. Yay!
  • r4404, Sep 26: Add ASM "CheckClass" adapter to command-line (class file dumping) part of compiler.
  • r4405, Sep 26: A few additional fixes for rescue method names and reduced size for the pre-allocated calladapters, strings, and positions.
  • r4410, Sep 27: A number of additional fixes for the compiler to remedy inconsistent stack issues, and a whole slew of work to make apps run correctly with AOT-compiled stdlib. Very close to "complete" in my eyes.
  • r4412, Sep 27: Fixes to top-level scoping for AOT-compiled methods, loading sequence, and some minor compiler tweaks to make rubygems start up and run correctly with AOT-compiled stdlib.
  • r4413, Sep 27: Fixed the last known bug in the compiler. It is now complete.
  • r4414, Sep 27: Ok, now the compiler is REALLY complete. I forgot about BEGIN and END nodes. The only remaining node that doesn't compile is OptN, whichwe won't put in the compiled output (we'll just wrap execution of scripts with the appropriate logic). It's a good day to be alive!
I think I've done a decent job proving you can get some serious work done on the road, even while preparing two talks and hob-nobbing with fellow geeks. But of course this is an enormous milestone for JRuby in general.

For the first time ever, there is a complete, fully-functional Ruby 1.8 compiler. There have been other compilers announced that were able to handle all Ruby syntax, and perhaps even compile the entire standard library. But they have never gotten to what in my eyes is really "complete": being able to dump the stdlib .rb files and continue running nontrivial applications like IRB or RubyGems. I think I'm allowed to be a little proud of that accomplishment. JRuby has the first complete and functional 1.8-semantics compiler. That's pretty cool.

What's even more cool is that this has all been accomplished while keeping a fully-functional interpreter working in concert. We've even made great strides in speeding up interpreted mode to almost as fast as the C implementation of Ruby 1.8, and we still have more ideas. So for the first time, there's a mixed-mode Ruby runtime that can run interpreted, compiled, or both at the same time. Doubly cool. This also means that we don't have to pay a massive compilation cost for 'eval' and friends, and that we can be deployed in a security-restricted environment where runtime code-generation is forbidden.

I will try to prepare a document soon about the design of the compiler, the decisions made, and what the future holds. But for now, I have at least one teaser for you to chew on: there is a second compiler in the works, this time for creating real Java classes you can construct and invoke directly from Java-land. Yes, you heard me.

Compiler #2

Compiler #2 will basically take a Ruby class in a given file (or multiple Ruby classes, if you so choose) and generate a normal Java type. This type will look and feel like any other Java class:
  • You can instantiate it with a normal new MyClass(arg1, arg2) from Java code
  • You can invoke all its methods with normal Java invocations
  • You can extend it with your own Java classes
The basic idea behind this compiler is to take all the visible signatures in a Ruby class definition, as seen during a quick walk through the code, and turn them into Java signatures on a normal class. Behind the scenes, those signatures will just dynamically invoke the named method, passing arguments through as normal. So for example, a piece of Ruby code like this:
class MyClass
def initialize(arg1, arg2); end
def method1(arg1); end
def method2(arg1, arg2 = 'foo', *arg3); end
end
Might produce a Java class equivalent to this:
public class MyClass extends RubyObject {
public MyClass(Object arg1, Object arg2) {
callMethod("initialize", arg1, arg2);
}

public Object method1(Object arg1) {
return callMethod("method1", arg1);
}

public Object method2(Object arg1, Object... optAndRest) {
return callMethod("method2", arg1, optAndRest);
}
}
It's a pretty trivial amount of code to generate, but it completes that "last mile" of Java integration, being directly callable from Java and directly integrated into Java type hierarchies. Triply cool?

Of course the use of Object everywhere is somewhat less than ideal, so I've been thinking through implementation-independent ways to specify signatures for Ruby methods. The requirement in my mind is that the same code can run in JRuby and any other Ruby without modification, but in JRuby it will gain additional static type signatures for calls from Java. The syntax I'm kicking around right now looks something like this:
class MyClass
...
{String => [Integer, Array]}
def mymethod(num, ary); end
end
If you're unfamiliar with it, this is basically just a literal hash syntax. The return type, String, is associated with the types of the two method arguments, Integer and Array. In any normal Ruby implementation, this line would be executed, a hash constructed, and execution would proceed with the hash likely getting quickly garbage collected. However Compiler #2 would encounter these lines in the class body and use them to create method signatures like this:
    public String mymethod(int num, List ary) {
...
}

The final syntax is of course open for debate, but I can assure you this compiler will be far easier to write than the general compiler. It may not be complete before JRuby 1.1 in November, but it won't take long.

So there you have it, friends. Our work on JRuby has shown that it is possible to fully compile Ruby code for a general-purpose VM, and even that Ruby can be made to integrate as a first-class citizen on the Java platform, fitting in wherever Java code may be used today.

Are you as excited as I am?