Tuesday, March 10, 2009

Compiling Ruby to Java Types

"Compiler #2" as it has been affectionately called is a compiler to turn normal Ruby classes into Java classes, so they can be constructed and called by normal Java code. When I asked for 1.3 priorities, this came out way at the top. Tom thought perhaps I asked for trouble putting it on the list, and he's probably right (like asking "prioritize these: sandwich, pencil, shiny gold ring with 5kt diamond, banana"), but I know this has been a pain point for people.

I have just landed an early prototype of the compiler on trunk. I made a few decisions about it today:

  • It will use my bytecode DSL "BiteScript", just like Duby does
  • It will use the *runtime* definition of a class to generate the Java version
The second point is an important one. Instead of having an offline compiler that inspects a file and generates code from it, the compiler will actually used the runtime class to create a Java version. This means you'll be able to use all the usual metaprogramming facilities, and at whatever point the compiler picks up your class it will see all those methods.

Here's an example:

# myruby.rb
require 'rbconfig'

class MyRubyClass
def helloWorld
puts "Hello from Ruby"
end
if Config::CONFIG['host_os'] =~ /mswin32/
def goodbyeWorld(a)
puts a
end
else
def nevermore(*a)
puts a
end
end
end

Here we have a class that defines two methods. The first, always defined, is helloWorld. The second is conditionally either goodbyeWorld or nevermore, based on whether we're on Windows. Yes, it's a contrived example...bear with me.

The compiler2 prototype can be invoked as follows (assuming bitescript is checked out into ../bitescript):

jruby -I ../bitescript/lib/ tool/compiler2.rb MyObject MyRubyClass myruby

A breakdown of these arguments is as follows:
  • -I ../bitescript/lib includes bitescript
  • tool/compiler2.rb is the compiler itself
  • MyObject is the name we'd like the Java class to have
  • MyRubyClass is the name of the Ruby class we want it to front
  • myruby is the library we want it to require to load that class
Running this on OS X and dumping the resulting Java class gives us:

Compiled from "MyObject.java.rb"
public class MyObject extends org.jruby.RubyObject{
static {};
public MyObject();
public org.jruby.runtime.builtin.IRubyObject helloWorld();
public org.jruby.runtime.builtin.IRubyObject nevermore(org.jruby.runtime.builtin.IRubyObject[]);
}

The first thing to notice is that the compiler has generated a method for nevermore, since I'm not on Windows. I believe this will be unique among dynamic languages on the JVM: we will make the *runtime* set of methods available through the Java type, not just the static set present at compile time.

Because there are no type signatures specified for MyRubyClass, all types have defaulted to IRubyObject. Type signature logic will come along shortly. And notice also this extends RubyObject; a limitation of the current setup is that you won't be able to use compiler2 to create subclasses. That will come later.

Once you've run this, you've got a MyObject that can be instantiated and used directly. Behind the scenes, it uses a global JRuby instance, so JRuby's still there and you still need it in classpath, but you won't have to instantiate a runtime, pass it around, and so on. It should make integrating JRuby into Java frameworks that want a real class much easier.

So, thoughts? Questions? Have a look at the code under tool/compiler2.rb in JRuby's repository. The entire compiler is so far only 78 lines of Ruby code.