Cell ➞ Java compiler released

Version 0.1 of the Cell to Java compiler/code generator is now available. It is provided as a jar file, and should run on any platform that supports Java 8.

Compared to the Cell ➞ C# compiler released two month ago, the Java code generator makes passing data back and forth between the two languages (Cell and Java) a lot easier. The C++ and C# code generators try to map any Cell data types to either native data types of the host language (like bool, long or double) or types defined in the standard libraries, like std::string for C++ or string for C#, and failing that they resort to a generic representation that is not especially convenient to use. The Java code generator, on the other hand, tries to generate a Java class for any Cell type that cannot be mapped to a native Java type, and that allows it to easily handle records and union types, unlike the older C++ and C# code generators. It is also possible to replace the generated classes with your own, as long as their interfaces are compatible. Everything is described in detail in the Interfacing with Java chapter. All the improvements will be soon backported to the C++ and C# code generators.

Java is not a friendly language for code generators though, and there's one problem in particular that is still unresolved. The maximum size for a method in the Java bytecode is 64K, and that limit can be easily exceeded when using large constants in Cell. That problem has popped up repeatedly during the development of the Java code generator, and the compiler code had to be refactored in a number of places to work around it. The problem can (and will) be fixed easily for simple constants, and with more effort for more complex code, but for now just be aware that the Java compiler may reject the generated code if it contains such large constants/methods, even though everything works just fine when using C++ and C#.

As far as performance is concerned the generated Java code is about 5 times slower that the C++ version when running on a single core, and 3/4 times slower on a dual-core processor (somehow the JVM manages to make use of the extra core even though the compiler itself is not multithreaded. I've no idea how it does that), when measuring it with the only real benchmark I have, the compiler compiling itself. That's actually pretty impressive of the JVM, considering how much effort went into optimizing the C++ version, and the fact that I haven't even tried to optimize the generated Java code. In any case the current benchmark results are pretty meaningless, since the picture is going to change completely once type-based optimization is implemented. That will be the focus of version 0.2 of the compiler, whose development is now under way.