Interfacing with Java
When you compile a Cell program, the compiler does not generate a binary executable. Instead, it generates a number of text files containing code in the chosen output language. We'll examine the Java code generator here. If you define a Main(..)
procedure in you Cell code, the generated code can then be handed over to a Java compiler to generate a jar file (or just a set of class files). That's how for instance the Cell compiler itself is built, and also the simplest way to build a program that tests your automata. But if you don't define a Main(..)
procedure, the compiler will just generate a set of classes, one for each type of automaton in your Cell code, that can be used to instantiate and manipulate the corresponding automata from your Java code. The compiler will also generate a text file named interfaces.txt
which documents the interfaces of the generated classes in pseudo-Java code. In this chapter we'll go through the interface of the generated classes and explain how to use them, and what each of their methods is for.
It's often a good idea not to use the classes that are generated for each Cell automaton directly, but to derive your own classes from them, and add new methods (and/or member or class variables, if needed) there. Another slightly more laborious alternative is to write a wrapper class for them. Among other things, if a method of the generated classes requires some sort of manual data conversion, you really don't want to repeatedly perform those conversions all over your codebase: it's much better to have all of them in just one place. The best thing to do is probably to define an overloaded version of the same method, or a similar one, in the derived class, and have it take care of all data conversions before and/or after invoking the generated one.
Data conversion
The biggest issue one encounters when interfacing two languages as different as Cell and Java is converting data back and forth between the two native representations. Fortunately, the compiler does most of the heavy lifting for you. For starters, there's a number of simple Cell data types that are mapped directly to a corresponding Java type. They are shown in the following table:
Cell | Java |
---|---|
Int | long |
Float | double |
Bool | boolean |
String | String |
Date | java.time.LocalDate |
Time | java.time.LocalDateTime |
T* | T[] |
[T] | T[] |
[K -> V] | java.util.HashMap<K, V> |
any_tag(T) | T |
Maybe[T] | T / null |
The first six entries in the above table are self-explanatory. The mapping for Cell sequences and sets is also straightforward, as both are mapped to Java arrays. For example, the Cell type Int*
is mapped to long[]
in Java, and [Date]
is mapped to java.time.Date[]
. The last three entries require a bit of an explanation.
- Cell maps become instances of
HashMap
in Java, and the mapping is applied recursively to its parameters, just like with sequences or sets. For instance,[String -> Float*]
is mapped toHashMap<String, double[]>
. You need to be careful here though asHashMap
works as intended only if the type of the key implements a meaningfulhashCode()
method. The Cell type[Int* -> String]
, for example, will be mapped toHashMap<int[], String>
, which is problematic since in Java two different arrays that happen to have the same content will generally have different hashcodes. Lookups will not work, and you might inadvertently insert duplicate keys in aHashMap
instance. - Tagged types can be mapped directly to a Java type only if the tag is a known symbol. In this case, the tag is simply ignored and the mapping of the untagged value is used. A type like
<user_id(Int)>
, for example, will be mapped to along
in Java, and the generated code will take care of adding or removing theuser_id
tag as needed. - The
Maybe[T]
type is mapped directly to the type of its parameter, and the value:nothing
is mapped tonull
.Maybe[String]
, for example, is mapped toString
, and the value:just("Hello!")
is returned as the Java string"Hello!"
. Since primitive types are not nullable, they are replaced with the corresponding class types:Maybe[Int]
for example, is mapped toInteger
, andMaybe[Float]
toDouble
.
For data types that are not in the above table, the compiler tries to generate a Java class for them. It does so for symbols, tuples, records and union types. All the generated types are documented in the interface.txt
file. As an example, given the following Cell code:
this is what the compiler will generate, in the pseudo-Java code used in interfaces.txt
:
The first Cell type, Point
, is mapped to a Java class by the same name. The member variables item1
and item2
correspond to the first and second field of the tuple respectively. For a union type like Shape
an empty interface is created, and each of the alternatives in the union is mapped to its own type, which implements the interface. The mapping for records and tagged records like Square
or Circle
is obvious. Symbols like empty_list
are mapped to their own singleton class, EmptyList
in this case. Such classes have a private constructor, and their only instance can be accessed through the singleton
class variable. Generic types like List[T]
are never mapped directly to Java types, only their instances are. In this specific case List[Shape]
is mapped to List_Shape
. All generated classes and their member or class variables are public, and they're part of the net.cell_lang
package. They also overload the toString()
method so that it returns the standard textual representation for the corresponding Cell value. Point.toString()
, for example, will returns strings like "point(12, -5)"
.
The exact rules that are used to map a Cell type to a corresponding Java type are rather complex, and not yet final, so they won't be described here, but the final result is generally just what you would expect. There's only a couple things that need explaining. If you use "inline" tuple or record types the compiler will get a bit creative with their names. For example, for an inline tuple type like (Int, Point, String)
the compiler will generate a Java class named Long_Point_String
, and an inline record type like (x: Float, y: Float, z: Float)
will be mapped to a Java class named X_Y_Z
. Union types can be mapped to a generated native type only if they only contain symbols or tagged values. The compiler for example will not be able to generated a Java equivalent for the following type:
Finally, when any type of naming conflict arises, the compiler fixes it by adding underscores and/or a unique number at the end of the type name, so don't be surprised if you find generated types with names like MyType_
or MyType_2
.
Every generated type is declared in its own file, and of course there's nothing stopping you from replacing the generated classes with your own, as long as they have the same names, are part of the same net.cell_lang
package, and you don't remove or change the type of its member variables. You can of course add all the member variables and methods you need, which will be ignored by the generated code. Just be careful because the Cell compiler will regenerate those classes every time, and will overwrite your own if they are in the wrong directory. The simplest thing to do is probably to delete the generated classes you want to replace right after running the Cell compiler, as part of the build process, and keep their replacements somewhere else in the CLASSPATH
.
When everything else fails, values are passed back and forth as strings containing their standard textual representation. That's neither elegant nor particularly efficient, but when passing data from Java to Cell it's at least simple and straightforward, since generating those strings is usually trivial. In the other direction things are a bit more complicated, as you may have to parse those strings, and parsing is a lot more complex than printing. A Java library that will assist with that task will be provided soon, but remember that exchanging data in textual form is just a last resort, one that is very easy to avoid in practice.
Relational automata
Let's now take a look at the classes that are generated when a relational automaton is compiled. We'll make use of a very simple one we've seen before, Counter
:
This is the interface of the corresponding Java class generated by the Cell compiler:
As you can see, the generated Java class has the same name of the Cell automaton it derives from, and it belongs to the net.cell_lang
package. The first three methods, load(..)
, save(..)
and execute(..)
, and the two fields onSuccess
and onFailure
are the same for all relational automata. All other methods are different for each automaton, and are used to send specific types of messages to it or to invoke its (read-only) methods.
The save(..)
method is the equivalent of the Save(..)
procedure in Cell, in that it takes a snapshot of the state of the automaton, which is written to the provided java.io.Writer
object. The state is saved in the standard text format used for all Cell values.
load(..)
is used to set the state of an automaton instance, which is read from a java.io.Reader
object, and is the equivalent of the Load(..)
procedure in Cell. It can be used at any time in the life of the automaton instance, any number of times. The new state has to be provided in the standard text format. Here's an example:
If the provided state is not a valid one, load(..)
will throw an exception. In that case, the automaton instance will just retain the state it had before, and will still be perfectly functional.
execute(..)
is used to send the automaton a message, which has to be passed in text form. A few examples:
Errors handling works in the same way as with load(..)
. If an error occurs an exception will be thrown, but the automaton will remain fully operational, and its state will be left untouched.
The next four methods, incr()
, decr()
, reset()
and reset(long)
provides another way to send messages to the automaton:
They are a lot faster than execute(..)
, and usually they're more convenient too. There are cases though when the ability to generically send a message of any type to an automaton is crucial, so that's why the compiler provides two ways of doing the same thing.
When updating an automaton instance keep in mind that Cell does not (yet) provide a way to incrementally persist its state: every time you call the save(..)
method the entire state of the automaton is saved. That's an expensive operation so typically you'll be performing it only once in a while. That means that you would have unsaved data in memory most of the time, which is of course at risk of being lost in the event of a crash. One simple and efficient way to avoid that is to store the list of messages that were received since the last save. If the application crashes, all you need to do when you restart it is to load the last saved state and re-send all the messages it received after that. That will recreate the exact same state you lost in the crash.
The onSuccess
field is meant to help with that. If it's not null
(which is the initial value) the lambda object it points to is called every time a message is processed successfully, and the message itself is passed to it in textual form. The field onFailure
on the other hand is invoked whenever a message handler fails. Saving those failed messages is not necessary for persistence, but it's typically useful for debugging.
The last three methods, value()
, updates()
and isGreaterThan(..)
, are just wrappers for the corresponding (read-only) methods of Counter
.
Reactive automata
We'll use Switch
as our first example. This is the interface of the corresponding generated class:
The first thing to note here is the two enumerations Input
and Output
, whose elements are the uppercase version of the names of the inputs and outputs of Switch
. These are used in conjunction with the methods setInput()
and readOutput()
as shown here:
As an alternative to setInput(..)
and readOutput(..)
, which can operate on any input or output and use the textual representation of a value as an exchange format, the generated class also provides another set of methods each of which can manipulate a single input or output, but that are more convenient to use in most cases. The above code snippet can be rewritten as follow:
The readState()
and setState(..)
methods work in the same way as with relational automata, but with the limitations we've already discussed for time-aware automata. The method changedOutputs()
returns a list of outputs that have changed (or have been activated, in the case of discrete outputs) as a result of the last call to apply()
:
The last thing we need to see is how to deal with time-aware automata. We'll use WaterSensor
:
The only differences here, apart from the input setters and output getters which are obviously specific to each automaton type and the generated types for Maybe[Bool]
and WaterSensorState
, are the two extra methods setElapsedSecs(..)
and setElapsedMillisecs(..)
and the fact that apply()
now returns a boolean value. The former are the equivalent of the elapsed
instruction in Cell, and the value now returned by apply()
has the same meaning as the one returned by the apply
instruction in a Cell procedure. Here's an example of how to update an instance of WaterSensor
: