The Cell Programming Language

Methods

Methods of relational automaton are just normal functions which are given read-only access to the current state of an automaton instance. Since the states of automaton instances are not ordinary values that can be passed around as parameters (remember, automaton variables cannot be aliased) methods have to be declared inside a using block, as shown here:

using Supply {
  [SupplierId] lowest_price_suppliers(PartId pid) {
    prices = [s, p : s, p <- unit_price(?, pid, ?)];
    return [] if prices == [];
    min_price = only(min_by([p : s, p <- prices], untag));
    return [p : p, s <- prices, s == min_price];
  }

  [PartId, SupplierId] lowest_price_suppliers = [
    p, s : p <- part, s <- lowest_price_suppliers(p)
  ];
}

A using block can contain any number of methods, and an automaton can have any number of using blocks. Method are invoked with the same syntax used in object-oriented languages: if a is an automaton variable and m is a method, it can be invoked by writing a.m(..) if m has any arguments, or a.m otherwise. The same is true of all automaton variables, including nested automata. Methods of a given automaton can of course invoke each other, with no need (and no way) to explicitly specify the target, and a reference to the automaton instance is passed along implicitly, just like the this/self argument in object-oriented languages.

Methods can freely manipulate ordinary member variables of the automaton they both the automaton they're associated with and all of its dependees, but they need to adhere to the no-aliasing rule for mutable relation variables. We've already seen which operations can be performed on mutable relation variables and which cannot, but if you're unsure whether something is permitted or not, just try to do it: the compiler will stop you anyway if you're doing something wrong.

Accessing stored and computed data in a uniform way

Mutable relation variables benefit from the same syntactic sugar that is available for ordinary relation values. Lookup operations of the form:

address(sid, !)
unit_price(sid, pid, !)

can be rewritten more concisely like so:

address(sid)
unit_price(sid, pid)

which makes them look like function/method calls, therefore providing a uniform syntax to access both stored attributes and computed ones. Here's an example:

schema People {
  person(PersonId)
    first_name : String,
    last_name  : String;
}

using People {
  String full_name(PersonId id) = first_name(id) & " " & last_name(id);
}

Now first_name, last_name and full_name for a given person_id can be accessed in exactly the same way:

first_name(person_id)
last_name(person_id)
full_name(person_id)

Nondeterminism

Function in Cell are, like in all pure functional languages, referentially transparent, which is just another way to say that they're deterministic: calling a function with the same inputs must always produce the same result. Unfortunately, that's not entirely true for methods. While methods cannot have any side effects, they're not entirely deterministic because they manipulate data structures whose elements are intrinsically unordered. Whenever you iterate through the elements of a mutable relation variable, the language does not guarantee a deterministic iteration order so if you're not careful you may end up writing nondeterministic code. Here's a couple of examples:

schema Measurements {
  measurement(MsrtId)
    value : Float;
}

using Measurements {
  Float avg_value {
    count = 0;
    sum = 0.0;
    for id, v <- measurements {
      count = count + 1;
      sum = sum + v;
    }
    return sum / count;
  }

  Float* ord_values {
    values = ();
    for id, v <- values
      values = (values | v);
    return values;
  }
}

The avg_value method iterates through all measurement values in an unspecified order: calling it on two different automaton instances can (and generally will) result in different iteration orders, even when the two instances have exactly the same logical state. Since floating point additions (unlike integer ones) are not associative and therefore sensitive to the order they're performed in, calling avg_value on two automaton instances with the same state can produce different results. ord_values shows a more blatant example of nondeterminism: all measurement values are appended to the result sequence in the order they're iterated on, so in this case we would be unlikely to get the same result from two distinct but logically identical automaton instances.

The same problem of course arises for ordinary relation variables, but in that case the language does guarantee a deterministic iteration order, even though such an order is implementation defined and may change from one version of the language to the next. But mutable relation variables are implemented differently under the hood, and they forgo determinism in order to provide optimal performance. I'm not aware of a way to provide deterministic iteration that doesn't impact performance or memory consumption to some extent, so this problem is probably not going away completely, but future version of the compiler will provide an option to make the behavior of methods entirely deterministic that will come with performance trade-offs.