Unpacking Options Values: A Case Study in Language Design

Language design happens at many levels of granularity. This post is an example of a fine-grained example.

I mentioned KernelF before. It is a functional language whose purpose is to be embedded into DSLs, according to the third pattern described in the previous post. KernelF has option types. This post is about the design of the syntax used to unpack options.

Option Types

Option types are used to handle null values in a type-safe way. The constant maybe in the code below can either be an actual number value, or nothing, represented by none, depending on the value of aBool. The if expression then produces either none or 42. This is why the constant is typed as an option<number> instead of just number.

val maybe : option<number> = if aBool then 42 else none

Most operators, as well as many dot operations, are overloaded to also work with option<T> if they are defined for T. If one of the arguments is none, then the whole expression evaluates to none. In this sense, a none value “bubbles” up. Note that the type system represents this; the + operator and the length call in the example below are also option types!

val nothing   : option<number> = none
val something : option<number> = 10
val noText    : option<string> = noneassert nothing + 10   equals none
assert something + 10 equals 20    
assert noText.length  equals none  

The language design issue we address in this post is: how do you extract a value of type T from an option<T> after testing that it actually contains a Tinstead of a none.

The Starting Point

We started with a first-class concept with some, plus an expression val that would provide access to the optioned value if it is not none. Having a first-class concept makes analyses simple to build, because it is simple to recognise a check for some because the language concept directly expresses it.

fun f(x: option<number>) = with some x => val none 10

The example above returns the value inside the option, and 10 if the option contains a none. We also experimented with dot expressions to access the optioned value:

fun f(x: option<number>) = with some x => x.val none 10

This second version would not work for complex expression such as function calls, since repeating the complex expression before the dot is syntactically ugly and leads to errors if the called function has side effects. We decided on the first alternative.


However, this alternative will result in a problem if several with someexpressions are nested because val would be ambiguous. The name of the expression used to refer to the value must be changeable. One solution would be to define a value explicitly:

fun f(x: number, y: number) = {
  val xval = with some maybe(x) => val none 10
  with some maybe(y) => val + xval none 20

However, this is too verbose. We came up with two versions of an abbreviation to define names for the tested value:

fun f(x: number) = with some v = maybe(x)   => v none 10
   -- or --
fun f(x: number) = with some maybe(x) as v  => v none 10

We preferred <expr> as <name> over <name> = <expr> because it cannot be confused with an assignment (which we do not support in KernelF, but peoples’ mental parser still recognises it). It is also easier from the perspective of the user, because you can add the name (syntactically and in terms of typing sequence) after the expression the user wants to test. Finally, KernelF already has a facility for optionally naming things with an as suffix. The above can then be written as:

fun f(x: number, y: number) = {
  with some maybe(x) as xval
     => with some maybe(y) as yval => xval + yval
                                      none 0
     none 0

To avoid the annoying nesting, we allowed comma-separated tests:

fun f(x: number, y: number) =
   with some maybe(x) as xval, maybe(y) as yval
      => xval + yval none 0

Using if Expressions

The first-class concept with some turned out to be disliked by users: it introduces new keywords for something where users intuitively wanted to use the existingif; so we allowed the if expression to be used, again with the same variations:

fun f(x: option<number>) = if some(x)     then val   else 10
fun f(x: option<number>) = if some(x)     then x.val else 10
fun f(x: number) = if some(maybe(x))      then val   else 10
fun f(x: number) = if some(maybe(x) as v) then v     else 10

A problem with using the existing if expression is that users can construct arbitrarily complex expressions, such as the following:

fun f(x: option<number>) =
  if some(x) || g(x) then val else 10

In this case it cannot (easily) be statically checked that inside the thenbranch, x always has a value . To enforce this, we ensure that the someexpression is the topmost expression in the if; it cannot be combined with others. This is trivial to check structurally and avoids the need for advanced semantic analysis of complex expressions.

Options as Booleans

We had the idea of interpreting an option type as Boolean to avoid the need to write some:

fun f(x: option<number>) = if x then val else 10

However, we discarded this option because, for our target audience, we think that too much type magic is too complicated. Another idea was to use the name of the tested variable (if it is a simple expression) in the then part, and type it to the content of the option. This would allow the following syntax:

fun f(x: option<number>) = if some(x) then x else 10

This is harder to implement because the type of x is now different depending on the location in the source. This is not easily possible with MPS’ type system. Alternatively, the second x could be made to be a different language concept (which comes with a different type), but then one has to prevent the use of the original x in the then part. This would require all reference concepts to be aware of the mechanism; every scoping function would have to call a filter method. While this makes language extension a little bit harder (users have to call the filtering function), we decided that this is worth it: since one cannot do anything else inside the then part, providing the ``unpacked’’ value there makes sense.

Final Design

We settled on the following syntax. The if conforms to users’ expectations, the as avoids confusion with assignments, and we provided the magic of “automatic unpacking’’ inside the then part:

fun f(x: option<number>) = if some(x)     then x else 10
fun f(x: number) = if some(maybe(x)) as v then v else 10

For multiple tested values we now use && instead of the comma, because the && is used in logical expressions already as a conjunction; note that other logical operators are not supported on some tests.

fun f(x: number, y: option<number>) =
   if some(maybe(x)) as xval && some(y)
      then xval + y else 0

For the common case where one “just” wants to get the value in the option, and an alternative otherwise. The ^: operator supports this in a very concise way:

val aNumber = maybe(x) ^: 0

Why not the familiar Matching?

Many functional languages use case matching to deal with options. The reason for this is that option types are often implemented as algebraic data types (ADTs), and case matching is a natural way to process them. However, KernelF does not have ADTs (because their purpose is to build custom abstractions, which is not what KernelF is intended to do), so users do not know about pattern matching. Also, there is no general pattern matching syntax that could be reused.

Wrap Up

I wanted to publish this one because it shows that language design can extend to a very fine-grained language feature, and are not just about “domain abstractions”. It also shows how user expectations (from some to if) have to be balanced with implementation effort (treating options as Booleans). Oh, and here’s the unrelated picture. Almost forgot it :-)