Introducing Numbat
Part I: The type system
Numbat is a programming language that aims to provide a convenient way
to perform computations with physical units.
One of the main goals of the language is to help users write correct programs.
It has a static type system where physical dimensions act as types.
The expression 3 months
, for example, has a type of Time
.
Similarly, the expression 2 years
also has a type of Time
.
The compound expression 3 months + 2 years
is therefore well-typed:
On the other hand, 3 months + 2 lightyears
is ill-typed, because the right-hand side is of type Length
.
You can change ‘years’ to ‘lightyears’ in the editor above to see the resulting compiler error.
Before we describe the type system in more detail, let's look at two simple examples. The first program
computes the oscillation period of a (mathematical) pendulum using the gravitational acceleration
g0
.
Note that you can change the unit of the pendulum length to, say, inch
without
having to change the type annotations. This is in contrast to a lot of unit libraries in other
programming languages that typically lift units — not the physical dimensions — to the type level.
Also, note that you can completely remove the explicit : Type
annotations and let the compiler infer the types:
The second example demonstrates how explicit type annotations
can help avoid mistakes. We want to compute the ratio of gravitational forces exerted on Earth by both the Sun and the Moon.
However, the computation contains a mistake. Try to annotate the definitions of force_sun
and force_moon
with the intended Force dimension
and see if the compiler error can help you fix the equations.
The correct result should be closer to 180
.
A type in Numbat can be one of three things. It can be 1
— which is a special type for dimensionless quantities, it can be a base dimension, or it can be an algebraic combination of other types.
We use the dimension
keyword to introduce new types. For example, this
is how basic physical dimensions from classical mechanics are defined in Numbat's prelude:
Note how the three base dimensions length, time and mass can be combined using multiplication, division and exponentiation. Now there is not a lot we can do with those types until we introduce corresponding units to build quantities that inhabit those types:
Now take a look at the last line. When the type checker encounters an expression like this, it infers
the overall type (Force
) by recursively walking the
syntax tree. Multiplication and division nodes are easy. We simply combine the types
in the same way that we combine the expressions:
type(a * b) = type(a) * type(b) type(a / b) = type(a) / type(b)
However, addition and subtraction are different. If the types on both hand sides are the same, we return that type. If they are not, we raise a type check error:
type(a + b) = if type(a) == type(b) { type(a) } else { error("incompatible dimensions in addition …") }
Comparison operators like ==
or >
also work in a
similar way — can't compare apples and oranges — except that they return a boolean type. You can try this by replacing addition with a comparison operator in the
example above.
Finally, there is exponentiation: a^b
. And it turns out to be quite a bit more complicated.
We first note that the exponent b
needs to be dimensionless. An expression like (2 m)^(3 kg)
is not
meaningful. But even then there is a problem. Imagine something like (2 m)^(1 + 2)
. It should have a type of Length^3
. But this means that the type of the expression depends on the value of the exponent. It turns out we need to evaluate the expression
b
at compile time to determine the type of a^b
:
type(a ^ b) = if type(b) != 1 { error("exponent needs to be dimensionless") } else { if type(a) == 1 { 1 } else { type(a) ^ consteval(b) } }
Numbat therefore has a special const-evaluation mode that can compute a subset of expressions as part of the type-checking phase.
Note that the restriction to a subset of operations is not a limitation in practice. Real world computations often have complicated expressions in the exponent.
But in those cases, the left-hand side (a
) is always dimensionless and the overall type is simply 1
. And when a
is not dimensionless, the exponent expression is typically just a (rational) number.
One interesting application of the exponentiation operator is the sqrt
function that you
saw in the very first example. It can take dimensionful quantities as an argument. And it can be defined in this way:
Here, D
is a generic type parameter. The function takes an argument of type
D^2
and returns a D
. In the example here, it turns
Time^2
into Time
.
In general, type inference for functions is an interesting topic for another day.
Our final example demonstrates how you can make use of custom dimensions and units. This is often useful when you want to count things (people, atoms, pixels, or in this case: drops). Say we want to compute the required IV drip rate for an infusion of 1200 mL of saline solution over 6 hours. Even if it might look like overkill in this simple example, note how the introduction of a new unit not just helps with correctness, but also with readability: