Skip to content

Type System Reference

Chelis keeps tensor shape and numeric precision explicit in types. There is no implicit broadcasting and no implicit precision promotion. Named dimensions are nominal: two dimensions match only when their names match, so the type checker catches transposition and shape bugs before any code runs. This page is the reference for the type surface. The authoritative source is spec/04-type-system.md.

The numeric primitives are:

TypeMeaning
f3232-bit float
f6464-bit float
bf16bfloat16
f16IEEE 754 binary16
int88-bit signed integer
int1616-bit signed integer
int3232-bit signed integer
int6464-bit signed integer
boolboolean

There are no unsigned integer types. string exists as a type for parsing and host work, and () is the unit type.

A tensor type is written tensor[dims..., elemtype]. The element type is the last entry; everything before it is a dimension. A tensor with no dimensions is a scalar on the device, written tensor[f32].

tensor[f32] -- scalar
tensor[n, f32] -- one named dimension
tensor[batch, seq, f32] -- two named dimensions
tensor[64, 64, f32] -- two literal dimensions

The canonical Deep form spells out each dimension and the precision:

(t-tensor {} (d-name {} batch) (d-name {} seq) (t-prim {} f32))

Dimensions come in three forms: a named dimension (d-name {} batch), a variable (d-var {} a) introduced by a function's [...] clause, and a literal (d-lit {} 512).

Dimension matching is strict and element-wise. A named dimension unifies only with the same name; batch does not unify with seq. A literal unifies only with the same literal. A dimension variable unifies with anything and binds.

A function generic over shape uses dimension variables. There are no explicit dimension arguments; call sites instantiate the variables by unification.

def transpose[a, b](x: tensor[a, b, f32]) -> tensor[b, a, f32] = permute(x, 1, 0)

In a Deep signature this reads:

(defsig {}
transpose
(t-fn {}
(t-tensor {} (d-var {} a) (d-var {} b) (t-prim {} f32))
(t-tensor {} (d-var {} b) (d-var {} a) (t-prim {} f32))))

Declared dimension parameters are rigid inside the function body: the body must type-check for every instantiation, so it cannot couple a result dimension to an unrelated input dimension.

A wildcard dimension * marks a size that is not statically known, for example the axis produced by a concat whose length depends on runtime data. It unifies with anything but is never generalized. Add an explicit annotation to restore named checking.

A rank variable ..r is a name-preserving spread over a run of dimensions, so one definition covers tensors of every rank. It is introduced contextually; a spread name may not repeat in a single shape.

def relu_any_rank(x: &tensor[..r, f32]) -> tensor[..r, f32] = relu(x)

Spreads can interleave with named anchors, which lets a definition reduce or insert one named axis while preserving the rest. Reducing over a named axis:

def reduce_seq(x: &tensor[..pre, seq, ..post, f32]) -> tensor[..pre, ..post, f32] =
sum(x, seq)

The body of a rank-polymorphic definition is restricted to operations whose effect on the shape can be tracked by name: elementwise and shape-identity operations, named-axis reductions (sum, mean, max_reduce, min_reduce, prod_reduce), and named-axis expand. Positional rewriters such as permute, reshape, and matmul are rejected inside a ..r body, which is what preserves the named-dimension safety guarantee.

Chelis does not broadcast. Operands of an elementwise operation must have identical dimension lists. Use expand to add a dimension explicitly before combining tensors of different rank.

-- tensor[batch, hidden, f32] + tensor[hidden, f32] is a type error.
biased = add(linear, expand(b, 0, batch))

expand, reshape, and permute are the explicit tools for changing rank and shape.

Precision never changes implicitly. Both operands of an arithmetic operation must have the same precision; the compiler never inserts a cast for you.

-- add(tensor[d, f32], tensor[d, bf16]) is a type error.
sum = add(x, cast(y, f32))

cast(e, p) is the explicit precision change; it preserves dimensions and changes only the element type. Integer literals default to int32 and float literals default to f32. Three things override a default: a literal suffix, a surrounding known element type, or an explicit cast.

a = cast(x, bf16) -- precision change
b = 1.0f64 -- suffix binds f64
c = cast(3000000000, int64) -- escape hatch for out-of-int32-range literals

The compatibility rules: arithmetic (add, mul, sub, div) needs equal numeric precision; comparison (cmplt, eq) takes equal numeric precision and yields bool; logical operations (and, or, not) take bool; transcendental operations (exp, log, sin, cos, tan, atan, sqrt) take float types only.

matmul and sum accept an optional accumulator precision, the one place mixed precision appears. The default accumulator widens bf16 and f16 inputs to f32 for numerical stability and widens int8 and int16 to int32 for overflow safety; a requested accumulator must be at least as wide as the operands and the default.

A function type is the arrow chain A -> B -> C, where the last entry is the return and the rest are arguments. The arrow is right associative; parenthesize a function-typed argument.

tensor[n, f32] -> tensor[n, f32] -> tensor[f32] -- two args, scalar result
(tensor[n, f32] -> tensor[f32]) -> tensor[f32] -- a function-typed argument

In Deep this is a flat t-fn whose last child is the return:

(t-fn {}
(t-tensor {} (d-name {} n) (t-prim {} f32))
(t-tensor {} (d-name {} n) (t-prim {} f32))
(t-tensor {} (t-prim {} f32)))
  • Tuples: (f32, f32) as a type, (a, b) as a value, projected with .0, .1.
  • Algebraic data types: type Option[a] = | None | Some { value: a }, with positional or record-field payloads. Recursive types refer to themselves by name.
  • Records: constructed with Foo { x: e1, y: e2 } (punning allowed), read with e.field.
  • Type aliases: a type without variants is transparent and expanded at desugaring.
  • Lists: List[T] holds rank-uniform elements; a list of tensors fixes one rank for every element. The list builtins (len, index, append, concat, range, zip, enumerate, map, filter, fold, and others) operate on these.

A function type can carry an effect set in eff metadata. In Surf the effect set is the ! { ... } suffix on a signature or def.

sig predict: tensor[n, f32] -> tensor[n, f32] ! { Random }
(t-fn {eff: (effects {} random)}
(t-tensor {} (d-name {} n) (t-prim {} f32))
(t-tensor {} (d-name {} n) (t-prim {} f32)))

Effect inference runs after type inference. A function's effect set is the union of the effects of the operations in its body. Random comes from dropout and uniform_like and the initializers that call them; it is discharged by with seed(...). IO is inferred from host operations such as print and file reads. Resource("device") marks a region validated against the build target through with device(...). See Effects and Handlers.

Tensor values are owned by default, with lightweight uniqueness rather than full ownership and lifetimes. A consuming use makes the binding dead; the compiler inserts copies where a value fans out and drops where an owned value goes out of scope, so you usually do not write either by hand.

A read-only borrow is a reference type, written &T in a signature and &x at a use site. A borrow leaves the owned binding live and is the idiomatic way to pass a tensor to a read-only operation.

def relu_forward(x: &tensor[..r, f32]) -> tensor[..r, f32] = relu(x)
(t-ref {} (t-tensor {} (d-name {} batch) (t-prim {} f32)))

Passing an owned value where a borrow is expected auto-borrows. Passing a borrow where an owned value is expected is an error unless you write copy(x), which produces a fresh owned value from an owned or borrowed input. Borrows cannot be stored in aggregates, returned, or captured by closures. The borrow target must be a tensor or a tensor-carrying value, where a type is tensor-carrying when one of its fields contains a tensor, transitively through tuples, ADTs, and references.

Type checking is Hindley-Milner inference with tensor and dimension extensions. Annotations are optional; the checker infers literal types, dimension variables at call sites, and effect sets, and verifies any annotation you do supply. What you must still state: the precision argument to cast, the dimension and precision parameters in a [...] clause, an annotation to pin a wildcard dimension back to a name, and a suffix or cast for a literal outside the default precision range.

The pipeline order is parse, desugar, type inference and checking, effect inference and checking, linearity checking, then lowering.