Losses

Losses live under the School.Loss prefix. Each loss is a function from a prediction and a target to a scalar (or, for the per-row classification losses, a per-example vector that a caller reduces). The signatures below use n for a flat element count, a for a batch axis, and b (or classes) for a class axis.

This chapter is a reference for the losses defined under src/loss/.

Classification

School.Loss.CrossEntropy.loss (src/loss/crossentropy.ch): softmax cross entropy from logits against a one-hot target, returning the per-example loss.
```
loss: tensor[a, b, f32] -> tensor[a, b, f32] -> tensor[a, f32]
```
School.Loss.Nll.nll_loss (src/loss/nll.ch): negative log-likelihood from log-probabilities against integer class labels.
```
nll_loss: tensor[a, b, f32] -> tensor[a, int64] -> f32
```
School.Loss.LabelSmoothCe.loss (src/loss/labelsmoothce.ch): label-smoothed cross entropy. Takes logits, a one-hot target, and a smoothing eps, returning the per-example loss.
School.Loss.Bce.bce_with_logits (src/loss/bce.ch): binary cross entropy computed from logits (numerically stable), returning the per-element loss.
School.Loss.Focal.with_logits (src/loss/focal.ch): focal loss from logits, with a class-balance weight alpha and a focusing exponent gamma.
```
with_logits: &tensor[n, f32] -> &tensor[n, f32] -> f32 -> f32 -> tensor[n, f32]
```
School.Loss.KlDiv.kl_divergence (src/loss/kldiv.ch): Kullback-Leibler divergence between two distributions, returning a scalar.

Regression

School.Loss.Mse.mse_loss (src/loss/mse.ch): mean squared error.
School.Loss.Mae.mae_loss (src/loss/mae.ch): mean absolute error.
School.Loss.Huber.huber_loss (src/loss/huber.ch): Huber loss, with a threshold delta that switches between the squared and absolute regimes.
```
mse_loss: tensor[n, f32] -> tensor[n, f32] -> f32
huber_loss: tensor[n, f32] -> tensor[n, f32] -> f32 -> f32
```

The mean-squared-error call (examples/p1/mse_smoke.ch):

import School.Loss.Mse (mse_loss)
...
v = mse_loss(pred, target)

Embedding and ranking

School.Loss.CosEmbed.cosine_embedding (src/loss/cosembed.ch): cosine embedding loss between two vectors, with a target of +1 or -1 and a margin.
```
cosine_embedding: tensor[n, f32] -> tensor[n, f32] -> f32 -> f32 -> f32
```
School.Loss.Triplet.margin (src/loss/triplet.ch): triplet margin loss over an anchor, a positive, and a negative, with a margin m and a norm exponent p.
```
margin: tensor[n, f32] -> tensor[n, f32] -> tensor[n, f32] -> f32 -> f32 -> f32
```

Metrics

School.Loss.Metrics (src/loss/metrics.ch) provides two reporting metrics:

accuracy(logits, labels): the fraction of rows whose argmax matches the integer label.
perplexity(loss): the exponential of a cross-entropy loss.

These report; they are not differentiated. The training metrics in Training loop and data cover precision, recall, F1, and the multiclass variants.