Properties and verification
A loaded ONNX model under Hydronnx is an ordinary Chelis function, so it takes
ordinary Chelis @property blocks. chelis prove verifies them by sampling
inputs from the property's forall quantifier and checking the condition on
each, turning "the model probably behaves like this" into a checked invariant.
This document walks through the worked property from the example
(probability_property.ch.snippet), explains the single-file constraint on
chelis prove, and shows what a failing property looks like.
A positive property: the example case
Section titled “A positive property: the example case”The worked classifier at examples/tabular_classifier/iris_classifier.onnx ends
in Softmax, so its three output scores are a genuine probability distribution
and must sum to 1.0 for any input. That is the property to verify. The snippet:
@property output_is_a_probability_distribution forall(features: tensor[1, 4, f32]): { probs = forward(features) total = tensor_to_scalar(sum(sum(probs, 1), 0)) _ = drop(features) ((total >= 0.999) && (total <= 1.001)) } with samples = 64 with seed = 0The shape is conventional Chelis: the property names a universally quantified
input (features), names the function under verification (forward), and
reduces to a Boolean. The with samples = 64 with seed = 0 clauses pin the
sampler's budget and seed, so a clean run is deterministically reproducible.
The single-file constraint
Section titled “The single-file constraint”chelis prove operates on the single file you hand it. It does not resolve
cross-module imports. A @property that calls forward must live in the
same .ch file as forward. Every property probe in the chelis spike repo is
self-contained for the same reason.
The example does this by appending the property to the emitted module in a fresh build context:
cd examples/tabular_classifiercat probability_property.ch.snippet >> src/irisclassifier.chchelis prove src/irisclassifier.ch --samples 64 --seed 0Expected output:
property: output_is_a_probability_distribution -- 64/64 passed1 passed, 0 failed, 0 unsupported, 0 errors64 of 64 samples satisfied the invariant: the property is verified for this sampler budget.
To restore the example to its pristine starting state, re-run
chelis-hydronnx-emit (which overwrites the appended snippet) or git restore src/irisclassifier.ch. The example's acceptance test does the append in a
tempdir, so the committed file stays clean.
Why the property is genuinely useful
Section titled “Why the property is genuinely useful”The worked classifier's three outputs come from a Softmax, so the sum-to-1
invariant should hold by construction. chelis prove checks that the emitted
host-runtime evaluation actually produces a sum within the tolerance band the
property declares, for 64 random inputs. If an emitter bug ever lost numerical
fidelity (or the final operator was misemitted), the property would flag it as a
counterexample. The property turns the emitted function into something whose
behavior you can formally verify, not just informally trust.
A failing property: what chelis prove reports
Section titled “A failing property: what chelis prove reports”To see what chelis prove does when a property does not hold, append a
property that asserts the same total is >= 1.999 (a false statement for a
softmax output, which sums to 1):
@property output_sums_to_two forall(features: tensor[1, 4, f32]): { probs = forward(features) total = tensor_to_scalar(sum(sum(probs, 1), 0)) _ = drop(features) (total >= 1.999) } with samples = 64 with seed = 0chelis prove on the file containing both properties prints (verbatim, all on
stdout):
property: output_is_a_probability_distribution -- 64/64 passedproperty failure: output_sums_to_two --> src/irisclassifier.ch1 passed, 1 failed, 0 unsupported, 0 errorsand exits 1. The --> <path> line points at the file containing the failing
property: chelis prove's diagnostic for "this is where the failure was
raised". chelis does not currently print the counterexample tuple itself; the
failed-property line plus the path is the evidence. The acceptance test
exercises exactly this output_sums_to_two failure to confirm the verification
machinery is real, not hollow.
Why a failing property is useful
Section titled “Why a failing property is useful”In practice you write the true property and chelis prove either verifies it
(no counterexample within the sample budget) or reports a property failure: <name> for an invariant the model does not satisfy. A failure during
development tells you the model's behavior deviates from your stated invariant: a
real defect, not an artifact. A clean run gives you a statistical guarantee
bounded by the sampler budget; widen --samples for a stronger result.
Property surface for an emitted forward
Section titled “Property surface for an emitted forward”The property machinery is chelis's, not Hydronnx's. Hydronnx adds nothing to the
@property surface beyond making forward available as a typed function the
property can quantify over. In particular:
- Input quantifier types.
forall(<name>: <type>)admits any chelis type. For an emittedforward, the natural choice is the function's declared input type, the example'stensor[1, 4, f32].chelis provesamples random tensors of that type. - Symbolic tensor dims are not sampleable. chelis's sampler rejects a
symbolic tensor dim. If your emitted
forwardhas a symbolic batch dim, you cannot quantify directly overtensor[batch, ...]. Bindbatchto a concrete value first, or use a fully-static fixture. The worked example uses a concrete[1, 4], sidestepping the issue. - Property body. Any chelis-expressible Boolean. The example's body uses
sum,tensor_to_scalar, and comparison builtins, all standard chelis surface. The_ = drop(features)line discards the unused input (chelis tensors are affine). with samplesandwith seed. Standard chelis property clauses.samplesis the budget;seedmakes a run reproducible.
Composability
Section titled “Composability”A property on forward quantifies over forward's input type. A property on a
composed function (classify(raw) in the example) would quantify over
tensor[1, 4, f32] (the raw input) and assert something about the int64 output.
The composability comes for free because the composed function is itself an
ordinary Chelis function.
The single-file constraint applies to whatever the property calls into. If
classify is in a different module from forward, both definitions need to land
in the same file passed to chelis prove (for example by concatenation in a
build script). The example sidesteps this by attaching the property directly to
the emitted module.
What property verification does not do
Section titled “What property verification does not do”- Hydronnx does not emit properties. Properties are user-written. The emit
CLI never produces a
@propertyblock; it produces the typedforwardthat the property quantifies over. - No cross-module
chelis prove. This is a chelis-side constraint; the workaround is the single-file append shown above. - No symbolic-tensor-dim quantification (see above). Concrete shapes only.
- Property verification over AD needs a matching AD gate. Hydronnx gates
grad(forward)for the weight-free fixture and the weighted tabular Gemm fixture. Properties on broader weighted models should be paired with a model-specific AD or finite-difference gate first. See Limitations and scope. Properties on the rawforwardare fine.