Skip to content

Migration from ONNX Runtime

A working guide for users who already run their .onnx under ONNX Runtime (ORT) and are considering Hydronnx. The aim is genuine usefulness: what maps cleanly, what is structurally different, what is a strict gain, and what is a strict loss.

If ONNX Runtime is the right tool for your model, this guide says so directly.

Move to Hydronnx if:

  • Your model is a small to moderate tabular, encoder-only, or elementwise graph (the emittable categories).
  • You want compile-time shape checking at every call site.
  • You want property verification: chelis prove to check invariants like "softmax outputs sum to 1" on a sampled input domain.
  • You want to compose the model into hand-written Chelis with end-to-end type-checked pipelines.
  • You want AD on a gated model surface. Hydronnx gates the weight-free fixture and weighted tabular Gemm; broader surfaces need their own gradient gate.

Stay on ONNX Runtime if:

  • Your model is a broad vision model (ResNet, MobileNet, YOLO, and so on). Hydronnx emits a static 2D Conv subset, but larger vision and detection models need more operator-surface and weight-format work.
  • Your model is multi-megabyte. The inline-emit path emits weights as source literals, which does not scale.
  • Your inference loop is throughput-bound. chelis does not fuse operators, and ORT has years of head start on raw speed.
  • You need GPU or mobile NPU deployment. Hydronnx's runtime is chelis's host runtime; no kernel offload.
  • You need AD on a weighted model surface Hydronnx has not gated. Weighted tabular Gemm AD is covered; broader surfaces need their own gradient gate.
  • You need control-flow ops (If, Loop, Scan). Out of scope.

See Limitations and scope for the full surface; the rest of this guide fills in the migration mechanics for the cases where the move is the right call.

The high-level shape (take an .onnx, get a callable function) is the same in both tools. A side-by-side concept map:

ONNX Runtime conceptHydronnx equivalent
onnxruntime.InferenceSession(path)chelis-hydronnx-emit <path> (build-time), then import Hydronnx.<Name> (forward) in your Chelis code
session.run(None, {"input": x})forward(x): an ordinary typed Chelis function call
Input tensor (numpy / torch)A chelis tensor literal at the call site, type tensor[..., dtype]
Output tensorThe return value of forward, typed by the model's ONNX output
session.get_inputs() / get_outputs()The emitted module's sig forward: <in> -> <out>
Model inspection (onnx.checker.check_model)chelis-hydronnx-inspect <path>
Per-op fusion / optimizationchelis's pipeline (no fusion)

If you have used ORT, the verbs translate. The nouns translate too, just shifted from runtime to build-time.

What is structurally different: build-time emit versus runtime session

Section titled “What is structurally different: build-time emit versus runtime session”

This is the single biggest mental shift. ORT loads a model at runtime into a Python or C++ process and dispatches per-call. Hydronnx generates a Chelis function module ahead of time and your Chelis program compiles it in.

ORT (runtime): Hydronnx (build-time emit):
build my_app chelis-hydronnx-emit model.onnx -o forward.ch
| | (once, before build)
run my_app chelis build my_app (forward.ch is part of source)
| |
ORT loads model.onnx forward is a typed function in your binary
| |
session.run(x) -> y forward(x) -> y (no model file at runtime)

The consequences:

  • No .onnx file at runtime. The weights are baked into the Chelis source. You ship one compiled binary, not a binary plus .onnx.
  • No session object, no model load cost. Inference is a function call. Cold-start is one cold function call.
  • The .onnx is a build-time input. Versioning the model is versioning your source: regenerate the emitted .ch and rebuild.
  • No dynamic model swap. Changing the model means a re-emit and a re-build. ORT lets you swap a session at runtime; Hydronnx does not.

The model load is a build-time concern. The model load is sketched as a runtime function, but chelis is ahead-of-time compiled, so the model load is implemented as chelis-hydronnx-emit (the build-time CLI). The user-visible surface is import Hydronnx.<Name> (forward), not let m = load_model("...").

Things ORT does not, or structurally cannot, give you.

The emitted forward's sig is a typed Chelis function signature. Every call site is type-checked. Pass a tensor[1, 5, f32] to a function declared tensor[1, 4, f32] -> tensor[1, 3, f32] and chelis check rejects the line at compile time, naming the dimension mismatch. ORT raises a runtime exception for the same case, on the call that delivers the wrong-shape input. Hydronnx catches it at the build.

You can attach @property blocks to forward and run chelis prove. The worked example does this for a softmax-outputs-sum-to-1 invariant:

@property output_is_a_probability_distribution forall(features: tensor[1, 4, f32]):
...

chelis prove samples 64 inputs and checks the invariant on each. A counterexample fails the build. There is no ORT equivalent for "verify the model satisfies this property"; at best you run a unit test on finitely many inputs. See Properties and verification.

3. Composition with hand-written code, type-checked end to end

Section titled “3. Composition with hand-written code, type-checked end to end”

ORT inference is a black box from a typed language's perspective: session.run returns dict-shaped Any. Hydronnx's forward is an ordinary function: you pre-process, post-process, and compose pipelines with hand-written Chelis, and chelis check verifies every shape through. See Composition with Chelis.

chelis grad differentiates an emitted forward when the emitted operator surface is in chelis AD's supported set. Hydronnx gates both a weight-free elementwise fixture and the weighted tabular Gemm fixture against finite differences. ORT has no built-in AD; you would use PyTorch or JAX for this, which means leaving the ONNX surface. The honest caveat: this is not a blanket AD claim for every weighted ONNX model. Add a model-specific AD gate before relying on gradients for broader operator surfaces. See Limitations and scope.

The emitted .ch is human-readable source. The sig, the operator sequence, and the baked weights are all visible. ORT loads an opaque graph; Hydronnx hands you the code.

ORT is years more mature on raw inference throughput. chelis's pipeline does not fuse operators. Hydronnx is not the right choice if your inference loop is the bottleneck.

Static 2D Conv, MaxPool, AveragePool, Cos, Tan, Floor, and Ceil emit; ConvTranspose emits for equal strides, including strides greater than 1 via source dilation, and covers static asymmetric pads plus valid spatial output_padding and output_shape cases. Larger vision models such as ResNet, MobileNet, and YOLO need more operator-surface work and sidecar-backed emit. A small generated image-classifier example ships; object detection and broad vision models are outside the current surface.

The inline path emits weights as source literals. A multi-megabyte model becomes a multi-hundred-megabyte .ch, past chelis's practical ceiling. The placeholder-emit mode and the runtime path work around this; see Weights and large models. For real numbers as typed source, the numeric-source path awaits an upstream Chelis feature.

Detailed under Limitations and scope. Hydronnx gates weighted tabular Gemm AD; broader weighted models need model-specific AD coverage. The small Conv image-classifier surface is not AD-ready in chelis. A clean non-differentiable-operator report is a useful first screen, not a guarantee that grad is covered.

ORT runs on CPU, GPU, mobile NPUs, and web targets. Hydronnx runs wherever the chelis host runtime runs, currently CPU only. No GPU dispatch, no NPU dispatch.

ORT lets you load a different .onnx at runtime. Hydronnx does not; a new model means re-emit plus re-build. ORT supports custom operator registration; Hydronnx rejects custom ops at parse time.

Hydronnx is one-way: .onnx to .ch. There is no .ch to .onnx export. If you want to share the resulting model with other tools, you keep the original .onnx.

For a model that fits the emittable scope:

  1. Inspect it. chelis-hydronnx-inspect model.onnx: confirm the opset (11 through 23), dtypes (f32 / f64 / i32 / i64), and operator set are in scope. The output ends with a non-differentiable-operator report if you care about AD.
  2. Decide where the pre/post-process lives. ORT's typical pattern bakes pre/post into the model. Under Hydronnx you might pull them out into hand-written Chelis (see Composition with Chelis for the trade-off heuristic). Pull-out gives you better type clarity and sidesteps any non-emittable pre-process op; bake-in is closer to the ORT mental model. Both work.
  3. Emit. chelis-hydronnx-emit model.onnx -o src/myforward.ch (or let the CLI pick the path). The emitted module exports one function: forward.
  4. Wire it up. In your Chelis program, import Hydronnx.<Name> (forward) and call it. If you wrote pre/post-process by hand, compose the three.
  5. Verify. chelis check (compile-time types), chelis test (your own unit tests on representative inputs), and chelis prove (any property you want to hold). The worked example (examples/tabular_classifier/) demonstrates all three.
  6. Stand up a parity check. Until you trust the emit, run the model under ORT on a representative input and compare against the forward(x) output to confirm within 1e-3 (the per-op parity bound). The worked example does exactly this in tests/example.ch::test_forward_matches_onnx_runtime.

If any step exits non-zero, see Error handling; if the cause is a deferred operator or a structural mismatch with the emittable scope, see Limitations and scope.

You can keep ORT in your stack and add Hydronnx for the verification layer. Compile-time-check and property-verify a model under Hydronnx while shipping the runtime path through ORT. The two tools answer different questions: Hydronnx tells you "this model satisfies this invariant", ORT tells you "this model runs fast on this hardware". Use whichever answers the question you have.