Skip to content

Supported ONNX surface

This chapter describes the ONNX surface Hydronnx emits: which opsets and dtypes it accepts, which operators emit and which are deferred, and how shapes map into the Chelis type system. The honest account of everything outside this surface is in Limitations and scope.

  • Opset versions: 11 through 23 inclusive. Outside that range, the parser fails with UnsupportedOpset. Older opsets are rejected because their operator semantics are too divergent; newer opsets are rejected because they have not been tested.
  • Dtypes: f32, f64, i32, i64, bool. Outside that set, the parser fails with UnsupportedDtype. uint8, int8, bf16, fp16, complex, and string dtypes all reject at parse time. Cast at export time.
  • Static shapes map literally: an ONNX dim value becomes a literal Chelis dim.
  • Symbolic ONNX dim_params map to Chelis dim variables in the emitted sig. The symbolic_batch.onnx fixture is the canonical exercise: a [batch, 5] input emits as tensor[batch, 5, f32].
  • Dynamic dims with no dim_param name are rare in practice; they render as ? in inspect. The emit path rejects them with InternalInconsistency because Chelis's type system cannot express an anonymous dynamic dim. Re-export with the dim named (dim_param = "batch") or pinned to a static value.

Rank-2 MatMul and default Gemm can preserve a symbolic graph-input batch dim, including the common rank-1 Gemm bias broadcast. Reshape 0 dims copied from symbolic graph inputs render via cast(shape(x, cast(axis, int32)), int64). -1 inference with symbolic dims rejects, because the missing product cannot be resolved statically.

The emit set covers tabular, encoder-only, elementwise, windowed pooling, and single-Conv image-classifier graphs. The per-op gate (tests/emit_per_op.rs) asserts both the emittable set and the deferred set on every build.

Conv, ConvTranspose, MaxPool, AveragePool, GlobalAveragePool, and GlobalMaxPool emit.

  • Conv supports static rank-4 NCHW 2D input, rank-4 weights, group=1, dilations=[1,1], equal H/W strides, and nonnegative static pads=[top,left,bottom,right], including asymmetric or per-axis padding lowered through a pre-pad. Grouped Conv, dilated Conv, non-equal strides, and runtime-shaped Conv reject with a documented error.
  • ConvTranspose supports static rank-4 NCHW input, f32 initializer weights in ONNX [C_in, C_out, kH, kW] layout, group=1, dilations=[1,1], positive equal strides [s,s], nonnegative static pads, valid spatial output_padding values smaller than the stride, and spatial-only output_shape=[H,W] cases that derive nonnegative pads. For strides greater than 1, the emitter source-dilates the input before the flipped-kernel conv2d lowering; asymmetric pads and output padding are handled by choosing an over-padded conv2d call and applying static shrink cleanup. Non-equal strides, grouped or dilated ConvTranspose, non-initializer kernels, runtime-shaped ConvTranspose, malformed output_shape, and invalid output_padding reject with documented errors.
  • Windowed pooling supports rank-4 2D pooling, positive length-2 kernel_shape and strides, dilations=[1,1], ceil_mode=0, storage_order=0 for MaxPool, and count_include_pad=0 for AveragePool. MaxPool nonzero padding emits: the spatial border is padded with the most-negative finite f32, the chelis stand-in for ONNX's negative-infinity pad cell, since chelis has no inf literal, and numeric parity holds for any finite input. AveragePool nonzero padding is deferred (it needs a per-output count-include-pad divisor mask). Dilated pooling, ceil-mode pooling, MaxPool indices output, and 1D/3D pooling reject with a documented error.

Flatten emits (it lowers to a reshape onto the collapsed [outer, inner] matrix shape, the standard CNN-head op before the classifier Gemm). Cos, Tan, Floor, and Ceil emit. Floor and Ceil remain in the non-differentiable-operator warning because they are mathematically piecewise-constant and chelis AD can silently produce zero gradients for them.

Operators that emit, but with a documented domain limit:

  • Pow: parity-green only for the a > 0 domain. The exp(b*log(a)) decomposition produces NaN for a <= 0 (log of zero or negative). The per-op parity harness masks a <= 0 cells.
  • Slice: single-axis only. chelis gather (the decomposition target) indexes exactly one axis. A multi-axis Slice rejects Malformed. A negative step also rejects (no reverse-stride primitive).
  • Gelu: only approximate="tanh" emits. The exact-erf form `0.5 * x * (1
    • erf(x / sqrt(2)))needs a cheliserfprimitive that does not exist.approximate="none"(and the absent-attribute default) rejects withUnsupportedAttribute`.
  • MatMul: rank-2 by rank-2 only. A rank 3 or higher operand rejects Malformed. The chelis batched-matmul deferral propagates.
  • Attention: single-head, batch=1 only.
  • Integer Gemm / MatMul: rejected. chelis BlasMatmul is float-only. Cast operands to a float dtype.

These operators do not emit. Each rejects with UnsupportedOperator from chelis-hydronnx-emit.

OperatorReason
Roundchelis has no rounding primitive; round-half-to-even has no clean Floor / Ceil / CmpLt lowering. Needs a new chelis primitive.
ScatterElementschelis RiscOp::Scatter is hyperplane, not element-wise. Needs a new chelis primitive.
ScatterNDsame: chelis scatter has hyperplane semantics. Needs a new chelis primitive.
MultiHeadAttentionnot a standard ai.onnx operator at any opset (it lives in the ORT com.microsoft contrib domain). The standard Attention is emittable and covers multi-head.

When a deferred operator is the obstacle, the resolution paths in order of cost: re-export the model without the operator, move that step out of ONNX into hand-written Chelis after the emit (see Composition with Chelis), or use ONNX Runtime for that model (see Migration from ONNX Runtime).

A model is convertible when every node emits and every weight fits the inline-emit ceiling. The two conditions are independent: a graph can be fully operator-convertible and still exceed the weight ceiling.

Tabular models, small MLPs, small transformer encoders, and small Conv-bearing image classifiers convert and run on the inline-emit path. For real vision-scale and NLP-scale models, the convertibility audit (scripts/audit_models.py, scripts/simulate_convertibility.py) measures the operator surface qualitatively:

  • Classic CNNs (ResNet, MobileNet, ConvNeXt) are close to full operator-convertibility. The recurring operator unlocks are grouped or depthwise Conv (group != 1) and a padded pooling step.
  • Transformers (DeiT, RoBERTa) block earlier on the dynamic-shape family (Shape, Expand, ConstantOfShape), Erf for exact Gelu, and batched MatMul (rank 3 or higher).
  • LLM decoders (Qwen, DeepSeek) need the same dynamic-shape plumbing plus batched MatMul; ORT-fused exports additionally carry com.microsoft contrib operators (MultiHeadAttention, RotaryEmbedding, SkipSimplifiedLayerNormalization), which Hydronnx rejects rather than mis-emit.

Across the corpus, the highest-leverage operator unlocks are grouped/depthwise Conv, batched MatMul, Erf (exact Gelu), and the dynamic-shape family. Separately, every real model here exceeds the inline-weight ceiling, so the inline-source path is for small models regardless of operator coverage. The runtime path (chelis-hydronnx-run) and the placeholder-weight emit mode lift that ceiling for the runtime and type-discipline use cases respectively; see Weights and large models.