Expand description
Local inference via ONNX Runtime (ort, loaded dynamically). Encoder models
only — the BERT / DistilBERT / MiniLM family: classification/sentiment yield
logits (the adapter argmaxes), embedding yields a mean-pooled vector.
Generative tasks are rejected. A model is loaded once per source and cached;
the forward pass runs on spawn_blocking, off the Ring 1 task, under a
deadline so a pathological model can never stall the worker (and the
watermark behind it). ONNX Runtime is loaded at runtime, so onnxruntime.dll
/ .so (ORT >= 1.24) must be on the search path or named by ORT_DYLIB_PATH.
A source resolves to a directory laid out like a Hugging Face export —
onnx/model.onnx + tokenizer.json (+ optional config.json for labels):
hf:org/repo → <cache_dir>/org/repo, file://<path> or a bare path used
as-is. A missing hf: repo is downloaded from the Hugging Face CDN on first
use (public repos only). Classifier labels come from the model’s own
config.json id2label; LocalProvider::intrinsic_labels resolves them on
demand, so a model that downloads lazily scores correctly once it is cached —
no restart, no externally supplied label list.
Structs§
- Local
Provider - Local ONNX provider, backed by a model cache directory.
Functions§
- load_
labels - Classifier labels from a model’s
config.jsonid2label, ordered by index. Empty if the file is absent or has noid2label— the registry uses this to auto-derive a local classifier’s labels. - model_
dir - On-disk directory for a model
source:hf:org/repo→<cache_dir>/org/repo,file://<path>or a bare path used as-is.