Expand description
Local inference via ONNX Runtime (ort, loaded dynamically): encoder models
only (BERT/DistilBERT/MiniLM) — classify/sentiment return logits, embed
returns a mean-pooled vector; generative tasks are rejected. A model is loaded
once per source and cached; the forward pass runs on spawn_blocking under
a deadline so a wedged model can’t stall the worker. ORT loads at
runtime — onnxruntime.{dll,so} (>= 1.24) must be on the search path or named
by ORT_DYLIB_PATH. A source is hf:org/repo (downloaded on first use),
file://<path>, or a bare path; labels come from the model’s config.json
id2label.
Structs§
- Local
Provider - Local ONNX provider, backed by a model cache directory.
Functions§
- load_
labels - Classifier labels from a model’s
config.jsonid2label, ordered by index. Empty if the file is absent or has noid2label— the registry uses this to auto-derive a local classifier’s labels. - model_
dir - On-disk directory for a model
source:hf:org/repo→<cache_dir>/org/repo,file://<path>or a bare path used as-is.