Module backends

Expand description

Concrete inference backends, feature-gated so the default build has no HTTP or ML weight. Transport only: task framing lives in the adapter, not here.

Re-exports§

pub use anthropic::AnthropicProvider;
pub use local::LocalProvider;
pub use openai::OpenAiProvider;
pub use rate_limited::RateLimitedProvider;

Modules§

anthropic: Anthropic Messages API provider. One bounded-concurrent call per input row. ai_embed is unsupported — use the OpenAI-compatible provider for embeddings.
local: Local inference via ONNX Runtime (loaded dynamically). Encoder models only: classify/sentiment return logits, embed returns a mean-pooled vector. Sources: hf:org/repo (downloaded on first use), file://<path>, or a bare path. onnxruntime.{dll,so} >= 1.24 must be on the search path or via ORT_DYLIB_PATH.
openai: OpenAI-compatible remote provider (OpenAI, Azure, vLLM, local servers). Chat completions for discriminative/generative tasks; the embeddings endpoint for ai_embed (the only provider that supports it).
rate_limited: Client-side token-bucket rate limiting for remote providers. One cell is acquired per input row before dispatch; the wait happens on Ring 1 only.

Module backends

Module backends Copy item path

Re-exports§

Modules§

Module backends