Module provider

Expand description

The single transport abstraction over a model backend.

An InferenceProvider does I/O only: a homogeneous batch of inputs in, a homogeneous batch of outputs plus usage out. It knows nothing about SQL tasks or output columns — framing a request and turning the response into a task’s output column is the adapter’s job. Implementors: Anthropic, an OpenAI-compatible provider (OpenAI / Azure / vLLM via base_url), and a local ONNX Runtime provider.

Structs§

InferenceParams: Knobs that shape a request and contribute to the cache’s params_version, so the same text under different parameters never collides. Generation knobs (max_tokens, temperature, …) are added here as backends consume them.
InferenceRequest: One batch of inputs to run through a model. A request is homogeneous: a single task, a single model, and one input string per row in order.
InferenceResponse: The result of a batch inference call: outputs aligned 1:1 with the request’s inputs, plus usage.
Usage: Token and cost accounting for a single batch call. Local backends report Usage::ZERO; remote backends report what the provider charged.

Enums§

InferenceOutputs: Per-row outputs of a batch. Homogeneous for a given request: a classify or generate batch yields text; an embed batch — or a local classifier’s raw logits awaiting softmax in the adapter — yields numeric vectors; a sentiment batch yields one scalar score per row (the adapter’s output, never a raw provider shape).
ProviderError: Errors a provider can return for a batch call.

Traits§

InferenceProvider: Transport over a model backend. Implementors perform I/O only — no task framing, no result parsing. Shared as Arc<dyn InferenceProvider> and driven from the Ring 1 inference worker, never from Ring 0.

Module provider

Module provider Copy item path

Structs§

Enums§

Traits§

Module provider