Expand description
Result cache for AI inference, keyed (content_hash, model_id, params_version).
The key versions on both the model and its parameters, so a local model and a remote model — or the same model under different parameters — never collide on the same input text. Local results are deterministic and cacheable permanently for correctness; remote results are cached as a cost-saver. The cache itself does not distinguish the two — that policy lives in the caller.
The cache is an in-memory [foyer::Cache] with S3-FIFO eviction, the same
crate the lookup and schema-registry caches use. A lookup is a memory op:
cheap enough to gate the inference worker from the operator without doing the
model call inline.
Structs§
- AiCache
Key - Cache key. All fields are
Copy, so lookups need no allocation and no borrowed-key indirection. - AiResult
Cache - foyer-backed in-memory cache of per-row inference results.
- AiResult
Cache Config - Configuration for
AiResultCache.
Enums§
- Cached
Output - One row’s cached inference output. Mirrors the per-row shape of
crate::provider::InferenceOutputsbut singular, since the cache is keyed per row of input.
Functions§
- content_
hash - xxh3-128 of the input content. Not cryptographic; a fast, collision-negligible key for a result cache.
- params_
version - A stable hash of the parameters that change a model’s output for the same input — currently the candidate label set. Computed once per batch (all rows in a batch share parameters), not per row.