Skip to main content

Module cache

Module cache 

Source
Expand description

Result cache for AI inference, keyed (content_hash, model_id, params_version).

The key versions on both the model and its parameters, so a local model and a remote model — or the same model under different parameters — never collide on the same input text. Local results are deterministic and cacheable permanently for correctness; remote results are cached as a cost-saver. The cache itself does not distinguish the two — that policy lives in the caller.

The cache is an in-memory [foyer::Cache] with S3-FIFO eviction, the same crate the lookup and schema-registry caches use. A lookup is a memory op: cheap enough to gate the inference worker from the operator without doing the model call inline.

Structs§

AiCacheKey
Cache key. All fields are Copy, so lookups need no allocation and no borrowed-key indirection.
AiResultCache
foyer-backed in-memory cache of per-row inference results.
AiResultCacheConfig
Configuration for AiResultCache.

Enums§

CachedOutput
One row’s cached inference output. Mirrors the per-row shape of crate::provider::InferenceOutputs but singular, since the cache is keyed per row of input.

Functions§

content_hash
xxh3-128 of the input content. Not cryptographic; a fast, collision-negligible key for a result cache.
params_version
A stable hash of the parameters that change a model’s output for the same input — currently the candidate label set. Computed once per batch (all rows in a batch share parameters), not per row.