Description
wrapAISDKModel(model, { caching: false }) does not actually disable caching. When caching: false is set, the wrapper still reads from and writes to the evalite cache server — the flag is silently ignored as long as tracing is enabled (which it is by default).
This is misleading because the API suggests opt-out is supported, but the only way to truly bypass the cache today is to also set tracing: false (which makes the wrapper return the unwrapped model) or to delete node_modules/.evalite/ between runs.
The impact is that users who want fresh, uncached results (e.g. for cross-model baselines, or to measure sampling variability across trialCount > 1) silently get cached responses. Aggregate cost / latency / token-usage stats are also wrong: cached trials report tokens.input = 0 and latency < 100ms, dragging the aggregates down without the user knowing.
Version
evalite@1.0.0-beta.16
- Bug also present at the tip of branch
v1 (verified by reading packages/evalite/src/ai-sdk.ts, same code path)
Reproduction
Minimal eval file:
// repro.eval.ts
import { evalite } from "evalite";
import { wrapAISDKModel } from "evalite/ai-sdk";
import { openai } from "@ai-sdk/openai";
import { generateText } from "ai";
evalite("caching-false-repro", {
data: () => [{ input: "say hi", expected: "hi" }],
task: async (input) => {
const model = wrapAISDKModel(openai("gpt-4o-mini"), { caching: false });
const { text, usage } = await generateText({ model, prompt: input });
console.log("usage:", usage);
return text;
},
scorers: [() => ({ score: 1, name: "noop" })],
});
Run twice in a row:
npx evalite run repro.eval.ts # first run — real API call
npx evalite run repro.eval.ts # second run — should be a real call with caching:false
Expected vs. Actual
Expected: both runs hit the API. usage.inputTokens > 0 on both, similar latency.
Actual: second run returns the cached response. usage.inputTokens === 0, latency under 100ms, no API call made.
Root cause
In packages/evalite/src/ai-sdk.ts, wrapAISDKModel reads the enableCaching flag at the top:
const enableCaching = options?.caching ?? true;
…but inside wrapGenerate and wrapStream, the cache fetch and store are gated only on the existence of cacheContext, not on enableCaching:
// wrapGenerate
const cacheContext = getCacheContext();
if (cacheContext) { // ← enableCaching not checked
// fetch from cache server
}
// ...
if (!result) {
result = await opts.doGenerate();
if (cacheContext) { // ← enableCaching not checked
// store in cache server
}
}
Same pattern in wrapStream. The flag is parsed but only used by the early-return if (!enableCaching && !enableTracing) return model; — meaning caching: false is only respected when tracing is also disabled.
Suggested fix
Gate the four cache fetch/store blocks on enableCaching in addition to cacheContext:
-if (cacheContext) {
+if (cacheContext && enableCaching) {
Four sites total (2 in wrapGenerate, 2 in wrapStream). Tracing path is unaffected. Diff is ~4 lines. I have a working patch validated locally — happy to send a PR if you'd like.
Additional note (potential follow-up, not part of this issue)
The CLI flag --no-cache and the cache: false config option set cacheEnabled = false in run-evalite.ts, but that flag only short-circuits the reportCacheHit callback in evalite.ts:240. The cacheContext object itself is still posted into cacheContextLocalStorage, so wrapGenerate still fetches/stores. Fixing the local caching: false flag (this issue) doesn't fix --no-cache — that would need either propagating cacheEnabled into the context, or skipping the cacheContextLocalStorage.enterWith() call entirely when caching is disabled. Happy to file a separate issue if you'd prefer to track that work distinctly.
Related
Description
wrapAISDKModel(model, { caching: false })does not actually disable caching. Whencaching: falseis set, the wrapper still reads from and writes to the evalite cache server — the flag is silently ignored as long as tracing is enabled (which it is by default).This is misleading because the API suggests opt-out is supported, but the only way to truly bypass the cache today is to also set
tracing: false(which makes the wrapper return the unwrapped model) or to deletenode_modules/.evalite/between runs.The impact is that users who want fresh, uncached results (e.g. for cross-model baselines, or to measure sampling variability across
trialCount > 1) silently get cached responses. Aggregate cost / latency / token-usage stats are also wrong: cached trials reporttokens.input = 0andlatency < 100ms, dragging the aggregates down without the user knowing.Version
evalite@1.0.0-beta.16v1(verified by readingpackages/evalite/src/ai-sdk.ts, same code path)Reproduction
Minimal eval file:
Run twice in a row:
Expected vs. Actual
Expected: both runs hit the API.
usage.inputTokens > 0on both, similar latency.Actual: second run returns the cached response.
usage.inputTokens === 0, latency under 100ms, no API call made.Root cause
In
packages/evalite/src/ai-sdk.ts,wrapAISDKModelreads theenableCachingflag at the top:…but inside
wrapGenerateandwrapStream, the cache fetch and store are gated only on the existence ofcacheContext, not onenableCaching:Same pattern in
wrapStream. The flag is parsed but only used by the early-returnif (!enableCaching && !enableTracing) return model;— meaningcaching: falseis only respected when tracing is also disabled.Suggested fix
Gate the four cache fetch/store blocks on
enableCachingin addition tocacheContext:Four sites total (2 in
wrapGenerate, 2 inwrapStream). Tracing path is unaffected. Diff is ~4 lines. I have a working patch validated locally — happy to send a PR if you'd like.Additional note (potential follow-up, not part of this issue)
The CLI flag
--no-cacheand thecache: falseconfig option setcacheEnabled = falseinrun-evalite.ts, but that flag only short-circuits thereportCacheHitcallback inevalite.ts:240. ThecacheContextobject itself is still posted intocacheContextLocalStorage, sowrapGeneratestill fetches/stores. Fixing the localcaching: falseflag (this issue) doesn't fix--no-cache— that would need either propagatingcacheEnabledinto the context, or skipping thecacheContextLocalStorage.enterWith()call entirely when caching is disabled. Happy to file a separate issue if you'd prefer to track that work distinctly.Related
cacheEnabledSetting #354 — similar pattern (cache directory created regardless ofcacheEnabled), fixed by guarding the relevant code path on the resolved flag