Skip to content

Bug: wrapAISDKModel({ caching: false }) is silently ignored when tracing is enabled #395

@Sraleik

Description

@Sraleik

Description

wrapAISDKModel(model, { caching: false }) does not actually disable caching. When caching: false is set, the wrapper still reads from and writes to the evalite cache server — the flag is silently ignored as long as tracing is enabled (which it is by default).

This is misleading because the API suggests opt-out is supported, but the only way to truly bypass the cache today is to also set tracing: false (which makes the wrapper return the unwrapped model) or to delete node_modules/.evalite/ between runs.

The impact is that users who want fresh, uncached results (e.g. for cross-model baselines, or to measure sampling variability across trialCount > 1) silently get cached responses. Aggregate cost / latency / token-usage stats are also wrong: cached trials report tokens.input = 0 and latency < 100ms, dragging the aggregates down without the user knowing.

Version

  • evalite@1.0.0-beta.16
  • Bug also present at the tip of branch v1 (verified by reading packages/evalite/src/ai-sdk.ts, same code path)

Reproduction

Minimal eval file:

// repro.eval.ts
import { evalite } from "evalite";
import { wrapAISDKModel } from "evalite/ai-sdk";
import { openai } from "@ai-sdk/openai";
import { generateText } from "ai";

evalite("caching-false-repro", {
  data: () => [{ input: "say hi", expected: "hi" }],
  task: async (input) => {
    const model = wrapAISDKModel(openai("gpt-4o-mini"), { caching: false });
    const { text, usage } = await generateText({ model, prompt: input });
    console.log("usage:", usage);
    return text;
  },
  scorers: [() => ({ score: 1, name: "noop" })],
});

Run twice in a row:

npx evalite run repro.eval.ts   # first run — real API call
npx evalite run repro.eval.ts   # second run — should be a real call with caching:false

Expected vs. Actual

Expected: both runs hit the API. usage.inputTokens > 0 on both, similar latency.

Actual: second run returns the cached response. usage.inputTokens === 0, latency under 100ms, no API call made.

Root cause

In packages/evalite/src/ai-sdk.ts, wrapAISDKModel reads the enableCaching flag at the top:

const enableCaching = options?.caching ?? true;

…but inside wrapGenerate and wrapStream, the cache fetch and store are gated only on the existence of cacheContext, not on enableCaching:

// wrapGenerate
const cacheContext = getCacheContext();
if (cacheContext) {                          // ← enableCaching not checked
  // fetch from cache server
}
// ...
if (!result) {
  result = await opts.doGenerate();
  if (cacheContext) {                        // ← enableCaching not checked
    // store in cache server
  }
}

Same pattern in wrapStream. The flag is parsed but only used by the early-return if (!enableCaching && !enableTracing) return model; — meaning caching: false is only respected when tracing is also disabled.

Suggested fix

Gate the four cache fetch/store blocks on enableCaching in addition to cacheContext:

-if (cacheContext) {
+if (cacheContext && enableCaching) {

Four sites total (2 in wrapGenerate, 2 in wrapStream). Tracing path is unaffected. Diff is ~4 lines. I have a working patch validated locally — happy to send a PR if you'd like.

Additional note (potential follow-up, not part of this issue)

The CLI flag --no-cache and the cache: false config option set cacheEnabled = false in run-evalite.ts, but that flag only short-circuits the reportCacheHit callback in evalite.ts:240. The cacheContext object itself is still posted into cacheContextLocalStorage, so wrapGenerate still fetches/stores. Fixing the local caching: false flag (this issue) doesn't fix --no-cache — that would need either propagating cacheEnabled into the context, or skipping the cacheContextLocalStorage.enterWith() call entirely when caching is disabled. Happy to file a separate issue if you'd prefer to track that work distinctly.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions