Bug: `wrapAISDKModel({ caching: false })` is silently ignored when tracing is enabled

## Description

`wrapAISDKModel(model, { caching: false })` does not actually disable caching. When `caching: false` is set, the wrapper still reads from and writes to the evalite cache server — the flag is silently ignored as long as tracing is enabled (which it is by default).

This is misleading because the API suggests opt-out is supported, but the only way to truly bypass the cache today is to also set `tracing: false` (which makes the wrapper return the unwrapped model) or to delete `node_modules/.evalite/` between runs.

The impact is that users who want fresh, uncached results (e.g. for cross-model baselines, or to measure sampling variability across `trialCount > 1`) silently get cached responses. Aggregate cost / latency / token-usage stats are also wrong: cached trials report `tokens.input = 0` and `latency < 100ms`, dragging the aggregates down without the user knowing.

## Version

- `evalite@1.0.0-beta.16`
- Bug also present at the tip of branch `v1` (verified by reading `packages/evalite/src/ai-sdk.ts`, same code path)

## Reproduction

Minimal eval file:

```ts
// repro.eval.ts
import { evalite } from "evalite";
import { wrapAISDKModel } from "evalite/ai-sdk";
import { openai } from "@ai-sdk/openai";
import { generateText } from "ai";

evalite("caching-false-repro", {
  data: () => [{ input: "say hi", expected: "hi" }],
  task: async (input) => {
    const model = wrapAISDKModel(openai("gpt-4o-mini"), { caching: false });
    const { text, usage } = await generateText({ model, prompt: input });
    console.log("usage:", usage);
    return text;
  },
  scorers: [() => ({ score: 1, name: "noop" })],
});
```

Run twice in a row:

```bash
npx evalite run repro.eval.ts   # first run — real API call
npx evalite run repro.eval.ts   # second run — should be a real call with caching:false
```

## Expected vs. Actual

**Expected:** both runs hit the API. `usage.inputTokens > 0` on both, similar latency.

**Actual:** second run returns the cached response. `usage.inputTokens === 0`, latency under 100ms, no API call made.

## Root cause

In `packages/evalite/src/ai-sdk.ts`, `wrapAISDKModel` reads the `enableCaching` flag at the top:

```ts
const enableCaching = options?.caching ?? true;
```

…but inside `wrapGenerate` and `wrapStream`, the cache fetch and store are gated **only** on the existence of `cacheContext`, not on `enableCaching`:

```ts
// wrapGenerate
const cacheContext = getCacheContext();
if (cacheContext) {                          // ← enableCaching not checked
  // fetch from cache server
}
// ...
if (!result) {
  result = await opts.doGenerate();
  if (cacheContext) {                        // ← enableCaching not checked
    // store in cache server
  }
}
```

Same pattern in `wrapStream`. The flag is parsed but only used by the early-return `if (!enableCaching && !enableTracing) return model;` — meaning `caching: false` is only respected when tracing is also disabled.

## Suggested fix

Gate the four cache fetch/store blocks on `enableCaching` in addition to `cacheContext`:

```diff
-if (cacheContext) {
+if (cacheContext && enableCaching) {
```

Four sites total (2 in `wrapGenerate`, 2 in `wrapStream`). Tracing path is unaffected. Diff is ~4 lines. I have a working patch validated locally — happy to send a PR if you'd like.

## Additional note (potential follow-up, not part of this issue)

The CLI flag `--no-cache` and the `cache: false` config option set `cacheEnabled = false` in `run-evalite.ts`, but that flag only short-circuits the `reportCacheHit` callback in `evalite.ts:240`. The `cacheContext` object itself is still posted into `cacheContextLocalStorage`, so `wrapGenerate` still fetches/stores. Fixing the local `caching: false` flag (this issue) doesn't fix `--no-cache` — that would need either propagating `cacheEnabled` into the context, or skipping the `cacheContextLocalStorage.enterWith()` call entirely when caching is disabled. Happy to file a separate issue if you'd prefer to track that work distinctly.

## Related

- #354 — similar pattern (cache directory created regardless of `cacheEnabled`), fixed by guarding the relevant code path on the resolved flag
- #317 — original PR introducing the AI SDK caching layer


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: `wrapAISDKModel({ caching: false })` is silently ignored when tracing is enabled #395

Description

Version

Reproduction

Expected vs. Actual

Root cause

Suggested fix

Additional note (potential follow-up, not part of this issue)

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Bug: wrapAISDKModel({ caching: false }) is silently ignored when tracing is enabled #395

Description

Description

Version

Reproduction

Expected vs. Actual

Root cause

Suggested fix

Additional note (potential follow-up, not part of this issue)

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Bug: `wrapAISDKModel({ caching: false })` is silently ignored when tracing is enabled #395