standalone L1 extraction can fail with DeepSeek due to thinking/tool interference

## Summary

In standalone/gateway mode, L1 extraction can fail against DeepSeek-compatible chat APIs even when the model itself is capable of returning valid JSON.

I reproduced this with DeepSeek chat-style endpoints and saw L1 extraction produce either:

- empty output (`output=0 chars`)
- non-JSON assistant text
- `The reasoning_content in the thinking mode must be passed back to the API.`

This leaves L0 recorded successfully while L1 stays empty for the same session.

## Reproduction

Environment:

- mode: standalone gateway
- LLM: DeepSeek-compatible OpenAI API
- model: `deepseek-chat`
- TDAI L1 extraction path (`StandaloneLLMRunner` with `enableTools=false`)

Observed behavior:

1. `POST /capture` records L0 successfully.
2. L1 runner starts for the same session.
3. L1 returns empty/non-JSON output, or errors with reasoning-related message.
4. No L1 records are stored.

Representative logs:

```text
[memory-tdai][l1-extractor] No JSON array found in extraction response
[memory-tdai][l1-extractor] [l1-debug] NO_JSON taskId=l1-extraction, rawLen=0, cleanedLen=0, rawFull=""
[memory-tdai] [standalone-runner] run() failed after 13911ms: The `reasoning_content` in the thinking mode must be passed back to the API.
```

## What seems to be happening

Two details in the standalone runner make L1 extraction brittle for DeepSeek:

1. When `enableTools=false`, the runner still exposes a read-only tool (`read_file`).
   For L1 extraction this is undesirable because the prompt expects pure JSON text output, but the model may still opportunistically attempt tool use.

2. There is no way on `origin/main` to configure standalone gateway requests with `disableThinking`, so DeepSeek-compatible backends may enter reasoning/thinking mode and produce output that is not compatible with the current L1 JSON parser.

## Suggested fix

I tested a minimal fix locally and it resolved the L1 write path for my DeepSeek setup:

1. In `src/adapters/standalone/llm-runner.ts`
   - when `enableTools=false`, do **not** pass tools at all
   - add optional `disableThinking?: boolean` to `StandaloneLLMConfig`
   - when `disableThinking=true`, inject `thinking: { type: "disabled" }` into the OpenAI-compatible request body

2. In `src/gateway/config.ts`
   - parse `llm.disableThinking` from config/env and pass it into `StandaloneLLMConfig`

## Why I think this is a runner/config issue rather than a model issue

I verified separately that:

- direct calls to the same DeepSeek API can return valid JSON
- the same L1 extraction prompt can return valid JSON outside the live gateway pipeline

So the failure appears to be in the standalone runner/config behavior, not in `deepseek-chat` itself.

## Expected behavior

For L1 extraction in standalone mode:

- pure text tasks should not expose tools
- DeepSeek-compatible backends should be able to run with reasoning disabled via config
- successful `POST /capture` should allow L1 records to be generated for the same session when the extracted output is valid


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

standalone L1 extraction can fail with DeepSeek due to thinking/tool interference #58

Summary

Reproduction

What seems to be happening

Suggested fix

Why I think this is a runner/config issue rather than a model issue

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

standalone L1 extraction can fail with DeepSeek due to thinking/tool interference #58

Description

Summary

Reproduction

What seems to be happening

Suggested fix

Why I think this is a runner/config issue rather than a model issue

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions