Skip to content

standalone L1 extraction can fail with DeepSeek due to thinking/tool interference #58

@sirenexcelsior

Description

@sirenexcelsior

Summary

In standalone/gateway mode, L1 extraction can fail against DeepSeek-compatible chat APIs even when the model itself is capable of returning valid JSON.

I reproduced this with DeepSeek chat-style endpoints and saw L1 extraction produce either:

  • empty output (output=0 chars)
  • non-JSON assistant text
  • The reasoning_content in the thinking mode must be passed back to the API.

This leaves L0 recorded successfully while L1 stays empty for the same session.

Reproduction

Environment:

  • mode: standalone gateway
  • LLM: DeepSeek-compatible OpenAI API
  • model: deepseek-chat
  • TDAI L1 extraction path (StandaloneLLMRunner with enableTools=false)

Observed behavior:

  1. POST /capture records L0 successfully.
  2. L1 runner starts for the same session.
  3. L1 returns empty/non-JSON output, or errors with reasoning-related message.
  4. No L1 records are stored.

Representative logs:

[memory-tdai][l1-extractor] No JSON array found in extraction response
[memory-tdai][l1-extractor] [l1-debug] NO_JSON taskId=l1-extraction, rawLen=0, cleanedLen=0, rawFull=""
[memory-tdai] [standalone-runner] run() failed after 13911ms: The `reasoning_content` in the thinking mode must be passed back to the API.

What seems to be happening

Two details in the standalone runner make L1 extraction brittle for DeepSeek:

  1. When enableTools=false, the runner still exposes a read-only tool (read_file).
    For L1 extraction this is undesirable because the prompt expects pure JSON text output, but the model may still opportunistically attempt tool use.

  2. There is no way on origin/main to configure standalone gateway requests with disableThinking, so DeepSeek-compatible backends may enter reasoning/thinking mode and produce output that is not compatible with the current L1 JSON parser.

Suggested fix

I tested a minimal fix locally and it resolved the L1 write path for my DeepSeek setup:

  1. In src/adapters/standalone/llm-runner.ts

    • when enableTools=false, do not pass tools at all
    • add optional disableThinking?: boolean to StandaloneLLMConfig
    • when disableThinking=true, inject thinking: { type: "disabled" } into the OpenAI-compatible request body
  2. In src/gateway/config.ts

    • parse llm.disableThinking from config/env and pass it into StandaloneLLMConfig

Why I think this is a runner/config issue rather than a model issue

I verified separately that:

  • direct calls to the same DeepSeek API can return valid JSON
  • the same L1 extraction prompt can return valid JSON outside the live gateway pipeline

So the failure appears to be in the standalone runner/config behavior, not in deepseek-chat itself.

Expected behavior

For L1 extraction in standalone mode:

  • pure text tasks should not expose tools
  • DeepSeek-compatible backends should be able to run with reasoning disabled via config
  • successful POST /capture should allow L1 records to be generated for the same session when the extracted output is valid

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions