Skip to content

Windows: SDK should fall back to --system-prompt-file when cmdline overflows (follow-up to #501) #883

@buddhaholic420

Description

@buddhaholic420

Summary

The Python SDK still hits Windows' 32,767-char CreateProcess limit for system_prompt, even though the CLI now exposes --system-prompt-file <path> as a workaround. The SDK should fall back to a temp file (same pattern it already uses for --agents) when the resolved command line would exceed the platform limit.

Filing as a follow-up because:

Reproduction

Confirmed today against:

  • claude_agent_sdk 0.1.x (latest pip install, traceback from subprocess_cli.py:485)
  • claude.exe 2.1.119
  • Python 3.14.3 on Windows 11 (Windows workstation)

The triggering call enriches ClaudeAgentOptions.system_prompt with RAG-injected wiki chunks and memory hits, totaling ~48,755 chars. Result:

File "C:\Users\Jaime\AppData\Local\Programs\Python\Python314\Lib\subprocess.py", line 1552, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args, ...)
FileNotFoundError: [WinError 206] The filename or extension is too long

# Then the SDK relabels it:
claude_agent_sdk._internal.transport.subprocess_cli.py:485 in connect
    raise error from e   # error = CLINotFoundError(f"Claude Code not found at: {self._cli_path}")

Two problems compounded here:

  1. The actual cause is unrelated to the binary. The CLI binary exists, runs fine standalone (claude.exe --version returns 2.1.119), and the _find_cli path resolution worked — the spawn just fails because --system-prompt <48K> blows the cmdline. Python's subprocess raises FileNotFoundError for WinError 206, which the SDK's except FileNotFoundError block at subprocess_cli.py:484 mistakes for a missing binary.

  2. No SDK fallback exists. The SDK already has the --agents temp-file pattern at subprocess_cli.py:336-362. It does not extend it to --system-prompt, so callers have no out short of pre-truncating on their side.

Suggested fix

Two targeted changes:

1. Mirror the --agents temp-file fallback for --system-prompt

When the resolved cmd string exceeds _CMD_LENGTH_LIMIT, write --system-prompt's value to a tempfile.NamedTemporaryFile, replace it with --system-prompt-file <path>, and clean up after the subprocess exits. Same shape as the existing --agents branch. The CLI flag is now real, so the value will land correctly without @filepath games.

# subprocess_cli.py — pseudocode, parallel to the existing --agents block
cmd_str = " ".join(cmd)
if len(cmd_str) > _CMD_LENGTH_LIMIT:
    if "--system-prompt" in cmd:
        idx = cmd.index("--system-prompt")
        text = cmd[idx + 1]
        tf = tempfile.NamedTemporaryFile(mode="w", suffix=".txt",
                                          delete=False, encoding="utf-8")
        tf.write(text); tf.close()
        cmd[idx]     = "--system-prompt-file"
        cmd[idx + 1] = tf.name
        self._tempfiles_to_cleanup.append(tf.name)

Same logic for --append-system-prompt--append-system-prompt-file.

2. Distinguish "binary missing" from "spawn failed for another reason"

The current except FileNotFoundError block treats any FileNotFoundError from _winapi.CreateProcess as "CLI not found at path". On Windows, that is a lie for at least three error codes that Python surfaces as FileNotFoundError: ERROR_FILE_NOT_FOUND (2), ERROR_PATH_NOT_FOUND (3), and ERROR_FILENAME_EXCED_RANGE (206). The error message we got steered me into a 90-minute investigation of a missing-binary that wasn't missing.

Two options:

  • Cheap: before raising CLINotFoundError, double-check os.path.isfile(self._cli_path). If True, raise CLIConnectionError with a more informative message (include cmdline length + platform limit).
  • Better: catch OSError instead of just FileNotFoundError, inspect e.winerror on Windows, and branch on 206 → "Command line exceeded Windows limit ({len} chars vs 32,767)" before falling through to the file-not-found branch.

The "cheap" option alone would have saved my afternoon.

Why this matters in practice

The SDK is the recommended Python integration path. Anyone building a NetOps/observability/RAG-grounded bot on Windows that injects retrieved context into system_prompt (which is the natural place for it) will hit this on the first non-trivial query. It silently looks like a binary-installation problem.

The fix is mechanical (pattern already in the codebase for --agents) and the user-facing payoff is large.

Environment

  • OS: Windows 11 Home 26200
  • Python: 3.14.3 (per-user install)
  • claude_agent_sdk: latest pypi
  • claude.exe: 2.1.119
  • Run mode: subprocess CLI transport (default)
  • Caller: NemoBot service (NSSM-managed Python process running as local user)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions