Skip to content

fix(mcp-service): sanitize MCP call arguments to avoid 'unhashable type: dict' from MCP servers#14886

Open
404-Page-Found wants to merge 1 commit into
CherryHQ:v2from
404-Page-Found:nvidia-kimi-k2.6-model
Open

fix(mcp-service): sanitize MCP call arguments to avoid 'unhashable type: dict' from MCP servers#14886
404-Page-Found wants to merge 1 commit into
CherryHQ:v2from
404-Page-Found:nvidia-kimi-k2.6-model

Conversation

@404-Page-Found
Copy link
Copy Markdown
Contributor

What this PR does

Before this PR:

  • MCP tool call arguments could contain non-JSON-friendly JavaScript types (Map, Set, BigInt, nested non-plain objects). Some Python-based MCP servers (e.g., Nvidia Kimi K2.6) would receive these values and raise errors such as "unhashable type: 'dict'" when attempting to use them as hashable keys.

After this PR:

  • Arguments are sanitized via sanitizeForMcp(...) in McpService before being forwarded to MCP clients. Maps and Sets are converted to plain objects/arrays, BigInt to string, and non-plain values are serialized to JSON-friendly structures.

Fixes #14868

Why we need it and why it was done in this way

  • Python MCP implementations expect plain JSON-compatible input. Converting complex JS runtime types to plain structures is the minimal, surgical change that prevents server-side exceptions without changing MCP protocol semantics.

The following tradeoffs were made:

  • Pros: Minimal change, low risk, resolves reported server crash.
  • Cons: Small runtime cost for deep sanitization of arguments.

The following alternatives were considered:

  • Enforcing callers to only pass plain JSON (would require broader refactors).
  • Serializing using a custom transport wrapper (more invasive).

Links to discussion: #14868

Breaking changes

None.

Special notes for your reviewer

  • This change is intentionally small and scoped to McpService.callTool argument sanitization.
  • Unit tests can be added to cover sanitizeForMcp behavior if desired.

Checklist

  • PR: The PR description is expressive enough and will help future contributors
  • Code: Code is simple and readable
  • Self-review: I have reviewed my own code before requesting review from others

Release note

fix(mcp-service): sanitize MCP call arguments to prevent 'unhashable type: dict' errors with Python MCP servers (e.g., Nvidia Kimi K2.6)

… causing Python server errors

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: 404-Page-Found <Lucas20220605@gmail.com>
@404-Page-Found 404-Page-Found requested a review from a team May 7, 2026 02:11
@DeJeune
Copy link
Copy Markdown
Collaborator

DeJeune commented May 7, 2026

add a unit test plz

@DeJeune
Copy link
Copy Markdown
Collaborator

DeJeune commented May 7, 2026

Note

This issue/comment/review was translated by Claude.

Thanks for the fix! However, I have a few points to discuss with you during review of this change. It may be worth confirming the root cause and implementation approach again before merging.

1. About the Root Cause

unhashable type: 'dict' is a Python-side error—some server-side tool used the received dict as a hashable value (e.g., putting it in a set() or using it as a dict key). But along this code path of ours:

  • In McpService.callTool, args have already gone through JSON.parse (L940-951)
  • The MCP SDK then JSON.stringifys the request and sends it to the server via stdio/SSE/HTTP

This means the args reaching the Python server must be pure JSON—Map/Set become {} during JSON.stringify, and BigInt throws directly. Map/Set/BigInt don't actually appear on this path, so this sanitize most likely never really triggers.

A more likely cause is: the model gave a nested object as arg (e.g., schema expects string, but model gave {"a":1}), and the server uses it as a dict key causing this error. After sanitization, the nested object is still a nested object, the bug won't disappear.

If convenient, could you reproduce once with the log.txt from the issue, and print out the actual value of args before handing to the SDK to confirm. If it's a schema mismatch, a more thorough fix is to use tool.inputSchema for zod validation before callTool, and on failure return { isError: true, content: [...] } to let the model self-correct.

2. If Keeping Sanitize, Types Can Be Safer

The current signature is (val: any) => any, and after calling args = sanitizeForMcp(args), the type collapses to any, polluting downstream. In the SDK, arguments is already Record<string, unknown> (z.ZodRecord<z.ZodString, z.ZodUnknown> from CallToolRequestSchema), we can align:

type JsonPrimitive = string | number | boolean | null
type JsonValue = JsonPrimitive | JsonValue[] | { [k: string]: JsonValue }

function sanitizeForMcp(val: unknown, seen: WeakSet<object> = new WeakSet()): JsonValue {
  ...
}

unknown + explicit narrowing is much stronger than any.

3. Several Edge Cases That Will Error

The current implementation uses typeof val === 'object' and walks Object.entries, which behaves incorrectly for the following inputs:

Input Current Result Expected
Date {} date.toISOString()
Buffer / Uint8Array { "0": 72, "1": 105, ... } base64
Error {} { name, message }
URL / RegExp {} String(val)
Object with toJSON() Drops toJSON and walks prototype Call toJSON() first
Circular reference Stack overflow WeakSet detection

Reference implementation:

function sanitizeForMcp(val: unknown, seen: WeakSet<object> = new WeakSet()): JsonValue {
  if (val === null || val === undefined) return null
  switch (typeof val) {
    case 'string':
    case 'boolean': return val
    case 'number':  return Number.isFinite(val) ? val : null
    case 'bigint':  return val.toString()
    case 'function':
    case 'symbol':  return null
  }
  if (seen.has(val as object)) return null
  seen.add(val as object)

  if (Array.isArray(val)) return val.map((v) => sanitizeForMcp(v, seen))
  if (val instanceof Date) return val.toISOString()
  if (val instanceof Map) {
    const out: Record<string, JsonValue> = {}
    for (const [k, v] of val) out[String(k)] = sanitizeForMcp(v, seen)
    return out
  }
  if (val instanceof Set) return Array.from(val, (v) => sanitizeForMcp(v, seen))
  if (val instanceof Error) return { name: val.name, message: val.message }
  if (val instanceof URL || val instanceof RegExp) return String(val)
  if (ArrayBuffer.isView(val) || val instanceof ArrayBuffer) {
    return Buffer.from(val as ArrayBufferLike).toString('base64')
  }
  if (typeof (val as { toJSON?: unknown }).toJSON === 'function') {
    return sanitizeForMcp((val as { toJSON: () => unknown }).toJSON(), seen)
  }
  const out: Record<string, JsonValue> = {}
  for (const [k, v] of Object.entries(val as object)) {
    out[k] = sanitizeForMcp(v, seen)
  }
  return out
}

At the call site, use as Record<string, unknown> to align with SDK:

const sanitized = sanitizeForMcp(args) as Record<string, unknown>
const result = await client.callTool({ name, arguments: sanitized }, ...)

4. The Outer try/catch Can Be Removed

try {
  args = sanitizeForMcp(args)
} catch (e) {
  ...warn('Failed to sanitize args, sending raw', e)
}

The function above won't throw for pure JSON input—BigInt/cycles are both handled internally. Keeping the catch actually misleads reviewers into thinking there's unknown risk here.


To summarize my suggested priority:

  1. First use the issue's log to confirm the actual shape of args, see if it's a schema mismatch;
  2. If it's a schema mismatch → add tool.inputSchema validation before callTool (fix the root);
  3. If you still want to keep sanitize as a fallback → use unknown + JsonValue + WeakSet + Date/Buffer/Error/toJSON explicit branches, remove the outer try/catch.

If you've already confirmed the root cause is Map/Set/BigInt (e.g., some upstream transform really inserted these), please post the source—I just scanned the callTool call path and didn't see this situation, might have missed it.


Original Content

感谢修复!不过在 review 这个改动时有几点想和你讨论一下,可能值得在合并前再确认下根因和实现方式。

1. 关于根因

unhashable type: 'dict' 是 Python 端的报错——某个 server 端 tool 把收到的 dict 当作 hashable 值用了(比如塞进 set() 或当 dict key)。但是在我们这条链路上:

  • McpService.callTool 里 args 已经走过 JSON.parse(L940-951)
  • MCP SDK 接着会把 request JSON.stringify 通过 stdio/SSE/HTTP 发到 server

也就是说到达 Python server 的 args 必然是纯 JSON——Map/Set 在 JSON.stringify 阶段会变成 {},BigInt 会直接抛错。这条路径上 Map/Set/BigInt 实际不会出现,所以这个 sanitize 大概率没真的触发。

更可能的成因是:模型给出的 arg 是嵌套对象(比如 schema 期望 string,模型给了 {"a":1}),server 拿到后用作 dict key 就报这个错。Sanitize 之后嵌套对象仍然是嵌套对象,bug 不会消失。

如果方便的话,可以拿 issue 里的 log.txt 复现一次,把交给 SDK 之前的 args 实际值打出来确认下。如果是 schema mismatch,更彻底的修法是在 callTool 之前用 tool.inputSchema 做 zod 校验,失败返回 { isError: true, content: [...] } 让模型自纠。

2. 如果保留 sanitize,类型可以更安全

现在签名是 (val: any) => any,调用处 args = sanitizeForMcp(args) 之后类型塌成 any,污染下游。SDK 里 arguments 已经是 Record<string, unknown>CallToolRequestSchemaz.ZodRecord<z.ZodString, z.ZodUnknown>),可以对齐:

type JsonPrimitive = string | number | boolean | null
type JsonValue = JsonPrimitive | JsonValue[] | { [k: string]: JsonValue }

function sanitizeForMcp(val: unknown, seen: WeakSet<object> = new WeakSet()): JsonValue {
  ...
}

unknown + 显式 narrow 比 any 强很多。

3. 几个会出错的边界情况

当前实现 typeof val === 'object'Object.entries,对下列输入行为都不对:

输入 当前结果 期望
Date {} date.toISOString()
Buffer / Uint8Array { "0": 72, "1": 105, ... } base64
Error {} { name, message }
URL / RegExp {} String(val)
toJSON() 的对象 toJSON 走原型枚举 先调 toJSON()
循环引用 栈溢出 WeakSet 检测

参考实现:

function sanitizeForMcp(val: unknown, seen: WeakSet<object> = new WeakSet()): JsonValue {
  if (val === null || val === undefined) return null
  switch (typeof val) {
    case 'string':
    case 'boolean': return val
    case 'number':  return Number.isFinite(val) ? val : null
    case 'bigint':  return val.toString()
    case 'function':
    case 'symbol':  return null
  }
  if (seen.has(val as object)) return null
  seen.add(val as object)

  if (Array.isArray(val)) return val.map((v) => sanitizeForMcp(v, seen))
  if (val instanceof Date) return val.toISOString()
  if (val instanceof Map) {
    const out: Record<string, JsonValue> = {}
    for (const [k, v] of val) out[String(k)] = sanitizeForMcp(v, seen)
    return out
  }
  if (val instanceof Set) return Array.from(val, (v) => sanitizeForMcp(v, seen))
  if (val instanceof Error) return { name: val.name, message: val.message }
  if (val instanceof URL || val instanceof RegExp) return String(val)
  if (ArrayBuffer.isView(val) || val instanceof ArrayBuffer) {
    return Buffer.from(val as ArrayBufferLike).toString('base64')
  }
  if (typeof (val as { toJSON?: unknown }).toJSON === 'function') {
    return sanitizeForMcp((val as { toJSON: () => unknown }).toJSON(), seen)
  }
  const out: Record<string, JsonValue> = {}
  for (const [k, v] of Object.entries(val as object)) {
    out[k] = sanitizeForMcp(v, seen)
  }
  return out
}

调用处用 as Record<string, unknown> 对齐 SDK:

const sanitized = sanitizeForMcp(args) as Record<string, unknown>
const result = await client.callTool({ name, arguments: sanitized }, ...)

4. 外层的 try/catch 可以删掉

try {
  args = sanitizeForMcp(args)
} catch (e) {
  ...warn('Failed to sanitize args, sending raw', e)
}

上面这个函数对纯 JSON 输入不会抛——BigInt/cycle 都在内部 handle 了。保留 catch 反而误导 reviewer 觉得这里有未知风险。


总结一下我的建议优先级:

  1. 先用 issue 的 log 确认 args 实际 shape,看是不是 schema mismatch;
  2. 如果是 schema mismatch → 在 callTool 前加 tool.inputSchema 校验(治根);
  3. 如果还是要保留 sanitize 作兜底 → 用 unknown + JsonValue + WeakSet + Date/Buffer/Error/toJSON 显式分支,删掉外层 try/catch

如果你已经确认过根因是 Map/Set/BigInt(比如某个上游 transform 真的塞了这些进来),麻烦贴一下来源——我刚才扫了一下 callTool 的调用路径没看到这种情况,可能漏了。

@DeJeune
Copy link
Copy Markdown
Collaborator

DeJeune commented May 7, 2026

Note

This comment was translated by Claude.

Looks like a server-side issue.


Original Content

看起来是服务端的问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants