Skip to content

NameError: tree_reduce is not defined in mlx_lm/models/cache.py _BaseCache.nbytes #1164

@siiea-ai

Description

@siiea-ai

Summary

mlx_lm/models/cache.py imports tree_flatten, tree_map, tree_unflatten from mlx.utils but the _BaseCache.nbytes property (line 322) calls tree_reduce, which is never imported. Any code path that reads .nbytes on a _BaseCache instance (or a subclass that inherits it) raises:

NameError: name 'tree_reduce' is not defined

Repro

Running qwen3.5-397b-a17b-mlx (a Qwen3.5 MoE, model_type: qwen3_5_moe) in LM Studio 0.4.12+1 with its bundled mlx-lm==0.31.3:

curl -sS -X POST http://127.0.0.1:8888/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"qwen3.5-397b-a17b-mlx","stream":false,"max_tokens":20,
       "messages":[{"role":"user","content":"Say hello."}]}'

returns:

{"error":"Error in iterating prediction stream: NameError: name 'tree_reduce' is not defined"}

Both streaming and non-streaming paths fail — _BaseCache.nbytes is read before the first token is produced.

Why MoE models surface this

Dense architectures use cache subclasses that override nbytes with their own implementation and never hit _BaseCache.nbytes. MoE architectures (Qwen3.5 MoE, etc.) inherit the base implementation, which is why MoE users see it and dense-model users don't.

Root cause

mlx_lm/models/cache.py line 10:

from mlx.utils import tree_flatten, tree_map, tree_unflatten

line 322:

@property
def nbytes(self):
    return tree_reduce(lambda a, x: a + x.nbytes, (self.keys, self.values), 0)

tree_reduce is used but not imported. One-line fix — PR to follow.

Environment

  • macOS 15.x, Apple Silicon (M3 Ultra)
  • mlx==0.31.1, mlx-lm==0.31.3
  • Confirmed on main at 62f38ae
  • Also reproduces in LM Studio bundled runtimes 1.4.0, 1.5.0, 1.6.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions