Fix config utils by blueswhen · Pull Request #1357 · ModelTC/LightLLM

blueswhen · 2026-06-16T01:53:18Z

No description provided.

gemini-code-assist

Code Review

This pull request optimizes recurrent processing in Qwen3next by reading q/k/v/a/b column views directly via per-token strides to avoid contiguous copies, removes profiling-related arguments and endpoints, updates Qwen3.5 EOS token ID retrieval logic, and adds corresponding unit tests. Feedback on the changes highlights three critical issues: removing the cleanup of req.linear_att_len_to_big_page_id in infer_batch.py can cause resource leaks and router crashes; calling int() on tokenizer.eos_token_id in config_utils.py may raise a TypeError if it is a list; and removing the root node check in linear_att_radix_cache.py could trigger assertion failures during eviction when the tree is empty.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-16T01:55:23Z

        if shared_kv_len <= req.cur_kv_len:
            free_token_index.append(self.req_manager.req_to_token_indexs[req.req_idx][shared_kv_len : req.cur_kv_len])
-            # 该分支不会把 prefill 阶段累积的 big page id 插入 radix cache（典型为 pause/abort
-            # 在 prefill 跨过 big page 边界后、到达末尾前触发），需在此显式释放，避免泄漏。
-
-            # 释放本请求 prefill 阶段在 big page 边界上申请、但尚未插入 radix cache 的 big page
-            # state buffer。仅当请求未走 insert 分支(小页/大页插入)就被释放时才会有残留，典型场景：
-            # big page 模式下请求在 prefill 跨过 big page 边界后、到达末尾前被 pause / abort。
-            # 若不释放，会泄漏 big page state slot，并触发 free_a_req_mem 中 dict 为空的断言。
-            if req.linear_att_len_to_big_page_id:
-                self.radix_cache.linear_att_big_page_buffers.free_state_cache(
-                    list(req.linear_att_len_to_big_page_id.values())
-                )
-                req.linear_att_len_to_big_page_id.clear()
-
            req.cur_kv_len = shared_kv_len


Removing the code that clears req.linear_att_len_to_big_page_id in _linear_att_free_req will cause a resource leak of big page state slots if a request is aborted or paused during prefill. Furthermore, it will cause the assertion assert len(req.linear_att_len_to_big_page_id) == 0 at line 132 to fail, crashing the router process. We should restore the cleanup logic to free the big page state buffers.

Suggested change

if shared_kv_len <= req.cur_kv_len:

free_token_index.append(self.req_manager.req_to_token_indexs[req.req_idx][shared_kv_len : req.cur_kv_len])

# 该分支不会把 prefill 阶段累积的 big page id 插入 radix cache（典型为 pause/abort

# 在 prefill 跨过 big page 边界后、到达末尾前触发），需在此显式释放，避免泄漏。

# 释放本请求 prefill 阶段在 big page 边界上申请、但尚未插入 radix cache 的 big page

# state buffer。仅当请求未走 insert 分支(小页/大页插入)就被释放时才会有残留，典型场景：

# big page 模式下请求在 prefill 跨过 big page 边界后、到达末尾前被 pause / abort。

# 若不释放，会泄漏 big page state slot，并触发 free_a_req_mem 中 dict 为空的断言。

if req.linear_att_len_to_big_page_id:

self.radix_cache.linear_att_big_page_buffers.free_state_cache(

list(req.linear_att_len_to_big_page_id.values())

)

req.linear_att_len_to_big_page_id.clear()

req.cur_kv_len = shared_kv_len

if shared_kv_len <= req.cur_kv_len:

free_token_index.append(self.req_manager.req_to_token_indexs[req.req_idx][shared_kv_len : req.cur_kv_len])

if req.linear_att_len_to_big_page_id:

self.radix_cache.linear_att_big_page_buffers.free_state_cache(

list(req.linear_att_len_to_big_page_id.values())

)

req.linear_att_len_to_big_page_id.clear()

req.cur_kv_len = shared_kv_len

gemini-code-assist · 2026-06-16T01:55:23Z

            if tokenizer.eos_token_id is not None:
-                return [int(tokenizer.eos_token_id)]
+                eos_token_ids.append(int(tokenizer.eos_token_id))


If tokenizer.eos_token_id is a list (which is common for tokenizers with multiple end-of-sequence tokens), calling int(tokenizer.eos_token_id) will raise a TypeError. Since this block is wrapped in a try...except Exception block, the error will be silently caught, and the function will fall back to the default behavior, bypassing the Qwen3.5-specific logic entirely. We should handle both int and list/iterable types for tokenizer.eos_token_id.

Suggested change

if tokenizer.eos_token_id is not None:

return [int(tokenizer.eos_token_id)]

eos_token_ids.append(int(tokenizer.eos_token_id))

if tokenizer.eos_token_id is not None:

if isinstance(tokenizer.eos_token_id, list):

eos_token_ids.extend(int(eid) for eid in tokenizer.eos_token_id)

else:

eos_token_ids.append(int(tokenizer.eos_token_id))

gemini-code-assist · 2026-06-16T01:55:23Z

+        if node.is_leaf():
            self._evict_tree_set.add(node)


Removing the and node is not self.root_node check allows the root_node to be added to _evict_tree_set when the tree is empty (as the root node then satisfies is_leaf()). This can contradict assertions in the eviction logic (e.g., assert node is not self.root_node in _evict). Unless there is a specific reason to allow the root node in the eviction set, it is safer to keep this guard to prevent potential assertion failures.

Suggested change

if node.is_leaf():

self._evict_tree_set.add(node)

if node.is_leaf() and node is not self.root_node:

self._evict_tree_set.add(node)

fix: fix a eos id bug

5ef948b

gemini-code-assist Bot reviewed Jun 16, 2026

View reviewed changes

blueswhen force-pushed the fix_config_utils branch from 5a42f1f to 5ef948b Compare June 16, 2026 01:55

fix

c2fcf85

hiworldwzj merged commit 9e4f552 into main Jun 16, 2026
1 check passed

hiworldwzj deleted the fix_config_utils branch June 16, 2026 02:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix config utils#1357

Fix config utils#1357
hiworldwzj merged 2 commits into
mainfrom
fix_config_utils

blueswhen commented Jun 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

blueswhen commented Jun 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants