Skip to content

feat: support in-batch prefix cache.#1240

Open
Clement-Wang26 wants to merge 1 commit intojd-opensource:mainfrom
Clement-Wang26:prefix_cache_final
Open

feat: support in-batch prefix cache.#1240
Clement-Wang26 wants to merge 1 commit intojd-opensource:mainfrom
Clement-Wang26:prefix_cache_final

Conversation

@Clement-Wang26
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new configuration option, enable_in_batch_prefix_cache, to control whether admitted prefill full blocks are cached into the prefix cache. This involves adding the flag definition, declaration, and integration into the Options struct across various core components and schedulers. A new overloaded cache method is added to BlockManagerPool and KVCacheManager to support this functionality, along with new tests for BlockManagerPoolTest and ContinuousSchedulerTest. Review comments indicate that the use of FLAGS_ global variables for this new setting, as well as enable_prefix_cache, deviates from the style guide which prefers configuration through the Options struct. Additionally, the new flag is registered in two different help categories, which could lead to confusion, and a DCHECK in block_manager_pool.cpp might need to be a CHECK for release build robustness.

Comment thread xllm/core/common/global_flags.cpp
Comment thread xllm/core/common/global_flags.h
Comment thread xllm/core/common/help_formatter.h
Comment thread xllm/core/common/help_formatter.h
}

void BlockManagerPool::cache(Sequence* sequence, size_t num_tokens) {
DCHECK(sequence != nullptr);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using DCHECK for sequence != nullptr might be insufficient if sequence being nullptr would lead to a critical error or crash in a release build. DCHECKs are typically compiled out in release builds. If this condition is vital for correctness in production, consider changing it to a CHECK to ensure it's always active.

Comment thread xllm/core/scheduler/continuous_scheduler.cpp
Comment thread xllm/core/scheduler/continuous_scheduler_test.cpp
liutongxuan added a commit to liutongxuan/xllm that referenced this pull request Apr 13, 2026
…ource#1240)

Add the ILU ProcessGroup constructor overload that accepts local_rank and
explicit group_ranks, so the DiT process-group creation path matches the
expected signature and builds successfully.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant