feat: Add prefix cache benchmark by yuyanpeng-google · Pull Request #268 · AI-Hypercomputer/JetStream

yuyanpeng-google · 2025-05-07T09:05:21Z

This commit introduces a new benchmark to test the performance of prefix caching in JetStream.

The benchmark (benchmark_prefix_cache.sh) allows testing with various prompt lengths and common prefix lengths. It utilizes a new mock dataset generated by load_mock_prefix_cache_test_input_requests in benchmark_serving.py, which creates prompts sharing common prefixes of varying lengths based on a normal distribution.

Key changes include:

New script benchmarks/benchmark_prefix_cache.sh to orchestrate prefix cache benchmark runs.
Added PrefixCacheTestTokenizer for simple character-to-ordinal tokenization, suitable for controlled prefix testing.
Implemented load_mock_prefix_cache_test_input_requests in benchmark_serving.py to generate test data with shared prefixes.
Added prefix_cache_test as a dataset option and --prefix-cache-test-common-len argument to benchmark_serving.py.
Updated benchmarks/README.md with instructions on how to run the new prefix cache benchmark.

This commit introduces a new benchmark to test the performance of prefix caching in JetStream. The benchmark (`benchmark_prefix_cache.sh`) allows testing with various prompt lengths and common prefix lengths. It utilizes a new mock dataset generated by `load_mock_prefix_cache_test_input_requests` in `benchmark_serving.py`, which creates prompts sharing common prefixes of varying lengths based on a normal distribution. Key changes include: - New script `benchmarks/benchmark_prefix_cache.sh` to orchestrate prefix cache benchmark runs. - Added `PrefixCacheTestTokenizer` for simple character-to-ordinal tokenization, suitable for controlled prefix testing. - Implemented `load_mock_prefix_cache_test_input_requests` in `benchmark_serving.py` to generate test data with shared prefixes. - Added `prefix_cache_test` as a dataset option and `--prefix-cache-test-common-len` argument to `benchmark_serving.py`. - Updated `benchmarks/README.md` with instructions on how to run the new prefix cache benchmark.

vipannalla

Looks good

yuyanpeng-google requested a review from mailvijayasingh May 7, 2025 09:05

yuyanpeng-google requested a review from vipannalla as a code owner May 7, 2025 09:05

yuyanpeng-google force-pushed the yuyan-prefix-cache-benchmark branch from 3fe08db to bbfb5bd Compare May 7, 2025 09:56

vipannalla approved these changes May 7, 2025

View reviewed changes

github-actions Bot added the pull ready This label is needed if we want the copybara service to auto sync it to g3. label May 7, 2025

copybara-service Bot merged commit 4aafd76 into main May 7, 2025
6 checks passed

copybara-service Bot deleted the yuyan-prefix-cache-benchmark branch May 7, 2025 16:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add prefix cache benchmark#268

feat: Add prefix cache benchmark#268
copybara-service[bot] merged 1 commit intomainfrom
yuyan-prefix-cache-benchmark

yuyanpeng-google commented May 7, 2025

Uh oh!

vipannalla left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yuyanpeng-google commented May 7, 2025

Uh oh!

vipannalla left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants