[test]Evaluate model performance and accuracy with UCM by ayaka836 · Pull Request #642 · ModelEngine-Group/unified-cache-management

ayaka836 · 2026-01-13T02:14:05Z

Purpose

This PR introduces a comprehensive model validation test suite to evaluate both performance (latency metrics) and accuracy (F1-score) under the following three key UCM caching scenarios:

Naive: No cache hits (hit rate = 0%) — serves as the baseline with full recomputation.
Sparse: Evaluated at hit rate = 0% to assess the performance and accuracy gains enabled by sparsity-aware mechanisms.
Prefix Caching (PC): Evaluated across multiple hit rates [0%, 30%, 50%, 80%, 100%] to demonstrate the impact of prefix reuse on inference latency.

Modifications

Added test cases to verify whether the model is compatible with PC and sparsification.

Test

==============================================================================================================
Hit Rate (%) Input Tokens    Output Tokens   Concurrency  TTFT_mean [s]        TPOT_mean [s]        E2E_mean [s]        
--------------------------------------------------------------------------------------------------------------
0            8000            200             8            2.9202               0.0300               8.9285              
30           8000            200             8            2.8951               0.0297               8.8281              
50           8000            200             8            2.8731               0.0300               8.8691              
80           8000            200             8            2.9001               0.0299               8.8861              
100          8000            200             8            2.8534               0.0305               8.9540              
==============================================================================================================

========================================
Test            f1-score
----------------------------------------
PC              0.0229      
========================================

Wwwzff

LGTM

ayaka836 requested review from Wwwzff, mag1c-h and ygwpz as code owners January 13, 2026 02:14

ayaka836 force-pushed the test_model_validate branch 2 times, most recently from 1f9b201 to 75d09dd Compare January 19, 2026 01:52

Wwwzff force-pushed the test_model_validate branch from 75d09dd to f9de3fd Compare January 19, 2026 02:08

ayaka836 force-pushed the test_model_validate branch from f9de3fd to 7472627 Compare January 19, 2026 02:25

Wwwzff reviewed Jan 19, 2026

View reviewed changes

Comment thread test/suites/E2E/test_model_validate.py

ayaka836 force-pushed the test_model_validate branch 2 times, most recently from 103abf5 to c8de951 Compare January 19, 2026 08:23

Wwwzff previously approved these changes Jan 19, 2026

View reviewed changes

Wwwzff force-pushed the test_model_validate branch from c8de951 to 37af8ee Compare January 19, 2026 08:24

ayaka836 dismissed Wwwzff’s stale review via 0969138 January 23, 2026 06:40

ayaka836 force-pushed the test_model_validate branch from 37af8ee to 0969138 Compare January 23, 2026 06:40

ayaka836 requested a review from Wwwzff January 23, 2026 06:48

yuanzhg078 force-pushed the test_model_validate branch from 0969138 to d4073f8 Compare January 23, 2026 08:21

[test]Evaluate model performance and accuracy with UCM

0c45a1a

ayaka836 force-pushed the test_model_validate branch from d4073f8 to 0c45a1a Compare January 23, 2026 08:40

ayaka836 requested review from FangRun2 and Tarrei as code owners January 23, 2026 08:40

Wwwzff approved these changes Jan 23, 2026

View reviewed changes

yuanzhg078 and others added 2 commits January 23, 2026 17:04

Merge branch 'develop' into test_model_validate

a26a92b

Merge branch 'develop' into test_model_validate

ace27d3

mag1c-h approved these changes Jan 23, 2026

View reviewed changes

mag1c-h merged commit 720b122 into ModelEngine-Group:develop Jan 23, 2026
11 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[test]Evaluate model performance and accuracy with UCM#642

[test]Evaluate model performance and accuracy with UCM#642
mag1c-h merged 3 commits intoModelEngine-Group:developfrom
ayaka836:test_model_validate

ayaka836 commented Jan 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Wwwzff left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ayaka836 commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Modifications

Test

Uh oh!

Uh oh!

Wwwzff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ayaka836 commented Jan 13, 2026 •

edited

Loading