Commit 720b122
[test]Evaluate model performance and accuracy with UCM (#642)
# Purpose
This PR introduces a comprehensive model validation test suite to
evaluate both performance (latency metrics) and accuracy (F1-score)
under the following three key UCM caching scenarios:
Naive: No cache hits (hit rate = 0%) — serves as the baseline with full
recomputation.
Sparse: Evaluated at hit rate = 0% to assess the performance and
accuracy gains enabled by sparsity-aware mechanisms.
Prefix Caching (PC): Evaluated across multiple hit rates [0%, 30%, 50%,
80%, 100%] to demonstrate the impact of prefix reuse on inference
latency.
# Modifications
Added test cases to verify whether the model is compatible with PC and
sparsification.
# Test
```
==============================================================================================================
Hit Rate (%) Input Tokens Output Tokens Concurrency TTFT_mean [s] TPOT_mean [s] E2E_mean [s]
--------------------------------------------------------------------------------------------------------------
0 8000 200 8 2.9202 0.0300 8.9285
30 8000 200 8 2.8951 0.0297 8.8281
50 8000 200 8 2.8731 0.0300 8.8691
80 8000 200 8 2.9001 0.0299 8.8861
100 8000 200 8 2.8534 0.0305 8.9540
==============================================================================================================
========================================
Test f1-score
----------------------------------------
PC 0.0229
========================================
```
<!--
CI passed with new added/existing test.
If it was tested in a way different from regular unit tests, please
clarify how you tested step by step, ideally copy and paste-able, so
that other reviewers can test and check, and descendants can verify in
the future.
If tests were not added, please describe why they were not added and/or
why it was difficult to add.
-->
Co-authored-by: yuanzhg078 <939526371@qq.com>
Co-authored-by: Mag1c.H <hemajun815@163.com>1 parent 1fa0614 commit 720b122
9 files changed
Lines changed: 2602 additions & 19 deletions
File tree
- test
- common
- uc_eval/utils
- suites/E2E
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
| |||
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| 26 | + | |
25 | 27 | | |
26 | 28 | | |
27 | 29 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
7 | 10 | | |
8 | 11 | | |
9 | 12 | | |
| |||
204 | 207 | | |
205 | 208 | | |
206 | 209 | | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
207 | 259 | | |
208 | 260 | | |
209 | 261 | | |
| |||
Large diffs are not rendered by default.
0 commit comments