Skip to content

Commit 720b122

Browse files
ayaka836yuanzhg078mag1c-h
authored
[test]Evaluate model performance and accuracy with UCM (#642)
# Purpose This PR introduces a comprehensive model validation test suite to evaluate both performance (latency metrics) and accuracy (F1-score) under the following three key UCM caching scenarios: Naive: No cache hits (hit rate = 0%) — serves as the baseline with full recomputation. Sparse: Evaluated at hit rate = 0% to assess the performance and accuracy gains enabled by sparsity-aware mechanisms. Prefix Caching (PC): Evaluated across multiple hit rates [0%, 30%, 50%, 80%, 100%] to demonstrate the impact of prefix reuse on inference latency. # Modifications Added test cases to verify whether the model is compatible with PC and sparsification. # Test ``` ============================================================================================================== Hit Rate (%) Input Tokens Output Tokens Concurrency TTFT_mean [s] TPOT_mean [s] E2E_mean [s] -------------------------------------------------------------------------------------------------------------- 0 8000 200 8 2.9202 0.0300 8.9285 30 8000 200 8 2.8951 0.0297 8.8281 50 8000 200 8 2.8731 0.0300 8.8691 80 8000 200 8 2.9001 0.0299 8.8861 100 8000 200 8 2.8534 0.0305 8.9540 ============================================================================================================== ======================================== Test f1-score ---------------------------------------- PC 0.0229 ======================================== ``` <!-- CI passed with new added/existing test. If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future. If tests were not added, please describe why they were not added and/or why it was difficult to add. --> Co-authored-by: yuanzhg078 <939526371@qq.com> Co-authored-by: Mag1c.H <hemajun815@163.com>
1 parent 1fa0614 commit 720b122

9 files changed

Lines changed: 2602 additions & 19 deletions

File tree

.pre-commit-config.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ repos:
33
rev: v2.4.1
44
hooks:
55
- id: codespell
6+
exclude: \.(jsonl|txt)$
67
args: [
78
'--skip', 'ucm/csrc/**,./ucm.egg-info/**,.github/**',
89
'-L', 'CANN,cann,NNAL,nnal,ASCEND,ascend,EnQue,CopyIn'
@@ -22,6 +23,7 @@ repos:
2223
rev: v1.7.7
2324
hooks:
2425
- id: actionlint
26+
exclude: \.(jsonl|txt)$
2527
default_stages:
2628
- pre-commit
2729
- manual

test/common/db_utils.py

Lines changed: 53 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,10 @@
33
import threading
44
from datetime import datetime
55
from pathlib import Path
6-
from typing import Any, Dict, Optional
6+
from typing import Any, Dict, List, Optional
7+
8+
from peewee import AutoField, DateTimeField, Model, TextField
9+
from playhouse.reflection import Introspector
710

811
# Lazy imports for database components
912
peewee = None
@@ -204,6 +207,55 @@ def write_to_db(table_name: str, data: Dict[str, Any]) -> bool:
204207
return False
205208

206209

210+
def read_from_db(
211+
table_name: str, filters: Optional[Dict[str, Any]] = None, limit: int = 1
212+
) -> List[Dict[str, Any]]:
213+
db_config = _get_db_config()
214+
if not db_config.get("enabled", False):
215+
logger.warning("Database disabled. Skipping read.")
216+
return []
217+
218+
db = _get_db()
219+
if db is None:
220+
logger.error("Failed to connect to database.")
221+
return []
222+
223+
_ensure_peewee_imported()
224+
225+
try:
226+
introspector = Introspector.from_database(db)
227+
DynamicModel = introspector.generate_models(table_names=[table_name]).get(
228+
table_name
229+
)
230+
231+
if DynamicModel is None:
232+
logger.warning(f"Table '{table_name}' not found in database.")
233+
return []
234+
235+
query = DynamicModel.select()
236+
237+
if filters:
238+
for key, value in filters.items():
239+
if hasattr(DynamicModel, key):
240+
field = getattr(DynamicModel, key)
241+
query = query.where(field == value)
242+
else:
243+
logger.warning(
244+
f"Filter key '{key}' does not exist in table '{table_name}'. Skipped."
245+
)
246+
247+
query = query.order_by(DynamicModel.created_at.desc()).limit(limit)
248+
249+
results = []
250+
for row in query:
251+
results.append(row.__data__)
252+
return results
253+
254+
except Exception as e:
255+
logger.error(f"Error reading from table '{table_name}': {e}")
256+
return []
257+
258+
207259
def database_connection(build_id: str) -> None:
208260
logger.info(f"Setting test build ID: {build_id}")
209261
_set_test_build_id(build_id)

test/common/uc_eval/utils/multifieldqa_zh.jsonl

Lines changed: 200 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)