Commit 4af002f
committed
test(llm_eval): widen tiny qwen3 max_position_embeddings for fp8 e2e test
The default tiny qwen3 max_position_embeddings is 32, shorter than typical
MMLU prompts (39-46 tokens). lm-eval's HFLM path tolerates this by truncating,
but the TRT-LLM serve path used in test_qwen3_eval_fp8 rejects oversized
prompts with `default_max_tokens (-14) must be greater than 0`. Bump to 2048
to give TRT-LLM headroom for MMLU/hellaswag/gsm8k/humaneval prompts.
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>1 parent 8d83ac3 commit 4af002f
1 file changed
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
44 | | - | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
45 | 47 | | |
46 | 48 | | |
47 | 49 | | |
| |||
0 commit comments