Commit d6e3b3e
committed
feat: auto-calculate max_new_tokens to align with vLLM behavior
When max_new_tokens is not specified (None or -1), automatically
calculate it as max_req_total_len - prompt_tokens. This aligns
with vLLM's behavior where max_tokens defaults to the remaining
context length.
Changes:
- sampling_params.py: default max_new_tokens changed from 16384 to -1
- py_sampling_params.py: default max_new_tokens changed from 16384 to None
- manager.py: add auto-calculation logic in _check_and_repair_length1 parent 391d2ea commit d6e3b3e
File tree
3 files changed
+24
-6
lines changed- lightllm/server
- core/objs
- httpserver
3 files changed
+24
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
| 41 | + | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
144 | | - | |
| 144 | + | |
145 | 145 | | |
146 | 146 | | |
147 | 147 | | |
148 | | - | |
| 148 | + | |
149 | 149 | | |
150 | 150 | | |
151 | 151 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
345 | 345 | | |
346 | 346 | | |
347 | 347 | | |
348 | | - | |
| 348 | + | |
349 | 349 | | |
350 | 350 | | |
351 | 351 | | |
| |||
439 | 439 | | |
440 | 440 | | |
441 | 441 | | |
442 | | - | |
| 442 | + | |
443 | 443 | | |
444 | 444 | | |
445 | 445 | | |
446 | | - | |
| 446 | + | |
447 | 447 | | |
448 | 448 | | |
449 | 449 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
477 | 477 | | |
478 | 478 | | |
479 | 479 | | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
480 | 498 | | |
481 | 499 | | |
482 | 500 | | |
| |||
0 commit comments