Commit 316c490
committed
Adding following metrics into JetStream server:
1) kv_cache_utilization: This refers to percentage of memory in the allocated kv-cache on TPU HBM, that is actually used during decode. It is based on the percentage of slots used.
2) num_requests_waiting: Total number of requests which are waiting to be decoded.
3) lora_requests_info: List of LoRA adapters that are loaded into the TPU HBM for serving the requests.1 parent 3c6fcbd commit 316c490
4 files changed
Lines changed: 377 additions & 0 deletions
File tree
- jetstream
- core
- metrics
- tools
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
139 | 139 | | |
140 | 140 | | |
141 | 141 | | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
142 | 153 | | |
143 | 154 | | |
144 | 155 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
214 | 214 | | |
215 | 215 | | |
216 | 216 | | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
217 | 242 | | |
218 | 243 | | |
219 | 244 | | |
| |||
255 | 280 | | |
256 | 281 | | |
257 | 282 | | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
314 | 314 | | |
315 | 315 | | |
316 | 316 | | |
| 317 | + | |
317 | 318 | | |
318 | 319 | | |
319 | 320 | | |
| |||
433 | 434 | | |
434 | 435 | | |
435 | 436 | | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
436 | 443 | | |
437 | 444 | | |
438 | 445 | | |
| |||
481 | 488 | | |
482 | 489 | | |
483 | 490 | | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
484 | 513 | | |
485 | 514 | | |
486 | 515 | | |
| |||
819 | 848 | | |
820 | 849 | | |
821 | 850 | | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| 858 | + | |
822 | 859 | | |
823 | 860 | | |
824 | 861 | | |
| |||
0 commit comments