Skip to content

[Blog] Model inference with Prefill-Decode disaggregation #6480

[Blog] Model inference with Prefill-Decode disaggregation

[Blog] Model inference with Prefill-Decode disaggregation #6480

Job Run time
5s
1m 12s
1m 30s
15s
36s
9s
57s
43s
21s
21s
24s
5m 5s
2m 43s
3m 12s
2m 20s
2m 56s
5m 54s
2m 55s
2m 10s
6m 0s
4m 21s
3m 22s
3m 58s
6m 26s
2m 10s
5m 41s
19s
18s
1h 6m 23s