Running Bamba on vLLM

This issue tracks progress on running Bamba on vLLM.

Success for this issue implies the following:

- [ ] Running the model successfully from the HF checkpoint in vLLM (https://github.com/vllm-project/vllm/pull/10909)
- [ ] Ensuring chunked prefill and TP work in vLLM
- [ ] Closing the performance gap in vLLM wrt Llama of similar sizes
- [ ] Reporting the performance results in a blog post

cc @raghukiran1224 @fabianlim @AdnanHoque

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running Bamba on vLLM #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Running Bamba on vLLM #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions