feat: add compare_with_vllm.py example-03#38
Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
Summary of ChangesHello @viraatc, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request adds a new example to facilitate direct performance comparisons between the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
This PR adds a benchmark comparison example that allows users to compare performance metrics between inference-endpoint and vLLM's benchmarking tools using identical prompts.
Key Changes:
- Adds a new example script that benchmarks both vLLM and inference-endpoint on the same dataset
- Includes pre-generated JSONL datasets in formats compatible with each tool
- Provides comprehensive documentation with usage examples
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| examples/03_BenchmarkComparison/compare_with_vllm.py | Main comparison script that runs both benchmarks and displays comparative metrics |
| examples/03_BenchmarkComparison/dummy_prompts_vllm.jsonl | Dataset file with 96 prompts in vLLM format ({"prompt": "..."}) |
| examples/03_BenchmarkComparison/dummy_prompts_ie.jsonl | Dataset file with 96 prompts in inference-endpoint format ({"text_input": "..."}) |
| examples/03_BenchmarkComparison/README.md | Documentation explaining prerequisites, usage, and expected output |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request introduces a new example script for comparing the performance of inference-endpoint with vllm. The script is well-structured and includes helpful features like a dry-run mode and server warmup. The accompanying README provides clear instructions. I've made a few suggestions to improve code compatibility, robustness, and maintainability. Additionally, I noticed a duplicate prompt in the dummy datasets.
|
@anandhu-eng can you try out this example to ensure that we aren't missing anything. |
a44df83 to
895b918
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
examples/03_BenchmarkComparison/compare_with_vllm.py:1
- This appears to be duplicated logic from dataloader.py line 328. The walrus operator pattern
if line := line.strip():is used in dataloader.py but here on line 328 there's justline.strip()without assignment, which suggests this may be unintended or the diff display is incorrect. If this is actually in compare_with_vllm.py, this line has no effect.
#!/usr/bin/env python3
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
rebased onto #32 |
4476f1c to
34ff0c8
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
a729cad to
1e23313
Compare
1e23313 to
8f7d5d4
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
8f7d5d4 to
246bca2
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1d5a568 to
cd7c959
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 29 out of 29 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
bc5420a to
e3dce1b
Compare
e3dce1b to
c3eec49
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
What does this PR do?
example script to run vllm and inference-endpoints on a given endpoint-url and compare metrics:
Type of change
Related issues
Testing
Checklist