[fix] Pre-parse data + decouple args#31
Conversation
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes the dataloader to pre-parse dataset samples and decouples command-line arguments from benchmark execution logic. The pre-parsing optimization moves data parsing out of the hot loop to improve request throughput, while the argument decoupling enables easier config serialization/deserialization for reproducible benchmark runs.
Key Changes:
- Pre-parse all dataset samples during load instead of on-demand parsing
- Store command-line arguments (report_dir, timeout, output, verbose) directly in BenchmarkConfig
- Remove max_concurrency setting from client configuration across all templates and code
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/inference_endpoint/dataset_manager/dataloader.py | Implements pre-parsing by iterating through data during load() and storing parsed results |
| src/inference_endpoint/commands/benchmark.py | Refactors to use config fields instead of args namespace, removes args parameter from _run_benchmark |
| src/inference_endpoint/config/schema.py | Adds output, report_dir, timeout, verbose fields to BenchmarkConfig; removes max_concurrency validation |
| src/inference_endpoint/load_generator/session.py | Renames report_path parameter to report_dir for consistency |
| src/inference_endpoint/cli.py | Renames --report-path CLI argument to --report-dir |
| tests/unit/commands/test_benchmark.py | Removes max_concurrency assertions and adds new config fields to mock args |
| tests/unit/config/test_yaml_loader.py | Skips concurrency validation tests pending refactor |
| tests/integration/*.py | Updates parser functions to access dictionary keys instead of object attributes |
| tests/conftest.py | Simplifies parser to return row directly instead of reconstructing dictionary |
| Config templates (*.yaml) | Removes max_concurrency setting from all template files |
| Example configs (*.yaml) | Removes max_concurrency setting from example configuration files |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
Fixes the dataloader to pre-parse and decouples the args from benchmark execution.
Type of change
Related issues
Testing
Checklist