[LoRA] update docs for LoRA#3798
Open
likholat wants to merge 3 commits into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds runtime validation and API documentation notes to clarify which LoRA adapter modes are supported when using ContinuousBatchingPipeline::add_request() + step(), and adds a Python test to validate the new restriction.
Changes:
- Added a runtime assertion in the Continuous Batching add_request() path to reject unsupported LoRA adapter modes.
- Updated C++ API documentation comments to describe LoRA mode limitations for add_request()+step().
- Added a Python test ensuring unsupported LoRA mode(s) are rejected by add_request().
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| tests/python_tests/test_continuous_batching.py | Adds a regression test for rejecting unsupported LoRA mode in CB add_request(). |
| src/cpp/src/continuous_batching/pipeline_impl.cpp | Adds runtime validation for LoRA adapter modes in CB add_request(). |
| src/cpp/src/continuous_batching/pipeline_base.hpp | Documents LoRA adapter mode limitations for CB add_request() overloads. |
| src/cpp/include/openvino/genai/continuous_batching_pipeline.hpp | Documents LoRA adapter mode limitations on the public CB API. |
Comment on lines
+271
to
+276
| if (sampling_params.adapters.has_value()) { | ||
| const auto mode = sampling_params.adapters->get_mode(); | ||
| OPENVINO_ASSERT(mode != AdapterConfig::MODE_DYNAMIC && mode != AdapterConfig::MODE_AUTO && mode != AdapterConfig::MODE_STATIC_RANK, | ||
| "MODE_DYNAMIC, MODE_AUTO, and MODE_STATIC_RANK LoRA adapters are not supported in the add_request() + step() flow. " | ||
| "Use MODE_STATIC or MODE_FUSE instead."); | ||
| } |
Comment on lines
+839
to
+840
| with pytest.raises(RuntimeError): | ||
| pipe.add_request(0, "test prompt", generation_config=config) |
Comment on lines
+199
to
+200
| /// @note LoRA adapters are only supported in MODE_STATIC or MODE_FUSE modes. | ||
| /// MODE_DYNAMIC, MODE_AUTO and MODE_STATIC_RANK are not supported in the add_request() + step() flow. |
Comment on lines
81
to
85
| /** | ||
| * Adds requests to awaiting queue using encoded inputs | ||
| * Adds requests to awaiting queue using encoded inputs. | ||
| * @note LoRA adapters are only supported in MODE_STATIC or MODE_FUSE modes. | ||
| * MODE_DYNAMIC, MODE_AUTO and MODE_STATIC_RANK are not supported in the add_request() + step() flow. | ||
| */ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add documentation and runtime validation for LoRA adapter mode limitations in the ContinuousBatchingPipeline
add_request+step()flowRelated to #3677
Checklist: