RHIDP-14062: update provider types for synthesizer by Jdubrick · Pull Request #2007 · lightspeed-core/lightspeed-stack

Jdubrick · 2026-06-26T19:42:24Z

Description

Adds basic vllm instead of making users feel like they are locking into a certain provider for vllm such as rhaiss
Adds ollama as a provider
Fixes api key handling for vllm remote types, as it needs api_token

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

Related Issue https://redhat.atlassian.net/browse/RHIDP-14062
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Added support for additional unified inference provider types, including ollama and vllm.
- Improved handling of provider-specific authentication fields so credentials are placed in the correct config key automatically.
Bug Fixes
- Fixed provider configuration output for certain backends so environment-based API credentials are now generated in the expected format.
- Ensured extra provider settings are preserved when building inference configuration.

Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>

coderabbitai · 2026-06-26T19:43:26Z

Warning

Review limit reached

@Jdubrick, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 49 minutes and 14 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e699376e-8291-49f8-be56-2fd31a54f04a

📥 Commits

Reviewing files that changed from the base of the PR and between 6f58b23 and 46e463b.

📒 Files selected for processing (3)

docs/openapi.json
src/llama_stack_configuration.py
tests/unit/test_llama_stack_synthesize.py

Walkthrough

Configuration synthesis now accepts ollama and vllm unified inference providers, maps ollama to remote::ollama, and writes api_key_env values into provider-specific auth fields.

Changes

Unified inference provider support

Layer / File(s)	Summary
Provider identifiers `src/models/config.py`, `src/llama_stack_configuration.py`	`UnifiedInferenceProvider.type` accepts `ollama` and `vllm`, and the provider map adds `ollama -> remote::ollama`.
Auth token field selection `src/llama_stack_configuration.py`	`api_key_env` now writes into a provider-specific config field via `API_KEY_FIELD_MAP`, using `api_token` for `remote::vllm` and `api_key` otherwise.
Inference synthesis tests `tests/unit/test_llama_stack_synthesize.py`	Tests cover `vllm`, `vllm_rhaiis`, and `ollama` outputs for `provider_id`, `provider_type`, auth token placement, and `extra` merging.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title is concise and accurately reflects the main change to synthesizer provider type handling.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/llama_stack_configuration.py (1)

752-758: 🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Prevent extra from overriding the synthesized auth field.

provider_config.update(provider["extra"]) runs after the ${env...} credential is written, so a config like {"type": "vllm", "api_key_env": "VLLM_API_KEY", "extra": {"api_token": "hardcoded"}} silently replaces the env reference. That defeats the new vLLM fix and can reintroduce literal secrets into the generated config.

Proposed fix

         provider_config: dict[str, Any] = {}
+        if provider.get("extra"):
+            provider_config.update(provider["extra"])
         if provider.get("api_key_env"):
             key_field = API_KEY_FIELD_MAP.get(ls_provider_type, "api_key")
             provider_config[key_field] = "${env." + provider["api_key_env"] + "}"
         if provider.get("allowed_models"):
             provider_config["allowed_models"] = provider["allowed_models"]
-        if provider.get("extra"):
-            provider_config.update(provider["extra"])

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llama_stack_configuration.py` around lines 752 - 758, The provider config
assembly is letting provider["extra"] overwrite the synthesized auth field
written from api_key_env in the configuration builder. Update the logic in
llama_stack_configuration’s provider_config construction so extra is merged
first or explicitly excludes the auth key chosen via API_KEY_FIELD_MAP,
preserving the "${env...}" reference for the credential field while still
applying allowed_models and the remaining extra settings.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/test_llama_stack_synthesize.py`:
- Around line 167-220: Add a model-level parse/validation test for the new
UnifiedInferenceProvider.type literals so config loading actually exercises the
Literal values for "ollama" and "vllm". The current checks in
apply_high_level_inference only pass raw dicts and can miss schema rejection, so
add a small validation test alongside
test_apply_high_level_inference_maps_ollama and
test_apply_high_level_inference_maps_vllm that parses the config through the
model layer in src/models/config.py and confirms those provider types are
accepted.

---

Outside diff comments:
In `@src/llama_stack_configuration.py`:
- Around line 752-758: The provider config assembly is letting provider["extra"]
overwrite the synthesized auth field written from api_key_env in the
configuration builder. Update the logic in llama_stack_configuration’s
provider_config construction so extra is merged first or explicitly excludes the
auth key chosen via API_KEY_FIELD_MAP, preserving the "${env...}" reference for
the credential field while still applying allowed_models and the remaining extra
settings.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: bbb57910-ff63-4826-87f9-e5966598d4d8

📥 Commits

Reviewing files that changed from the base of the PR and between fe8dd40 and 6f58b23.

📒 Files selected for processing (3)

src/llama_stack_configuration.py
src/models/config.py
tests/unit/test_llama_stack_synthesize.py

📜 Review details

⏰ Context from checks skipped due to timeout. (15)

GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E Tests for Lightspeed Evaluation job
GitHub Check: Pylinter
GitHub Check: integration_tests (3.13)
GitHub Check: integration_tests (3.12)
GitHub Check: unit_tests (3.13)
GitHub Check: unit_tests (3.12)
GitHub Check: build-pr

⚠️ CI failures not shown inline (2)

GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt: RHIDP-14062: update provider types for synthesizer

Conclusion: failure

View job details

##[group]Run set -euo pipefail
 �[36;1mset -euo pipefail�[0m
 �[36;1muv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json�[0m
 �[36;1mif ! diff -u docs/openapi.json /tmp/openapi-generated.json; then�[0m
 �[36;1m  echo "::error::docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json"�[0m

GitHub Actions: OpenAPI (Spectral) / spectral: RHIDP-14062: update provider types for synthesizer

Conclusion: failure

View job details

##[group]Run set -euo pipefail
 �[36;1mset -euo pipefail�[0m
 �[36;1muv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json�[0m
 �[36;1mif ! diff -u docs/openapi.json /tmp/openapi-generated.json; then�[0m
 �[36;1m  echo "::error::docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json"�[0m

🧰 Additional context used

📓 Path-based instructions (3)

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

src/models/config.py
src/llama_stack_configuration.py

src/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type

Files:

src/models/config.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/unit/test_llama_stack_synthesize.py

🧠 Learnings (3)

📚 Learning: 2026-01-12T10:58:40.230Z

Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

src/models/config.py

📚 Learning: 2026-02-25T07:46:33.545Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.

Applied to files:

src/models/config.py

📚 Learning: 2026-06-24T13:45:37.249Z

Learnt from: Jdubrick
Repo: lightspeed-core/lightspeed-stack PR: 1971
File: src/utils/markdown_repair.py:31-36
Timestamp: 2026-06-24T13:45:37.249Z
Learning: In the lightspeed-stack repository, docstrings must use the section header name "Parameters:" (not "Args:") for function arguments, even if the project references Google Python docstring conventions. Ensure docstrings follow the project’s established "Parameters:" header format for any documented function parameters.

Applied to files:

src/models/config.py
tests/unit/test_llama_stack_synthesize.py
src/llama_stack_configuration.py

🔇 Additional comments (2)

src/models/config.py (1)

679-689: LGTM!

src/llama_stack_configuration.py (1)

39-55: LGTM!

Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>

tisnik

LGTM

update provider types for synthesizer

6f58b23

Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>

update openapi schema

e77065a

Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>

coderabbitai Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread tests/unit/test_llama_stack_synthesize.py

add new tests, reorder config to avoid secret leaks

46e463b

Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>

tisnik approved these changes Jun 26, 2026

View reviewed changes

tisnik merged commit 3e060c2 into lightspeed-core:main Jun 26, 2026
33 of 34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RHIDP-14062: update provider types for synthesizer#2007

RHIDP-14062: update provider types for synthesizer#2007
tisnik merged 3 commits into
lightspeed-core:mainfrom
Jdubrick:extend-inference-synthesizer

Jdubrick commented Jun 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading

Review limit reached

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

tisnik left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Jdubrick commented Jun 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Jdubrick commented Jun 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading