Skip to content

LCORE-1872: Launch Llama Stack container via Makefile orchestration#1760

Merged
tisnik merged 1 commit into
lightspeed-core:mainfrom
anik120:launch-lls-container
May 20, 2026
Merged

LCORE-1872: Launch Llama Stack container via Makefile orchestration#1760
tisnik merged 1 commit into
lightspeed-core:mainfrom
anik120:launch-lls-container

Conversation

@anik120

@anik120 anik120 commented May 18, 2026

Copy link
Copy Markdown
Contributor

Description

Enables LCORE to automatically launch Llama Stack as a containerized service through Makefile orchestration. Running make run now handles all infrastructure setup:
i) building the container image
ii) stopping any existing instance
iii) launching a fresh llama-stack container
iv) waiting for started container's health check
v) and finally starting the lightspeed-stack service.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Claude

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Containerized llama-stack service with automatic Docker or Podman runtime detection and health monitoring.
    • Simplified local development setup via single make run command with automatic container lifecycle orchestration.
  • Documentation

    • Updated run instructions to reflect containerized workflow and local endpoint configuration.
    • Added container runtime requirements documentation for Docker or Podman support.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

Warning

Rate limit exceeded

@anik120 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 21 minutes and 58 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a29a476c-5f86-4fc6-aee7-3f704679008b

📥 Commits

Reviewing files that changed from the base of the PR and between 9a1adb3 and 3cfb68c.

📒 Files selected for processing (3)
  • Makefile
  • README.md
  • lightspeed-stack.yaml

Walkthrough

This PR transitions the service from local uv run execution to containerized llama-stack service delivery. It adds Makefile-based container orchestration with auto-detected runtime support, updates configuration to reference localhost, and simplifies documentation of the new workflow.

Changes

Container-backed Llama Stack execution

Layer / File(s) Summary
Makefile container lifecycle management
Makefile
Adds container configuration variables (name, image, port), auto-detects podman/docker availability, and introduces six new targets (build-llama-stack-image, remove-llama-stack-container, start-llama-stack-container, wait-for-llama-stack-health, ensure-llama-stack-container, clean-llama-stack) that orchestrate image building, container startup with port mapping and health checks, container removal, and cleanup. The run target now depends on ensure-llama-stack-container orchestration.
Configuration defaults and documentation updates
lightspeed-stack.yaml, README.md
Updates lightspeed-stack.yaml llama_stack configuration to use localhost:8321 endpoint with api_key commented out, adds documentation noting that make run launches a container at this address. README simplifies local startup to use make run and adds a new "Container Runtime Requirements" section documenting Podman/Docker prerequisites and Makefile auto-detection.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately summarizes the main change: introducing Makefile orchestration to automatically launch Llama Stack as a containerized service when running make run.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Makefile`:
- Around line 50-51: The Makefile hardcodes run.yaml and lightspeed-stack.yaml
in the container volume mounts, so overrides passed as CONFIG or
LLAMA_STACK_CONFIG aren’t propagated; update the mount lines that currently
reference run.yaml and lightspeed-stack.yaml to use the Makefile variables
(e.g., $(CONFIG) and $(LLAMA_STACK_CONFIG)) instead (with sensible defaults if
needed) so the target that invokes the container mounts the configured files
from the host into the container; change the two -v mount entries in the recipe
to reference $(PWD)/$(CONFIG):/opt/app-root/run.yaml:ro,z and
$(PWD)/$(LLAMA_STACK_CONFIG):/opt/app-root/lightspeed-stack.yaml:ro,z (or their
default variable values) so overrides are honored.
- Line 24: Add a .PHONY declaration listing all orchestration targets so Make
won’t treat same-named files as up-to-date; explicitly mark run,
ensure-llama-stack-container, build-llama-stack-image,
stop-llama-stack-container, start-llama-stack-container,
wait-for-llama-stack-health, and clean-llama-stack as phony by adding a .PHONY:
line that includes those target names (referencing the target symbols
ensure-llama-stack-container, build-llama-stack-image,
stop-llama-stack-container, start-llama-stack-container,
wait-for-llama-stack-health, clean-llama-stack, and run).
- Around line 35-37: The current conditional uses substring matching with `ps
--filter "name="` and `grep` which can falsely match similarly named containers;
replace that check with an exact existence test using the container runtime's
inspect command (referencing CONTAINER_RUNTIME and LLAMA_STACK_CONTAINER_NAME) —
call $(CONTAINER_RUNTIME) inspect $(LLAMA_STACK_CONTAINER_NAME) and test its
exit status (redirecting output to /dev/null) to decide whether to echo
"Stopping existing llama-stack container..." and run $(CONTAINER_RUNTIME) rm -f
$(LLAMA_STACK_CONTAINER_NAME) so only the exact container name triggers removal.
- Around line 54-77: The docker run env flags currently interpolate secrets on
the command line (entries like -e OPENAI_API_KEY=$${OPENAI_API_KEY}, -e
CLIENT_SECRET=$${CLIENT_SECRET:-}, -e WATSONX_API_KEY=$${WATSONX_API_KEY:-}, -e
AWS_BEARER_TOKEN_BEDROCK=$${AWS_BEARER_TOKEN_BEDROCK:-}, etc. in the Makefile)
which exposes them; change each of these to pass-through form (-e KEY) instead
of embedding values so the container inherits the secret from the parent
environment (replace patterns like -e VAR=$${VAR:-...} with -e VAR for all
secret vars in the docker run invocation).

In `@README.md`:
- Around line 181-184: Add a blank line before and after the fenced code block
under the list item "start LCS server" in README.md to satisfy markdownlint rule
MD031; locate the list item containing the ```bash make run ``` block and ensure
there is an empty line above the opening ``` and an empty line below the closing
``` so the fenced block is separated from surrounding list text.

In `@src/models/config.py`:
- Around line 604-605: The AnyHttpUrl field default is currently a raw string in
the declaration "url: AnyHttpUrl = Field(default=\"http://localhost:8321\",
...)" which causes type-check failures; update the default to be an AnyHttpUrl
instance by wrapping the string with the AnyHttpUrl constructor (use
AnyHttpUrl("http://localhost:8321") as the default for the url Field) so
Pydantic and static type checkers accept the value.
- Around line 1901-1902: The Field declaration for llama_stack uses a class
reference where default_factory expects a zero-arg callable; change the
default_factory to a zero-argument factory (e.g. a lambda or small helper) that
returns a new LlamaStackConfiguration instance so static checks pass—update the
llama_stack: LlamaStackConfiguration = Field(default_factory=...) to use a
callable like lambda: LlamaStackConfiguration() consistent with the other nested
config defaults in this module.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c2672555-59a0-48a8-845d-4970a95ee638

📥 Commits

Reviewing files that changed from the base of the PR and between fd4acfa and ad72ddd.

📒 Files selected for processing (6)
  • Makefile
  • README.md
  • lightspeed-stack.yaml
  • src/client.py
  • src/models/config.py
  • tests/unit/test_client.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: build-pr
  • GitHub Check: Pylinter
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🧰 Additional context used
📓 Path-based instructions (3)
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

  • src/client.py
  • src/models/config.py
src/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Pydantic models must use @model_validator and @field_validator for validation and complete type annotations for all attributes, avoiding Any type

Files:

  • src/models/config.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

  • tests/unit/test_client.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

  • src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.

Applied to files:

  • src/models/config.py
🪛 checkmake (0.3.2)
Makefile

[warning] 26-26: Target body for "build-llama-stack-image" exceeds allowed length of 5 lines (6).

(maxbodylength)


[warning] 40-40: Target body for "start-llama-stack-container" exceeds allowed length of 5 lines (41).

(maxbodylength)


[warning] 83-83: Target body for "wait-for-llama-stack-health" exceeds allowed length of 5 lines (14).

(maxbodylength)


[warning] 24-24: Target "ensure-llama-stack-container" should be declared PHONY.

(phonydeclared)

🪛 GitHub Actions: Pyright / 0_Pyright.txt
src/models/config.py

[error] 604-604: pyright error (reportAssignmentType): Type "str" is not assignable to declared type "AnyHttpUrl". "str" is not assignable to "AnyHttpUrl".


[error] 1902-1902: pyright error (reportArgumentType): Argument of type "type[LlamaStackConfiguration]" cannot be assigned to parameter "default_factory" of type "(() -> _T@Field) | ((dict[str, Any]) -> _T@Field)" in function "Field"; function signature mismatch (expected 0 but received 1).

🪛 GitHub Actions: Pyright / Pyright
src/models/config.py

[error] 604-604: pyright (reportAssignmentType): Type "str" is not assignable to declared type "AnyHttpUrl". "str" is not assignable to "AnyHttpUrl".


[error] 1902-1902: pyright (reportArgumentType): Argument of type "type[LlamaStackConfiguration]" cannot be assigned to parameter "default_factory" of type "(() -> _T@Field) | ((dict[str, Any]) -> _T@Field)" in function "Field"; function accepts too many positional parameters and URL "Literal['http://localhost:8321']" is not assignable to "AnyHttpUrl".

🪛 GitHub Actions: Type checks / 0_mypy.txt
src/models/config.py

[error] 604-604: mypy: Incompatible types in assignment (expression has type "str", variable has type "AnyHttpUrl") [assignment]


[error] 1902-1902: mypy: Argument "default_factory" to "Field" has incompatible type "type[LlamaStackConfiguration]"; expected "Callable[[], Never] | Callable[[dict[str, Any]], Never]" [arg-type]

🪛 GitHub Actions: Type checks / mypy
src/models/config.py

[error] 604-604: mypy: Incompatible types in assignment (expression has type "str", variable has type "AnyHttpUrl") [assignment]


[error] 1902-1902: mypy: Argument "default_factory" to "Field" has incompatible type "type[LlamaStackConfiguration]"; expected "Callable[[], Never] | Callable[[dict[str, Any]], Never]" [arg-type]

🪛 markdownlint-cli2 (0.22.1)
README.md

[warning] 182-182: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🔇 Additional comments (3)
tests/unit/test_client.py (1)

40-47: LGTM!

Also applies to: 49-54, 57-58, 157-163

README.md (1)

188-196: LGTM!

src/client.py (1)

21-47: LGTM!

Also applies to: 66-97

Comment thread Makefile Outdated
Comment thread Makefile Outdated
Comment thread Makefile Outdated
Comment thread Makefile Outdated
Comment thread README.md
Comment thread src/models/config.py Outdated
Comment thread src/models/config.py Outdated
@anik120 anik120 force-pushed the launch-lls-container branch 4 times, most recently from d8be544 to 546efbf Compare May 18, 2026 20:15
@tisnik tisnik requested a review from radofuchs May 19, 2026 11:35

@tisnik tisnik left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread Makefile Outdated
fi
$(CONTAINER_RUNTIME) build -f deploy/llama-stack/test.containerfile -t $(LLAMA_STACK_IMAGE) .

stop-llama-stack-container: ## Stop and remove existing llama-stack container

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the stop name is misleading, the content of this is actually removing the container, not just stopping it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to remove-llama-stack-container

Comment thread Makefile Outdated
done; \
echo "✗ ERROR: Llama-stack did not become healthy within 60 seconds"; \
echo "Container logs:"; \
$(CONTAINER_RUNTIME) logs --tail 50 $(LLAMA_STACK_CONTAINER_NAME); \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no reason to limit the logs when you are starting the container

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fair, removed the limitation 👍🏽

@anik120 anik120 force-pushed the launch-lls-container branch from 546efbf to 9a1adb3 Compare May 19, 2026 17:20

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lightspeed-stack.yaml`:
- Around line 10-12: Update the misleading comment about llama_stack: change the
note to state that when using `make run` a container is launched at
http://localhost:8321 but the `llama_stack` configuration is still applied to
lightspeed-core because `make run` starts lightspeed-core with the `-c
$(CONFIG)` flag; explicitly state that this block controls service-side
connection settings so it is not ignored. Reference `llama_stack`, `make run`,
`lightspeed-core`, and the `-c $(CONFIG)` flag in the revised comment to avoid
confusion.

In `@Makefile`:
- Around line 26-27: The current parallelizable prerequisite list for
ensure-llama-stack-container causes race conditions; enforce a strict sequence
by chaining the targets: make build-llama-stack-image depend on
remove-llama-stack-container, make start-llama-stack-container depend on
build-llama-stack-image, and change ensure-llama-stack-container to depend only
on start-llama-stack-container (alternatively use GNU make's .WAIT between the
existing prerequisites) so removal → build → start always run in order.

In `@README.md`:
- Around line 181-197: The README still documents the removed Make target
run-llama-stack; update the README targets table and any duplicate mentions
(including the other occurrence noted) to remove or replace the run-llama-stack
entry so it matches the current Makefile workflow—locate the targets table and
any lines referencing run-llama-stack and either delete that row or replace it
with the correct command (e.g., make run or the new workflow), ensuring the
“Llama Stack as separate server” / container runtime sections remain consistent
with the Makefile changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 27a7cb0b-ea0c-4197-a5fd-b1f8b5831374

📥 Commits

Reviewing files that changed from the base of the PR and between ad72ddd and 9a1adb3.

📒 Files selected for processing (3)
  • Makefile
  • README.md
  • lightspeed-stack.yaml
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: build-pr
  • GitHub Check: Pylinter
  • GitHub Check: unit_tests (3.12)
  • GitHub Check: unit_tests (3.13)
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
🪛 checkmake (0.3.2)
Makefile

[warning] 28-28: Target body for "build-llama-stack-image" exceeds allowed length of 5 lines (6).

(maxbodylength)


[warning] 42-42: Target body for "start-llama-stack-container" exceeds allowed length of 5 lines (41).

(maxbodylength)


[warning] 85-85: Target body for "wait-for-llama-stack-health" exceeds allowed length of 5 lines (14).

(maxbodylength)


[warning] 21-21: Required target "all" is missing from the Makefile.

(minphony)


[warning] 21-21: Required target "clean" is missing from the Makefile.

(minphony)


[warning] 21-21: Required target "test" is missing from the Makefile.

(minphony)

🪛 markdownlint-cli2 (0.22.1)
README.md

[warning] 182-182: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 184-184: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

Comment thread lightspeed-stack.yaml Outdated
Comment thread Makefile Outdated
Comment thread README.md
Enables LCORE to automatically launch Llama Stack as a containerized service
through Makefile orchestration. Running make run now handles all infrastructure setup:
i) building the container image
ii) stopping any existing instance
iii) launching a fresh llama-stack container
iv) waiting for started container's health check
v) and finally starting the lightspeed-stack service.

Signed-off-by: Anik Bhattacharjee <anbhatta@redhat.com>
@anik120 anik120 force-pushed the launch-lls-container branch from 9a1adb3 to 3cfb68c Compare May 19, 2026 17:58

@radofuchs radofuchs left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants