LCORE-1599: Add retry with backoff to check_llama_stack_version by savitojs · Pull Request #1409 · lightspeed-core/lightspeed-stack

savitojs · 2026-03-26T16:43:45Z

Description

check_llama_stack_version() now retries up to 5 times with a 2-second delay when it gets an APIConnectionError from Llama Stack. Previously it made a single call and crashed immediately if Llama Stack wasn't ready. This handles the sidecar startup race condition where both containers start concurrently but Llama Stack needs a few extra seconds to initialize.

After all retries are exhausted, the original error is still raised so misconfiguration isn't silently ignored.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Release Notes

Bug Fixes
- Version checking now automatically retries on connection errors instead of failing immediately, improving resilience.
- Added configurable retry parameters to customize retry attempts and delay with sensible defaults.
- Enhanced logging during retry attempts provides better troubleshooting and monitoring visibility.
- Added validation for retry parameters to catch invalid values early.

coderabbitai · 2026-03-26T16:44:00Z

Warning

Rate limit exceeded

@savitojs has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 25 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 851b7b38-b743-4de0-8aad-76b1422112a9

📥 Commits

Reviewing files that changed from the base of the PR and between 5cf466e and 5b25619.

📒 Files selected for processing (2)

src/utils/llama_stack_version.py
tests/unit/utils/test_llama_stack_version.py

Walkthrough

check_llama_stack_version now accepts max_retries and retry_delay, retries client.inspect.version() on APIConnectionError up to max_retries with asyncio.sleep between attempts, logs warnings on intermediate failures, and re-raises the error if all attempts fail. Unit tests added to validate retry-success and retry-exhaustion paths.

Changes

Cohort / File(s)	Summary
Version Check Retry Implementation `src/utils/llama_stack_version.py`	Added `max_retries` and `retry_delay` parameters (with defaults). Implements retry loop around `await client.inspect.version()` catching `APIConnectionError`, logging warnings on failures, sleeping between attempts with `asyncio.sleep`, and re-raising after the final failed attempt. Input validation added for `max_retries`. Import path adjusted to `AsyncLlamaStackClient` from `llama_stack_client`.
Retry Behavior Tests `tests/unit/utils/test_llama_stack_version.py`	Added two async tests: one simulates two `APIConnectionError` failures then success and asserts call/sleep counts; the other simulates persistent `APIConnectionError` and asserts the exception is raised after exhausting retries. Both patch `asyncio.sleep` to avoid real delays.

Sequence Diagram(s)

sequenceDiagram
    participant App as App (startup)
    participant Checker as check_llama_stack_version
    participant Client as AsyncLlamaStackClient
    participant Verifier as compare_versions
    participant Async as asyncio.sleep

    App->>Checker: call check_llama_stack_version(...)
    loop until success or attempts exhausted
        Checker->>Client: await inspect.version()
        alt version returned
            Client-->>Checker: VersionInfo
            Checker->>Verifier: compare_versions(version, min, max)
            Verifier-->>Checker: ok / raises
            Checker-->>App: return
        else APIConnectionError
            Client-->>Checker: raises APIConnectionError
            Checker->>App: log warning (attempt N)
            Checker->>Async: await sleep(retry_delay)
            Async-->>Checker: resume
        end
    end
    alt exhausted
        Checker-->>App: re-raise APIConnectionError
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: adding retry-with-backoff functionality to check_llama_stack_version.
Linked Issues check	✅ Passed	The implementation fully addresses issue `#1408` requirements: retry logic with configurable max_retries and retry_delay, error re-raising on exhaustion, and appropriate logging of failures.
Out of Scope Changes check	✅ Passed	All changes are directly related to adding retry-with-backoff functionality; no unrelated modifications to unrelated files or features were introduced.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/unit/utils/test_llama_stack_version.py (1)

121-139: Add the exhausted-retries test path.

This test covers eventual success, but not the required behavior that APIConnectionError is re-raised after all retries are exhausted.

Suggested companion test

+@pytest.mark.asyncio
+async def test_check_llama_stack_version_raises_after_max_retries(
+    mocker: MockerFixture,
+) -> None:
+    mock_client = mocker.AsyncMock()
+    mock_sleep = mocker.patch("utils.llama_stack_version.asyncio.sleep")
+    mock_client.inspect.version.side_effect = APIConnectionError(request=mocker.MagicMock())
+
+    with pytest.raises(APIConnectionError):
+        await check_llama_stack_version(mock_client, max_retries=3, retry_delay=1)
+
+    assert mock_client.inspect.version.call_count == 3
+    assert mock_sleep.call_count == 2

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/utils/test_llama_stack_version.py` around lines 121 - 139, Add a
new unit test that verifies check_llama_stack_version re-raises
APIConnectionError after all retries are exhausted: create an async test
(similar to test_check_llama_stack_version_retries_on_connection_error) that
sets mock_client.inspect.version.side_effect to raise APIConnectionError for
every call, call await check_llama_stack_version(mock_client, max_retries=N,
retry_delay=1) inside pytest.raises(APIConnectionError) and assert
mock_client.inspect.version.call_count == N and mock_sleep.call_count == N-1 to
confirm it retried the expected number of times before re-raising; reference the
existing test name and the check_llama_stack_version function, as well as
mock_client.inspect.version and APIConnectionError to locate and mirror
behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/utils/llama_stack_version.py`:
- Around line 32-34: The retry parameters allow max_retries to be zero or
negative which causes the retry loop to run zero times and skip the version/API
check; in the function that declares max_retries and retry_delay, validate
max_retries at the start (e.g., if max_retries < 1) and either raise a
ValueError with a clear message or coerce it to a minimum of 1 before entering
the retry loop so the version check runs at least once; reference the parameters
max_retries and retry_delay to locate the function and update the validation
accordingly.
- Around line 6-7: The import uses a private module: replace the internal import
"from llama_stack_client._client import AsyncLlamaStackClient" with the public
export from the package (import AsyncLlamaStackClient directly from
llama_stack_client), i.e. change the import to use the top-level export of
AsyncLlamaStackClient so code uses the public API surface instead of the private
_client module.

---

Nitpick comments:
In `@tests/unit/utils/test_llama_stack_version.py`:
- Around line 121-139: Add a new unit test that verifies
check_llama_stack_version re-raises APIConnectionError after all retries are
exhausted: create an async test (similar to
test_check_llama_stack_version_retries_on_connection_error) that sets
mock_client.inspect.version.side_effect to raise APIConnectionError for every
call, call await check_llama_stack_version(mock_client, max_retries=N,
retry_delay=1) inside pytest.raises(APIConnectionError) and assert
mock_client.inspect.version.call_count == N and mock_sleep.call_count == N-1 to
confirm it retried the expected number of times before re-raising; reference the
existing test name and the check_llama_stack_version function, as well as
mock_client.inspect.version and APIConnectionError to locate and mirror
behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bc949baa-c4b2-49dc-bdb7-8c3c309760b3

📥 Commits

Reviewing files that changed from the base of the PR and between 1ee31fc and 96616f1.

📒 Files selected for processing (2)

src/utils/llama_stack_version.py
tests/unit/utils/test_llama_stack_version.py

Signed-off-by: Savitoj Singh <savsingh@redhat.com>

asimurka

LGTM in overall, please correct linter errors

tisnik

LGTM

asimurka

LGTM

coderabbitai Bot reviewed Mar 26, 2026

View reviewed changes

Comment thread src/utils/llama_stack_version.py Outdated

Comment thread src/utils/llama_stack_version.py

savitojs force-pushed the savitojs/LCORE-1599-retry-llama-stack-version-check branch from 96616f1 to 5cf466e Compare March 26, 2026 16:54

LCORE-1599: Add retry with backoff to check_llama_stack_version

5b25619

Signed-off-by: Savitoj Singh <savsingh@redhat.com>

savitojs force-pushed the savitojs/LCORE-1599-retry-llama-stack-version-check branch from 5cf466e to 5b25619 Compare March 26, 2026 17:03

tisnik requested a review from asimurka March 26, 2026 18:26

asimurka requested changes Mar 27, 2026

View reviewed changes

tisnik approved these changes Mar 27, 2026

View reviewed changes

asimurka approved these changes Mar 27, 2026

View reviewed changes

tisnik merged commit 409cdc2 into lightspeed-core:main Mar 27, 2026
20 of 22 checks passed

savitojs deleted the savitojs/LCORE-1599-retry-llama-stack-version-check branch March 27, 2026 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LCORE-1599: Add retry with backoff to check_llama_stack_version#1409

LCORE-1599: Add retry with backoff to check_llama_stack_version#1409
tisnik merged 1 commit into
lightspeed-core:mainfrom
savitojs:savitojs/LCORE-1599-retry-llama-stack-version-check

savitojs commented Mar 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 26, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

asimurka left a comment

Uh oh!

tisnik left a comment

Uh oh!

asimurka left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

savitojs commented Mar 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

asimurka left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

asimurka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

savitojs commented Mar 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 26, 2026 •

edited

Loading