LCORE-1860: Add E2E tests for degraded mode startup scenarios by anik120 · Pull Request #1992 · lightspeed-core/lightspeed-stack

anik120 · 2026-06-24T19:46:37Z

Description

Add end-to-end tests to verify LCORE behavior when starting without llama-stack and allow_degraded_mode is enabled.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

Related Issue #
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Added end-to-end coverage for starting the service in degraded mode when a downstream dependency is unavailable.
- Added checks for degraded-mode metrics and readiness responses during startup.
Bug Fixes
- Improved service startup validation so degraded mode behavior is verified through HTTP endpoints and response content.

coderabbitai · 2026-06-24T19:46:53Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ff1f45fb-502f-47ec-bff1-1865962fc33f

📥 Commits

Reviewing files that changed from the base of the PR and between 606cf00 and 8150a5f.

📒 Files selected for processing (3)

tests/e2e/configuration/server-mode/lightspeed-stack-degraded-mode.yaml
tests/e2e/features/degraded_mode_startup.feature
tests/e2e/features/steps/common_http.py

📜 Recent review details

⏰ Context from checks skipped due to timeout. (17)

GitHub Check: spectral
GitHub Check: mypy
GitHub Check: unit_tests (3.13)
GitHub Check: unit_tests (3.12)
GitHub Check: Pylinter
GitHub Check: integration_tests (3.13)
GitHub Check: integration_tests (3.12)
GitHub Check: build-pr
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E Tests for Lightspeed Evaluation job
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

🧰 Additional context used

📓 Path-based instructions (2)

tests/e2e/**/*.{py,feature}

📄 CodeRabbit inference engine (AGENTS.md)

Use behave (BDD) framework for end-to-end testing with Gherkin feature files

Files:

tests/e2e/features/degraded_mode_startup.feature
tests/e2e/features/steps/common_http.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/e2e/features/steps/common_http.py

🧠 Learnings (4)

📚 Learning: 2026-05-20T08:09:30.641Z

Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: docs/design/llama-stack-config-merge/poc-results/library-mode/synthesized-run.yaml:107-110
Timestamp: 2026-05-20T08:09:30.641Z
Learning: In Llama-stack config YAMLs, when defining a Llama Guard safety shield entry, set `provider_shield_id` to the *guard model identifier* (e.g., `meta-llama/Llama-Guard-3-8B`). Do not use a chat/generative model id (e.g., `openai/gpt-4o-mini`): a chat-model id (or `native_override`) indicates only an override landed and does **not** mean the safety shield is actually gating queries. Ensure any E2E coverage for the related implementation (JIRA/E2E tests) exercises a real Llama Guard model to verify that the shield is effective.

Applied to files:

tests/e2e/configuration/server-mode/lightspeed-stack-degraded-mode.yaml

📚 Learning: 2026-04-07T09:20:26.590Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1467
File: tests/e2e/features/steps/common.py:36-49
Timestamp: 2026-04-07T09:20:26.590Z
Learning: For Behave-based Python tests, rely on Behave’s Context layered stack for attribute lifecycle: Behave pushes a new Context layer when entering feature scope (before_feature) and again for scenario scope (before_scenario). Attributes assigned inside given/when/then steps live on the current scenario layer and are automatically removed when the scenario ends. As a result, step-set attributes should not be expected to persist across scenarios or features, and manual cleanup in after_scenario/after_feature is generally unnecessary for attributes set in step functions. Only perform manual cleanup for attributes that you set explicitly in before_feature/before_scenario, since those live on the respective feature/scenario layers.

Applied to files:

tests/e2e/features/steps/common_http.py

📚 Learning: 2026-04-13T13:39:54.963Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1490
File: tests/e2e/features/environment.py:206-211
Timestamp: 2026-04-13T13:39:54.963Z
Learning: In lightspeed-stack E2E tests under tests/e2e/features, it is intentional to set context.feature_config inside Background/step functions (scenario-scoped Behave layer). The environment.py after_scenario restore logic should only restore configuration when context.scenario_lightspeed_override_active is True; this flag is set by configure_service only when a real config switch occurs (so restore does not run for scenarios without a switch). Additionally, steps/common.py’s module-level _active_lightspeed_stack_config_basename is used to prevent re-applying the same config across subsequent scenarios, ensuring scenario_lightspeed_override_active stays False after the first apply. Therefore, reviewers should not “fix” this flow as if feature_config were incorrectly scoped or if after_scenario restoration is missing—config switching and restoration are meant to happen exactly once per actual switch, not redundantly per scenario.

Applied to files:

tests/e2e/features/steps/common_http.py

📚 Learning: 2026-06-24T13:45:37.249Z

Learnt from: Jdubrick
Repo: lightspeed-core/lightspeed-stack PR: 1971
File: src/utils/markdown_repair.py:31-36
Timestamp: 2026-06-24T13:45:37.249Z
Learning: In the lightspeed-stack repository, docstrings must use the section header name "Parameters:" (not "Args:") for function arguments, even if the project references Google Python docstring conventions. Ensure docstrings follow the project’s established "Parameters:" header format for any documented function parameters.

Applied to files:

tests/e2e/features/steps/common_http.py

🔇 Additional comments (3)

tests/e2e/configuration/server-mode/lightspeed-stack-degraded-mode.yaml (1)

1-25: LGTM!

tests/e2e/features/steps/common_http.py (1)

115-125: LGTM!

tests/e2e/features/degraded_mode_startup.feature (1)

1-40: LGTM!

Walkthrough

Adds a degraded-mode e2e service configuration, a new step that checks response-body substrings, and feature scenarios that verify metrics and readiness when llama-stack is available or disrupted at startup.

Changes

Degraded mode startup E2E

Layer / File(s)	Summary
Degraded-mode fixture and HTTP assertion `tests/e2e/configuration/server-mode/lightspeed-stack-degraded-mode.yaml`, `tests/e2e/features/steps/common_http.py`	Adds a server-mode degraded-mode config and a quoted-substring response-body assertion step.
Degraded-mode startup scenarios `tests/e2e/features/degraded_mode_startup.feature`	Adds a shared background plus metrics and readiness scenarios for degraded startup with and without llama-stack.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

lightspeed-core/lightspeed-stack#1777: Adds the allow_degraded_mode config field used by the new degraded-mode YAML fixture.
lightspeed-core/lightspeed-stack#1985: Adds the degraded-mode startup metric asserted by the new metrics scenarios.

Suggested reviewers

tisnik

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: new E2E tests for degraded mode startup scenarios.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/e2e/features/degraded_mode_startup.feature`:
- Around line 58-62: Remove the `@skip` tag from the degraded mode startup
scenario so it is not excluded by the e2e behave runs; keep the scenario gated
with a CI-specific tag expression and retain `@manual` only if it still needs to
be excluded from automation. Update the tags on the Scenario in
degraded_mode_startup.feature so the Llama Stack startup check can actually
execute in CI.

In `@tests/e2e/utils/utils.py`:
- Around line 224-225: The liveness check fallback in the utility that waits for
a container response currently prints a message and returns, which can mask a
restart failure. Update the helper in utils.py where the retry loop ends so it
fails immediately by raising an assertion or error instead of returning
silently, keeping the existing container_name context in the failure message.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b1335973-46d9-423a-90d9-c2a2b542ee79

📥 Commits

Reviewing files that changed from the base of the PR and between ab86328 and 606cf00.

📒 Files selected for processing (5)

tests/e2e/configuration/server-mode/lightspeed-stack-degraded-mode.yaml
tests/e2e/features/degraded_mode_startup.feature
tests/e2e/features/steps/common.py
tests/e2e/features/steps/common_http.py
tests/e2e/utils/utils.py

📜 Review details

⏰ Context from checks skipped due to timeout. (18)

GitHub Check: bandit
GitHub Check: Pyright
GitHub Check: build-pr
GitHub Check: integration_tests (3.12)
GitHub Check: spectral
GitHub Check: Pylinter
GitHub Check: integration_tests (3.13)
GitHub Check: unit_tests (3.12)
GitHub Check: unit_tests (3.13)
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E Tests for Lightspeed Evaluation job

🧰 Additional context used

📓 Path-based instructions (2)

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/e2e/features/steps/common.py
tests/e2e/features/steps/common_http.py
tests/e2e/utils/utils.py

tests/e2e/**/*.{py,feature}

📄 CodeRabbit inference engine (AGENTS.md)

Use behave (BDD) framework for end-to-end testing with Gherkin feature files

Files:

tests/e2e/features/steps/common.py
tests/e2e/features/degraded_mode_startup.feature
tests/e2e/features/steps/common_http.py
tests/e2e/utils/utils.py

🧠 Learnings (4)

📚 Learning: 2026-05-20T08:09:30.641Z

Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: docs/design/llama-stack-config-merge/poc-results/library-mode/synthesized-run.yaml:107-110
Timestamp: 2026-05-20T08:09:30.641Z
Learning: In Llama-stack config YAMLs, when defining a Llama Guard safety shield entry, set `provider_shield_id` to the *guard model identifier* (e.g., `meta-llama/Llama-Guard-3-8B`). Do not use a chat/generative model id (e.g., `openai/gpt-4o-mini`): a chat-model id (or `native_override`) indicates only an override landed and does **not** mean the safety shield is actually gating queries. Ensure any E2E coverage for the related implementation (JIRA/E2E tests) exercises a real Llama Guard model to verify that the shield is effective.

Applied to files:

tests/e2e/configuration/server-mode/lightspeed-stack-degraded-mode.yaml

📚 Learning: 2026-04-07T09:20:26.590Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1467
File: tests/e2e/features/steps/common.py:36-49
Timestamp: 2026-04-07T09:20:26.590Z
Learning: For Behave-based Python tests, rely on Behave’s Context layered stack for attribute lifecycle: Behave pushes a new Context layer when entering feature scope (before_feature) and again for scenario scope (before_scenario). Attributes assigned inside given/when/then steps live on the current scenario layer and are automatically removed when the scenario ends. As a result, step-set attributes should not be expected to persist across scenarios or features, and manual cleanup in after_scenario/after_feature is generally unnecessary for attributes set in step functions. Only perform manual cleanup for attributes that you set explicitly in before_feature/before_scenario, since those live on the respective feature/scenario layers.

Applied to files:

tests/e2e/features/steps/common.py
tests/e2e/features/steps/common_http.py

📚 Learning: 2026-04-13T13:39:54.963Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1490
File: tests/e2e/features/environment.py:206-211
Timestamp: 2026-04-13T13:39:54.963Z
Learning: In lightspeed-stack E2E tests under tests/e2e/features, it is intentional to set context.feature_config inside Background/step functions (scenario-scoped Behave layer). The environment.py after_scenario restore logic should only restore configuration when context.scenario_lightspeed_override_active is True; this flag is set by configure_service only when a real config switch occurs (so restore does not run for scenarios without a switch). Additionally, steps/common.py’s module-level _active_lightspeed_stack_config_basename is used to prevent re-applying the same config across subsequent scenarios, ensuring scenario_lightspeed_override_active stays False after the first apply. Therefore, reviewers should not “fix” this flow as if feature_config were incorrectly scoped or if after_scenario restoration is missing—config switching and restoration are meant to happen exactly once per actual switch, not redundantly per scenario.

Applied to files:

tests/e2e/features/steps/common.py
tests/e2e/features/steps/common_http.py

📚 Learning: 2026-06-24T13:45:37.249Z

Learnt from: Jdubrick
Repo: lightspeed-core/lightspeed-stack PR: 1971
File: src/utils/markdown_repair.py:31-36
Timestamp: 2026-06-24T13:45:37.249Z
Learning: In the lightspeed-stack repository, docstrings must use the section header name "Parameters:" (not "Args:") for function arguments, even if the project references Google Python docstring conventions. Ensure docstrings follow the project’s established "Parameters:" header format for any documented function parameters.

Applied to files:

tests/e2e/features/steps/common.py
tests/e2e/features/steps/common_http.py
tests/e2e/utils/utils.py

🪛 GitHub Actions: Black / 0_black.txt

tests/e2e/utils/utils.py

[error] 1-1: Black --check failed: formatting mismatch. Output indicates this file 'would be reformatted' (1 file would be reformatted), causing exit code 1. Run 'black src tests' (or 'black --write tests/e2e/utils/utils.py') to apply formatting.

🪛 GitHub Actions: Black / black

tests/e2e/utils/utils.py

[error] 1-1: Black --check would reformat files. 1 file would be reformatted (tests/e2e/utils/utils.py). Run 'uv tool run black' (without --check) or 'black --write' to fix formatting.

🔇 Additional comments (5)

tests/e2e/configuration/server-mode/lightspeed-stack-degraded-mode.yaml (1)

1-25: LGTM!

tests/e2e/features/degraded_mode_startup.feature (1)

1-57: LGTM!

Also applies to: 63-67

tests/e2e/features/steps/common_http.py (1)

115-126: LGTM!

tests/e2e/features/steps/common.py (1)

193-207: LGTM!

tests/e2e/utils/utils.py (1)

184-223: LGTM!

Also applies to: 461-467, 500-500

coderabbitai · 2026-06-24T19:50:35Z

+        print(f"{container_name} did not respond after 30 attempts (degraded mode)")
+        return


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Fail fast when liveness never becomes 200.

This path currently logs and returns, which can hide a failed restart and cause later, less-actionable test failures. Raise an assertion/error here.

Proposed fix

- print(f"{container_name} did not respond after 30 attempts (degraded mode)") - return + raise AssertionError( + f"{container_name} did not respond on /liveness after 30 attempts (degraded mode)" + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

print(f"{container_name} did not respond after 30 attempts (degraded mode)")

return

raise AssertionError(

f"{container_name} did not respond on /liveness after 30 attempts (degraded mode)"

)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/e2e/utils/utils.py` around lines 224 - 225, The liveness check fallback in the utility that waits for a container response currently prints a message and returns, which can mask a restart failure. Update the helper in utils.py where the retry loop ends so it fails immediately by raising an assertion or error instead of returning silently, keeping the existing container_name context in the failure message.

tisnik

LGTM

radofuchs · 2026-06-25T08:22:33Z

        max_attempts (int): Maximum number of health check attempts (default 6).
+        skip_health_check (bool): If True, skip health check (for degraded mode tests).
    """
+    if skip_health_check:


we already use the liveness endpoint to determine if the lightspeed-stack container is healthy, so why do we actually need this?

radofuchs · 2026-06-25T08:26:50Z

+      And REST API service prefix is /v1
+      And the Lightspeed stack configuration directory is "tests/e2e/configuration"
+
+  @skip-health-check


hiding functionality behind tags is what we are actively trying to avoid now, please find different way to do this if actually necessary

radofuchs · 2026-06-25T08:28:40Z

+      And the Lightspeed stack configuration directory is "tests/e2e/configuration"
+
+  @skip-health-check
+  Scenario: Service starts in degraded mode when llama-stack is not running


this test can essentially never fail, the liveness check is part of the helthcheck for the container. So the assertion and basically the whole test is redundant

radofuchs · 2026-06-25T08:29:46Z

+    """
+
+  @skip-health-check
+  Scenario: Metrics endpoint works in degraded mode


this test is redundant, the two that follow do the same thing

radofuchs · 2026-06-25T08:31:42Z

+    And The response body contains "ls_started_in_degraded_mode 1.0"
+
+  Scenario: Degraded mode metric is set to 0.0 when started with llama-stack
+    Given Llama Stack is restarted


you can avoid this unnecessary extra restart of llama-stack service if this test is run as first in the feature file

Add end-to-end tests to verify LCORE behavior when starting without llama-stack and allow_degraded_mode is enabled.

radofuchs

LGTM

coderabbitai Bot reviewed Jun 24, 2026

View reviewed changes

anik120 force-pushed the degraded-mode-tests branch 2 times, most recently from 580cef9 to 24d5db9 Compare June 24, 2026 19:52

tisnik approved these changes Jun 25, 2026

View reviewed changes

radofuchs requested changes Jun 25, 2026

View reviewed changes

LCORE-1860: Add E2E tests for degraded mode startup scenarios

8150a5f

Add end-to-end tests to verify LCORE behavior when starting without llama-stack and allow_degraded_mode is enabled.

anik120 force-pushed the degraded-mode-tests branch from 24d5db9 to 8150a5f Compare June 25, 2026 13:14

radofuchs approved these changes Jun 25, 2026

View reviewed changes

tisnik merged commit 62251cf into lightspeed-core:main Jun 25, 2026
38 of 39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LCORE-1860: Add E2E tests for degraded mode startup scenarios#1992

LCORE-1860: Add E2E tests for degraded mode startup scenarios#1992
tisnik merged 1 commit into
lightspeed-core:mainfrom
anik120:degraded-mode-tests

anik120 commented Jun 24, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 24, 2026

Uh oh!

tisnik left a comment

Uh oh!

radofuchs Jun 25, 2026

Uh oh!

radofuchs Jun 25, 2026

Uh oh!

radofuchs Jun 25, 2026

Uh oh!

radofuchs Jun 25, 2026

Uh oh!

radofuchs Jun 25, 2026

Uh oh!

radofuchs left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		print(f"{container_name} did not respond after 30 attempts (degraded mode)")
		return

-        print(f"{container_name} did not respond after 30 attempts (degraded mode)")
-        return
+        raise AssertionError(
+            f"{container_name} did not respond on /liveness after 30 attempts (degraded mode)"
+        )

Uh oh!

Conversation

anik120 commented Jun 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

radofuchs Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

radofuchs Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

radofuchs Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

radofuchs Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

radofuchs Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

radofuchs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

anik120 commented Jun 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading