Skip to content

Commit 01b9433

Browse files
committed
(lcore-1251) added tls e2e tests
(lcore-1251) fixed tls tests & removed other e2e tests for quicker test running (lcore-1251) restored test_list.txt (lcore-1251) use `trustme` for certs (lcore-1251) quick tls server fix (lcore-1251) removed tags in place of steps (fix) removed unused code fix tls config verified correct llm response clean LCORE-1253: Add e2e proxy and TLS networking tests Add comprehensive end-to-end tests verifying that Llama Stack's NetworkConfig (proxy, TLS) works correctly through the Lightspeed Stack pipeline. Test infrastructure: - TunnelProxy: Async HTTP CONNECT tunnel proxy that creates TCP tunnels for HTTPS traffic. Tracks CONNECT count and target hosts. - InterceptionProxy: Async TLS-intercepting (MITM) proxy using trustme CA to generate per-target server certificates. Simulates corporate SSL inspection proxies. Behave scenarios (tests/e2e/features/proxy.feature): - Tunnel proxy: Configures run.yaml with NetworkConfig proxy pointing to a local tunnel proxy. Verifies CONNECT to api.openai.com:443 is observed and the LLM query succeeds through the proxy. - Interception proxy: Configures run.yaml with proxy and custom CA cert (trustme). Verifies TLS interception of api.openai.com traffic and successful LLM query through the MITM proxy. - TLS version: Configures run.yaml with min_version TLSv1.2 and verifies the LLM query succeeds with the TLS constraint. Each scenario dynamically generates a modified run-ci.yaml with the appropriate NetworkConfig, restarts Llama Stack with the new config, restarts the Lightspeed Stack, and sends a query to verify the full pipeline. Added trustme>=1.2.1 to dev dependencies. LCORE-1253: Add negative tests, TLS/cipher scenarios, and cleanup hooks Expand proxy e2e test coverage to fully address all acceptance criteria: AC1 (tunnel proxy): - Add negative test: LLM query fails gracefully when proxy is unreachable AC2 (interception proxy with CA): - Add negative test: LLM query fails when interception proxy CA is not provided (verifies "only successful when correct CA is provided") AC3 (TLS version and ciphers): - Add TLSv1.3 minimum version scenario - Add custom cipher suite configuration scenario (ECDHE+AESGCM:DHE+AESGCM) Test infrastructure: - Add after_scenario cleanup hook in environment.py that restores original Llama Stack and Lightspeed Stack configs after @Proxy scenarios. Prevents config leaks between scenarios. - Use different ports for each interception proxy instance to avoid address-already-in-use errors in sequential scenarios. Documentation: - Update docs/e2e_scenarios.md with all 7 proxy test scenarios. - Update docs/e2e_testing.md with proxy-related Behave tags (@Proxy, @tunnelproxy, @InterceptionProxy, @TLSVersion, @tlscipher). LCORE-1253: Address review feedback Changes requested by reviewer (tisnik) and CodeRabbit: - Detect Docker mode once in before_all and store as context.is_docker_mode. All proxy step functions now use the context attribute instead of calling _is_docker_mode() repeatedly. - Log exception in _restore_original_services instead of silently swallowing it. - Only clear context.services_modified on successful restoration, not when restoration fails (prevents leaking modified state). - Add 10-second timeout to tunnel proxy open_connection to prevent stalls on unreachable targets. - Handle malformed CONNECT port with ValueError catch and 400 response. LCORE-1253: Replace tag-based cleanup with Background restore step Move config restoration from @Proxy after_scenario hook to an explicit Background Given step. This follows the team convention that tags are used only for test selection (filtering), not for triggering behavior. The Background step "The original Llama Stack config is restored if modified" runs before every scenario. If a previous scenario left a modified run.yaml (detected by backup file existence), it restores the original and restarts services. This handles cleanup even when the previous scenario failed mid-way. Removed: - @Proxy tag from feature file (was triggering after_scenario hook) - after_scenario hook for @Proxy in environment.py - _restore_original_services function (replaced by Background step) - context.services_modified tracking (no hook reads it) Updated docs/e2e_testing.md: tags documented as selection-only, not behavior-triggering. LCORE-1253: Address radofuchs review feedback Rewrite proxy e2e tests to follow project conventions: - Reuse existing step definitions: use "I use query to ask question" from llm_query_response.py and "The status code of the response is" from common_http.py instead of custom query/response steps. - Split service restart into two explicit Given steps: "Llama Stack is restarted" and "Lightspeed Stack is restarted" so restart ordering is visible in the feature file. - Remove local (non-Docker) mode code path. Proxy tests use restart_container() exclusively, consistent with the rest of the e2e test suite. - Check specific status code 500 for error scenarios instead of the broad >= 400 range. - Remove custom send_query, verify_llm_response, and verify_error_response steps that duplicated existing functionality. Net reduction: -183 lines from step definitions. LCORE-1253: Clean up proxy servers between scenarios Stop proxy servers and their event loops explicitly in the Background restore step. Previously, proxy daemon threads were left running after each scenario, causing asyncio "Task was destroyed but it is pending" warnings at process exit. The _stop_proxy helper schedules an async stop on the proxy's event loop, waits for it to complete, then stops the loop. Context references are cleared so the next scenario starts clean. LCORE-1253: Stop proxy servers after last scenario in after_feature Add proxy cleanup in after_feature to stop proxy servers left running from the last scenario. The Background restore step handles cleanup between scenarios, but the last scenario's proxies persist until process exit, causing asyncio "Task was destroyed" warnings. The cleanup checks for proxy objects on context (no tag check needed) and calls _stop_proxy to gracefully shut down the event loops.
1 parent e184541 commit 01b9433

19 files changed

Lines changed: 1563 additions & 1 deletion

docker-compose-library.yaml

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ services:
3030
condition: service_healthy
3131
mock-mcp:
3232
condition: service_healthy
33+
mock-tls-inference:
34+
condition: service_healthy
3335
networks:
3436
- lightspeednet
3537
volumes:
@@ -40,6 +42,7 @@ services:
4042
- ./tests/e2e/rag:/opt/app-root/src/.llama/storage/rag:Z
4143
- ./tests/e2e/secrets/mcp-token:/tmp/mcp-token:ro
4244
- ./tests/e2e/secrets/invalid-mcp-token:/tmp/invalid-mcp-token:ro
45+
- mock-tls-certs:/certs:ro
4346
environment:
4447
# LLM Provider API Keys
4548
- BRAVE_SEARCH_API_KEY=${BRAVE_SEARCH_API_KEY:-}
@@ -113,7 +116,30 @@ services:
113116
retries: 3
114117
start_period: 2s
115118

119+
# Mock TLS inference server for TLS E2E tests
120+
mock-tls-inference:
121+
build:
122+
context: ./tests/e2e/mock_tls_inference_server
123+
dockerfile: Dockerfile
124+
container_name: mock-tls-inference
125+
ports:
126+
- "8443:8443"
127+
- "8444:8444"
128+
networks:
129+
- lightspeednet
130+
volumes:
131+
- mock-tls-certs:/certs
132+
healthcheck:
133+
test: ["CMD", "python", "-c", "import urllib.request,ssl;c=ssl.create_default_context();c.check_hostname=False;c.verify_mode=ssl.CERT_NONE;urllib.request.urlopen('https://localhost:8443/health',context=c)"]
134+
interval: 5s
135+
timeout: 3s
136+
retries: 3
137+
start_period: 5s
138+
116139

117140
networks:
118141
lightspeednet:
119-
driver: bridge
142+
driver: bridge
143+
144+
volumes:
145+
mock-tls-certs:

docker-compose.yaml

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,12 +25,16 @@ services:
2525
container_name: llama-stack
2626
ports:
2727
- "8321:8321" # Expose llama-stack on 8321 (adjust if needed)
28+
depends_on:
29+
mock-tls-inference:
30+
condition: service_healthy
2831
volumes:
2932
- ./run.yaml:/opt/app-root/run.yaml:z
3033
- ${GCP_KEYS_PATH:-./tmp/.gcp-keys-dummy}:/opt/app-root/.gcp-keys:ro
3134
- ./lightspeed-stack.yaml:/opt/app-root/lightspeed-stack.yaml:ro
3235
- llama-storage:/opt/app-root/src/.llama/storage
3336
- ./tests/e2e/rag:/opt/app-root/src/.llama/storage/rag:z
37+
- mock-tls-certs:/certs:ro
3438
environment:
3539
- BRAVE_SEARCH_API_KEY=${BRAVE_SEARCH_API_KEY:-}
3640
- TAVILY_SEARCH_API_KEY=${TAVILY_SEARCH_API_KEY:-}
@@ -140,9 +144,30 @@ services:
140144
retries: 3
141145
start_period: 2s
142146

147+
# Mock TLS inference server for TLS E2E tests
148+
mock-tls-inference:
149+
build:
150+
context: ./tests/e2e/mock_tls_inference_server
151+
dockerfile: Dockerfile
152+
container_name: mock-tls-inference
153+
ports:
154+
- "8443:8443"
155+
- "8444:8444"
156+
networks:
157+
- lightspeednet
158+
volumes:
159+
- mock-tls-certs:/certs
160+
healthcheck:
161+
test: ["CMD", "python", "-c", "import urllib.request,ssl;c=ssl.create_default_context();c.check_hostname=False;c.verify_mode=ssl.CERT_NONE;urllib.request.urlopen('https://localhost:8443/health',context=c)"]
162+
interval: 5s
163+
timeout: 3s
164+
retries: 3
165+
start_period: 5s
166+
143167

144168
volumes:
145169
llama-storage:
170+
mock-tls-certs:
146171

147172
networks:
148173
lightspeednet:

docs/e2e_scenarios.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,16 @@
116116
* Check if models can be filtered
117117
* Check if filtering can return empty list of models
118118

119+
## [`proxy.feature`](https://github.com/lightspeed-core/lightspeed-stack/blob/main/tests/e2e/features/proxy.feature)
120+
121+
* LLM traffic is routed through a configured tunnel proxy
122+
* LLM query fails gracefully when proxy is unreachable
123+
* LLM traffic works through interception proxy with correct CA
124+
* LLM query fails when interception proxy CA is not provided
125+
* TLS minimum version TLSv1.2 is respected
126+
* TLS minimum version TLSv1.3 is respected
127+
* Custom cipher suite configuration is respected
128+
119129
## [`query.feature`](https://github.com/lightspeed-core/lightspeed-stack/blob/main/tests/e2e/features/query.feature)
120130

121131
* Check if LLM responds properly to restrictive system prompt to sent question with different system prompt

docs/e2e_testing.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -192,6 +192,10 @@ All tag behaviour is implemented in **`features/environment.py`**: the hooks (`b
192192
| `@RHIdentity` | Feature-level: use RH identity config; restore in after_feature. |
193193
| `@Feedback` | Feature-level: set feedback conversation list; after_feature deletes those conversations. |
194194
| `@MCP` | Feature-level: use MCP config; restore in after_feature. |
195+
| `@TunnelProxy` | Selection: tunnel proxy (HTTP CONNECT) scenarios. |
196+
| `@InterceptionProxy` | Selection: TLS-intercepting proxy with trustme CA scenarios. |
197+
| `@TLSVersion` | Selection: TLS version configuration scenarios. |
198+
| `@TLSCipher` | Selection: cipher suite configuration scenarios. |
195199

196200

197201
### Multiple Tags and Skip Comment

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,7 @@ dev = [
126126
"ruff>=0.11.13",
127127
"aiosqlite",
128128
"behave>=1.3.0",
129+
"trustme>=1.2.1",
129130
"types-cachetools>=6.1.0.20250717",
130131
"build>=1.2.2.post1",
131132
"twine>=6.1.0",
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
name: Lightspeed Core Service (LCS)
2+
service:
3+
host: 0.0.0.0
4+
port: 8080
5+
auth_enabled: false
6+
workers: 1
7+
color_log: true
8+
access_log: true
9+
llama_stack:
10+
use_as_library_client: true
11+
library_client_config_path: run.yaml
12+
user_data_collection:
13+
feedback_enabled: true
14+
feedback_storage: "/tmp/data/feedback"
15+
transcripts_enabled: true
16+
transcripts_storage: "/tmp/data/transcripts"
17+
authentication:
18+
module: "noop"
19+
inference:
20+
default_provider: tls-openai
21+
default_model: mock-tls-model
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
name: Lightspeed Core Service (LCS)
2+
service:
3+
host: 0.0.0.0
4+
port: 8080
5+
auth_enabled: false
6+
workers: 1
7+
color_log: true
8+
access_log: true
9+
llama_stack:
10+
use_as_library_client: false
11+
url: http://llama-stack:8321
12+
api_key: xyzzy
13+
user_data_collection:
14+
feedback_enabled: true
15+
feedback_storage: "/tmp/data/feedback"
16+
transcripts_enabled: true
17+
transcripts_storage: "/tmp/data/transcripts"
18+
authentication:
19+
module: "noop"
20+
inference:
21+
default_provider: tls-openai
22+
default_model: mock-tls-model

tests/e2e/features/environment.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,11 @@ def before_all(context: Context) -> None:
157157
context.deployment_mode = os.getenv("E2E_DEPLOYMENT_MODE", "server").lower()
158158
context.is_library_mode = context.deployment_mode == "library"
159159

160+
# Detect Docker mode once for proxy tests
161+
from tests.e2e.features.steps.proxy import _is_docker_mode
162+
163+
context.is_docker_mode = _is_docker_mode()
164+
160165
# Get first LLM model from running service
161166
print(f"Running tests in {context.deployment_mode} mode")
162167

@@ -499,6 +504,14 @@ def before_feature(context: Context, feature: Feature) -> None:
499504
switch_config(context.feature_config)
500505
restart_container("lightspeed-stack")
501506

507+
if "TLS" in feature.tags:
508+
mode_dir = "library-mode" if context.is_library_mode else "server-mode"
509+
context.feature_config = (
510+
f"tests/e2e/configuration/{mode_dir}/lightspeed-stack-tls.yaml"
511+
)
512+
context.default_config_backup = create_config_backup("lightspeed-stack.yaml")
513+
switch_config(context.feature_config)
514+
502515

503516
def after_feature(context: Context, feature: Feature) -> None:
504517
"""Run after each feature file is exercised.
@@ -546,3 +559,17 @@ def after_feature(context: Context, feature: Feature) -> None:
546559
switch_config(context.default_config_backup)
547560
restart_container("lightspeed-stack")
548561
remove_config_backup(context.default_config_backup)
562+
563+
if "TLS" in feature.tags:
564+
switch_config(context.default_config_backup)
565+
remove_config_backup(context.default_config_backup)
566+
if not context.is_library_mode:
567+
restart_container("llama-stack")
568+
restart_container("lightspeed-stack")
569+
570+
# Clean up any proxy servers left from the last scenario
571+
if hasattr(context, "tunnel_proxy") or hasattr(context, "interception_proxy"):
572+
from tests.e2e.features.steps.proxy import _stop_proxy
573+
574+
_stop_proxy(context, "tunnel_proxy", "proxy_loop")
575+
_stop_proxy(context, "interception_proxy", "interception_proxy_loop")

tests/e2e/features/proxy.feature

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
@skip-in-library-mode
2+
Feature: Proxy and TLS networking tests for Llama Stack providers
3+
4+
Verify that the Lightspeed Stack works correctly when Llama Stack's
5+
remote inference providers are configured with proxy and TLS settings
6+
via the run.yaml NetworkConfig.
7+
8+
Background:
9+
Given The service is started locally
10+
And REST API service prefix is /v1
11+
And The original Llama Stack config is restored if modified
12+
13+
14+
# --- AC1: Tunnel proxy routing ---
15+
16+
@TunnelProxy
17+
Scenario: LLM traffic is routed through a configured tunnel proxy
18+
Given A tunnel proxy is running on port 8888
19+
And Llama Stack is configured to route inference through the tunnel proxy
20+
And Llama Stack is restarted
21+
And Lightspeed Stack is restarted
22+
When I use "query" to ask question
23+
"""
24+
{"query": "What is 2+2?", "model": "{MODEL}", "provider": "{PROVIDER}"}
25+
"""
26+
Then The status code of the response is 200
27+
And The tunnel proxy handled at least 1 CONNECT request to the LLM provider
28+
29+
# NOTE: no_proxy is defined on Llama Stack's ProxyConfig model but not
30+
# implemented in _build_proxy_mounts (http_client.py). The field is ignored.
31+
# When Llama Stack implements no_proxy support, add a test here.
32+
33+
@TunnelProxy
34+
Scenario: LLM query fails gracefully when proxy is unreachable
35+
Given Llama Stack is configured to route inference through proxy "http://127.0.0.1:19999"
36+
And Llama Stack is restarted
37+
And Lightspeed Stack is restarted
38+
When I use "query" to ask question
39+
"""
40+
{"query": "What is 2+2?", "model": "{MODEL}", "provider": "{PROVIDER}"}
41+
"""
42+
Then The status code of the response is 500
43+
44+
45+
# --- AC2: Interception proxy with CA certificate ---
46+
47+
@InterceptionProxy
48+
Scenario: LLM traffic works through interception proxy with correct CA
49+
Given An interception proxy with trustme CA is running on port 8889
50+
And Llama Stack is configured to route inference through the interception proxy with CA cert
51+
And Llama Stack is restarted
52+
And Lightspeed Stack is restarted
53+
When I use "query" to ask question
54+
"""
55+
{"query": "What is 2+2?", "model": "{MODEL}", "provider": "{PROVIDER}"}
56+
"""
57+
Then The status code of the response is 200
58+
And The interception proxy intercepted at least 1 connection
59+
60+
@InterceptionProxy
61+
Scenario: LLM query fails when interception proxy CA is not provided
62+
Given An interception proxy with trustme CA is running on port 8890
63+
And Llama Stack is configured to route inference through the interception proxy without CA cert
64+
And Llama Stack is restarted
65+
And Lightspeed Stack is restarted
66+
When I use "query" to ask question
67+
"""
68+
{"query": "What is 2+2?", "model": "{MODEL}", "provider": "{PROVIDER}"}
69+
"""
70+
Then The status code of the response is 500
71+
72+
73+
# --- AC3: TLS version and cipher configuration ---
74+
75+
@TLSVersion
76+
Scenario: TLS minimum version TLSv1.2 is respected
77+
Given Llama Stack is configured with minimum TLS version "TLSv1.2"
78+
And Llama Stack is restarted
79+
And Lightspeed Stack is restarted
80+
When I use "query" to ask question
81+
"""
82+
{"query": "What is 2+2?", "model": "{MODEL}", "provider": "{PROVIDER}"}
83+
"""
84+
Then The status code of the response is 200
85+
86+
@TLSVersion
87+
Scenario: TLS minimum version TLSv1.3 is respected
88+
Given Llama Stack is configured with minimum TLS version "TLSv1.3"
89+
And Llama Stack is restarted
90+
And Lightspeed Stack is restarted
91+
When I use "query" to ask question
92+
"""
93+
{"query": "What is 2+2?", "model": "{MODEL}", "provider": "{PROVIDER}"}
94+
"""
95+
Then The status code of the response is 200
96+
97+
@TLSCipher
98+
Scenario: Custom cipher suite configuration is respected
99+
Given Llama Stack is configured with ciphers "ECDHE+AESGCM:DHE+AESGCM"
100+
And Llama Stack is restarted
101+
And Lightspeed Stack is restarted
102+
When I use "query" to ask question
103+
"""
104+
{"query": "What is 2+2?", "model": "{MODEL}", "provider": "{PROVIDER}"}
105+
"""
106+
Then The status code of the response is 200

0 commit comments

Comments
 (0)