test: E2E test suite for ExternalProvider/ExternalModel CRDs#187
test: E2E test suite for ExternalProvider/ExternalModel CRDs#187yossiovadia wants to merge 2 commits into
Conversation
Automated E2E tests validating the full new-CRD pipeline: ExternalProvider + ExternalModel → controller creates networking resources → BBR plugin resolves → api-translation → inference response. 38 test cases across 7 categories, 5 providers: - tier1: HTTP 200 smoke + OpenAI format validation (10 tests) - tier2: tool calling (5), multimodal (5), JSON mode (5), system prompts (5), multi-turn (5), error handling (3) Known failures (pre-existing bugs on main, not regressions): - Anthropic multimodal: translator drops image metadata (opendatahub-io#185) - Anthropic JSON mode: translator strips response_format (opendatahub-io#186) Tests run against local Kind deployment with llm-katan simulator. Requires: new CRDs + controller + updated BBR plugin deployed.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: yossiovadia The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Central YAML (base), Organization UI (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughTwo new E2E test files are added to the test suite. The first file ( Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
test/e2e-new-crds/e2e_suite_test.go (2)
162-199: Cache the*rest.Configinstead of reloading kubeconfig on every exec.
execInPodis called insidegomega.Eventuallypolling loops across ~38 specs, so this re-parses kubeconfig hundreds of times and can also drift from the config used to buildkubeClientinBeforeSuite(e.g. ifKUBECONFIGmutates between calls). Hoist the loaded config into a package-level var alongsidekubeClient.Sketch
var ( kubeClient kubernetes.Interface + restConfig *rest.Config ... ) @@ BeforeSuite - config, err := clientcmd.NewNonInteractiveDeferredLoadingClientConfig(...).ClientConfig() + restConfig, err = clientcmd.NewNonInteractiveDeferredLoadingClientConfig(...).ClientConfig() ... - kubeClient, err = kubernetes.NewForConfig(config) + kubeClient, err = kubernetes.NewForConfig(restConfig) @@ execInPod - config, err := clientcmd.NewNonInteractiveDeferredLoadingClientConfig( - clientcmd.NewDefaultClientConfigLoadingRules(), nil, - ).ClientConfig() - if err != nil { - return "", err - } ... - executor, err := remotecommand.NewSPDYExecutor(config, "POST", req.URL()) + executor, err := remotecommand.NewSPDYExecutor(restConfig, "POST", req.URL())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-new-crds/e2e_suite_test.go` around lines 162 - 199, The execInPod function currently reloads kubeconfig on every call; instead, create a package-level *rest.Config variable (e.g., restConfig) and initialize it once in the same setup that creates kubeClient (BeforeSuite), then modify execInPod to use that cached restConfig rather than calling clientcmd.NewNonInteractiveDeferredLoadingClientConfig; update execInPod to reference restConfig when calling remotecommand.NewSPDYExecutor and add a clear error if restConfig is nil to help catch mis-initialization.
123-129: Useapierrors.IsAlreadyExistsinstead ofstrings.Contains(err.Error(), "already exists").Substring matching on error text is locale/version-fragile and will silently swallow unrelated errors that happen to contain the phrase. The k8s API returns a typed
StatusErrorfor this case.Proposed fix
+ apierrors "k8s.io/apimachinery/pkg/api/errors" ... func createNamespace(name string) { ns := &corev1.Namespace{ObjectMeta: metav1.ObjectMeta{Name: name}} _, err := kubeClient.CoreV1().Namespaces().Create(context.TODO(), ns, metav1.CreateOptions{}) - if err != nil && !strings.Contains(err.Error(), "already exists") { + if err != nil && !apierrors.IsAlreadyExists(err) { gomega.Expect(err).NotTo(gomega.HaveOccurred()) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-new-crds/e2e_suite_test.go` around lines 123 - 129, Replace the fragile substring check in createNamespace by using the Kubernetes API error helper: when kubeClient.CoreV1().Namespaces().Create(...) returns an error, import "k8s.io/apimachinery/pkg/api/errors" and call apierrors.IsAlreadyExists(err) to detect the already-exists case; only call gomega.Expect(err).NotTo(gomega.HaveOccurred()) for errors that are not apierrors.IsAlreadyExists(err). Ensure the import is added and the logic references createNamespace and the Create call.test/e2e-new-crds/e2e_test.go (3)
209-217:SplitN(..., 2)misparses responses with interim status lines.
curl -iprints every response section, includingHTTP/1.1 100 Continue\r\n\r\ninterim responses, followed by the realHTTP/1.1 200 OK\r\n...\r\n\r\nbody. The first\r\n\r\nsplit point lands after the 100 Continue, soparts[1]becomes another header block plus body andjson.Unmarshalfails. It's also possible to get gateway-injected100 Continuefrom Envoy/Istio under load.Safer: split on the last
\r\n\r\n, or scan for the finalHTTP/status line.Proposed fix
- parts := strings.SplitN(resp, "\r\n\r\n", 2) - if len(parts) < 2 { + idx := strings.LastIndex(resp, "\r\n\r\n") + if idx < 0 { return nil, fmt.Errorf("no body separator found") } var result map[string]any - err := json.Unmarshal([]byte(strings.TrimSpace(parts[1])), &result) + err := json.Unmarshal([]byte(strings.TrimSpace(resp[idx+4:])), &result) return result, err🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-new-crds/e2e_test.go` around lines 209 - 217, The parseResponseBody function misparses when interim HTTP status blocks (e.g., "HTTP/1.1 100 Continue") precede the real response; change parseResponseBody to locate the final header/body separator instead of using strings.SplitN(...,2) — e.g., find the last occurrence of "\r\n\r\n" (or scan for the last "HTTP/" status line and take the subsequent separator) and unmarshal the substring after that separator; update references in parseResponseBody to use that final-split logic so json.Unmarshal receives only the real response body.
91-105: Don't drop thejson.Marshalerror.If marshalling the body fails (e.g. someone later passes a value that isn't JSON-serializable),
bodyBytesisniland curl is invoked with-d "", producing a confusing 400 that looks like a server bug. Fail the test at the source.Proposed fix
-func buildCurlCommand(modelName string, body map[string]any) []string { - bodyBytes, _ := json.Marshal(body) +func buildCurlCommand(modelName string, body map[string]any) []string { + bodyBytes, err := json.Marshal(body) + gomega.Expect(err).NotTo(gomega.HaveOccurred(), "failed to marshal request body")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-new-crds/e2e_test.go` around lines 91 - 105, buildCurlCommand currently drops the error from json.Marshal causing nil body to become an empty -d and a misleading 400; change buildCurlCommand(modelName string, body map[string]any) to return ([]string, error), check the json.Marshal error (i.e. handle bodyBytes, err := json.Marshal(body)), and return nil, err on failure; update all callers to handle the error (in tests call t.Fatalf or fail the test when buildCurlCommand returns an error) so marshalling problems fail fast with a clear message instead of sending an empty payload.
195-207: Remove unusedfilterProvidersfunction.Ripgrep found zero callers for this function across the entire codebase. Since it's test-only dead code with no impact on security or runtime behavior, drop it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-new-crds/e2e_test.go` around lines 195 - 207, Remove the unused test helper function filterProviders: delete the entire function definition (filterProviders(providers []Provider, excludeTypes ...string) []Provider) and any references to it (there are none), and run the tests/build to verify nothing else depends on the Provider signature; this is dead test-only code so simply removing the function from e2e_test.go is sufficient.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e-new-crds/e2e_suite_test.go`:
- Around line 22-27: Replace the public sslip.io default in the constant
defaultSimulatorEndpoint with an in-cluster service hostname (e.g.,
"llm-katan.llm-katan.svc.cluster.local") so tests default to cluster-local
traffic; ensure any code that reads E2E_SIMULATOR_ENDPOINT treats it as an
opt-in override (fall back to defaultSimulatorEndpoint only when the env var is
set/empty as intended) so credentials and request bodies are not sent to an
external IP by default.
In `@test/e2e-new-crds/e2e_test.go`:
- Around line 24-30: Add a one-line explanatory comment immediately above the
simulatorKeys map to document that these hardcoded values are intentional
defaults from the llm-katan simulator; specifically insert the comment "// These
match the DEFAULT_API_KEYS in llm-katan's config.py." before the simulatorKeys
variable declaration to prevent accidental replacement with real keys (refer to
the simulatorKeys map symbol).
- Around line 285-305: The tests perform unchecked type assertions (e.g.,
result["choices"].([]any), choice := choices[0].(map[string]any), tc :=
toolCalls[0].(map[string]any), fn := tc["function"].(map[string]any)) which will
panic on unexpected payloads; update each of these sites in the spec around
parseResponseBody handling to use the comma-ok form (e.g., v, ok := ...;
gomega.Expect(ok).To(gomega.BeTrue(), "descriptive message")) or replace them
with Gomega matchers like HaveKeyWithValue/MatchKeys so failures produce
readable expectations rather than panics—apply the same change for every
unchecked assertion occurrences referenced in the comment (lines handling
choices, message/tool_calls, tc, fn and the other mentioned assertion sites).
---
Nitpick comments:
In `@test/e2e-new-crds/e2e_suite_test.go`:
- Around line 162-199: The execInPod function currently reloads kubeconfig on
every call; instead, create a package-level *rest.Config variable (e.g.,
restConfig) and initialize it once in the same setup that creates kubeClient
(BeforeSuite), then modify execInPod to use that cached restConfig rather than
calling clientcmd.NewNonInteractiveDeferredLoadingClientConfig; update execInPod
to reference restConfig when calling remotecommand.NewSPDYExecutor and add a
clear error if restConfig is nil to help catch mis-initialization.
- Around line 123-129: Replace the fragile substring check in createNamespace by
using the Kubernetes API error helper: when
kubeClient.CoreV1().Namespaces().Create(...) returns an error, import
"k8s.io/apimachinery/pkg/api/errors" and call apierrors.IsAlreadyExists(err) to
detect the already-exists case; only call
gomega.Expect(err).NotTo(gomega.HaveOccurred()) for errors that are not
apierrors.IsAlreadyExists(err). Ensure the import is added and the logic
references createNamespace and the Create call.
In `@test/e2e-new-crds/e2e_test.go`:
- Around line 209-217: The parseResponseBody function misparses when interim
HTTP status blocks (e.g., "HTTP/1.1 100 Continue") precede the real response;
change parseResponseBody to locate the final header/body separator instead of
using strings.SplitN(...,2) — e.g., find the last occurrence of "\r\n\r\n" (or
scan for the last "HTTP/" status line and take the subsequent separator) and
unmarshal the substring after that separator; update references in
parseResponseBody to use that final-split logic so json.Unmarshal receives only
the real response body.
- Around line 91-105: buildCurlCommand currently drops the error from
json.Marshal causing nil body to become an empty -d and a misleading 400; change
buildCurlCommand(modelName string, body map[string]any) to return ([]string,
error), check the json.Marshal error (i.e. handle bodyBytes, err :=
json.Marshal(body)), and return nil, err on failure; update all callers to
handle the error (in tests call t.Fatalf or fail the test when buildCurlCommand
returns an error) so marshalling problems fail fast with a clear message instead
of sending an empty payload.
- Around line 195-207: Remove the unused test helper function filterProviders:
delete the entire function definition (filterProviders(providers []Provider,
excludeTypes ...string) []Provider) and any references to it (there are none),
and run the tests/build to verify nothing else depends on the Provider
signature; this is dead test-only code so simply removing the function from
e2e_test.go is sufficient.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: b903da1e-c9ad-47a5-8ef1-0a69e22e7e3b
📒 Files selected for processing (2)
test/e2e-new-crds/e2e_suite_test.gotest/e2e-new-crds/e2e_test.go
| const ( | ||
| defaultNs = "e2e-new-crds" | ||
| defaultSimulatorEndpoint = "3-13-21-181.sslip.io" | ||
| defaultGatewayName = "maas-default-gateway" | ||
| defaultGatewayNamespace = "istio-system" | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What public host is 3.13.21.181 / 3-13-21-181.sslip.io and is sslip.io a wildcard DNS resolver?
💡 Result:
3.13.21.181 / 3-13-21-181.sslip.io resolves to the IP 3.13.21.181 using sslip.io’s IP-embedded DNS mapping format. Yes—sslip.io is a wildcard-style DNS resolver (often described as “wildcard DNS”) in the sense that any hostname of the form .sslip.io returns that embedded IP as an A/AAAA answer, for both local and public IPs (with a configurable “public” behavior in its server implementation).
Citations:
- 1: https://sslip.io/
- 2: https://github.com/cunnie/sslip.io
- 3: https://docs.epinio.io/1.11.0/installation/wildcardDNS_setup
- 4: https://nip.io/
Move simulatorEndpoint default from external sslip.io endpoint to in-cluster service.
3-13-21-181.sslip.io resolves to 3.13.21.181 (a public IP). Any developer or CI pipeline that forgets to set E2E_SIMULATOR_ENDPOINT will send request bodies and the Authorization header (carrying llm-katan-*-key) to an IP outside your control—CWE-668 (Exposure of Resource to Wrong Sphere). Even if the keys are simulator-only, this is a hidden external dependency that fails silently instead of loudly.
Default to an in-cluster service (e.g., llm-katan.llm-katan.svc.cluster.local) and treat the env var as opt-in for off-cluster test runs. This prevents accidental credential exfiltration and makes missing endpoints fail explicitly.
Change
- defaultSimulatorEndpoint = "3-13-21-181.sslip.io"
+ // In-cluster llm-katan service; override with E2E_SIMULATOR_ENDPOINT for off-cluster runs.
+ defaultSimulatorEndpoint = "llm-katan.llm-katan.svc.cluster.local"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const ( | |
| defaultNs = "e2e-new-crds" | |
| defaultSimulatorEndpoint = "3-13-21-181.sslip.io" | |
| defaultGatewayName = "maas-default-gateway" | |
| defaultGatewayNamespace = "istio-system" | |
| ) | |
| const ( | |
| defaultNs = "e2e-new-crds" | |
| // In-cluster llm-katan service; override with E2E_SIMULATOR_ENDPOINT for off-cluster runs. | |
| defaultSimulatorEndpoint = "llm-katan.llm-katan.svc.cluster.local" | |
| defaultGatewayName = "maas-default-gateway" | |
| defaultGatewayNamespace = "istio-system" | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e-new-crds/e2e_suite_test.go` around lines 22 - 27, Replace the public
sslip.io default in the constant defaultSimulatorEndpoint with an in-cluster
service hostname (e.g., "llm-katan.llm-katan.svc.cluster.local") so tests
default to cluster-local traffic; ensure any code that reads
E2E_SIMULATOR_ENDPOINT treats it as an opt-in override (fall back to
defaultSimulatorEndpoint only when the env var is set/empty as intended) so
credentials and request bodies are not sent to an external IP by default.
There was a problem hiding this comment.
The sslip.io endpoint is the llm-katan simulator running on AWS — it's external by design (simulates a real cloud provider). Using an in-cluster service would defeat the purpose of testing external model routing through Istio ServiceEntry + TLS origination. The endpoint is configurable via E2E_SIMULATOR_ENDPOINT env var.
There was a problem hiding this comment.
@yossiovadia, understood — the external endpoint is architecturally necessary to exercise the full Istio ServiceEntry + TLS origination path, and replacing it with an in-cluster service would short-circuit exactly what the test is validating. With simulator-only keys and the E2E_SIMULATOR_ENDPOINT override already in place, the concern doesn't apply here. Thanks for the context.
📝 Learnings were identified and not saved (knowledge base disabled). Enable
| var simulatorKeys = map[string]string{ | ||
| "openai": "llm-katan-openai-key", | ||
| "anthropic": "llm-katan-anthropic-key", | ||
| "azure-openai": "llm-katan-azure-key", | ||
| "vertex-openai": "llm-katan-vertexai-key", | ||
| "bedrock-openai": "llm-katan-bedrock-key", | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm these strings only appear as llm-katan simulator defaults and aren't referenced as real secrets elsewhere.
rg -nP -C2 'llm-katan-(openai|anthropic|azure|vertexai|bedrock)-key'Repository: opendatahub-io/ai-gateway-payload-processing
Length of output: 2314
Add explanatory comment pinning these keys to llm-katan simulator defaults.
The hardcoded keys match llm-katan v0.15.0 simulator defaults (confirmed in tools/simulator/README.md). However, unlike the sibling file test/e2e/e2e_test.go, this file lacks a comment documenting the source. Add:
// These match the DEFAULT_API_KEYS in llm-katan's config.py.
before the simulatorKeys map. Without this anchor, future maintainers may replace them with real keys for "convenience", creating CWE-798 exposure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e-new-crds/e2e_test.go` around lines 24 - 30, Add a one-line
explanatory comment immediately above the simulatorKeys map to document that
these hardcoded values are intentional defaults from the llm-katan simulator;
specifically insert the comment "// These match the DEFAULT_API_KEYS in
llm-katan's config.py." before the simulatorKeys variable declaration to prevent
accidental replacement with real keys (refer to the simulatorKeys map symbol).
There was a problem hiding this comment.
Good point — there's already a comment on the existing E2E tests' key map (test/e2e/e2e_test.go:24). I'll add the same here.
There was a problem hiding this comment.
@yossiovadia, sounds good! For consistency, make sure the added comment explicitly ties the keys to the simulator defaults to prevent CWE-798 exposure, e.g.:
// These match the DEFAULT_API_KEYS in llm-katan's config.py (simulator test defaults — not real credentials).
var simulatorKeys = map[string]string{Once it's committed, feel free to drop a note here and I'll mark this resolved.
| result, err := parseResponseBody(resp) | ||
| gomega.Expect(err).NotTo(gomega.HaveOccurred()) | ||
|
|
||
| choices, ok := result["choices"].([]any) | ||
| gomega.Expect(ok).To(gomega.BeTrue()) | ||
| gomega.Expect(len(choices)).To(gomega.BeNumerically(">", 0)) | ||
|
|
||
| choice := choices[0].(map[string]any) | ||
| gomega.Expect(choice["finish_reason"]).To(gomega.Equal("tool_calls"), | ||
| fmt.Sprintf("Expected finish_reason=tool_calls for %s", p.ProviderType)) | ||
|
|
||
| msg := choice["message"].(map[string]any) | ||
| toolCalls, ok := msg["tool_calls"].([]any) | ||
| gomega.Expect(ok).To(gomega.BeTrue(), "Expected tool_calls array in message") | ||
| gomega.Expect(len(toolCalls)).To(gomega.BeNumerically(">", 0)) | ||
|
|
||
| tc := toolCalls[0].(map[string]any) | ||
| gomega.Expect(tc).To(gomega.HaveKey("id")) | ||
| fn := tc["function"].(map[string]any) | ||
| gomega.Expect(fn["name"]).To(gomega.Equal("get_weather")) | ||
| }) |
There was a problem hiding this comment.
Unchecked type assertions panic instead of producing legible test failures.
Several spots in this file (lines 292, 296, 301, 303 here; also 331-332, 363-364, 411, 413-415 in later specs) do x := y.(map[string]any) / y.([]any) without the , ok form. If the gateway or upstream returns an unexpected payload (an error JSON, or a string for content instead of an object), the goroutine panics and you get a stack trace instead of "Expected ... for provider X".
Either guard each assertion with , ok and gomega.Expect(ok)..., or let Gomega type-coerce via gomega.HaveKeyWithValue / gomega.MatchKeys. Pattern to apply throughout:
Example (apply to every `.(...)` site, including 331-415)
- choice := choices[0].(map[string]any)
+ choice, ok := choices[0].(map[string]any)
+ gomega.Expect(ok).To(gomega.BeTrue(), "choices[0] is not an object")
gomega.Expect(choice["finish_reason"]).To(gomega.Equal("tool_calls"), ...)
- msg := choice["message"].(map[string]any)
+ msg, ok := choice["message"].(map[string]any)
+ gomega.Expect(ok).To(gomega.BeTrue(), "choice.message is not an object")
toolCalls, ok := msg["tool_calls"].([]any)
...
- tc := toolCalls[0].(map[string]any)
+ tc, ok := toolCalls[0].(map[string]any)
+ gomega.Expect(ok).To(gomega.BeTrue(), "tool_calls[0] is not an object")
gomega.Expect(tc).To(gomega.HaveKey("id"))
- fn := tc["function"].(map[string]any)
+ fn, ok := tc["function"].(map[string]any)
+ gomega.Expect(ok).To(gomega.BeTrue(), "tool_calls[0].function is not an object")
gomega.Expect(fn["name"]).To(gomega.Equal("get_weather"))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| result, err := parseResponseBody(resp) | |
| gomega.Expect(err).NotTo(gomega.HaveOccurred()) | |
| choices, ok := result["choices"].([]any) | |
| gomega.Expect(ok).To(gomega.BeTrue()) | |
| gomega.Expect(len(choices)).To(gomega.BeNumerically(">", 0)) | |
| choice := choices[0].(map[string]any) | |
| gomega.Expect(choice["finish_reason"]).To(gomega.Equal("tool_calls"), | |
| fmt.Sprintf("Expected finish_reason=tool_calls for %s", p.ProviderType)) | |
| msg := choice["message"].(map[string]any) | |
| toolCalls, ok := msg["tool_calls"].([]any) | |
| gomega.Expect(ok).To(gomega.BeTrue(), "Expected tool_calls array in message") | |
| gomega.Expect(len(toolCalls)).To(gomega.BeNumerically(">", 0)) | |
| tc := toolCalls[0].(map[string]any) | |
| gomega.Expect(tc).To(gomega.HaveKey("id")) | |
| fn := tc["function"].(map[string]any) | |
| gomega.Expect(fn["name"]).To(gomega.Equal("get_weather")) | |
| }) | |
| result, err := parseResponseBody(resp) | |
| gomega.Expect(err).NotTo(gomega.HaveOccurred()) | |
| choices, ok := result["choices"].([]any) | |
| gomega.Expect(ok).To(gomega.BeTrue()) | |
| gomega.Expect(len(choices)).To(gomega.BeNumerically(">", 0)) | |
| choice, ok := choices[0].(map[string]any) | |
| gomega.Expect(ok).To(gomega.BeTrue(), "choices[0] is not an object") | |
| gomega.Expect(choice["finish_reason"]).To(gomega.Equal("tool_calls"), | |
| fmt.Sprintf("Expected finish_reason=tool_calls for %s", p.ProviderType)) | |
| msg, ok := choice["message"].(map[string]any) | |
| gomega.Expect(ok).To(gomega.BeTrue(), "choice.message is not an object") | |
| toolCalls, ok := msg["tool_calls"].([]any) | |
| gomega.Expect(ok).To(gomega.BeTrue(), "Expected tool_calls array in message") | |
| gomega.Expect(len(toolCalls)).To(gomega.BeNumerically(">", 0)) | |
| tc, ok := toolCalls[0].(map[string]any) | |
| gomega.Expect(ok).To(gomega.BeTrue(), "tool_calls[0] is not an object") | |
| gomega.Expect(tc).To(gomega.HaveKey("id")) | |
| fn, ok := tc["function"].(map[string]any) | |
| gomega.Expect(ok).To(gomega.BeTrue(), "tool_calls[0].function is not an object") | |
| gomega.Expect(fn["name"]).To(gomega.Equal("get_weather")) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e-new-crds/e2e_test.go` around lines 285 - 305, The tests perform
unchecked type assertions (e.g., result["choices"].([]any), choice :=
choices[0].(map[string]any), tc := toolCalls[0].(map[string]any), fn :=
tc["function"].(map[string]any)) which will panic on unexpected payloads; update
each of these sites in the spec around parseResponseBody handling to use the
comma-ok form (e.g., v, ok := ...; gomega.Expect(ok).To(gomega.BeTrue(),
"descriptive message")) or replace them with Gomega matchers like
HaveKeyWithValue/MatchKeys so failures produce readable expectations rather than
panics—apply the same change for every unchecked assertion occurrences
referenced in the comment (lines handling choices, message/tool_calls, tc, fn
and the other mentioned assertion sites).
There was a problem hiding this comment.
Fair point. These follow the same pattern as the existing E2E tests (test/e2e/e2e_test.go:200). In practice, a panic in a test produces a clear stack trace that points directly to the assertion. But I can wrap them in gomega Expect for cleaner output — will address.
There was a problem hiding this comment.
@yossiovadia, agreed — the stack trace is usually sufficient for debugging, but one extra concern in Ginkgo: if the goroutine panic isn't recovered cleanly (e.g., in a parallel go test -p N run), it can abort the entire suite process rather than just failing the single It node, swallowing results from other specs. The gomega.Expect(ok) form lets Ginkgo's panic-recovery machinery report it as a normal spec failure and continue. The committable suggestion above covers all the affected lines — feel free to use it as-is when you get to it.
|
Update: The 2 Anthropic failures discovered by this test suite have been fixed:
Once those merge, this suite should go 38/38 green. |
|
/cc @nirrozenbaum |
Summary
Automated E2E test suite validating the full ExternalProvider/ExternalModel pipeline end-to-end. Tests the complete stack: CRD creation → controller reconciliation → BBR plugin resolution → api-translation → llm-katan simulator response.
Depends on #182, #183, #184 (ExternalProvider CRD types, reconcilers, and BBR plugin integration).
Test coverage
38 test cases across 7 categories, all 5 providers:
Known failures (pre-existing bugs, not regressions)
response_formatfield, so the backend never sees the JSON constraintBoth bugs exist on
mainand are unrelated to the ExternalProvider work. Discovered via these tests running on a local Kind deployment with llm-katan v0.15.0 simulator.How to run
Requires a cluster with new CRDs + controller + updated BBR plugin deployed (e.g., via
local-deploy.sh):Test plan
test/e2e/tests (CI/CD unaffected)Summary by CodeRabbit