Commit 4259578
feat: add automated model discovery with feature-flag-gated availability (#772)
## Summary
Replaces hardcoded model arrays across 5 components with a single-source
model manifest (`models.json`), feature-flag-gated availability via
Unleash, and automated Vertex AI endpoint discovery.
**Before:** Adding a new model required changes in 5+ files across 3
components, a CI build, and a release.
**After:** Add a model ID to the manifest → discovery script probes
Vertex AI → Unleash flag auto-created on deploy → admin enables via UI.
[RHOAIENG-50567](https://issues.redhat.com/browse/RHOAIENG-50567)
## What changed
### Model Manifest (single source of truth)
- **`components/manifests/base/models.json`** — Declarative list of all
known models with Vertex AI mappings, provider, and availability.
Deployed as a ConfigMap (`ambient-models`) mounted into backend and
operator pods.
- **`components/manifests/base/kustomization.yaml`** —
`configMapGenerator` with `disableNameSuffixHash: true` for stable
ConfigMap name.
### Backend (Go)
- **`handlers/models.go`** — New `GET /api/projects/:projectName/models`
endpoint. Reads manifest from ConfigMap volume, checks workspace-scoped
feature flag overrides (ConfigMap), falls back to Unleash global state.
Uses `atomic.Pointer` for thread-safe manifest caching.
- **`handlers/sessions.go`** — `isModelAvailable()` validation added to
`CreateSession` — rejects unknown or disabled models with 400.
- **`handlers/featureflags_admin.go`** — Refactored: extracted
`setFlagOverride()` shared helper (3 duplicate handlers → one-liners).
Added `sanitizeParam()` for log injection prevention on all URL params.
Added `errors.IsAlreadyExists` handling for ConfigMap creation race.
Switched from `context.Background()` to `c.Request.Context()`.
- **`cmd/sync_model_flags.go`** — New `sync-model-flags` subcommand.
Ensures every manifest model has a corresponding Unleash flag (disabled,
`scope:workspace` tagged). Runs at server startup with 3 retries +
exponential backoff. All HTTP calls use `context.Context` for
cancellation.
- **`featureflags/featureflags.go`** — `initialized` changed to
`atomic.Bool` (race fix). Added `IsModelEnabled()` with fail-open
semantics (models available when Unleash is down).
- **`types/models.go`** — New types: `Model`, `ModelEntry`,
`ModelManifest`, `ListModelsResponse`.
### Operator (Go)
- **`internal/models/models.go`** — `LoadManifest()` and
`ResolveVertexID()` for manifest-driven Vertex AI model ID resolution.
Parses `provider` field for future multi-provider support.
- **`internal/handlers/sessions.go`** — Injects `LLM_MODEL_VERTEX_ID`
env var into runner pods from manifest resolution (replaces reliance on
runner's static `VERTEX_MODEL_MAP`).
### Frontend (TypeScript)
- **`services/api/models.ts`** + **`services/queries/use-models.ts`** —
React Query hook (`useModels`) with `staleTime: 60s` and `enabled`
guard.
- **`components/create-session-dialog.tsx`** — Model dropdown fetches
from API (lazy-loaded when dialog opens). Falls back to hardcoded list
if API unavailable.
- **`components/workspace-sections/feature-flags-section.tsx`** —
Replaced toggle switch with three-state segmented control (Default | On
| Off). Invalidates `["models", projectName]` query after saving
overrides.
- **`types/api/models.ts`** — `LLMModel` and `ListModelsResponse` types.
### Runner (Python)
- **`bridges/claude/auth.py`** — `setup_sdk_authentication()` now
prefers `LLM_MODEL_VERTEX_ID` (operator-resolved from manifest) over
static `VERTEX_MODEL_MAP`. Fallback chain: manifest → static map →
default.
### CI/CD
- **`.github/workflows/model-discovery.yml`** — Daily GHA workflow
probes Vertex AI endpoints via Workload Identity Federation, updates
`models.json`, and opens a PR if changes detected.
- **`scripts/model-discovery.py`** — Discovery script with retry +
exponential backoff on both version resolution and availability probing.
Resilient to missing/malformed manifest. Never removes models.
## Model flow (end-to-end)
```
User selects model in UI
→ Frontend fetches GET /api/projects/:name/models
→ Backend reads manifest (ConfigMap volume)
→ Backend checks workspace override (ConfigMap) then Unleash
→ Returns filtered model list
User creates session with model
→ Backend validates model via isModelAvailable()
→ Backend creates AgenticSession CR
→ Operator resolves vertexId from manifest
→ Operator injects LLM_MODEL_VERTEX_ID into runner pod
→ Runner prefers manifest-resolved ID over static map
```
## Security
- All user-facing reads use `reqK8s` (user-scoped K8s client) for RBAC
enforcement
- `sanitizeParam()` strips `\n`/`\r` from all URL params before logging
- `url.PathEscape()` on all Unleash Admin API URL path segments
- Generic error messages to users; detailed context in server logs only
- ConfigMap creation race handled with `errors.IsAlreadyExists` fallback
- GCP access token wrapped in `RuntimeError` to prevent leak in
tracebacks
## Tests
| Component | Tests | Coverage |
|-----------|-------|----------|
| Backend handlers | 14 cases (`models_test.go`) | Auth, overrides,
caching, fallback, unavailable models |
| Backend cmd | 8 cases (`sync_model_flags_test.go`) | Flag creation,
conflict, skip, error, CLI args |
| Backend featureflags | 5 cases (`featureflags_test.go`) | Unconfigured
state, fail-open for models |
| Operator models | 7 cases (`models_test.go`) | Load, resolve, missing,
malformed, unavailable |
| Runner auth | 9 cases (`test_claude_auth.py`) | Vertex ID priority,
static map, API key mode |
## Test plan
- [ ] `cd components/backend && go vet ./... && go test -tags test ./...
&& golangci-lint run`
- [ ] `cd components/operator && go vet ./... && go test ./...`
- [ ] Deploy to cluster — verify `GET /api/projects/:name/models`
returns filtered list
- [ ] Verify model dropdown in create session dialog populates from API
- [ ] Toggle a model flag in Feature Flags admin → verify model list
updates
- [ ] Create session with a disabled model → verify 400 rejection
- [ ] Verify `sync-model-flags` logs show flags created/skipped on
startup
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 56a196e commit 4259578
30 files changed
Lines changed: 2559 additions & 436 deletions
File tree
- .github/workflows
- components
- backend
- cmd
- featureflags
- handlers
- types
- frontend/src
- app/api/projects/[name]/models
- components
- workspace-sections
- services
- api
- queries
- types/api
- manifests/base
- operator/internal
- handlers
- models
- runners/claude-code-runner
- ambient_runner/bridges/claude
- tests
- scripts
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
65 | 68 | | |
66 | 69 | | |
67 | 70 | | |
| |||
153 | 156 | | |
154 | 157 | | |
155 | 158 | | |
156 | | - | |
157 | | - | |
| 159 | + | |
| 160 | + | |
158 | 161 | | |
159 | 162 | | |
160 | 163 | | |
| |||
0 commit comments