Skip to content

feat: typed ExternalModel reconciler with cross-watch and legacy support#286

Open
yossiovadia wants to merge 1 commit into
opendatahub-io:mainfrom
yossiovadia:feat/model-store-crosswatch
Open

feat: typed ExternalModel reconciler with cross-watch and legacy support#286
yossiovadia wants to merge 1 commit into
opendatahub-io:mainfrom
yossiovadia:feat/model-store-crosswatch

Conversation

@yossiovadia
Copy link
Copy Markdown
Contributor

@yossiovadia yossiovadia commented May 18, 2026

Summary

Depends on #285. Updates the model-provider-resolver plugin to use typed clients for inference.opendatahub.io ExternalModels with provider cross-watch, plus legacy backward compatibility for maas.opendatahub.io.

4 files changed, 121 insertions, 46 deletions.

What this does

  • Typed ExternalModel reconciler cross-references provider store for credential/config resolution
  • Cross-watch: ExternalProvider changes trigger re-reconciliation of dependent models
  • Legacy maas.opendatahub.io ExternalModel reconciler (unstructured, backward compat)
  • ModelConfigKey for config propagation via CycleState
  • Scheme registration in factory for typed client support
  • Registers all three reconcilers (provider, model, legacy) in the plugin

Test plan

  • Unit tests pass, lint clean (0 issues)

Risk analysis

  • Risk rating: 2
  • Why: Modifies existing plugin.go (factory + reconciler registration). The legacy reconciler ensures backward compatibility with existing E2E tests that use maas.opendatahub.io CRDs.

Depends on: #285
cc @nirrozenbaum

Summary by CodeRabbit

  • New Features

    • Added support for storing per-model configuration derived from providers and model overrides.
  • Improvements

    • Reconciliations now use typed APIs for external providers and models, improving reliability and credential propagation.
    • Request validation tightened: only chat completions accepted and model names strictly matched.
  • Refactor

    • Rewired reconciler/plugin initialization and extended in-memory store to include model configs; legacy CRD handling retained for compatibility.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: b6083401-ca58-4577-9234-30ea571406e1

📥 Commits

Reviewing files that changed from the base of the PR and between 6e742da and aef4953.

📒 Files selected for processing (4)
  • pkg/plugins/common/state/state-keys.go
  • pkg/plugins/model-provider-resolver/external_model_reconciler.go
  • pkg/plugins/model-provider-resolver/plugin.go
  • pkg/plugins/model-provider-resolver/store.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/plugins/common/state/state-keys.go
  • pkg/plugins/model-provider-resolver/store.go
  • pkg/plugins/model-provider-resolver/external_model_reconciler.go
  • pkg/plugins/model-provider-resolver/plugin.go

📝 Walkthrough

Walkthrough

Adds an exported ModelConfigKey and a config map field to externalModelInfo. Implements a typed inferencev1alpha1 ExternalModel reconciler that looks up providers, merges provider+model config via mergeConfig, applies auth secret overrides, and writes externalModelInfo into a shared infoStore; retains a legacy unstructured reconciler for the MaaS GVK. The plugin registers the inference scheme, wires typed reconcilers and a provider→model watch, and ProcessRequest enforces /chat/completions, validates model name, returns nil on missing store entries, and writes ModelConfigKey only when config is non-empty.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Security and Logic Notes

  • range_8a2a695c4b5f (mergeConfig, secret override):

    • Confirm mergeConfig treats provider config as lower precedence than model overrides and does not expose secrets through config keys. Check for CWE-200 (Information Exposure) where sensitive keys might be copied into non-secret config maps. Ensure keys carrying credentials are never merged into plain map[string]string (CWE-200).
    • Secret namespace/name override moves credential reference into model namespace; verify this does not permit namespace escape or unauthorized access (CWE-269: Improper Privilege Management). Ensure RBAC and secret access checks remain enforced for the model namespace.
    • On provider lookup requeue: verify exponential backoff or bounded retries to avoid busy-loop requeue (CWE-400: Uncontrolled Resource Consumption).
  • range_4849556e79ec (reconciler wiring, shared infoStore):

    • Shared infoStore concurrent writes from typed and legacy reconcilers must be synchronized; check for race conditions (CWE-362: Concurrent Execution with Shared Resource). Ensure thread-safe map access or locking.
    • Enqueue mapping from ExternalProvider→ExternalModel must comprehensively compute affected models; missing models could cause stale cache use (logic correctness).
  • range_83e4d1d0f013 (ProcessRequest validation, CycleState writes):

    • Strict route filtering to /chat/completions should not allow other routes; confirm this validation occurs before any downstream side effects. Mistaken acceptance could lead to improper routing (CWE-602: Client-Side Enforcement of Server-Side Security).
    • Writing config to CycleState only when non-empty: ensure consumers correctly handle absent key vs empty map; avoid NULL vs empty-different semantics that could result in Authorization or behavior bypasses.
  • range_03ef9c2b8811 (logger removal):

    • Removal of request-scoped logger may impair forensic/debugging; ensure sufficient context is logged elsewhere for failures, especially around model extraction and auth failures (operational security).

Action items (concrete):

  • Add unit tests covering mergeConfig precedence, and ensure secret keys are never stored in externalModelInfo.config.
  • Add concurrency tests or use sync-safe types for infoStore, or document locking.
  • Implement/verify backoff strategy on requeue for missing providers.
  • Add tests for ProcessRequest route validation and for behavior when ModelConfigKey is absent vs empty.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: typed ExternalModel reconciler with cross-watch and legacy support' directly and comprehensively describes the main change: introducing typed ExternalModel reconciliation with cross-watch capability and backward compatibility.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from nerdalert and noyitz May 18, 2026 14:53
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 18, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: yossiovadia

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 18, 2026
@github-actions github-actions Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 18, 2026
@yossiovadia yossiovadia force-pushed the feat/model-store-crosswatch branch 2 times, most recently from 3b2f788 to 6e742da Compare May 19, 2026 14:11
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
pkg/plugins/model-provider-resolver/external_model_reconciler.go (1)

76-80: ⚡ Quick win

Always copy provider config before caching model info.

When ref.Config is empty, config aliases providerInfo.config. That exposes one shared map instance through every model using that provider, so any downstream mutation of ModelConfigKey can leak across models and requests.

Suggested fix
-	// Model config overrides provider config
-	config := providerInfo.config
-	if len(ref.Config) > 0 {
-		config = mergeConfig(providerInfo.config, ref.Config)
-	}
+	// Always copy provider config; model config overrides provider values.
+	config := mergeConfig(providerInfo.config, ref.Config)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/plugins/model-provider-resolver/external_model_reconciler.go` around
lines 76 - 80, When building the cached model config, avoid aliasing
providerInfo.config into config when ref.Config is empty — always copy
providerInfo.config first so downstream mutations (e.g., to ModelConfigKey)
don't leak across models; update the logic around providerInfo.config/ref.Config
to create a new map (or call mergeConfig with a copy) instead of assigning the
original map, ensuring the cached model info holds an independent map instance.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/plugins/model-provider-resolver/external_model_reconciler.go`:
- Around line 70-74: When provider lookup via r.store.getProvider(providerKey)
returns not found, invalidate the cached externalModelInfo in the reconciler's
infoStore so ProcessRequest cannot use stale provider/secret/config data;
specifically, in the branch where providerInfo not found (the block that logs
"ExternalProvider not yet available, requeuing"), remove the corresponding
externalModelInfo entry from r.store/infoStore (e.g., call the store method that
deletes the model/provider mapping using the same providerKey or ref.Ref.Name)
before returning the requeue result.

In `@pkg/plugins/model-provider-resolver/plugin.go`:
- Around line 102-107: The shared store currently keys ExternalModel entries
only by "namespace/name", causing typed and legacy ExternalModel instances to
clobber each other; update the storage and lookup so the cache key includes the
object's GVK (or alternatively create two distinct stores e.g., typedStore and
legacyStore) and adjust the reconciler wiring (legacyExternalModelReconciler,
the typed ExternalModel reconciler created via reconcilerBuilder()) to write to
the GVK-aware store; then update ProcessRequest to perform lookups by
GVK+namespace/name (or check legacyStore then typedStore with a defined
precedence) so coexistence during migration is safe and deletions only remove
the correct GVK-scoped entry.

---

Nitpick comments:
In `@pkg/plugins/model-provider-resolver/external_model_reconciler.go`:
- Around line 76-80: When building the cached model config, avoid aliasing
providerInfo.config into config when ref.Config is empty — always copy
providerInfo.config first so downstream mutations (e.g., to ModelConfigKey)
don't leak across models; update the logic around providerInfo.config/ref.Config
to create a new map (or call mergeConfig with a copy) instead of assigning the
original map, ensuring the cached model info holds an independent map instance.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 5e2040df-6d1c-4e25-9928-1a0702ca9c03

📥 Commits

Reviewing files that changed from the base of the PR and between 9917fb4 and 6e742da.

📒 Files selected for processing (4)
  • pkg/plugins/common/state/state-keys.go
  • pkg/plugins/model-provider-resolver/external_model_reconciler.go
  • pkg/plugins/model-provider-resolver/plugin.go
  • pkg/plugins/model-provider-resolver/store.go

Comment on lines +70 to +74
providerInfo, found := r.store.getProvider(providerKey)
if !found {
logger.Info("ExternalProvider not yet available, requeuing", "provider", ref.Ref.Name)
return ctrl.Result{RequeueAfter: 5 * time.Second}, nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Invalidate the cached model entry when the provider lookup misses.

If a model was previously resolved and its provider is later deleted or renamed, this branch leaves the old externalModelInfo in infoStore, so ProcessRequest can keep routing with stale provider/secret/config data until another reconcile fixes it.

Suggested fix
 	providerKey := types.NamespacedName{Namespace: req.Namespace, Name: ref.Ref.Name}
 	providerInfo, found := r.store.getProvider(providerKey)
 	if !found {
+		r.store.deleteModel(req.NamespacedName)
 		logger.Info("ExternalProvider not yet available, requeuing", "provider", ref.Ref.Name)
 		return ctrl.Result{RequeueAfter: 5 * time.Second}, nil
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/plugins/model-provider-resolver/external_model_reconciler.go` around
lines 70 - 74, When provider lookup via r.store.getProvider(providerKey) returns
not found, invalidate the cached externalModelInfo in the reconciler's infoStore
so ProcessRequest cannot use stale provider/secret/config data; specifically, in
the branch where providerInfo not found (the block that logs "ExternalProvider
not yet available, requeuing"), remove the corresponding externalModelInfo entry
from r.store/infoStore (e.g., call the store method that deletes the
model/provider mapping using the same providerKey or ref.Ref.Name) before
returning the requeue result.

Comment on lines +102 to +107
// Legacy: watch maas.opendatahub.io ExternalModels for backward compatibility
legacyObj := &unstructured.Unstructured{}
legacyObj.SetGroupVersionKind(legacyExternalModelGVK)
legacyReconciler := &legacyExternalModelReconciler{Reader: k8sClient, store: store}
if err := reconcilerBuilder().For(legacyObj).Named("legacy-externalmodel").Complete(legacyReconciler); err != nil {
return nil, fmt.Errorf("failed to register legacy ExternalModel reconciler for plugin '%s' - %w", ModelProviderResolverPluginType, err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Typed and legacy ExternalModel objects will clobber each other in the shared store.

This PR wires both reconcilers into one infoStore, but lookup is still just namespace/name. If both CRDs coexist with the same path during migration, whichever reconcile runs last wins, and deleting either CRD removes the shared cache entry for both. Key the cache by GVK as well, or keep separate typed/legacy model stores and define explicit precedence in ProcessRequest.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/plugins/model-provider-resolver/plugin.go` around lines 102 - 107, The
shared store currently keys ExternalModel entries only by "namespace/name",
causing typed and legacy ExternalModel instances to clobber each other; update
the storage and lookup so the cache key includes the object's GVK (or
alternatively create two distinct stores e.g., typedStore and legacyStore) and
adjust the reconciler wiring (legacyExternalModelReconciler, the typed
ExternalModel reconciler created via reconcilerBuilder()) to write to the
GVK-aware store; then update ProcessRequest to perform lookups by
GVK+namespace/name (or check legacyStore then typedStore with a defined
precedence) so coexistence during migration is safe and deletions only remove
the correct GVK-scoped entry.

…erride

- Typed ExternalModel reconciler for inference.opendatahub.io with
  provider store cross-reference
- Model config overrides provider config (mergeConfig)
- Model auth overrides provider auth (optional Auth on ExternalProviderRef)
- Cross-watch: ExternalProvider changes trigger model re-reconciliation
- Legacy maas.opendatahub.io reconciler for backward compatibility
- ModelConfigKey for config propagation via CycleState
- Scheme registration in factory for typed client support
@yossiovadia yossiovadia force-pushed the feat/model-store-crosswatch branch from 6e742da to aef4953 Compare May 19, 2026 15:31
@yossiovadia yossiovadia requested a review from nirrozenbaum May 19, 2026 16:20
@rhods-ci-bot
Copy link
Copy Markdown

@yossiovadia: The following test has Succeeded:

OCI Artifact Browser URL

View in Artifact Browser

Inspecting Test Artifacts Manually

To inspect your test artifacts manually, follow these steps:

  1. Install ORAS (see the ORAS installation guide).
  2. Download artifacts with the following commands:
mkdir -p oras-artifacts
cd oras-artifacts
oras pull quay.io/opendatahub/odh-ci-artifacts:ai-gateway-group-test-fh9nn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants