feat: typed ExternalModel reconciler with cross-watch and legacy support#286
feat: typed ExternalModel reconciler with cross-watch and legacy support#286yossiovadia wants to merge 1 commit into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Central YAML (base), Organization UI (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughWalkthroughAdds an exported ModelConfigKey and a config map field to externalModelInfo. Implements a typed inferencev1alpha1 ExternalModel reconciler that looks up providers, merges provider+model config via mergeConfig, applies auth secret overrides, and writes externalModelInfo into a shared infoStore; retains a legacy unstructured reconciler for the MaaS GVK. The plugin registers the inference scheme, wires typed reconcilers and a provider→model watch, and ProcessRequest enforces /chat/completions, validates model name, returns nil on missing store entries, and writes ModelConfigKey only when config is non-empty. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Security and Logic Notes
Action items (concrete):
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: yossiovadia The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
3b2f788 to
6e742da
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
pkg/plugins/model-provider-resolver/external_model_reconciler.go (1)
76-80: ⚡ Quick winAlways copy provider config before caching model info.
When
ref.Configis empty,configaliasesproviderInfo.config. That exposes one shared map instance through every model using that provider, so any downstream mutation ofModelConfigKeycan leak across models and requests.Suggested fix
- // Model config overrides provider config - config := providerInfo.config - if len(ref.Config) > 0 { - config = mergeConfig(providerInfo.config, ref.Config) - } + // Always copy provider config; model config overrides provider values. + config := mergeConfig(providerInfo.config, ref.Config)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/plugins/model-provider-resolver/external_model_reconciler.go` around lines 76 - 80, When building the cached model config, avoid aliasing providerInfo.config into config when ref.Config is empty — always copy providerInfo.config first so downstream mutations (e.g., to ModelConfigKey) don't leak across models; update the logic around providerInfo.config/ref.Config to create a new map (or call mergeConfig with a copy) instead of assigning the original map, ensuring the cached model info holds an independent map instance.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/plugins/model-provider-resolver/external_model_reconciler.go`:
- Around line 70-74: When provider lookup via r.store.getProvider(providerKey)
returns not found, invalidate the cached externalModelInfo in the reconciler's
infoStore so ProcessRequest cannot use stale provider/secret/config data;
specifically, in the branch where providerInfo not found (the block that logs
"ExternalProvider not yet available, requeuing"), remove the corresponding
externalModelInfo entry from r.store/infoStore (e.g., call the store method that
deletes the model/provider mapping using the same providerKey or ref.Ref.Name)
before returning the requeue result.
In `@pkg/plugins/model-provider-resolver/plugin.go`:
- Around line 102-107: The shared store currently keys ExternalModel entries
only by "namespace/name", causing typed and legacy ExternalModel instances to
clobber each other; update the storage and lookup so the cache key includes the
object's GVK (or alternatively create two distinct stores e.g., typedStore and
legacyStore) and adjust the reconciler wiring (legacyExternalModelReconciler,
the typed ExternalModel reconciler created via reconcilerBuilder()) to write to
the GVK-aware store; then update ProcessRequest to perform lookups by
GVK+namespace/name (or check legacyStore then typedStore with a defined
precedence) so coexistence during migration is safe and deletions only remove
the correct GVK-scoped entry.
---
Nitpick comments:
In `@pkg/plugins/model-provider-resolver/external_model_reconciler.go`:
- Around line 76-80: When building the cached model config, avoid aliasing
providerInfo.config into config when ref.Config is empty — always copy
providerInfo.config first so downstream mutations (e.g., to ModelConfigKey)
don't leak across models; update the logic around providerInfo.config/ref.Config
to create a new map (or call mergeConfig with a copy) instead of assigning the
original map, ensuring the cached model info holds an independent map instance.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 5e2040df-6d1c-4e25-9928-1a0702ca9c03
📒 Files selected for processing (4)
pkg/plugins/common/state/state-keys.gopkg/plugins/model-provider-resolver/external_model_reconciler.gopkg/plugins/model-provider-resolver/plugin.gopkg/plugins/model-provider-resolver/store.go
| providerInfo, found := r.store.getProvider(providerKey) | ||
| if !found { | ||
| logger.Info("ExternalProvider not yet available, requeuing", "provider", ref.Ref.Name) | ||
| return ctrl.Result{RequeueAfter: 5 * time.Second}, nil | ||
| } |
There was a problem hiding this comment.
Invalidate the cached model entry when the provider lookup misses.
If a model was previously resolved and its provider is later deleted or renamed, this branch leaves the old externalModelInfo in infoStore, so ProcessRequest can keep routing with stale provider/secret/config data until another reconcile fixes it.
Suggested fix
providerKey := types.NamespacedName{Namespace: req.Namespace, Name: ref.Ref.Name}
providerInfo, found := r.store.getProvider(providerKey)
if !found {
+ r.store.deleteModel(req.NamespacedName)
logger.Info("ExternalProvider not yet available, requeuing", "provider", ref.Ref.Name)
return ctrl.Result{RequeueAfter: 5 * time.Second}, nil
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/plugins/model-provider-resolver/external_model_reconciler.go` around
lines 70 - 74, When provider lookup via r.store.getProvider(providerKey) returns
not found, invalidate the cached externalModelInfo in the reconciler's infoStore
so ProcessRequest cannot use stale provider/secret/config data; specifically, in
the branch where providerInfo not found (the block that logs "ExternalProvider
not yet available, requeuing"), remove the corresponding externalModelInfo entry
from r.store/infoStore (e.g., call the store method that deletes the
model/provider mapping using the same providerKey or ref.Ref.Name) before
returning the requeue result.
| // Legacy: watch maas.opendatahub.io ExternalModels for backward compatibility | ||
| legacyObj := &unstructured.Unstructured{} | ||
| legacyObj.SetGroupVersionKind(legacyExternalModelGVK) | ||
| legacyReconciler := &legacyExternalModelReconciler{Reader: k8sClient, store: store} | ||
| if err := reconcilerBuilder().For(legacyObj).Named("legacy-externalmodel").Complete(legacyReconciler); err != nil { | ||
| return nil, fmt.Errorf("failed to register legacy ExternalModel reconciler for plugin '%s' - %w", ModelProviderResolverPluginType, err) |
There was a problem hiding this comment.
Typed and legacy ExternalModel objects will clobber each other in the shared store.
This PR wires both reconcilers into one infoStore, but lookup is still just namespace/name. If both CRDs coexist with the same path during migration, whichever reconcile runs last wins, and deleting either CRD removes the shared cache entry for both. Key the cache by GVK as well, or keep separate typed/legacy model stores and define explicit precedence in ProcessRequest.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/plugins/model-provider-resolver/plugin.go` around lines 102 - 107, The
shared store currently keys ExternalModel entries only by "namespace/name",
causing typed and legacy ExternalModel instances to clobber each other; update
the storage and lookup so the cache key includes the object's GVK (or
alternatively create two distinct stores e.g., typedStore and legacyStore) and
adjust the reconciler wiring (legacyExternalModelReconciler, the typed
ExternalModel reconciler created via reconcilerBuilder()) to write to the
GVK-aware store; then update ProcessRequest to perform lookups by
GVK+namespace/name (or check legacyStore then typedStore with a defined
precedence) so coexistence during migration is safe and deletions only remove
the correct GVK-scoped entry.
…erride - Typed ExternalModel reconciler for inference.opendatahub.io with provider store cross-reference - Model config overrides provider config (mergeConfig) - Model auth overrides provider auth (optional Auth on ExternalProviderRef) - Cross-watch: ExternalProvider changes trigger model re-reconciliation - Legacy maas.opendatahub.io reconciler for backward compatibility - ModelConfigKey for config propagation via CycleState - Scheme registration in factory for typed client support
6e742da to
aef4953
Compare
|
@yossiovadia: The following test has Succeeded: OCI Artifact Browser URLInspecting Test Artifacts ManuallyTo inspect your test artifacts manually, follow these steps:
mkdir -p oras-artifacts
cd oras-artifacts
oras pull quay.io/opendatahub/odh-ci-artifacts:ai-gateway-group-test-fh9nn |
Summary
Depends on #285. Updates the model-provider-resolver plugin to use typed clients for
inference.opendatahub.ioExternalModels with provider cross-watch, plus legacy backward compatibility formaas.opendatahub.io.4 files changed, 121 insertions, 46 deletions.
What this does
maas.opendatahub.ioExternalModel reconciler (unstructured, backward compat)ModelConfigKeyfor config propagation via CycleStateTest plan
Risk analysis
Depends on: #285
cc @nirrozenbaum
Summary by CodeRabbit
New Features
Improvements
Refactor