Skip to content

feat: add supported/experimental tier classification to model-selection algorithms (#1514)#1693

Merged
rootfs merged 12 commits into
vllm-project:mainfrom
szedan-rh:issues_1514
Apr 13, 2026
Merged

feat: add supported/experimental tier classification to model-selection algorithms (#1514)#1693
rootfs merged 12 commits into
vllm-project:mainfrom
szedan-rh:issues_1514

Conversation

@szedan-rh
Copy link
Copy Markdown
Collaborator

  • Add supported / experimental tier classification to all 12 model-selection algorithms via the Selector interface
    • Emit prominent startup warnings when operators configure experimental algorithms in decisions
    • Health-check external service dependencies (AutoMix verifier, Router-R1 server) at startup with clear UNREACHABLE logs
    • Add tier label to Prometheus metrics (model_selection_total, model_selection_duration_seconds, model_selection_confidence) for dashboard filtering
    • Add model_selection_dependency_health gauge for external dependency monitoring
    • Replace flat algorithm type list with structured catalog in routing_surface_catalog.go

@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 31, 2026

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit ba434d0
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/69dcac6cd70f7e0009d0c9f1
😎 Deploy Preview https://deploy-preview-1693--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 31, 2026

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 config

Owners: @rootfs, @Xunzhuo
Files changed:

  • config/config.yaml

📁 ml-binding

Owners: @rootfs, @Xunzhuo, @szedan-rh, @abdallahsamabd, @asaadbalum, @noalimoy
Files changed:

  • ml-binding/Cargo.toml

📁 src/semantic-router

Owners: @rootfs, @Xunzhuo, @szedan-rh, @yehuditkerido, @abdallahsamabd, @asaadbalum, @liavweiss, @noalimoy
Files changed:

  • src/semantic-router/pkg/config/fragment_catalog_test.go
  • src/semantic-router/pkg/config/routing_surface_catalog.go
  • src/semantic-router/pkg/config/routing_surface_catalog_test.go
  • src/semantic-router/pkg/extproc/req_filter_classification.go
  • src/semantic-router/pkg/extproc/router_selection.go
  • src/semantic-router/pkg/selection/factory.go
  • src/semantic-router/pkg/selection/metrics.go
  • src/semantic-router/pkg/selection/metrics_test.go
  • src/semantic-router/pkg/selection/rl_driven.go
  • src/semantic-router/pkg/selection/selector.go
  • src/semantic-router/pkg/selection/tier_declarations.go
  • src/semantic-router/pkg/selection/tier_test.go

📁 website

Owners: @Xunzhuo, @samzong, @yuluo-yx
Files changed:

  • website/docs/tutorials/algorithm/selection/mlp.md

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 31, 2026

✅ Supply Chain Security Report — All Clear

Scanner Status Findings
AST Codebase Scan (Py, Go, JS/TS, Rust) 27 finding(s) — MEDIUM: 21 · LOW: 6
AST PR Diff Scan No issues detected
Regex Fallback Scan No issues detected

Scanned at 2026-04-13T08:46:39.918Z · View full workflow logs

@szedan-rh
Copy link
Copy Markdown
Collaborator Author

@rootfs / @Xunzhuo - Could you please take look?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds tier classification and dependency introspection to model-selection algorithms, with associated startup logging/warnings, health checks, and Prometheus labeling/metrics to improve operational clarity around “supported” vs “experimental” selectors.

Changes:

  • Extend Selector interface with Tier() and ExternalDependencies() and plumb tier into SelectionResult.
  • Add tier label to selection metrics and introduce a dependency health gauge.
  • Add startup logging for registered algorithms plus warnings/health checks for experimentally configured algorithms; update config catalog to include tiers.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
website/docs/tutorials/algorithm/selection/mlp.md Documents MLP as experimental.
src/semantic-router/pkg/selection/static.go Implements Tier()/ExternalDependencies() for Static selector.
src/semantic-router/pkg/selection/selector_test.go Adds tests covering tier/dependency constants and selector tier behavior.
src/semantic-router/pkg/selection/selector.go Introduces tier/dependency types; extends Selector; populates SelectionResult.Tier.
src/semantic-router/pkg/selection/router_dc.go Implements Tier()/ExternalDependencies() for RouterDC.
src/semantic-router/pkg/selection/rl_driven.go Implements tier/deps; updates RL metrics recording to include tier.
src/semantic-router/pkg/selection/ml_adapter.go Marks ML adapters experimental and declares pretrained-model dependency.
src/semantic-router/pkg/selection/metrics_test.go Updates metrics tests for tier labels and new recording APIs.
src/semantic-router/pkg/selection/metrics.go Adds tier labels; adds dependency health gauge; introduces tier-aware recorders.
src/semantic-router/pkg/selection/latency_aware.go Implements Tier()/ExternalDependencies() for LatencyAware.
src/semantic-router/pkg/selection/hybrid.go Implements Tier()/ExternalDependencies() for Hybrid.
src/semantic-router/pkg/selection/gmtrouter.go Implements tier/deps for GMTRouter.
src/semantic-router/pkg/selection/factory.go Adds algorithm registry logging, experimental warnings, and dependency health checks.
src/semantic-router/pkg/selection/elo.go Implements Tier()/ExternalDependencies() for Elo.
src/semantic-router/pkg/selection/automix.go Implements tier/deps for AutoMix (verifier service dependency).
src/semantic-router/pkg/extproc/router_selection.go Wires startup warnings and dependency checks based on configured decisions.
src/semantic-router/pkg/extproc/req_filter_classification.go Records selection metrics with tier.
src/semantic-router/pkg/config/routing_surface_catalog_test.go Adds tests for catalog tiers and backward compatibility.
src/semantic-router/pkg/config/routing_surface_catalog.go Replaces flat algorithm list with structured catalog including tier.
src/semantic-router/pkg/config/fragment_catalog_test.go Adds selection fragment mapping for mlp.
docs/superpowers/specs/2026-03-29-algorithm-tier-classification-design.md Adds design spec for tier classification and guardrails.
config/config.yaml Adds example decision using experimental mlp algorithm.

Comment thread src/semantic-router/pkg/selection/metrics.go
Comment thread src/semantic-router/pkg/selection/factory.go
Comment thread src/semantic-router/pkg/selection/metrics.go
Comment thread src/semantic-router/pkg/selection/factory.go
Comment thread src/semantic-router/pkg/selection/latency_aware.go Outdated
@Whatsonyourmind
Copy link
Copy Markdown

The tier classification (supported vs experimental) for model-selection algorithms is the right governance pattern. A mathematical framework for the tier criteria:

Promotion criteria from experimental to supported:

  1. Regret bound: The algorithm should have a proven or empirical sublinear regret bound. For K models over T requests, cumulative regret should be O(sqrt(KT)) or better. Algorithms without this guarantee (pure heuristic routing) stay experimental.

  2. Convergence speed: Measure requests-to-convergence on a synthetic benchmark (e.g., 5 models with known quality distributions). Supported algorithms should converge to within 5% of the oracle policy within 500 requests.

  3. Robustness to non-stationarity: When model quality shifts (new version, degradation), supported algorithms should detect and adapt within 2x the convergence time. Test by introducing a step-change in one model's quality mid-benchmark.

For the 12 algorithms specifically:

Algorithms with theoretical guarantees (strong supported candidates):

  • Thompson Sampling: O(sqrt(KT log K)) regret, adapts to non-stationarity with windowed posteriors
  • UCB1: O(sqrt(KT)) regret, deterministic, reproducible
  • LinUCB: O(d * sqrt(T * log T)) regret with d context dimensions

Algorithms without guarantees (experimental until empirically validated):

  • Rule-based routing (no learning, no adaptation)
  • Cascade routing (order-dependent, no exploration)
  • External service dependencies (AutoMix, Router-R1 — availability risk)

Health check pattern for external dependencies: Beyond startup UNREACHABLE logs, consider a circuit breaker: if the external service fails 3 consecutive times within 30 seconds, fallback to the best local algorithm (Thompson Sampling) until the external recovers. This prevents external service flakiness from degrading production routing.

szedan-rh and others added 6 commits April 8, 2026 23:22
…#1514)

Design for adding supported/experimental tier classification to
model-selection algorithms so operators can distinguish production-ready
algorithms from research experiments at config and runtime level.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Senan Zedan <szedan@redhat.com>
…-project#1514)

9-task TDD implementation plan covering interface changes, tier methods
on all 12 algorithms, startup warnings, metrics labels, structured
config catalog, and integration testing.

Signed-off-by: Senan Zedan <szedan@redhat.com>
…nterface (vllm-project#1514)

This commit adds foundational types for algorithm tier classification:

- AlgorithmTier enum with TierSupported and TierExperimental values
- DependencyType enum for external_service, pretrained_model, and embedding_function
- Dependency struct to describe external dependencies (name, type, description, health URL, required flag)
- Tier field added to SelectionResult to track production readiness
- Tier() and ExternalDependencies() methods added to Selector interface

Tests verify the new types and constants compile correctly.

Note: Existing algorithm structs (EloSelector, AutoMixSelector, etc.) do not yet
implement the new interface methods - this is intentional and will be addressed
in subsequent commits. The package will not build until Tasks 2 and 3 add the
implementations to all 12 algorithm structs.

Signed-off-by: Senan Zedan <szedan@redhat.com>
…ect#1514)

Add Tier() and ExternalDependencies() methods to the 5 supported
algorithm selectors (static, elo, router_dc, latency_aware, hybrid).
All return TierSupported and empty dependency lists.

Add TestSupportedAlgorithms_Tier to verify correct implementation.

Signed-off-by: Senan Zedan <szedan@redhat.com>
…roject#1514)

Add Tier() and ExternalDependencies() methods to experimental algorithm selectors:
- automix.go: AutoMixSelector now declares TierExperimental and optional verifier service dependency
- rl_driven.go: RLDrivenSelector now declares TierExperimental and optional Router-R1 service dependency
- gmtrouter.go: GMTRouterSelector now declares TierExperimental and optional pretrained graph model dependency
- ml_adapter.go: MLSelectorAdapter now declares TierExperimental and required pretrained model dependency

Tests added:
- TestExperimentalAlgorithms_Tier: Verifies tier and dependencies for automix, rl_driven, gmtrouter
- TestMLAdapterAlgorithms_Tier: Verifies tier and pretrained model dependencies for KNN, KMeans, SVM, MLP
- TestAutoMixSelector_DependenciesWithVerifier: Verifies conditional external service dependency declaration

All 12 algorithms now implement the full Selector interface with tier classification.

Signed-off-by: Senan Zedan <szedan@redhat.com>
…health gauge (vllm-project#1514)

This commit adds tier-awareness to the Prometheus metrics system and introduces
a new dependency health gauge for monitoring external service availability.

Changes:
- Add "tier" label to ModelSelectionTotal, ModelSelectionDuration, and ModelSelectionConfidence
- Add new ModelSelectionDependencyHealth gauge for tracking dependency reachability
- Update all metric call sites to pass tier labels (empty string for legacy calls)
- Add RecordSelectionWithTier() for tier-aware recording
- Add RecordDependencyHealth() for dependency monitoring
- Update preInitializeMetrics() to pre-populate tier-labeled metrics
- Add tests for new tier-aware functions

The tier label enables operators to filter Prometheus dashboards by algorithm
stability (tier="supported" vs tier="experimental"), providing better visibility
into production-ready vs research-grade algorithm usage.

Signed-off-by: Senan Zedan <szedan@redhat.com>
vllm-project#1514)

Replace the flat `supportedDecisionAlgorithmTypes` string list with a
structured `decisionAlgorithmCatalog` that includes tier information
("supported" or "experimental") for each algorithm type.

Key changes:
- Add `AlgorithmCatalogEntry` struct with Type and Tier fields
- Replace flat list with structured catalog in routing_surface_catalog.go
- Add `DecisionAlgorithmCatalog()` and `GetAlgorithmTier()` public functions
- Add "mlp" algorithm as experimental with supporting config fragments
- Maintain backwards compatibility via derived `supportedDecisionAlgorithmTypes`
- Add comprehensive tests for catalog and tier lookups

All existing callers of `IsSupportedDecisionAlgorithmType()` and
`SupportedDecisionAlgorithmTypes()` continue to work unchanged.

Signed-off-by: Senan Zedan <szedan@redhat.com>
Signed-off-by: Senan Zedan <szedan@redhat.com>
@szedan-rh szedan-rh force-pushed the issues_1514 branch 7 times, most recently from 785554f to 25a211f Compare April 12, 2026 14:04
@szedan-rh
Copy link
Copy Markdown
Collaborator Author

@rootfs / @Xunzhuo - Could you please review?

Signed-off-by: Senan Zedan <szedan@redhat.com>
@rootfs rootfs merged commit f7b5efa into vllm-project:main Apr 13, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.