Skip to content

fix(BA-5967): pre-validate slot types in sokovan validator chain#11515

Open
fregataa wants to merge 2 commits intomainfrom
fix/BA-5967-sokovan-validate-slot-types
Open

fix(BA-5967): pre-validate slot types in sokovan validator chain#11515
fregataa wants to merge 2 commits intomainfrom
fix/BA-5967-sokovan-validate-slot-types

Conversation

@fregataa
Copy link
Copy Markdown
Member

@fregataa fregataa commented May 8, 2026

Summary

  • Adds ImageSlotTypeRule and RequestedSlotTypeRule to the Sokovan SessionSpec validator chain so slot keys absent from the cluster's known_slot_types are caught with a clear 4xx before reaching ResourceLimitRule.
  • Without this, ResourceLimitRule calls ResourceSlot.to_humanized() to format an error message, and a missing slot key turns the original 4xx into a 500 (the bug reported in BA-5967).
  • Both rules no-op when known_slot_types is empty, matching the existing raw-fallback path that ResourceLimitRule._humanize_slots uses on freshly bootstrapped clusters.

Test plan

  • pants test tests/unit/manager/sokovan/scheduling_controller/validators/test_session_spec_rules.py — new TestImageSlotTypeRule and TestRequestedSlotTypeRule cover known-slot pass, mismatched-slot rejection, and empty-known-slot-types skip.
  • pants fmt/fix/lint/check on changed files.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 8, 2026 04:12
@github-actions github-actions Bot added size:L 100~500 LoC comp:manager Related to Manager component labels May 8, 2026
fregataa added a commit that referenced this pull request May 8, 2026
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the Sokovan SessionSpec validator chain by pre-validating slot keys against the cluster’s configured/registered known_slot_types, preventing a 4xx validation failure from being converted into a 500 when downstream rules attempt to humanize unknown slot types.

Changes:

  • Added ImageSlotTypeRule and RequestedSlotTypeRule to reject unknown slot keys early (or no-op when known_slot_types is empty).
  • Wired the new rules into the scheduling controller’s validator chain ahead of ResourceLimitRule.
  • Added unit tests covering pass/reject/skip behavior for both rules.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/ai/backend/manager/sokovan/scheduling_controller/validators/image_slot_type_rule.py New rule validating image-declared slot keys against known_slot_types.
src/ai/backend/manager/sokovan/scheduling_controller/validators/requested_slot_type_rule.py New rule validating user-requested slot keys against known_slot_types.
src/ai/backend/manager/sokovan/scheduling_controller/validators/__init__.py Re-exported new validator rules.
src/ai/backend/manager/sokovan/scheduling_controller/scheduling_controller.py Inserted the new rules into the validator chain before ResourceLimitRule.
tests/unit/manager/sokovan/scheduling_controller/validators/test_session_spec_rules.py Added test coverage for the two new validator rules.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@fregataa fregataa marked this pull request as draft May 8, 2026 04:44
@fregataa fregataa force-pushed the fix/BA-5967-sokovan-validate-slot-types branch from c14d730 to 78122d4 Compare May 8, 2026 05:59
@fregataa fregataa marked this pull request as ready for review May 8, 2026 06:00
@fregataa fregataa requested a review from a team May 8, 2026 06:00
@fregataa fregataa force-pushed the fix/BA-5967-sokovan-validate-slot-types branch from 78122d4 to 0ae6197 Compare May 8, 2026 06:08
Comment thread src/ai/backend/manager/repositories/scheduler/db_source/db_source.py Outdated
@fregataa fregataa force-pushed the fix/BA-5967-sokovan-validate-slot-types branch 2 times, most recently from 410ebf8 to 4d766c6 Compare May 8, 2026 14:58
Add two SessionSpec validator rules so slot keys absent from the
target resource group's agent inventory are caught with a clear 4xx
before they reach ResourceLimitRule, where the downstream humanizer
would otherwise escalate the missing key into a 500.

ImageSlotTypeRule rejects images that declare a resource_spec slot
the target resource group has no agent for. RequestedSlotTypeRule
rejects caller-supplied resource entries with the same property. Both
rules also reject when the RG has no non-terminated agents at all.

The per-RG slot inventory is sourced from the same ScalingGroupRow
fetch via selectinload over agents -> agent_resource_rows, with
TERMINATED agents filtered out in Python. The new bundle dataclass
_ScalingGroupWithAgentResources keeps the SG row and its filtered
agent list together so the inventory snapshot stays consistent with
the same readonly transaction. The fetch helper raises
ScalingGroupNotFound on missing RG, so the caller no longer needs a
separate None check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@fregataa fregataa force-pushed the fix/BA-5967-sokovan-validate-slot-types branch from 4d766c6 to 9810ac4 Compare May 8, 2026 15:23
@fregataa fregataa force-pushed the fix/BA-5967-sokovan-validate-slot-types branch from b8aee32 to 9810ac4 Compare May 8, 2026 17:00
… test

Removing the mock was scoped under BA-5984 (the broader migration off
the etcd-backed resource_slots query), so this PR should leave it in
place as it was before BA-5967 touched the file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@fregataa fregataa requested a review from a team May 8, 2026 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:manager Related to Manager component size:L 100~500 LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants