feat(BA-6581): wire IdleCheckerRepository into idle-check reconciler source#12398
feat(BA-6581): wire IdleCheckerRepository into idle-check reconciler source#12398seedspirit wants to merge 9 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR replaces the placeholder idle-check reconciler source with a real data path. It introduces an IdleCheckerRepository (backed by a new IdleCheckerDBSource) that fetches a normalized IdleCheckSnapshot — sessions, their applicable scopes (resource group / project / domain), scope→checker bindings, and typed checker specs — and wires this repository through the orchestration DI and the reconciler stage factory so IdleCheckSource.fetch_reconcile_info returns the snapshot instead of an empty placeholder. It also makes IdleCheckerSpecABC.load dispatch by checker_type into concrete (currently field-less) spec classes that land their fields in follow-up stories. The change is read-only; the handler/applier remain placeholders.
Changes:
- New
idle_checkerrepository layer (repository,repositories,db_source, resulttypes) producingIdleCheckSnapshot, plus newScopeType/ScopeRef/IdleCheckSessionViewdata types. - DI wiring of
idle_checker_repositorythroughRepositories, both orchestration composers, sokovan input, and the reconciler stage factory, withIdleCheckReconcileInfoswitched fromsession_idstosnapshot. IdleCheckerSpecABC.loadnow dispatches onchecker_typetoSessionLifetimeSpec/NetworkTimeoutSpec/UtilizationSpec.
Reviewed changes
Copilot reviewed 19 out of 21 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
repositories/idle_checker/db_source/db_source.py |
New DB reads assembling the normalized snapshot (joins + correlated enabled-binding filter). |
repositories/idle_checker/repository.py |
Thin repository delegating to the db_source. |
repositories/idle_checker/repositories.py |
IdleCheckerRepositories container following the per-entity wrapper convention. |
repositories/idle_checker/types.py |
IdleCheckerRef, IdleCheckerBindingRef, IdleCheckSnapshot result types. |
repositories/repositories.py |
Registers idle_checker in the aggregate Repositories. |
data/idle_checker/types.py |
Adds ScopeType, ScopeRef, IdleCheckSessionView. |
models/idle_checker/spec.py |
load discriminated dispatch + concrete spec subclasses. |
sokovan/idle_check/source.py |
Injects repository; fetches real snapshot. |
sokovan/idle_check/types.py |
IdleCheckReconcileInfo carries snapshot; entity_ids derives from it. |
sokovan/stages/idle_check.py |
build_idle_check_stage now takes the repository. |
sokovan/stages/factory.py |
Threads idle_checker_repository into stage assembly. |
dependencies/orchestration/composer.py, dependencies/orchestration/sokovan.py, dependencies/composer.py |
DI plumbing of the new repository. |
tests/.../idle_check/test_source.py, test_stage.py, dependencies/orchestration/test_sokovan.py, test_composer.py |
Update/add tests for the new repository argument and source behavior. |
changes/12398.feature.md |
Changelog entry. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| async def fetch_idle_check_snapshot( | ||
| self, session_statuses: Collection[SessionStatus] | ||
| ) -> IdleCheckSnapshot: | ||
| """Fetch sessions and scope-bound idle checkers without expanding to session-checker rows.""" |
| sa.select( | ||
| SessionRow.id.label("session_id"), | ||
| SessionRow.created_at.label("session_created_at"), | ||
| SessionRow.starts_at.label("session_starts_at"), | ||
| ScalingGroupRow.id.label("resource_group_id"), | ||
| SessionRow.group_id.label("project_id"), | ||
| DomainRow.id.label("domain_id"), | ||
| ) | ||
| .select_from(SessionRow) | ||
| .join(DomainRow, SessionRow.domain_name == DomainRow.name) | ||
| .join(ScalingGroupRow, SessionRow.scaling_group_name == ScalingGroupRow.name) |
There was a problem hiding this comment.
Is there a reason to use a JOIN like this?
There was a problem hiding this comment.
This is to look up the ID in Scope. Since the current session row only contains the domain name and resource group name, a join is required to determine their IDs.
There was a problem hiding this comment.
If possible, I don't want to comment on naming, but wasn't it the convention to use the Data suffix for these data pattern types?
Is there a reason for omitting the Data suffix here?
I think it would be clearer if the type's purpose could be identified from its name alone.
1869788 to
9a884c5
Compare
| # today that is RUNNING only. Kept as a parameter so the policy stays in the | ||
| # stage's target_statuses rather than baked into this read. | ||
| SessionRow.status.in_(session_statuses), | ||
| SessionRow.session_type != SessionTypes.INFERENCE, |
There was a problem hiding this comment.
This condition (session_type) also can be a reconcile target
…source Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tainer Construct it in an IdleCheckerRepositories container and pass it through OrchestrationInput like the other repositories, instead of building it inline in the orchestration composer. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…k read - Read resource_group_id/domain_id from SessionRow columns (BA-6644) instead of joining DomainRow/ScalingGroupRow in the batch fetch, correlating the enabled-binding EXISTS subquery on those columns too - Move the session_type filter into the stage's target_statuses as session_types, matching how session_statuses is owned by the stage - Populate the new NOT NULL scope-id columns in the repository fixtures Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| # Inference sessions follow the deployment/routing lifecycle, not idle termination. | ||
| session_types=frozenset({ | ||
| SessionTypes.INTERACTIVE, | ||
| SessionTypes.BATCH, | ||
| SessionTypes.SYSTEM, | ||
| }), |
There was a problem hiding this comment.
Why is the system session targeted?
| def _enabled_binding_exists_query(self) -> sa.sql.elements.ColumnElement[bool]: | ||
| return sa.exists().where( | ||
| sa.and_( | ||
| IdleCheckerBindingRow.enabled == sa.true(), | ||
| sa.or_( | ||
| sa.and_( | ||
| IdleCheckerBindingRow.scope_type == ScopeType.RESOURCE_GROUP.value, | ||
| IdleCheckerBindingRow.scope_id == SessionRow.resource_group_id, | ||
| ), | ||
| sa.and_( | ||
| IdleCheckerBindingRow.scope_type == ScopeType.PROJECT.value, | ||
| IdleCheckerBindingRow.scope_id == SessionRow.group_id, | ||
| ), | ||
| sa.and_( | ||
| IdleCheckerBindingRow.scope_type == ScopeType.DOMAIN.value, | ||
| IdleCheckerBindingRow.scope_id == SessionRow.domain_id, | ||
| ), | ||
| ), | ||
| ) | ||
| ) |
There was a problem hiding this comment.
I'm a little disappointed with the query binding part—wouldn't it make more sense to query based on IdleChecker or the session?
Summary
IdleCheckerRepository(with its DB source and result types) that fetches a normalizedIdleCheckSnapshotof sessions, their applicable scopes, scope→checker bindings, and typed checker specs.IdleCheckSourcefetches real data instead of returning a placeholder.IdleCheckerSpec(atypediscriminator plus optional per-kind sub-specssession_lifetime/network/utilization), stored in theidle_checkers.specJSONB column via the existingPydanticColumn. Amodel_validatorenforces that exactly the matching sub-spec is present for the chosentype. This replaces the previousABCColumn+ hand-writtenserialize/loaddispatch. Concrete spec fields land with the checker-logic stories.CheckerTypediscriminator out of the ORM layer tocommon/data/idle_checker/types.py(pure Pydantic, ORM-agnostic). Manager-internal projections (ScopeType,ScopeRef,IdleCheckSessionView) stay inmanager/data.No DB migration: both
ABCColumnandPydanticColumnmap to JSONB and theidle_checkerstable is unreleased.Incidental test-infra fix
test: eagerly callensure_all_tables_registered()in the sharedtests/unit/manager/repositories/conftest.py.DomainRowvia query scopes). SQLAlchemy's globalconfigure_mappers()then fails to resolve string relationships across the rest of the graph (DomainRow → "ScalingGroupForDomainRow" → "AgentRow" → …). Onmainthis passed only by lucky Pants shard composition; adding a new test target in this PR reshuffled the shards and exposed it. Eager full-model registration makes mapper configuration independent of test sharding. (Verified: the affectedrepositories/base|rbac|opstests fail on puremainin isolation too.)Test plan
pants fmt / fix / lint / check / teston changed targetsrepositories/base|rbac|opstests +idle_checkertests pass locally--changed-sincesuite passes via the pre-push hookResolves BA-6581
🤖 Generated with Claude Code