Skip to content

fix(BA-5996): propagate session network_type/network_id to scheduler launcher#11561

Merged
HyeockJinKim merged 1 commit into
25.15from
backport/11543-to-25.15
May 13, 2026
Merged

fix(BA-5996): propagate session network_type/network_id to scheduler launcher#11561
HyeockJinKim merged 1 commit into
25.15from
backport/11543-to-25.15

Conversation

@jopemachine
Copy link
Copy Markdown
Member

This is a manual backport PR of #11543 to the 25.15 release.

The automated cherry-pick failed with CONFLICT (directory rename split) because:

  • On main, the PR touches three ScheduleDBSource query paths that build SessionDataForStart (_get_sessions_for_start, _fetch_sessions_for_start_by_ids, search_sessions_with_kernels_and_user).
  • On 25.15, only _get_sessions_for_start exists; the other two methods were introduced after 25.15.
  • The regression test added on main targets search_sessions_with_kernels_and_user, which doesn't exist here, so it is not backported.

Changes

  • Add SessionRow.network_type / SessionRow.network_id to the _get_sessions_for_start select and forward them into the SessionDataForStart constructor.
  • Add the changes/11543.fix.md news fragment.

Original PR

#11543

Test plan

  • pants check on the touched file
  • pants lint on the touched file
  • Manual end-to-end: create a session with network_type=PERSISTENT and a pre-created network_id, observe the launcher takes the persistent branch instead of calling create_network

…launcher (#11543)

Propagate `SessionRow.network_type` / `SessionRow.network_id` through
`ScheduleDBSource._get_sessions_for_start` into `SessionDataForStart`,
so the launcher correctly reuses pre-created networks for `PERSISTENT`
sessions instead of falling back to `network_plugin.create_network(...)`.

Note: On the 25.15 branch only `_get_sessions_for_start` produces
`SessionDataForStart`. The other two call sites that were patched on
main (`_fetch_sessions_for_start_by_ids`, `search_sessions_with_kernels_and_user`)
do not exist on 25.15, so this backport only adapts the single
existing query path. The regression test from main targets
`search_sessions_with_kernels_and_user` and is therefore not
applicable to 25.15.

Backported-from: main (26.4)
Backported-to: 25.15
Backport-of: 11543
Copilot AI review requested due to automatic review settings May 12, 2026 09:06
@jopemachine jopemachine added this to the 25.15 milestone May 12, 2026
@jopemachine jopemachine added comp:manager Related to Manager component size:L 100~500 LoC backport labels May 12, 2026
@github-actions github-actions Bot added the size:XS ~10 LoC label May 12, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backports #11543 to the 25.15 release branch by ensuring session inter-container network configuration (network_type / network_id) is fetched from the DB and propagated into SessionDataForStart, so the scheduler/launcher can correctly reuse pre-created persistent networks.

Changes:

  • Extend _get_sessions_for_start to select SessionRow.network_type and SessionRow.network_id.
  • Forward network_type / network_id into SessionDataForStart.
  • Add a towncrier news fragment documenting the fix.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/ai/backend/manager/repositories/scheduler/db_source/db_source.py Selects and propagates network_type/network_id into SessionDataForStart for the scheduler start path.
changes/11543.fix.md Adds a release-note fragment describing the propagated network fields fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 2940 to +2944
SessionRow.environ,
SessionRow.cluster_mode,
SessionRow.user_uuid,
SessionRow.network_type,
SessionRow.network_id,
@jopemachine jopemachine enabled auto-merge (squash) May 12, 2026 09:19
@HyeockJinKim HyeockJinKim disabled auto-merge May 13, 2026 02:07
@HyeockJinKim HyeockJinKim merged commit ff8dd79 into 25.15 May 13, 2026
35 checks passed
@HyeockJinKim HyeockJinKim deleted the backport/11543-to-25.15 branch May 13, 2026 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport comp:manager Related to Manager component size:L 100~500 LoC size:XS ~10 LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants