feat(api): accept model overrides on session creation and add runtime model switching endpoints by dgageot · Pull Request #2791 · docker/docker-agent

dgageot · 2026-05-13T14:01:51Z

What changes

`POST /api/sessions` honours model selection (#2780)

The handler binds the request body to a session.Session so the existing
agent_model_overrides and custom_models_used JSON fields are now picked
up and copied onto the new session in SessionManager.CreateSession. The
runtime applies them the first time it is built for the session (see
runtimeForSession), so a freshly created session can be pinned to a
specific model without an extra round-trip.

Runtime model switching over HTTP (#2781)

Two new endpoints:

GET /api/sessions/:id/models → list of runtime.ModelChoice for the
session's current agent, with IsCurrent set on the active selection
(override or default), and any inline provider/model ref from the
session's history (or override) appended as a synthetic choice.
PATCH /api/sessions/:id/model → applies a per-agent override and
persists it. Empty model clears the override and reverts to the
agent's configured default.

POST /api/sessions/:id/model is also accepted because the existing
pkg/runtime Client.SetAgentModel (used by RemoteRuntime) was
historically a POST — so no client upgrade is required.

Status code mapping:

outcome	code
ok	200
invalid JSON body	400
no runtime attached for session	404
runtime doesn't support model switching	422
invalid model ref	400

Implementation notes

Reuses runtime.ModelChoice directly as the wire type (added JSON tags).
No duplicate pkg/api.ModelChoice.
The TUI App.AvailableModels now delegates to a shared
runtime.DecorateModelChoices helper, which is also used by
SessionManager.AvailableSessionModels. HTTP and TUI clients see the
same picker state.
SessionManager.SetSessionAgentModel snapshots state before mutation
and rolls back both the in-memory session and the runtime override
if sessionStore.UpdateSession fails.
New sentinel ErrSessionNotRunning distinguishes "session not found
or not running" from generic errors so the HTTP layer can map it to
404 deterministically.
runtimeForSession re-applies persisted overrides via a small
applyStoredOverrides helper. Failures are logged at WARN — a stored
override that no longer resolves must not prevent session resumption.

Tests

pkg/server/session_models_test.go (new) — end-to-end coverage for
GET/PATCH/POST, IsCurrent decoration, custom models appended, empty
body clears override, store-write rollback, runtime-failure rollback,
404 when no runtime attached, 422 when runtime doesn't support
switching, POST verb compat.
pkg/runtime/model_switcher_test.go — table test for
DecorateModelChoices corner cases.
pkg/server/session_manager_test.go — SupportsModelSwitching on
fakeRuntime.

Validation

task lint — clean.
go test -race -count=3 ./pkg/server/... ./pkg/runtime/... ./pkg/api/... — green.
Full repo go test — green except pre-existing environmental failures
in pkg/teamloader (require Docker Model Runner running locally,
unrelated to this PR).

Commits

feat: support model overrides when creating sessions
feat: add GET/PATCH/POST endpoints for runtime model switching
fix: add backward compat note for POST /sessions/:id/model
fix: thread-safe model reads and persistence rollback
test: enhance model selection test coverage and fix goroutine leaks
refactor(runtime): extract DecorateModelChoices and remove duplication
fix: distinguish "session not running" from other errors

Allow callers to specify agent_model_overrides and custom_models_used when creating sessions via POST /api/sessions, and wire the runtime with model switcher configuration to apply those overrides when the session is first run.

Add GET /api/sessions/:id/models to list available models and the current override (if any), and PATCH/POST /api/sessions/:id/model to apply a model override for the session's current agent. Extend ModelChoice with JSON tags for wire format and add SessionModelsResponse type. Both endpoints return 422 if the runtime doesn't support model switching.

PATCH is the canonical verb for updating the agent's model. POST is also accepted because pkg/runtime Client.SetAgentModel (used by RemoteRuntime) was historically a POST; keep both so a client upgrade is not required.

Fix a data race in AvailableSessionModels where rs.session is read without holding sm.mux (the lock protecting reads/writes of AgentModelOverrides and CustomModelsUsed by SetSessionAgentModel). In SetSessionAgentModel, snapshot the in-memory override state before mutating the runtime and session. If sessionStore.UpdateSession fails after the runtime mutation, roll back both the in-memory session state and the runtime override so callers don't observe a runtime/store mismatch on the next request. Fixes issues identified in review and validated with go test -race.

Extract a startAttachedServer helper that wires up a SessionManager + HTTP server with t.Cleanup(ln.Close) so the Serve goroutine exits cleanly when the test finishes. Replaces boilerplate in all model-switching tests and prevents goroutine leaks. Add tests for key picker scenarios: - TestAttachedServer_GetSessionModels_DefaultMarkedCurrent: default marked current when no override is active - TestAttachedServer_GetSessionModels_AppendsCustomModels: session's custom model history appears in the picker - TestSessionManager_SetSessionAgentModel_RollsBackOnStoreFailure: store failure rolls back both in-memory session and runtime state - TestDecorateModelChoices (table): corner cases for the decorate helper Update TestAttachedServer_GetSessionModels to verify the IsCurrent decoration flow end-to-end and to remove the manual IsCurrent: true from the fixture (now correctly set by the manager).

Map ErrSessionNotRunning to 404 so callers can tell apart a stale session ID from a server-side problem. Introduce the sentinel error and replace four ad-hoc errors.New() calls with it: SteerSession, FollowUpSession, AvailableSessionModels, SetSessionAgentModel. Update the HTTP handlers to check for this sentinel explicitly and return 404 accordingly. Also add TestAttachedServer_ModelEndpoints_404WhenNotRunning to verify both GET /sessions/:id/models and PATCH /sessions/:id/model return 404 when no runtime is attached.

docker-agent

Assessment: 🟡 NEEDS ATTENTION

This PR introduces runtime model switching (GET/PATCH/POST endpoints) and session-creation model overrides. The implementation is well-structured with a rollback path and a new error sentinel. Three findings were identified in the changed code.

docker-agent · 2026-05-13T14:05:34Z

+		customRefs = rs.session.CustomModelsUsed
+	}
+
+	choices := runtime.DecorateModelChoices(rs.runtime.AvailableModels(ctx), current, customRefs)


[HIGH] Lock held across blocking runtime I/O in AvailableSessionModels

sm.mux.Lock() / defer sm.mux.Unlock() wraps the entire function body, including the call to rs.runtime.AvailableModels(ctx) — a context-accepting call that can perform network I/O (e.g. models.dev catalog lookups, provider discovery). Any concurrent caller of RunSession, DeleteSession, SetSessionAgentModel, UpdateSessionTitle, or any other method that acquires sm.mux will block for the full duration of that network call.

The same pattern occurs in SetSessionAgentModel, which calls rs.runtime.SetAgentModel(ctx, agentName, modelRef) under sm.mux.

Fix: snapshot the rs pointer (and any session fields needed) under the lock, release the lock, then invoke the runtime method without holding sm.mux. The runtime is already concurrency-safe via its own internal locking.

docker-agent · 2026-05-13T14:05:34Z

+		case errors.Is(err, ErrSessionNotRunning):
+			return echo.NewHTTPError(http.StatusNotFound, err.Error())
+		default:
+			return echo.NewHTTPError(http.StatusBadRequest, err.Error())


[MEDIUM] Unknown errors from SetSessionAgentModel mapped to HTTP 400 instead of 500

The default branch here returns http.StatusBadRequest (400) for any error that is not ErrModelSwitchingNotSupported or ErrSessionNotRunning. HTTP 400 means the client sent an invalid request — but unexpected errors from the session store or from the runtime (e.g. a network failure, an internal validation error wrapped in fmt.Errorf) are server-side faults, not client mistakes.

By contrast, the companion getSessionModels handler (a few lines above) correctly uses http.StatusInternalServerError (500) in its default branch.

Fix: Change the default case to return http.StatusInternalServerError.

default: return echo.NewHTTPError(http.StatusInternalServerError, err.Error())

docker-agent · 2026-05-13T14:05:34Z

+	for i := range models {
+		if currentRef != "" {
+			if models[i].Ref == currentRef {
+				models[i].IsCurrent = true


[MEDIUM] DecorateModelChoices mutates the caller-supplied models slice in-place

models[i].IsCurrent = true and models[i].IsCurrent = models[i].IsDefault modify elements of the slice that was passed in by the caller. In Go, a slice header is copied at the call boundary but the underlying array is shared. If AvailableModels() returns a slice backed by an internal cache (rather than a freshly allocated copy), these index-assignment mutations will silently update the cached structs.

A subsequent call to AvailableModels() could then return entries already marked IsCurrent = true (or false) from a previous decoration call, producing incorrect picker state for a different session or agent.

Fix: Make a defensive copy of the input slice at the top of DecorateModelChoices:

result := make([]ModelChoice, len(models)) copy(result, models) // ... operate on result, not models

and return result instead of the mutated input.

dgageot · 2026-05-13T14:19:17Z

Thanks for the review! Addressed all three findings in 3 separate commits:

[MEDIUM] `DecorateModelChoices` mutates caller-supplied slice — fixed in `96ea8bb`

The function now allocates a result slice (sized len(models)+len(customRefs)+1 to fit the synthetic-override case), copies the input, and operates on the copy. Doc-comment updated to spell out the non-mutation guarantee. Added TestDecorateModelChoices/does_not_mutate_the_input_slice that snapshots the input before calling and asserts deep-equality after.

[MEDIUM] Unknown errors mapped to 400 — fixed in `ad1366a`

The handler's default branch now returns 500 (mirroring the companion getSessionModels handler). 400 is reserved for the upstream Bind failure on the request body. Added TestAttachedServer_SetSessionModel_StoreFailureReturns500 driving a synthetic UpdateSession failure through the HTTP layer and asserting 500.

[HIGH] Lock held across blocking runtime I/O — fixed in `be9e8ae`

Added a per-session modelSwitch sync.Mutex on activeRuntimes to serialise concurrent SetSessionAgentModel calls on the same session without blocking other sessions.
AvailableSessionModels now snapshots current and customRefs under sm.mux briefly, releases the lock, then calls runtime.AvailableModels(ctx).
SetSessionAgentModel is restructured so sm.mux is taken in three short critical sections (snapshot prev → mutate session → rollback in-memory) but released around the slow runtime calls (SetAgentModel apply + rollback) and the store write. Atomicity wrt other model-switches on the same session is preserved by the new modelSwitch lock; the rollback contract is unchanged.
Added two synchronisation-style tests that park a deliberately-slow runtime call open and verify that an unrelated SetSessionStarred (sm.mux-acquiring) completes within 2s while the runtime call is still in flight:
- TestSessionManager_AvailableSessionModels_DoesNotHoldMuxDuringRuntimeIO
- TestSessionManager_SetSessionAgentModel_DoesNotHoldMuxDuringRuntimeIO

Validation

task lint — clean.
go test -race -count=5 ./pkg/server/... — green.
go test -race -count=3 ./pkg/server/... ./pkg/runtime/... — green.

aheritier · 2026-05-17T17:19:42Z

+	group.GET("/sessions/:id/models", s.getSessionModels)
+	// PATCH is the canonical verb for updating the agent's model. POST is
+	// also accepted because pkg/runtime Client.SetAgentModel (used by
+	// RemoteRuntime) was historically a POST; keep both so a client
+	// upgrade is not required.
+	group.PATCH("/sessions/:id/model", s.setSessionModel)


Blocking — docs: The three new endpoints (GET /api/sessions/:id/models, PATCH /api/sessions/:id/model, POST /api/sessions/:id/model) aren't reflected in docs/features/api-server/index.md's endpoint table (around line 44). Worth adding rows so the API server doc stays in sync with the routes registered here. Also worth a short note that POST is accepted for backward compat with RemoteRuntime.

(Could be a follow-up doc PR if you'd rather — but since CHANGELOG is the only doc updated in this branch and that's auto-generated, the API page is the only place a user can actually discover these.)

aheritier · 2026-05-17T17:19:45Z

+		switch {
+		case errors.Is(err, ErrModelSwitchingNotSupported):
+			return echo.NewHTTPError(http.StatusUnprocessableEntity, err.Error())


Question: Mapping ErrModelSwitchingNotSupported to 422 is a defensible choice (the URL/method is fine, the runtime just can't honour it), but 501 Not Implemented or 409 Conflict are both more conventional for "this server doesn't support this capability" / "the resource is in a state that disallows this action". 422 traditionally signals "the body parsed but is semantically wrong", which doesn't really match here — there's no body for GET. Curious whether you considered 501 and rejected it. Not blocking, just worth a sentence in the doc-comment justifying the choice.

aheritier · 2026-05-17T17:20:32Z

 	return c.JSON(http.StatusOK, map[string]string{"status": "updated"})
 }

+// getSessionModels lists the models the user can pick from for the


Question (about the API surface, not this handler specifically): the existing createSession binds straight into session.Session and SessionManager.CreateSession cherry-picks the fields it honours (now extended to AgentModelOverrides / CustomModelsUsed). A client that POSTs {"input_tokens": 999, "starred": true, "id": "..."} will silently have those ignored — correct behaviour, but the contract is implicit.

Worth either (a) introducing an explicit api.CreateSessionRequest DTO so the wire surface is type-level visible, or (b) at minimum a doc-comment on CreateSession listing exactly which template fields are honoured. Suggestion only; not a blocker for this PR.

aheritier · 2026-05-17T17:20:34Z

+// session and store never observe interleaved updates. The manager-wide
+// sm.mux is only held briefly while reading or mutating session fields,
+// never while calling into the runtime or the store.
+func (sm *SessionManager) SetSessionAgentModel(ctx context.Context, sessionID, modelRef string) (string, string, error) {


Suggestion: SetSessionAgentModel returns (string, string, error) — readable inline but easy to misuse at call sites (which arg is agent, which is model?). The handler in server.go already destructures into agentName, modelRef. Consider returning a small named struct (e.g. type ModelSwitchResult struct { Agent, Model string }) to make the API self-documenting, or rename to (agentName, modelRef, err) in a doc-comment line. Same applies to AvailableSessionModels which returns four positional values. Not blocking.

aheritier · 2026-05-17T17:20:35Z

+		sm.mux.Unlock()
+
+		rollback := prevOverride
+		if !hadOverride {
+			rollback = ""
+		}
+		if rbErr := rs.runtime.SetAgentModel(ctx, agentName, rollback); rbErr != nil {


Suggestion: the rollback block uses two passes (in-memory under sm.mux, then runtime without it) which is fine, but the rollback ref calculation is duplicated:

rollback := prevOverride if !hadOverride { rollback = "" }

This same (prevOverride, hadOverride) → ref projection is what the in-memory rollback computes too. Worth extracting a tiny helper, e.g. prevRef := prevOverrideRef(hadOverride, prevOverride) and using it in both rollback branches and as the SetAgentModel arg, so a future reader doesn't have to re-derive that "missing key means clear" rule twice. Non-blocking.

aheritier · 2026-05-17T17:20:43Z

+// GET /api/sessions/:id/models; renaming a tag is a breaking change.
 type ModelChoice struct {
 	// Name is the display name (config key)
-	Name string
+	Name string `json:"name"`
 	// Ref is the model reference used internally (e.g., "my_model" or "openai/gpt-4o")
-	Ref string
+	Ref string `json:"ref"`
 	// Provider is the provider name (e.g., "openai", "anthropic")
-	Provider string
+	Provider string `json:"provider,omitempty"`
 	// Model is the specific model name (e.g., "gpt-4o", "claude-sonnet-4-0")
-	Model string
+	Model string `json:"model,omitempty"`
 	// IsDefault indicates this is the agent's configured default model
-	IsDefault bool
+	IsDefault bool `json:"is_default,omitempty"`
 	// IsCurrent indicates this is the currently active model for the agent
-	IsCurrent bool
+	IsCurrent bool `json:"is_current,omitempty"`
 	// IsCustom indicates this is a custom model from the session history (not from config)
-	IsCustom bool
+	IsCustom bool `json:"is_custom,omitempty"`
 	// IsCatalog indicates this is a model from the models.dev catalog
-	IsCatalog bool
+	IsCatalog bool `json:"is_catalog,omitempty"`

 	// The fields below are populated (best-effort) from the models.dev
 	// catalog. They are optional and may all be zero/empty when no
 	// catalog entry is found for the model.

 	// Family is the model family (e.g., "claude", "gpt").
-	Family string
+	Family string `json:"family,omitempty"`
 	// InputCost is the price (in USD) per 1M input tokens.
-	InputCost float64
+	InputCost float64 `json:"input_cost,omitempty"`
 	// OutputCost is the price (in USD) per 1M output tokens.
-	OutputCost float64
+	OutputCost float64 `json:"output_cost,omitempty"`
 	// CacheReadCost is the price (in USD) per 1M cached input tokens.
-	CacheReadCost float64
+	CacheReadCost float64 `json:"cache_read_cost,omitempty"`
 	// CacheWriteCost is the price (in USD) per 1M cache-write tokens.
-	CacheWriteCost float64
+	CacheWriteCost float64 `json:"cache_write_cost,omitempty"`
 	// ContextLimit is the maximum context window size in tokens.
-	ContextLimit int
+	ContextLimit int `json:"context_limit,omitempty"`
 	// OutputLimit is the maximum number of tokens the model can produce
 	// in a single response.
-	OutputLimit int64
+	OutputLimit int64 `json:"output_limit,omitempty"`
 	// InputModalities lists the input modalities supported by the model
 	// (e.g., "text", "image", "audio").
-	InputModalities []string
+	InputModalities []string `json:"input_modalities,omitempty"`
 	// OutputModalities lists the output modalities the model can produce.
-	OutputModalities []string
+	OutputModalities []string `json:"output_modalities,omitempty"`
+}
+


Suggestion: ModelChoice is now part of the wire format (per the new doc-comment) — every public field has a JSON tag. The catalog-only fields below (Family, InputCost, OutputCost, CacheReadCost, CacheWriteCost, ContextLimit, OutputLimit, InputModalities, OutputModalities) are now serialised on every response; that's a lot of optional surface to commit to. If any of those fields might rename or restructure (e.g. consolidating costs into a sub-object), now is the time to consider it before clients depend on them.

Also: Ref and Name carry no omitempty while everything else does — intentional? They'll always be present, so the asymmetry is fine, just calling it out.

aheritier · 2026-05-17T17:20:45Z

+	// If the override points at an inline provider/model not in the
+	// runtime's list nor in the session's history, fabricate a synthetic
+	// choice so the picker can still highlight the active selection.
+	if !currentFound && strings.Contains(currentRef, "/") {
+		prov, name, _ := strings.Cut(currentRef, "/")
+		result = append(result, ModelChoice{
+			Name:      currentRef,
+			Ref:       currentRef,
+			Provider:  prov,
+			Model:     name,
+			IsCurrent: true,
+			IsCustom:  true,


Question: the synthesized choice for an inline override only fires when the ref contains /. If the override is a config key (my_model) that has been removed from the agent config since the session was created, the picker silently won't highlight anything and IsCurrent is false everywhere. The unit test "non-provider override does not synthesize a fabricated choice" enshrines this as intentional — but is that really what callers want? An entry like {Name: currentRef, Ref: currentRef, IsCurrent: true, IsCustom: true} (with empty Provider/Model) would at least let the UI surface "you're on a model that no longer exists in the config". Non-blocking design call.

aheritier · 2026-05-17T17:20:46Z

+// SessionModelsResponse is the response returned by
+// GET /api/sessions/:id/models. CurrentModelRef is the active override for
+// the named agent (empty when the agent is using its configured default).
+type SessionModelsResponse struct {
+	Agent           string        `json:"agent"`
+	CurrentModelRef string        `json:"current_model_ref,omitempty"`
+	Models          []ModelChoice `json:"models"`


Question: the response embeds ref for the agent default but no other notion of "default" beyond the per-choice IsDefault flag. Should SessionModelsResponse include a DefaultModelRef string too, so a client showing "Reset to default" doesn't have to scan Models[].IsDefault? Minor; ignore if you'd rather keep it lean.

aheritier · 2026-05-17T17:20:48Z

+// SetSessionModelRequest is the body of PATCH /api/sessions/:id/model.
+// An empty Model clears the override and reverts to the agent's default.
+type SetSessionModelRequest struct {
+	Model string `json:"model"`


Nit: missing the Assisted-By: docker-agent trailer on a couple of commits in this branch (feat: support model overrides when creating sessions uses Assisted-By: Docker Agent — capitalised — and a couple of the later commits don't have a body at all). Project convention per git-commit-conventions is the lowercase docker-agent form. Cosmetic, but worth fixing if you squash-merge with a clean trailer; harmless if you merge as-is.

aheritier · 2026-05-17T17:20:49Z

-	models := a.runtime.AvailableModels(ctx)

-	// Determine the currently active model for this agent
 	agentName := a.runtime.CurrentAgentName()


Praise: clean refactor — 66 lines deleted, 5 added, identical behaviour, and the extracted helper is now exhaustively unit-tested. Good catch on factoring this out.

dgageot added 7 commits May 13, 2026 14:37

feat: support model overrides when creating sessions

e905fdd

Allow callers to specify agent_model_overrides and custom_models_used when creating sessions via POST /api/sessions, and wire the runtime with model switcher configuration to apply those overrides when the session is first run.

fix: add backward compat note for POST /sessions/:id/model

397d72b

PATCH is the canonical verb for updating the agent's model. POST is also accepted because pkg/runtime Client.SetAgentModel (used by RemoteRuntime) was historically a POST; keep both so a client upgrade is not required.

refactor(runtime): extract DecorateModelChoices and remove duplication

1291d89

dgageot requested a review from a team as a code owner May 13, 2026 14:01

trungutt previously approved these changes May 13, 2026

View reviewed changes

docker-agent reviewed May 13, 2026

View reviewed changes

dgageot added 3 commits May 13, 2026 16:18

fix(runtime): make DecorateModelChoices defensively copy input slice

96ea8bb

fix(server): map unknown errors from setSessionModel to 500, not 400

ad1366a

fix(server): release sm.mux during runtime I/O on model endpoints

be9e8ae

dgageot dismissed trungutt’s stale review via be9e8ae May 13, 2026 14:19

aheritier reviewed May 17, 2026

View reviewed changes

Conversation

dgageot commented May 13, 2026

What changes

POST /api/sessions honours model selection (#2780)

Runtime model switching over HTTP (#2781)

Implementation notes

Tests

Validation

Commits

Uh oh!

docker-agent left a comment

Choose a reason for hiding this comment

Assessment: 🟡 NEEDS ATTENTION

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dgageot commented May 13, 2026

[MEDIUM] DecorateModelChoices mutates caller-supplied slice — fixed in 96ea8bb

[MEDIUM] Unknown errors mapped to 400 — fixed in ad1366a

[HIGH] Lock held across blocking runtime I/O — fixed in be9e8ae

Validation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

`POST /api/sessions` honours model selection (#2780)

[MEDIUM] `DecorateModelChoices` mutates caller-supplied slice — fixed in `96ea8bb`

[MEDIUM] Unknown errors mapped to 400 — fixed in `ad1366a`

[HIGH] Lock held across blocking runtime I/O — fixed in `be9e8ae`