You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
compute: sync controller config before create-time setup (#37294)
### Motivation
A replica creates its `ComputeState` when it handles `CreateInstance`,
seeding `worker_config` with dyncfg defaults. `handle_create_instance`
then immediately runs `apply_worker_config`, but the controller's synced
configuration only arrives one command later, in the first
`UpdateConfiguration`. Both the live command stream (`Instance::run`
sends `Hello` then `CreateInstance` before the first
`update_configuration`) and the reconciliation order
(`ComputeCommandHistory::reduce` places the single `UpdateConfiguration`
after `CreateInstance`) put configuration after creation. Reordering is
not viable either: there is no `ComputeState`/`worker_config` to apply
configuration to before `CreateInstance`.
As a result, every create-time read that depends on configuration
silently used defaults: lgalloc, the pager, the column-paged batcher,
the memory limiter, `ore_overflowing_behavior`, and the introspection
dataflows rendered during `CreateInstance`. This is also why
create-time-frozen scoped flags from #37158 had to be worked around.
### Description
Carry the configuration with `CreateInstance`. `InstanceConfig` gains an
`initial_config` snapshot of the controller's dyncfg with the replica's
scoped overrides applied on top. The controller builds it in
`Instance::specialize_command_for_replica`, the same layer that already
injects scoped overrides into `UpdateConfiguration`, and it has the live
`dyncfg` and override map there. It is evaluated fresh on every send and
on every `add_replica` replay, so the snapshot always reflects current
values. The replica applies it to `worker_config` before
`apply_worker_config`.
The snapshot is excluded from `InstanceConfig::compatible_with`, like
dictionary compression, so reconnecting to a running replica does not
halt on configuration that has changed since creation. An empty snapshot
leaves the worker at its defaults until the first `UpdateConfiguration`,
preserving prior behavior.
Soundness:
* Fresh startup or replica restart: reconcile applies `CreateInstance`,
seeding `worker_config` from the snapshot before create-time setup.
* `environmentd` reconnect to a running replica: reconcile only
compatibility-checks `CreateInstance`; the existing already-synced
`worker_config` is retained and the replayed `UpdateConfiguration` keeps
it current.
* `UpdateConfiguration` still applies globally and remains hoistable;
the snapshot is a redundant create-time seed of the same values, not a
new ordering dependency.
### Tests
Adds unit tests in the compute controller covering the create-time
snapshot and the scoped-override merge, and confirming the existing
`UpdateConfiguration` override merge is unchanged. Adds a
`clusterd-test-driver` scenario and a parse test exercising the
create-time snapshot plumbing end to end.
### Checklist
* [ ] This PR has adequate test coverage / or no new functionality
requires testing.
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
0 commit comments