Skip to content

direct: store serialized_dashboard in state as a content hash#5609

Draft
shreyas-goenka wants to merge 14 commits into
databricks:mainfrom
shreyas-goenka:shreyas-goenka/dashboards-sha-state
Draft

direct: store serialized_dashboard in state as a content hash#5609
shreyas-goenka wants to merge 14 commits into
databricks:mainfrom
shreyas-goenka:shreyas-goenka/dashboards-sha-state

Conversation

@shreyas-goenka

@shreyas-goenka shreyas-goenka commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Changes

The direct deploy engine persists the full planned config per resource in resources.json. For dashboards that includes the inlined serialized_dashboard contents, which routinely run from hundreds of KB to several MB (and roughly double once JSON-escaped into state). That content is never read back from state:

  • a deploy always sends the contents to the API from the plan's new_state, never from saved state;
  • nothing resolves references out of the field;
  • the only thing the saved value is used for is a local "did the content change since last deploy?" equality check — which a content hash serves exactly.

This PR adds a declarative hashed_in_state list to resources.yml (a plain list of field paths). A generic helper CompactState(cfg, state) replaces each such field with a sha256:<hex> placeholder. The framework applies it both before persisting state and to every value entering the diff (saved state, the local config copy, and the remapped remote), so stored and compared values share one canonical form: unchanged content still yields an equal-hash skip, changed content yields a different hash, exactly as before.

Only dashboards.serialized_dashboard opts in. The plan's new_state (sent to the API on apply) and the raw top-level remote_state snapshot keep their full content.

hashed_in_state is orthogonal to ignore_remote_changes

Because the remapped remote is hashed too, remote drift on a hashed field is detected as hash != hash — so a field can be hashed_in_state without being ignore_remote_changes. The two are declared independently.

serialized_dashboard happens to be both, for unrelated reasons: it is hashed_in_state because the blob is large, and it is separately ignore_remote_changes because the server normalizes it (adds pageType, reorders keys) so its remote hash never equals the config hash — drift is detected via etag instead.

libs/hash

The "JSON-encode then SHA-256" primitive is extracted into a new libs/hash package (hash.JSON). libs/cache's fingerprintToHash, which was doing the same thing independently, now routes through it too (behavior-identical).

Compatibility

No state version bump. Legacy full-content state is hashed on read for the comparison and rewritten compactly on the next save (lazy migration). An older CLI reading new state sees a hash, plans one redundant update, and rewrites full content — safe. Bumping the version would instead hard-fail older CLIs, a worse failure mode for mixed-version CI/teams.

User-visible effect

bundle plan reports serialized_dashboard as sha256:... in the changes section rather than embedding the (potentially multi-MB) serialized blob. resources.json shrinks correspondingly, as does the per-deploy state upload.

Scope

Dashboards only. genie_spaces.serialized_space is a candidate for the same treatment and will be a follow-up.

Tests

  • Unit: hashStateValue (determinism, idempotence, nil/empty), CompactState (no-declared-fields no-op, mutation-safety, legacy full-content migration invariant), the dual-declaration doc test for serialized_dashboard, and libs/hash.JSON.
  • Acceptance: a new direct-only resources/dashboard-state-sha test covering plan-keeps-content / state-stores-hash / API-gets-full-content across the --plan (READPLAN) matrix for create, no-op re-plan, and update; plus regenerated affected outputs (dashboards simple/detect-change/unpublish-out-of-band, bind, migrate).

This pull request and its description were written by Isaac.

…nt hashes

The direct deploy engine persists the full planned config per resource in
resources.json. For dashboards and genie spaces, that includes the inlined
serialized_dashboard / serialized_space contents, which routinely run into the
hundreds of KB to several MB. These fields are never read back from state: drift
is detected via etag (they are ignore_remote_changes), and a deploy always sends
the contents from the plan's new_state, never from saved state.

Add an optional CompactState adapter method that replaces such equality-only
fields with a "sha256:<hex>" content hash. The framework applies it both before
persisting state and to every value entering the diff, so stored and compared
values share one form and unchanged content still produces an equal-hash skip.
The plan's new_state (sent to the API) and the raw remote_state snapshot keep
full content.

No state version bump: legacy full-content state is hashed on read for the
comparison and rewritten compactly on the next save.

Co-authored-by: Isaac
…together

Extend the genie_spaces/serialized_space test so a single deploy asserts both
sides of the SHA-only-state contract on the same state:
  - the create requests carry the full serialized_space (existing capture, plus
    an explicit "no sha256 in the request" guard), and
  - the persisted state stores only the sha256 content hash, never the content.

Co-authored-by: Isaac
The two error paths added for state compaction in Bind (adapter lookup and
CompactState) returned without removing the temp state file, unlike every other
error path in the function. Add os.Remove(tmpStatePath) to match.

Co-authored-by: Isaac
…s one token

The generic test replacements split the persisted sha256 digest into pieces:
[0-9a-f]{32} turned it into "[DASHBOARD_ID][DASHBOARD_ID]" in the bind test, and
[0-9]{3,} fragmented it as "sha256:...[NUMID]..." in several plan outputs. Add a
single low-Order replacement in acceptance/bundle/test.toml that collapses the
64-char digest to "sha256:[ALPHANUMID]" before those rules run, so every state and
plan output renders the hash consistently. Also add the missing newline after the
state-assertion title in the genie serialized_space test.

Co-authored-by: Isaac
serialized_space gets the same treatment in a follow-up; keeping this PR to
dashboards only.

Co-authored-by: Isaac
…aration

The per-resource CompactState(state any) (any, error) adapter method (plus its
IResource entry and calladapt plumbing) is removed. Instead, fields persisted as
a content hash are declared under hashed_in_state in resources.yml, and a generic
dresources.CompactState(cfg, state) hashes them via structaccess.

Behaviour is unchanged: new_state still carries the full content (so deploy and
deploy --plan send the real bytes), and only resources.json + the diff use the
hash. dashboards.serialized_dashboard is the sole declared field; acceptance
outputs are identical.

Co-authored-by: Isaac
Spell out that hashing the saved state on read is what lets state written before
this change (or before a field was added to hashed_in_state) still diff correctly
against the now-hashed local config, and that it is read-only normalization (no
state_version bump). Link the PR.

Co-authored-by: Isaac
…nvariant test

- Drop the CompactState call on the remapped remote state in CalculatePlan. Hashing
  only ever helps the local diff (saved vs config); a hashed_in_state field is always
  ignore_remote_changes, so its remote diff is discarded regardless. The plan now shows
  the real remote value instead of a meaningless remote hash.
- Mask the state hash in acceptance output as "sha256:[HASH]" instead of the generic
  "[ALPHANUMID]", so it clearly reads as a content hash.
- Add TestHashedInStateImpliesIgnoreRemoteChanges enforcing, for every resource, that
  any hashed_in_state field is also ignore_remote_changes (a hashed field can only
  detect local changes; its remote diff must be ignored or it would drift forever).

Co-authored-by: Isaac
Comment thread bundle/direct/dresources/resources.yml Outdated
# ignore_remote_changes (etag_based) and re-sent from config on every deploy, so it
# is never read back from state; persist only its content hash to keep state small.
- field: serialized_dashboard
reason: large_content

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omit reason here. It does not show up in plan.

Hash the remapped remote on hashed_in_state fields too, so a hashed field is a hash
on all three sides of the diff (saved, config, remote). Once the saved value is a
hash, every comparison must be hash-vs-hash — including remote drift (Remote vs New)
— or it is hash-vs-content nonsense. With the remote hashed, remote drift is detected
as hash != hash, so a field can be hashed_in_state without being ignore_remote_changes.

The two are now declared independently. serialized_dashboard needs both, but for
unrelated reasons: hashed because the blob is large, ignore_remote_changes because the
server normalizes it (so its remote hash never matches the config hash). Dropped the
test that wrongly required hashed_in_state to imply ignore_remote_changes.

Co-authored-by: Isaac
…ndtrip)

Add dashboard-state-sha: a direct-only test (terraform does not hash state) that runs
both in-memory (READPLAN="") and saved-plan (READPLAN=1, deploy --plan) deploys and
asserts:
  - the create request carries the FULL serialized_dashboard (out.create.serialized.json,
    identical across READPLAN, so the saved plan applies the real content, not the hash);
  - the persisted state holds only the content hash (out.state.direct.txt);
  - the plan keeps full content in new_state but reports the diff as a hash;
  - a re-plan with no local change is a no-op (remote normalization ignored, etag_based).

Local diff detection (edit -> update) is covered by change-serialized-dashboard.

Co-authored-by: Isaac
Extract the "json.Marshal + sha256 -> hex" primitive into a new libs/hash
package and route both consumers through it: the dashboard state-compaction
hash and libs/cache's fingerprintToHash, which were each doing the same
thing independently.

hashed_in_state in resources.yml is now a plain list of field names
([]string) rather than {field, reason} rules: it only ever needs the path,
so the reason field was dead weight.

Co-authored-by: Isaac
Add an edit-and-redeploy cycle to dashboard-state-sha: re-render the
dashboard, plan, and deploy (both in-memory and via a saved --plan file).
This proves the local diff detects the change as a hash, the persisted
state keeps only the new hash, and the update API call still carries the
full new serialized_dashboard - identical across READPLAN variants, so the
saved plan applies real content rather than the hash.

Co-authored-by: Isaac
@shreyas-goenka shreyas-goenka changed the title direct: store serialized_dashboard/serialized_space in state as content hashes direct: store serialized_dashboard in state as a content hash Jun 24, 2026
Assert that a pre-existing full-content state and a matching config compact
to the same hash, so re-planning against hashed-on-read legacy state shows
no spurious diff.

Co-authored-by: Isaac
…llow copy

The shallow copy keeps the caller's value (reused for the deploy API call)
untouched only for top-level scalar hashed fields; a nested field would need
a deep copy. Note the constraint so it isn't silently broken later.

Co-authored-by: Isaac
@github-actions

Copy link
Copy Markdown
Contributor

An authorized user can trigger integration tests manually by following the instructions below:

Trigger:
go/deco-tests-run/cli

Inputs:

  • PR number: 5609
  • Commit SHA: a3c64243036dd7aefd34faef5a5f60c08016dc88

Checks will be approved automatically on success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant