Skip to content

tests/update_eve_image: eveimage-remove leftover BaseOsConfig at end#1172

Merged
eriknordmark merged 2 commits into
lf-edge:masterfrom
eriknordmark:update-eve-image-cleanup
May 18, 2026
Merged

tests/update_eve_image: eveimage-remove leftover BaseOsConfig at end#1172
eriknordmark merged 2 commits into
lf-edge:masterfrom
eriknordmark:update-eve-image-cleanup

Conversation

@eriknordmark
Copy link
Copy Markdown
Contributor

@eriknordmark eriknordmark commented May 11, 2026

Summary

The three tests under `tests/update_eve_image/` push a BaseOsConfig via `eden controller edge-node eveimage-update` but only call `eden eve reset` (clears device config) at the end, not `eveimage-remove`. The pushed BaseOsConfig therefore lingers in adam after the test, leaking state into subsequent tests / suites that do baseos work — most visibly in coverage-instrumented G3 sequences where the next baseos test does its own update and adam's residual state changes the expected behaviour.

Add a final `eveimage-remove` of the pushed version before the `eden eve reset` in:

  • `update_eve_image_http.txt` — removes `file://...` rootfs
  • `update_eve_image_oci.txt` — removes `oci://...` image
  • `revert_eve_image_update.txt` — removes the original-version rootfs it pushed back

Test plan

  • Test still passes (it only adds cleanup at the end)
  • After the test, `eden controller edge-node baseos` (or equivalent) shows the BaseOsConfig list empty for this version

🤖 Generated with Claude Code

@eriknordmark eriknordmark requested a review from uncleDecart as a code owner May 11, 2026 22:04
@eriknordmark eriknordmark requested a review from milan-zededa May 11, 2026 22:05
eriknordmark added a commit to eriknordmark/eden that referenced this pull request May 13, 2026
Two follow-ups for the nodeagent suite:

1. Raise outer -test.timeout values in eden.nodeagent.tests.txt so the
   test-framework wrapper doesn't kill a healthy run before its inner
   lim.test waits resolve. On the coverage-instrumented EVE build,
   post-reboot Info republish takes ~12 min, so the prior 15-minute
   timeouts on reset_on_disconnect_link_down / _blackhole could not
   accommodate even one of the three sequential lim.test waits each
   test uses (the test-script -timewait stays at 30m so a healthy
   device still resolves quickly; the change only relaxes the outer
   cap). New values: 45m for the disconnect / fallback tests, 20m for
   restart_counter_monotonic, 40m for maintenance_no_disk_space.

2. Add a get-config assertion after every eveimage-remove call so the
   test fails loudly if the controller config still references the
   removed image. The current eden CLI EdgeNodeEVEImageRemove only
   removes the legacy baseosconfig list entry and leaves the modern
   single-block baseos field + contentInfo[] populated; this assertion
   surfaces that bug end-to-end. (The corresponding eden CLI fix lands
   in PR lf-edge#1172 alongside the update_eve_image cleanup tests.)

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eriknordmark added a commit to eriknordmark/eden that referenced this pull request May 13, 2026
Two follow-ups for the baseosmgr suite:

1. Raise outer -test.timeout values in eden.baseosmgr.tests.txt to
   match realistic suite runtime on the coverage-instrumented EVE
   build. retry_update issues four sequential lim.test waits at
   -timewait 30m each, plus 6 minutes of exec sleep — the 60-minute
   outer cap killed the suite mid-revert. Bumped to 90m for
   retry_update and 45m for force_fallback. The test-script
   -timewait values are unchanged; only the wrapper cap relaxes.

2. Add a get-config assertion after every eveimage-remove call so the
   test fails loudly if the controller config still references the
   removed image. The current eden CLI EdgeNodeEVEImageRemove only
   removes the legacy baseosconfig list entry and leaves the modern
   single-block baseos field + contentInfo[] populated; this
   assertion surfaces that bug end-to-end. (The corresponding eden
   CLI fix lands in PR lf-edge#1172.)

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eriknordmark and others added 2 commits May 14, 2026 14:37
update_eve_image_http, update_eve_image_oci, and revert_eve_image_update
each push a BaseOsConfig via eden controller edge-node eveimage-update
but only `eden eve reset` (clears device config) at the end, not
`eveimage-remove`. The pushed BaseOsConfig therefore lingers in adam
after the test, which leaks state into subsequent tests / suites that
do baseos work.

Add a final eveimage-remove of the pushed version (file:// for the
http/revert paths, oci:// for the oci path) before the eden eve reset.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ests

EdgeNodeEVEImageRemove previously only removed the legacy
baseosconfig list entry on the device's controller config. The
modern single-block baseos field (Activate / ContentTreeUUID /
BaseOsVersion / RetryUpdateCounter) and the corresponding
contentInfo[] entry were left in place, so adam continued to ship the
"installed" baseos to the device even after the user explicitly
asked for it to go away. Subsequent eveimage-update calls then
behaved as no-ops because the device already had the same
ContentTreeUUID in its baseos block.

Make the remove path symmetric with EdgeNodeEVEImageUpdate: when the
removed version matches the modern baseos block, clear all four of
its fields and drop the corresponding ContentTree from the cloud's
in-memory list. The on-the-wire EdgeDevConfig now contains
baseos: {} and contentInfo: [] for that image.

Surface the bug at the test level: add `eden controller edge-node
get-config` + `! stdout '<version>'` after every eveimage-remove call
in tests/update_eve_image/testdata/. Without this fix the assertion
fails because the version still appears in the get-config output.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@eriknordmark eriknordmark force-pushed the update-eve-image-cleanup branch from beee963 to 24a6757 Compare May 14, 2026 12:37
@uncleDecart uncleDecart removed their request for review May 14, 2026 13:02
eriknordmark added a commit to eriknordmark/eden that referenced this pull request May 15, 2026
Two follow-ups for the baseosmgr suite:

1. Raise outer -test.timeout values in eden.baseosmgr.tests.txt to
   match realistic suite runtime on the coverage-instrumented EVE
   build. retry_update issues four sequential lim.test waits at
   -timewait 30m each, plus 6 minutes of exec sleep — the 60-minute
   outer cap killed the suite mid-revert. Bumped to 90m for
   retry_update and 45m for force_fallback. The test-script
   -timewait values are unchanged; only the wrapper cap relaxes.

2. Add a get-config assertion after every eveimage-remove call so the
   test fails loudly if the controller config still references the
   removed image. The current eden CLI EdgeNodeEVEImageRemove only
   removes the legacy baseosconfig list entry and leaves the modern
   single-block baseos field + contentInfo[] populated; this
   assertion surfaces that bug end-to-end. (The corresponding eden
   CLI fix lands in PR lf-edge#1172.)

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eriknordmark added a commit to eriknordmark/eden that referenced this pull request May 15, 2026
Two follow-ups for the nodeagent suite:

1. Raise outer -test.timeout values in eden.nodeagent.tests.txt so the
   test-framework wrapper doesn't kill a healthy run before its inner
   lim.test waits resolve. On the coverage-instrumented EVE build,
   post-reboot Info republish takes ~12 min, so the prior 15-minute
   timeouts on reset_on_disconnect_link_down / _blackhole could not
   accommodate even one of the three sequential lim.test waits each
   test uses (the test-script -timewait stays at 30m so a healthy
   device still resolves quickly; the change only relaxes the outer
   cap). New values: 45m for the disconnect / fallback tests, 20m for
   restart_counter_monotonic, 40m for maintenance_no_disk_space.

2. Add a get-config assertion after every eveimage-remove call so the
   test fails loudly if the controller config still references the
   removed image. The current eden CLI EdgeNodeEVEImageRemove only
   removes the legacy baseosconfig list entry and leaves the modern
   single-block baseos field + contentInfo[] populated; this assertion
   surfaces that bug end-to-end. (The corresponding eden CLI fix lands
   in PR lf-edge#1172 alongside the update_eve_image cleanup tests.)

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@eriknordmark eriknordmark merged commit df8752b into lf-edge:master May 18, 2026
38 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant