Skip to content

install: route through pod-hosted TFTP when power=rack#93

Merged
widgetii merged 1 commit into
masterfrom
install-via-pod-tftp
May 12, 2026
Merged

install: route through pod-hosted TFTP when power=rack#93
widgetii merged 1 commit into
masterfrom
install-via-pod-tftp

Conversation

@widgetii
Copy link
Copy Markdown
Member

Summary

Adds --tftp-via=auto|pod|host (default: auto) and wires _install_async to pick the TFTP backend without forking the U-Boot driving logic.

Mode What happens
pod Stage U-Boot + kernel + rootfs in the rack pod's PSRAM via RackController.tftp_put; set setenv serverip 192.168.1.1 and let the camera fetch directly over the local LAN. The pod TFTP server is already on the camera's gateway IP — no host TFTP, no NIC IP plumbing, no sudo, no port-69 conflict. Files cleared at the end via tftp_clear.
host Existing path. temporary_ip adds a NIC IP alias and start_tftp_server binds UDP/69 on the host.
auto pod when power=rack, host otherwise — preserves the old default for non-rack setups.

--tftp-via pod without DEFIB_POWER_TYPE=rack errors out cleanly.

Implementation note

A small AsyncExitStack swap. Both branches end up exposing the same two locals to the rest of the install code: serverip (used in setenv serverip) and replace_in_tftp(name, data) (the async hook the UBI rootfs path needs to swap files mid-flow). The 200+ lines of U-Boot driving / tftp + sf write / ethaddr-rescue / saveenv stay untouched.

End-to-end verification

$ DEFIB_POWER_TYPE=rack DEFIB_RACK_HOST=10.216.128.69 \
  defib install -c hi3516ev300 \
                --firmware openipc.hi3516ev300-nor-neo.tgz \
                -p rack://10.216.128.69 \
                --power-cycle \
                --nor-size 16

Phase 1: Burning U-Boot to RAM
  Pod-side fastboot in progress…
  U-Boot loaded in 25940ms
Phase 2: Flash via TFTP
  Staging 6535 KB in pod PSRAM via POST /tftp/<name>...
  Pod TFTP ready on 192.168.1.1:69
  Flashing U-Boot → 0x0 (262144 bytes)
    TFTP CRC verified: FA8B2667
    Flash verified:   FA8B2667     U-Boot OK
  Flashing kernel → 0x50000 (2055676 bytes)
    TFTP CRC verified: BF160C6A
    Flash verified:   BF160C6A     kernel OK
  Flashing rootfs → 0x350000 (4374528 bytes)
    TFTP CRC verified: 7D598D15
    Flash verified:   7D598D15     rootfs OK
  ethaddr preserved
  Environment saved
  Resetting device...
Install complete!

Camera then reaches openipc-hi3516ev300 login: on its own. CRC verification at both the TFTP-into-RAM and flash-readback points, on every partition.

The CLI now drives a complete OpenIPC install from one command, zero host-side TFTP setup — the only network plane involved is the rack-pod's own (HTTP for staging + local-LAN TFTP between pod and camera).

Test plan

  • uv run pytest tests/ -x -v --ignore=tests/fuzz — 486 passed / 2 skipped (unchanged — _install_async has no direct unit tests; covered by integration)
  • uv run ruff check src/defib/cli/app.py — clean
  • uv run mypy src/defib/cli/app.py --ignore-missing-imports — clean
  • Regression: defib install --tftp-via host … still works on the existing host-TFTP setups (unchanged code path). Verified by inspection — the host branch is byte-identical to what it was before this PR; the only change is being inside an AsyncExitStack.
  • --tftp-via pod without rack power → clean error message.

🤖 Generated with Claude Code

Adds `--tftp-via=auto|pod|host` flag (default: auto) and wires
`_install_async` to pick the TFTP backend without forking the U-Boot
driving logic:

  - `pod`:  stage U-Boot + kernel + rootfs in the rack pod's PSRAM via
            `RackController.tftp_put`; set `setenv serverip 192.168.1.1`
            and let the camera fetch directly over the local LAN. The
            pod TFTP server is already on the camera's gateway IP — no
            host TFTP, no NIC IP plumbing, no sudo, no port-69 conflict.
            Files cleared at the end via `tftp_clear`.
  - `host`: existing path. `temporary_ip` adds a NIC IP alias and
            `start_tftp_server` binds UDP/69 on the host.
  - `auto`: pod when power=rack, host otherwise (preserves the old
            default for non-rack setups).

Implementation: a small `AsyncExitStack` swap. Both branches end up
exposing the same two locals to the rest of the install code:
`serverip` (used in `setenv serverip`) and `replace_in_tftp(name,
data)` (the async hook that the UBI rootfs path needs to swap files
mid-flow). The 200+ lines of U-Boot driving / tftp+sf-write /
ethaddr-rescue / saveenv stay untouched.

End-to-end verification on rack pod 10.216.128.69 + hi3516ev300
(nor-neo install):

  $ DEFIB_POWER_TYPE=rack DEFIB_RACK_HOST=10.216.128.69 \
    defib install -c hi3516ev300 \
                  --firmware openipc.hi3516ev300-nor-neo.tgz \
                  -p rack://10.216.128.69 \
                  --power-cycle \
                  --nor-size 16

  ...
  Phase 1: Burning U-Boot to RAM
    Pod-side fastboot in progress…
    U-Boot loaded in 25940ms
  Phase 2: Flash via TFTP
    Staging 6535 KB in pod PSRAM via POST /tftp/<name>...
    Pod TFTP ready on 192.168.1.1:69
    Flashing U-Boot → 0x0 (262144 bytes)
      TFTP CRC verified: FA8B2667
      Flash verified: FA8B2667    U-Boot OK
    Flashing kernel → 0x50000 (2055676 bytes)
      TFTP CRC verified: BF160C6A
      Flash verified: BF160C6A    kernel OK
    Flashing rootfs → 0x350000 (4374528 bytes)
      TFTP CRC verified: 7D598D15
      Flash verified: 7D598D15    rootfs OK
    ethaddr preserved
    Environment saved
    Resetting device...
  Install complete!

  $ # camera reaches `openipc-hi3516ev300 login:` cleanly

The CLI now drives a complete OpenIPC install from a single command,
zero host-side TFTP setup — the whole UART + Eth path through the
rack pod is the only network plane involved.

Suite: 486 passed / 2 skipped; ruff + mypy clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@widgetii widgetii merged commit ed4bc98 into master May 12, 2026
13 checks passed
@widgetii widgetii deleted the install-via-pod-tftp branch May 12, 2026 04:40
widgetii added a commit that referenced this pull request May 12, 2026
## Summary

Brings `defib restore` to parity with `defib install` (#88 + #93) for
rack-controlled cameras. Three pieces:

### Phase 1 — fastboot when `power=rack`

The previous host-side frame-blast race (power-off → open serial → start
session → power-on) is RouterOS-only. Rack pods don't expose independent
`power_off`/`power_on` and don't need to — the pod's `/fastboot`
endpoint does the whole sequence locally with microsecond ACK latency.
Drop the hard-coded *"restore needs RouterOSController only"* reject —
`RackController` is now an accepted alternative. Vectis stays rejected.

### Phase 5 — `--tftp-via=auto|pod|host` (default auto)

Same flag as `install`. Auto → pod when `power=rack`, host otherwise.
Pod path stages every partition via `RackController.tftp_put`, sets
`serverip=192.168.1.1` (the pod), and unifies the UBI rootfs file-swap
through `_replace_in_tftp(name, data)`.

Two robustness improvements:

- **`tftp_clear` BEFORE staging.** A prior aborted run leaves PSRAM
occupied; if the next run can't allocate, the 4 MB rootfs OOMs at 256 KB
largest-free. Wipe first.
- **`try/finally` around Phase 5 + 6.** A mid-loop write failure skipped
`__aexit__` and leaked ~7 MB of pod PSRAM until the next install. The
`try/finally` (with the cleanup hooks pre-registered on the
`AsyncExitStack`) makes cleanup unconditional.

### Live verification on rack pod `10.216.128.69` (hi3516ev300)

Synthetic dump dir at `/tmp/cam_dump/` (mtd0..3 sized to match the 16 MB
NOR layout):

```
$ DEFIB_POWER_TYPE=rack DEFIB_RACK_HOST=10.216.128.69 \
  defib restore -c hi3516ev300 -i /tmp/cam_dump/ \
                -p rack://10.216.128.69 --power-cycle --flash-type nor

  Power: rack pod HTTP API
Phase 1: Loading U-Boot to RAM
  Pod-side fastboot in progress…
Phase 4: Network setup — Network OK (attempt 1)
Phase 5: Writing flash
  Staging 7664 KB in pod PSRAM via POST /tftp/<name>...
  Pod TFTP ready on 192.168.1.1:69
  mtd1: 64KB    → 0x40000     Written (7.5 s)
  mtd2: 3072KB  → 0x50000     Written (11.7 s)
  mtd3: 4272KB  → 0x350000    Written (15.7 s)
  mtd0: 256KB   → 0x0         Written (8.3 s)
Restore complete!
```

Camera reaches `openipc-hi3516ev300 login:` cleanly. `exit=0`.

### Companion rack-firmware change (local-only)

`UART_IDLE_TIMEOUT_S` **60 → 600**. The 60-second idle timer was killing
the bridge socket mid-staging — ~50 s of HTTP `/tftp` uploads counts as
"idle" to the bridge (no host→pod UART traffic during that window). 600
s comfortably covers full installs and restores.

## Test plan

- [ ] `uv run pytest tests/ -x -v --ignore=tests/fuzz` — 486 passed / 2
skipped (no new unit tests; `_restore_async` is integration-only)
- [ ] `uv run ruff check src/defib/cli/app.py` — clean
- [ ] `uv run mypy src/defib/cli/app.py --ignore-missing-imports` —
clean
- [ ] Regression: `defib restore --tftp-via host …` still works on
existing RouterOS+host-TFTP setups — host branch is byte-identical
except for being inside the shared `AsyncExitStack`.
- [ ] `--tftp-via pod` without `DEFIB_POWER_TYPE=rack` → clean error
message.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Dmitry Ilyin <widgetii@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant