agent: bump client.crc32 timeout for V1-era SoCs#98
Merged
Conversation
The old timeout formula `max(10, 5 + size/MiB)` assumed a fast DMA read path through the agent's flash_crc32. That's true on V3+/V4+/V5/V6 SoCs (FMC100, multi-MB/s), but on V1-era HiSilicon parts the agent uses an AHB STD READ via the memory-mapped window (HISFC350, hi3520dv200) which tops out at ~150 KB/s when the agent walks each byte through the controller. At that rate a 4 MiB CRC32 takes ~27 s — the old formula bailed at 10 s and surfaced misleading "agent stopped responding" / "No packet received within 10s" errors even though the device was still computing the CRC and the response was already on its way. Bumping to `max(15, 5 + size/100KB)` gives 100 KB/s of headroom — well under the worst-case V1 path's actual rate, plus a 15 s baseline for round-trip latency. V3+/V4+ chips at multi-MB/s still complete in well under the new ceiling. Verified against hi3520dv200 (MX25L25635E 32 MiB NOR on CS1): full 3.74 MiB rootfs CRC32 now completes in ~25 s and matches every time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
client.crc32()'s timeout formulamax(10, 5 + size_MiB)is too aggressive for V1-era HiSilicon SoCs (hi3520dv200 / HISFC350 controller) where the agent walks flash via the AHB STD READ window at ~150 KB/s.A 4 MiB CRC32 needed ~27 s on hi3520dv200 but the host bailed at 10 s and reported "No packet received within 10s" — misleading: the device kept computing and the response was already in flight.
New formula:
max(15, 5 + size / (100 * 1024))— assumes 100 KB/s of effective throughput (safe under the slow V1 path) with a 15 s baseline. V3+/V4+ DMA reads at multi-MB/s still complete well under the ceiling.Why this matters
Caught after a real install on hi3520dv200: a relocation script crashed at the 10 s CRC timeout, forcing me to split the operation into two scripts. The second script's "erase old location" then wiped 2.74 MiB of the freshly-written-and-verified rootfs because it didn't subtract the new write's footprint — see OpenIPC/firmware#2089 (closed) for the full incident write-up.
The bug was operator-level (not in defib's blessed
installflow), but the too-tight CRC timeout was the contributing factor that drove me to the staged-script approach where the overlap mistake became easy to make. Fixing this is the minimum-viable defib change that would have averted the chain.Test plan
uv run pytest tests/ -x -q --ignore=tests/fuzz)🤖 Generated with Claude Code