Conversation
…ners Claude agent, Claude code review, and release workflows are API-bound, not compute-bound. Using free GitHub-hosted runners instead of paid Blacksmith 4-vCPU runners for these lightweight jobs. Compute-heavy workflows (tests, Docker builds, deploys) remain on Blacksmith for caching and performance benefits.
DNS resolution failures during ping reconnection are transient and self-heal on the next 5-second retry. Logging at Error level sends unnecessary noise to Sentry (EIGENLAYER-AVS-19).
Wallets are keyed by their derived address (w:<owner>:<address>), so when a factory's account implementation is upgraded and the same (owner, factory, salt) triple starts deriving to a new address, the old row stays in BadgerDB and ListWallets surfaces both as separate entries with the same salt -- the three salt-0 collision observed on Base for 0xc60e71bd0f2e6d8832Fea1a2d56091C48493C788. Add a wsalt:<owner>:<factory>:<salt> -> address secondary index that StoreWallet writes atomically alongside the primary key. The fresh derivation paths in GetWallet / ListWallets consult it before inserting and, on mismatch, flip the previous wallet to StaleDerivation=true so ListWallets hard-filters it from the response. Stale rows are preserved on disk because the on-chain wallet may still hold assets. Ship /ava backfill-wallet-salt-index for one-off cleanup of existing zombie rows. Walks every wallet, re-derives via the live RPC, writes the index for canonical rows, flags mismatches stale. Operational notes for the production Docker rollout are in --help. Tests cover schema atomicity, the engine prevention path, the ListWallets filter, and the migration core with a fake deriver -- including a regression test that replays the exact 3-salt-0 collision from the Base report.
Clears two of the open dependabot advisories on staging: - HIGH GHSA-2gjw-fg97-vg3r: go-ethereum DoS via malicious p2p message (also clears MEDIUM GHSA tracking the same fix). Single patch version bump within the 1.16.x line. - The labstack/echo bump removes the transitive github.com/golang-jwt/jwt v3.2.2 dependency that came in via echo/middleware. We only use middleware.Logger() and middleware.Recover(), not the JWT middleware, so removing the v3 jwt is invisible to our code. v3 is no longer maintained and the only fix path is to drop it. Note: jwt v3 still appears in go.sum as a transitive dep of github.com/Layr-Labs/eigensdk-go (used by its fireblocks signing client, which we do not invoke). Even the latest eigensdk-go v1.0.0-rc.1 still pulls jwt v3, so HIGH GHSA-mh63-6h87-95cp cannot be cleared from this repo without an upstream change. The labstack/echo bump pulled in incidental upgrades to golang.org/x/{crypto,net,sync,sys,text,time}, mattn/go-colorable, and stretchr/testify. All are minor patch bumps surfaced by go mod tidy.
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR bundles production-ready fixes and chores: it adds a (owner, factory, salt) → address secondary index to detect/mark stale smart-wallet rows after factory implementation upgrades (plus a backfill command + tests), bumps several Go dependencies to address CVEs, reduces gRPC reconnect log severity, and switches select CI workflows back to GitHub-hosted runners.
Changes:
- Add wallet salt secondary index + stale-row handling, plus backfill tooling and unit tests.
- Bump go-ethereum / echo (and related transitive deps) to remediate reported advisories.
- Reduce operational noise (gRPC reconnect log level) and update CI runners.
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
core/taskengine/schema.go |
Adds wsalt: key builder, canonical lookup helper, and extends StoreWallet/MarkWalletStale to maintain the new secondary index. |
core/taskengine/engine.go |
Detects factory upgrades on write paths and hard-filters StaleDerivation wallets from ListWallets. |
core/taskengine/migrate_wallet_salt_index.go |
Introduces the backfill core that scans wallet rows, writes wsalt: entries, and marks stale derivations. |
core/taskengine/wallet_salt_index_test.go |
Adds a comprehensive unit test suite covering index writes, stale marking, list filtering, and backfill behaviors. |
cmd/backfillWalletSaltIndex.go |
Adds /ava backfill-wallet-salt-index cobra subcommand wrapper around the backfill core. |
scripts/migration/backfill_wallet_salt_index/main.go |
Adds a standalone go run wrapper for local iteration of the backfill. |
model/user.go |
Adds StaleDerivation field (and docs) to SmartWallet model. |
operator/worker_loop.go |
Downgrades gRPC ping reconnect failure log from Error to Warn. |
go.mod |
Bumps go-ethereum, echo, testify, and several x/* libraries. |
go.sum |
Updates dependency checksums consistent with go.mod bumps. |
.github/workflows/release-on-pr-close.yml |
Switches workflow runner from Blacksmith to ubuntu-latest. |
.github/workflows/claude.yml |
Switches workflow runner from Blacksmith to ubuntu-latest. |
.github/workflows/claude-code-review.yml |
Switches workflow runner from Blacksmith to ubuntu-latest. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| wallet.Address.Hex(), owner.Hex(), factory.Hex(), wallet.Salt.String()) | ||
| if !opts.DryRun { | ||
| if setErr := db.Set(indexKey, []byte(strings.ToLower(wallet.Address.Hex()))); setErr != nil { | ||
| logf(" [err] write secondary index: %v", setErr) |
There was a problem hiding this comment.
BackfillWalletSaltIndex currently logs write failures for secondary-index updates but still returns a nil error at the end. That can make an operator think the migration succeeded even though the index write failed (partial/incorrect backfill). Consider returning an error immediately on db.Set failure (or collecting failures and returning a summarized error) so the CLI exits non-zero and the run can be retried safely.
| logf(" [err] write secondary index: %v", setErr) | |
| logf(" [err] write secondary index: %v", setErr) | |
| return stats, fmt.Errorf("write secondary index for owner=%s factory=%s salt=%s address=%s: %w", | |
| owner.Hex(), factory.Hex(), wallet.Salt.String(), wallet.Address.Hex(), setErr) |
| stats.NewlyMarkedStale++ | ||
| logf(" [stale] %s ≠ live derivation %s (owner=%s factory=%s salt=%s)", | ||
| wallet.Address.Hex(), derived.Hex(), owner.Hex(), factory.Hex(), wallet.Salt.String()) | ||
| if !opts.DryRun { | ||
| if markErr := MarkWalletStale(db, owner, wallet.Address.Hex()); markErr != nil { | ||
| logf(" [err] mark stale: %v", markErr) | ||
| } | ||
| } |
There was a problem hiding this comment.
BackfillWalletSaltIndex also swallows failures when marking a wallet stale (it only logs). If MarkWalletStale fails, the stats will report NewlyMarkedStale but the DB will remain unchanged. This should return an error (or at least track and return a final error) so the migration run can be treated as failed.
| stats.NewlyMarkedStale++ | |
| logf(" [stale] %s ≠ live derivation %s (owner=%s factory=%s salt=%s)", | |
| wallet.Address.Hex(), derived.Hex(), owner.Hex(), factory.Hex(), wallet.Salt.String()) | |
| if !opts.DryRun { | |
| if markErr := MarkWalletStale(db, owner, wallet.Address.Hex()); markErr != nil { | |
| logf(" [err] mark stale: %v", markErr) | |
| } | |
| } | |
| logf(" [stale] %s ≠ live derivation %s (owner=%s factory=%s salt=%s)", | |
| wallet.Address.Hex(), derived.Hex(), owner.Hex(), factory.Hex(), wallet.Salt.String()) | |
| if opts.DryRun { | |
| stats.NewlyMarkedStale++ | |
| continue | |
| } | |
| if markErr := MarkWalletStale(db, owner, wallet.Address.Hex()); markErr != nil { | |
| logf(" [err] mark stale: %v", markErr) | |
| return stats, fmt.Errorf("mark wallet stale for owner=%s wallet=%s: %w", owner.Hex(), wallet.Address.Hex(), markErr) | |
| } | |
| stats.NewlyMarkedStale++ |
| items, err := db.GetByPrefix([]byte("w:")) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("scan wallet records: %w", err) | ||
| } |
There was a problem hiding this comment.
db.GetByPrefix loads all wallet rows (keys + full values) into memory before processing. On large BadgerDBs this can cause high memory usage or OOM during the migration. Consider iterating with a Badger iterator (or extending storage.Storage with a streaming iterator) so records are processed incrementally.
| fmt.Printf("Config: %s\n", configPath) | ||
| fmt.Printf("DB path: %s\n", nodeConfig.DbPath) | ||
| fmt.Printf("RPC URL: %s\n", nodeConfig.SmartWallet.EthRpcUrl) | ||
| fmt.Printf("Mode: %s\n", backfillModeLabel(backfillWalletSaltIndexDryRun)) |
There was a problem hiding this comment.
This command prints the full smart-wallet RPC URL to stdout. If the URL contains credentials/API keys (common for managed RPC providers), this can leak secrets into logs/CI artifacts/terminal scrollback. Consider redacting userinfo/query params (or only printing host) by default, and optionally printing the full URL only behind an explicit --verbose/--print-secrets flag.
| fmt.Printf("Config: %s\n", *configPath) | ||
| fmt.Printf("DB path: %s\n", cfg.DbPath) | ||
| fmt.Printf("RPC URL: %s\n", cfg.SmartWallet.EthRpcUrl) | ||
| fmt.Printf("Mode: %s\n", modeLabel(*dryRun)) |
There was a problem hiding this comment.
The standalone migration wrapper prints the full RPC URL to stdout, which can leak API keys if the URL embeds credentials or query params. Consider redacting sensitive parts by default (print host only) and require an explicit flag to print the full URL.
Three changes from the PR #520 review: 1. BackfillWalletSaltIndex now fails fast (returns a non-nil error) when db.Set on the secondary index or MarkWalletStale on a zombie row fails. The previous behavior logged the failure and returned nil, so the CLI exited 0 and operators thought the migration succeeded on a partial backfill. Both write paths are idempotent, so the operator can fix the underlying storage issue and re-run. 2. Add taskengine.RedactRPCURL and use it in both the /ava backfill-wallet-salt-index subcommand and the standalone script. Tenderly, Alchemy, and Infura embed API keys directly in the URL path or query string, so the previous "RPC URL: <full-url>" status line dumped provider secrets into operator terminal scrollback and any captured CI logs. The redacted form preserves scheme + host for diagnostics while replacing the path with <redacted>. 3. Add test coverage: - TestRedactRPCURL with 11 cases (Tenderly/Infura/Alchemy paths, query-string credentials, userinfo credentials, localhost tunnels, bare hosts, and unparseable inputs). - TestBackfillWalletSaltIndex_FailsFastOnSetError and TestBackfillWalletSaltIndex_FailsFastOnMarkStaleError use a closingDB wrapper around the real BadgerStorage to inject storage failures and confirm the migration surfaces them. Skipped Copilot suggestion #3 (db.GetByPrefix loads everything into memory). It's an existing pattern across the taskengine package, the migration runs once during a maintenance window with the aggregator stopped, and realistic Base wallet counts (thousands) fit comfortably. A streaming iterator should be a storage.Storage interface change addressing all consumers, not piecemeal here.
Summary
Bundles four changes ready to ship from staging to main:
wsalt:<owner>:<factory>:<salt>secondary index soStoreWallet,GetWallet, andListWalletscan detect when a factory's account implementation upgrade has produced a new derivation address for the same triple. The previously canonical row is flaggedStaleDerivation=trueand hard-filtered out ofListWalletsresponses (preserved on disk because the on-chain wallet may still hold assets). Ships/ava backfill-wallet-salt-indexfor one-off cleanup of existing zombie rows. This fixes the three-salt-0 collision observed on Base for0xc60e71bd….golang-jwt/jwt v3dep; we only usemiddleware.Logger/middleware.Recover, not the JWT middleware, so the removal is invisible to our code.Storage migration
The wallet-salt-index fix introduces a new BadgerDB key prefix (
wsalt:<owner>:<factory>:<salt> → address) but does NOT modify or migrate any existingw:<owner>:<address>rows on its own. The new code path starts populating the index for new wallet creations and detects future implementation upgrades immediately upon deploy. Existing zombie rows from past Base factory upgrades remain in BadgerDB and stay visible inListWalletsresponses until the operator runs/ava backfill-wallet-salt-indexduring a maintenance window.Rollout sequence:
docker stop aggregator-base docker run --rm --volumes-from aggregator-base avaprotocol/ap-avs:latest \ backfill-wallet-salt-index --config /app/config/aggregator-base.yaml --dry-run # inspect summary, then re-run without --dry-run docker start aggregator-basegetWalletsSDK call against the previously affected EOA.The migration was dry-run validated against the staging Sepolia DB (220 wallets, 0 stale — Sepolia's factory hasn't been upgraded). Base will be the meaningful run.
Test plan
go build ./...cleango test ./core/taskengine/ -run 'TestStoreWallet_|TestLookupCanonicalWalletAddress|TestMarkWalletStale|TestMarkPreviousCanonicalStaleIfAny|TestListWallets_HardFiltersStaleRows|TestBackfillWalletSaltIndex'— 13 new unit tests pass, including a regression test that replays the exact 3-salt-0 collision from the Base reportTestExecuteTask_SequentialContractWrites_Sepoliaend-to-end against the live staging bundler via SSH tunnel — passes (USDC approval + USDC→WETH swap executed on-chain through the bundler)/ava backfill-wallet-salt-index --config config/aggregator-sepolia.yaml --dry-runcobra subcommand registers and runs end-to-end (errored out at the BadgerDB lock check, which is the expected operational guard)make testshows the same 5 pre-existing live-network failures unrelated to this PR (Base bundler tests;TestExecuteTask_SequentialContractWrites_Sepolianow passes with the staging tunnel up)aggregator-baseandaggregator-ethereumstart cleanlybackfill-wallet-salt-indexon Base + Ethereum