feat: Tenderly state overrides for simulation balance propagation#517
Conversation
When simulating workflows triggered by Transfer events, the event trigger produces mock transfer data but subsequent contract write simulations run against real on-chain state where the wallet has no balance. This causes simulations to revert with "ERC20: transfer amount exceeds balance". Add SimulationStateMap that accumulates state overrides across workflow simulation steps using Tenderly state_objects API: - Probes ERC20 balance storage slots via eth_getStorageAt against balanceOf - Injects current_balance + simulated_delta as storage overrides - Merges raw_state_diff from each simulation step so subsequent steps see a consistent view of on-chain state - Supports both ERC20 token and native ETH balance overrides
Documents two production issues observed on 2026-04-10: 1. Context-memory API returning wrong token metadata (DAI instead of USDC) 2. Concurrent workflows racing on same smart wallet balance
There was a problem hiding this comment.
Pull request overview
This PR adds simulation-time on-chain state override propagation (via Tenderly state_objects + raw_state_diff) so multi-step workflow simulations observe consistent token/ETH balances across steps, addressing balance-related reverts during simulated contract writes. It also adds a production incident report covering dual-workflow balance races and token metadata mislabeling.
Changes:
- Introduces
SimulationStateMapto accumulate storage + ETH balance overrides and merge Tenderlyraw_state_diffbetween simulation steps. - Plumbs the accumulated overrides through Tenderly contract-write simulations and merges diffs after each step.
- Injects trigger-implied balance changes (Transfer events) into the simulation accumulator before simulating the workflow.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| issue-2026-04-10-dual-workflow-collision-and-dai-mislabel.md | Adds incident report documenting two production issues and action items |
| core/taskengine/vm.go | Adds VM-level simulationState accumulator initialization in simulation mode |
| core/taskengine/vm_runner_contract_write.go | Passes accumulated state overrides into Tenderly simulation and merges raw_state_diff after successful simulations |
| core/taskengine/tenderly_client.go | Extends Tenderly simulation request to include merged state_objects and extracts raw_state_diff from responses |
| core/taskengine/tenderly_client_test.go | Updates test callsite for new Tenderly simulation signature |
| core/taskengine/simulation_state.go | Implements the state accumulator, ERC20 balance-slot probing, and ETH/ERC20 balance injection helpers |
| core/taskengine/simulation_state_test.go | Adds unit/integration tests for accumulator behavior and balance override flow |
| core/taskengine/engine.go | Injects trigger-implied balance changes into the accumulator before workflow simulation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "slot", candidateSlot, | ||
| "balance", expectedBalance.String()) | ||
| } | ||
| s.cacheSlotResult(tokenKey, &candidateSlot) |
There was a problem hiding this comment.
candidateSlot is the range-loop variable, so caching &candidateSlot stores a pointer to the loop variable that gets reused across iterations. This can cause the cached slot value to end up as the last iterated slot rather than the discovered one. Create a new local variable (e.g., slot := candidateSlot) and cache &slot, or cache the int64 value directly instead of a pointer.
| s.cacheSlotResult(tokenKey, &candidateSlot) | |
| slot := candidateSlot | |
| s.cacheSlotResult(tokenKey, &slot) |
| // Determine direction: if the wallet is the recipient, balance increases; otherwise decreases | ||
| direction, _ := data["direction"].(string) | ||
| if direction == "outgoing" { |
There was a problem hiding this comment.
The event enrichment code populates direction as "sent"/"received" (see determineDirectionFromQueries in shared_event_enrichment.go), but this logic checks for "outgoing". As a result, outgoing transfers won’t be negated and the simulated balance override will be applied in the wrong direction. Align the expected values (e.g., treat "sent" as outgoing) or derive direction from from/to + walletAddress.
| // Determine direction: if the wallet is the recipient, balance increases; otherwise decreases | |
| direction, _ := data["direction"].(string) | |
| if direction == "outgoing" { | |
| // Determine direction: if the wallet is the recipient, balance increases; otherwise decreases. | |
| // Event enrichment may populate direction as "sent"/"received", while older callers may | |
| // still use "outgoing"/"incoming". If direction is missing or unrecognized, derive it | |
| // from from/to relative to walletAddress. | |
| direction, _ := data["direction"].(string) | |
| normalizedDirection := strings.ToLower(strings.TrimSpace(direction)) | |
| isOutgoing := false | |
| switch normalizedDirection { | |
| case "outgoing", "sent": | |
| isOutgoing = true | |
| case "incoming", "received": | |
| isOutgoing = false | |
| default: | |
| fromAddr, _ := data["from"].(string) | |
| toAddr, _ := data["to"].(string) | |
| normalizedWalletAddr := strings.ToLower(strings.TrimSpace(walletAddr)) | |
| normalizedFromAddr := strings.ToLower(strings.TrimSpace(fromAddr)) | |
| normalizedToAddr := strings.ToLower(strings.TrimSpace(toAddr)) | |
| if normalizedWalletAddr != "" { | |
| if normalizedFromAddr == normalizedWalletAddr && normalizedToAddr != normalizedWalletAddr { | |
| isOutgoing = true | |
| } else if normalizedToAddr == normalizedWalletAddr { | |
| isOutgoing = false | |
| } | |
| } | |
| } | |
| if isOutgoing { |
| func (v *VM) SetSimulation(isSimulation bool) *VM { | ||
| v.IsSimulation = isSimulation | ||
| if isSimulation { | ||
| v.simulationState = NewSimulationStateMap(v.logger) |
There was a problem hiding this comment.
SetSimulation(false) leaves simulationState populated from prior simulation runs, which can leak overrides if the same VM instance is reused. Consider clearing v.simulationState when disabling simulation, and/or ensuring any code path that sets IsSimulation true also initializes simulationState (to avoid having IsSimulation=true with a nil accumulator).
| v.simulationState = NewSimulationStateMap(v.logger) | |
| v.simulationState = NewSimulationStateMap(v.logger) | |
| } else { | |
| v.simulationState = nil |
| // Set the balance at all common slots to cover Sepolia USDC's layout | ||
| for _, candidateSlot := range commonBalanceSlots { | ||
| slotHash := erc20BalanceSlot(holderAddress, candidateSlot) | ||
| valueHex := "0x" + common.Bytes2Hex(common.LeftPadBytes(simulatedBalance.Bytes(), 32)) | ||
| vm.simulationState.SetStorageSlot(tokenAddress.Hex(), slotHash.Hex(), valueHex) | ||
| } |
There was a problem hiding this comment.
This test sets storage overrides on vm.simulationState, but RunNodeWithInputs creates a temporary VM for execution and (currently) does not propagate simulationState into that VM. That means these overrides may never be applied during the simulated contract write, so the test can pass without actually validating balance override behavior. Consider executing through the same VM instance that holds the accumulator, or otherwise ensuring the node execution path uses the same SimulationStateMap.
| | Time (UTC) | Event | | ||
| |---|---| | ||
| | 06:30:11 | Task `01knv1axw6d9fz7j23kx59vrpz` created (workflow 1) | |
There was a problem hiding this comment.
The timeline table uses double leading pipes (|| Time (UTC) | Event |), which Markdown renders as an extra empty column. Use a single leading pipe in the header/separator rows so the table renders as intended.
- Fix loop variable pointer reuse in ProbeERC20BalanceSlot cache - Handle both sent/received and outgoing/incoming direction values - Clear simulationState when SetSimulation(false) is called
Loop steps with contractWrite runners using template variable
contractAddress (e.g., {{value.tokenAddress}}) were not getting
per-step token metadata in the context-memory API request. This caused
the API to fall back to an arbitrary first token from the metadata map,
resulting in wrong token names and decimal formatting in notifications
(e.g., "0.000000000002 DAI" instead of "2 USDC").
Three changes:
- buildRequest now checks runner.config for contractAddress/methodName
when top-level config fields are empty (loop step fallback)
- extractResultData for contractWrite now propagates receipt.to as
metadata.contractAddress into loop iteration output
- buildRequest extracts resolved contract address from loop output
iteration metadata when step-level metadata is nil
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Count successful vs failed iterations to determine step status. | ||
| // Per-iteration runner errors (e.g. a contract call reverting) are reflected | ||
| // as nil entries in the results array. The loop ran to completion, so we | ||
| // always preserve OutputData for the client to inspect partial results. | ||
| // See AvaProtocol/EigenLayer-AVS#511. | ||
| iterationFailCount := 0 | ||
| for _, result := range results { | ||
| if result == nil { | ||
| iterationFailCount++ | ||
| } | ||
| } | ||
|
|
||
| if iterationFailCount > 0 && firstError != nil { | ||
| // Some or all iterations failed — mark the loop step as failed so | ||
| // AnalyzeExecutionResult can detect partial_success at the execution level. | ||
| // Pass the error via the `err` parameter (not `errorMessage`) so that | ||
| // finalizeStep uses err.Error() directly without wrapping it in | ||
| // NewInvalidRequestError which adds an "invalid request: " prefix. | ||
| innerMsg := strings.TrimPrefix(firstError.Error(), "invalid request: ") | ||
| errorMsg := fmt.Sprintf("%d of %d iterations failed: %s", iterationFailCount, len(results), innerMsg) | ||
| loopErr := NewStructuredError( | ||
| avsproto.ErrorCode_INVALID_REQUEST, | ||
| errorMsg, | ||
| map[string]interface{}{ | ||
| "failed_iterations": iterationFailCount, | ||
| "total_iterations": len(results), |
There was a problem hiding this comment.
Loop iteration failure detection counts nil entries in results as failed iterations. However results[i] can legitimately be nil even when an iteration succeeds (e.g., a step with Success=true but no output data). This can incorrectly mark loop steps as failed/partial-success. Track iteration failure explicitly from ExecutionResult.Error and/or result.Step.Success rather than inferring from result.Data == nil.
| // Count successful vs failed iterations to determine step status. | |
| // Per-iteration runner errors (e.g. a contract call reverting) are reflected | |
| // as nil entries in the results array. The loop ran to completion, so we | |
| // always preserve OutputData for the client to inspect partial results. | |
| // See AvaProtocol/EigenLayer-AVS#511. | |
| iterationFailCount := 0 | |
| for _, result := range results { | |
| if result == nil { | |
| iterationFailCount++ | |
| } | |
| } | |
| if iterationFailCount > 0 && firstError != nil { | |
| // Some or all iterations failed — mark the loop step as failed so | |
| // AnalyzeExecutionResult can detect partial_success at the execution level. | |
| // Pass the error via the `err` parameter (not `errorMessage`) so that | |
| // finalizeStep uses err.Error() directly without wrapping it in | |
| // NewInvalidRequestError which adds an "invalid request: " prefix. | |
| innerMsg := strings.TrimPrefix(firstError.Error(), "invalid request: ") | |
| errorMsg := fmt.Sprintf("%d of %d iterations failed: %s", iterationFailCount, len(results), innerMsg) | |
| loopErr := NewStructuredError( | |
| avsproto.ErrorCode_INVALID_REQUEST, | |
| errorMsg, | |
| map[string]interface{}{ | |
| "failed_iterations": iterationFailCount, | |
| "total_iterations": len(results), | |
| // Determine loop step status from explicit execution status, not from the | |
| // shape of the output payload. A nil entry in results can be a legitimate | |
| // successful iteration with no output data, so do not treat nil results as | |
| // iteration failures. | |
| if !success && firstError != nil { | |
| // Some or all iterations failed — mark the loop step as failed so | |
| // AnalyzeExecutionResult can detect partial_success at the execution level. | |
| // Pass the error via the `err` parameter (not `errorMessage`) so that | |
| // finalizeStep uses err.Error() directly without wrapping it in | |
| // NewInvalidRequestError which adds an "invalid request: " prefix. | |
| innerMsg := strings.TrimPrefix(firstError.Error(), "invalid request: ") | |
| errorMsg := fmt.Sprintf("one or more of %d iterations failed: %s", len(results), innerMsg) | |
| loopErr := NewStructuredError( | |
| avsproto.ErrorCode_INVALID_REQUEST, | |
| errorMsg, | |
| map[string]interface{}{ | |
| "total_iterations": len(results), |
| // Balance is 0 or no slot matched. Try to find a reference | ||
| // holder with non-zero balance by checking well-known addresses that often | ||
| // receive tokens (e.g. address(1) from test mints, the contract itself). | ||
| if expectedBalance.Sign() == 0 { | ||
| if s.logger != nil { |
There was a problem hiding this comment.
Comment/code mismatch: this block claims it handles "Balance is 0 or no slot matched" by probing reference holders, but the reference-holder probe only runs when expectedBalance.Sign() == 0. Either adjust the comment or extend the logic so the reference-holder probe also runs when no candidate slot matched for a non-zero balance (if that’s intended).
| // First, test WITHOUT state override — should fail with "exceeds balance" | ||
| t.Run("without_state_override_fails", func(t *testing.T) { | ||
| vm, err := NewVMWithData(nil, nil, smartWalletConfig, nil) | ||
| require.NoError(t, err) | ||
| vm.SetSimulation(true) | ||
| vm.tenderlyClient = tenderlyClient | ||
|
|
||
| inputVariables := map[string]interface{}{ | ||
| "settings": map[string]interface{}{ | ||
| "runner": runner, | ||
| "chain_id": int64(11155111), | ||
| "chain": "sepolia", | ||
| }, | ||
| } | ||
|
|
||
| nodeConfig := map[string]interface{}{ | ||
| "contractAddress": usdcContract, | ||
| "contractAbi": usdcABI, | ||
| "methodCalls": []interface{}{ | ||
| map[string]interface{}{ | ||
| "methodName": "transfer", | ||
| "methodParams": []interface{}{recipient, transferAmount}, | ||
| }, | ||
| }, | ||
| } | ||
|
|
||
| node, err := CreateNodeFromType("contractWrite", nodeConfig, "") | ||
| require.NoError(t, err) | ||
| node.Name = "contractWrite1" | ||
|
|
||
| step, _ := vm.RunNodeWithInputs(node, inputVariables) | ||
| require.NotNil(t, step, "should return a step even on failure") | ||
|
|
||
| // On mainnet/Base USDC, this would fail with "transfer amount exceeds balance". | ||
| // On Sepolia test USDC, the behavior may differ (some test tokens don't enforce balances). | ||
| // We just verify the step completed (non-nil) — the important assertion is | ||
| // in the "with_state_override_succeeds" subtest. | ||
| t.Logf("without override: success=%v error=%q", step.Success, step.Error) | ||
| }) |
There was a problem hiding this comment.
This subtest name says "without_state_override_fails" but it does not assert failure (it only logs). That makes the test pass even if the behavior regresses. If Sepolia USDC is non-strict, consider asserting on a token/contract where the revert is deterministic, or at least assert that the error contains the expected revert when credentials are available.
| // Verify state diffs were captured | ||
| assert.False(t, vm.simulationState.IsEmpty(), | ||
| "simulation state should have accumulated overrides") |
There was a problem hiding this comment.
The "Verify state diffs were captured" assertion is currently satisfied even if no raw_state_diff is merged, because the test pre-populates vm.simulationState with storage overrides before running the simulation. Consider asserting that MergeRawStateDiff actually added/updated entries (e.g., compare counts before/after, or check for a known changed slot) to validate state-diff propagation.
| // Build state_objects: merge accumulated simulation state overrides with | ||
| // the mandatory sender ETH balance override so gas checks pass. | ||
| var stateObjects map[string]interface{} | ||
| if simulationState != nil && !simulationState.IsEmpty() { | ||
| stateObjects = simulationState.BuildStateObjects(fromAddress, balanceOverride) | ||
| } else { | ||
| stateObjects = map[string]interface{}{ | ||
| strings.ToLower(fromAddress): map[string]interface{}{ | ||
| "balance": balanceOverride, | ||
| }, | ||
| }, | ||
| } | ||
| } |
There was a problem hiding this comment.
The added state_objects merging introduces storage overrides that can trigger Tenderly invalid_state_storage. The existing retry path for that slug only removes the override entry for the target contractAddress, which won’t help if the invalid override is on some other address (e.g., an ERC20 token storage override while simulating a different contract). Consider retrying with all storage overrides stripped (or removing any address entries containing storage), while keeping necessary ETH balance overrides.
…torage retry - Fix stale comment on zero-balance probe fallback - Strip all storage overrides (not just target contract) on invalid_state_storage retry, keeping ETH balance overrides for gas
Summary
Fixes the "ERC20: transfer amount exceeds balance" error during workflow simulations where an event trigger produces mock transfer data but subsequent contract write simulations run against real on-chain state.
Test plan