Skip to content

fix: context-memory status contract, warn badge, trigger/UserOp fixes#525

Merged
chrisli30 merged 8 commits into
mainfrom
staging
Apr 14, 2026
Merged

fix: context-memory status contract, warn badge, trigger/UserOp fixes#525
chrisli30 merged 8 commits into
mainfrom
staging

Conversation

@chrisli30
Copy link
Copy Markdown
Member

  • fix: raise UserOp receipt wait from 60s to 120s to handle Sepolia latency

Sepolia's bundler submission + mining often exceeds 60s under current
network conditions. The 60s cap caused the aggregator to return
'No transaction receipt received' to the SDK while the UserOp was
still in flight -- it would confirm ~30s later, but too late for the
client response. 120s aligns with the SDK's TimeoutPresets.SLOW.

  • docs: finalize client interpretation section after Studio SDK upgrade to 2.17.0

Rewrite the "How the Client Interprets Execution Status" section to
reflect the shipped SDK model. Replaces the pre-migration workaround
notes with the final Success/Failed/Error/Pending table, the current
Studio branch-skip detection (steps.length < executionNodeCount on
Success), and per-component mapping of where status logic lives.

  • docs: move API_PROTOCOL_COMPARISON to avs-infra

Cross-repo content (covers AVS aggregator, Studio proxy, and partner
API infra) belongs in avs-infra, not in the AVS repo.

  • feat: send status and executionError to context-memory /api/summarize

Context-memory now strictly validates that every /api/summarize request
carries the aggregator's execution verdict, rejecting calls that omit
status or executionError with HTTP 400. Without these fields the
downstream summarizer fell through to its "error" branch and emitted
"could not run" subjects even when every step succeeded, contradicting
the green success badge rendered on the same email.

Compute the verdict via vm.AnalyzeExecutionResult() in Summarize() before
buildRequest takes vm.mu (sync.Mutex is not reentrant), map the enum to
the lowercase strings "success" | "failed" | "error" expected by the API,
and include them on contextMemorySummarizeRequest without omitempty so the
keys are always present (executionError is "" on success).

  • test: remove Uniswap Base tests that depend on a funded mainnet wallet

TestRunNodeImmediately_UniswapSwap_Base and its variants, plus
TestExecuteTask_SequentialContractWrites_Base_{SelfFunded,Paymaster}, require
at least 0.01 USDC on a smart wallet on Base mainnet to run — the setup fails
with "wallet ... needs at least 0.01 USDC (10000 raw)" on any machine where
the wallet isn't pre-funded, producing unreliable local and CI runs.

Dropping the two Base test files (1840 lines combined) and pulling the four
Sepolia constants still referenced by the Sepolia approve+swap tests into a
small shared constants file.

  • feat: consume context-memory skippedNote/step counts and render yellow warn badge

Context-memory now sends four new body fields on every /api/summarize response
(skippedNote, executedSteps, totalSteps, skippedSteps) and no longer suffixes the
skip note onto body.summary. The aggregator was rendering a green "All steps
completed successfully" badge for success-with-skipped runs because it only
looked at status, and it was concatenating the skip note into the email title.

Changes:

  • Plumb the four new fields through contextMemorySummarizeBody into Summary,
    and populate them in both the context-memory path and the deterministic
    fallback (computeSkippedStepCount mirrors the branch-skip logic so the
    rendering layer always has the counts).

  • Drop the legacy "partial_success" / "failure" status strings. The status
    model is now the same "success" | "failed" | "error" triad context-memory
    emits, with the yellow "warn" variant derived from (Status=="success" &&
    SkippedSteps>0). Removed statusIconPartialSuccess and the unused
    StatusHtmlFailedTemplate / StatusHtmlSuccessTemplate constants.

  • Rework the status helpers (getStatusColor, getStatusDisplayText,
    getStatusEmoji, buildStatusHtml) to take a Summary so they can derive the
    warn variant. The email badge renders yellow with the skip note as its text;
    the Telegram formatter prepends ⚠️ and surfaces the skip note on its own
    italic line under the subject.

  • Update the vm_runner_rest.go deterministic override path to match: no
    in-title concatenation, yellow background when branch-skipped, skipped_note
    emitted as its own template variable.

Verified against two local simulations: the branch-skipped run renders yellow
with status_text="1 node was skipped by Branch condition." and a clean title;
the normal run renders green with "All steps completed successfully" and a
populated executions[] array.

  • fix: widen cron/fixed-time trigger slices to []interface{} for structpb

TaskTriggerToConfig stored cronTrigger.Config.Schedules ([]string) and
fixedTimeTrigger.Config.Epochs ([]int64) directly into the returned
map[string]interface{}. engine.go then passes that map to structpb.NewValue,
which only accepts []interface{} for list values and errors on any concrete
typed slice — so every Cron/FixedTime simulation logged

WARN engine.go:2956 Failed to convert trigger input to protobuf
    {"error": "proto: invalid type: []string"}

before falling back to a json.Marshal/Unmarshal roundtrip that happens to
coerce the values. Convert both arrays to []interface{} at the source so the
direct structpb path succeeds and the fallback becomes dead (still kept as a
safety net).

…ency

Sepolia's bundler submission + mining often exceeds 60s under current
network conditions. The 60s cap caused the aggregator to return
'No transaction receipt received' to the SDK while the UserOp was
still in flight -- it would confirm ~30s later, but too late for the
client response. 120s aligns with the SDK's TimeoutPresets.SLOW.
… to 2.17.0

Rewrite the "How the Client Interprets Execution Status" section to
reflect the shipped SDK model. Replaces the pre-migration workaround
notes with the final Success/Failed/Error/Pending table, the current
Studio branch-skip detection (steps.length < executionNodeCount on
Success), and per-component mapping of where status logic lives.
Cross-repo content (covers AVS aggregator, Studio proxy, and partner
API infra) belongs in avs-infra, not in the AVS repo.
Context-memory now strictly validates that every /api/summarize request
carries the aggregator's execution verdict, rejecting calls that omit
status or executionError with HTTP 400. Without these fields the
downstream summarizer fell through to its "error" branch and emitted
"could not run" subjects even when every step succeeded, contradicting
the green success badge rendered on the same email.

Compute the verdict via vm.AnalyzeExecutionResult() in Summarize() before
buildRequest takes vm.mu (sync.Mutex is not reentrant), map the enum to
the lowercase strings "success" | "failed" | "error" expected by the API,
and include them on contextMemorySummarizeRequest without omitempty so the
keys are always present (executionError is "" on success).
TestRunNodeImmediately_UniswapSwap_Base and its variants, plus
TestExecuteTask_SequentialContractWrites_Base_{SelfFunded,Paymaster}, require
at least 0.01 USDC on a smart wallet on Base mainnet to run — the setup fails
with "wallet ... needs at least 0.01 USDC (10000 raw)" on any machine where
the wallet isn't pre-funded, producing unreliable local and CI runs.

Dropping the two Base test files (1840 lines combined) and pulling the four
Sepolia constants still referenced by the Sepolia approve+swap tests into a
small shared constants file.
…w warn badge

Context-memory now sends four new body fields on every /api/summarize response
(skippedNote, executedSteps, totalSteps, skippedSteps) and no longer suffixes the
skip note onto body.summary. The aggregator was rendering a green "All steps
completed successfully" badge for success-with-skipped runs because it only
looked at status, and it was concatenating the skip note into the email title.

Changes:

- Plumb the four new fields through contextMemorySummarizeBody into Summary,
  and populate them in both the context-memory path and the deterministic
  fallback (computeSkippedStepCount mirrors the branch-skip logic so the
  rendering layer always has the counts).

- Drop the legacy "partial_success" / "failure" status strings. The status
  model is now the same "success" | "failed" | "error" triad context-memory
  emits, with the yellow "warn" variant derived from (Status=="success" &&
  SkippedSteps>0). Removed statusIconPartialSuccess and the unused
  StatusHtmlFailedTemplate / StatusHtmlSuccessTemplate constants.

- Rework the status helpers (getStatusColor, getStatusDisplayText,
  getStatusEmoji, buildStatusHtml) to take a Summary so they can derive the
  warn variant. The email badge renders yellow with the skip note as its text;
  the Telegram formatter prepends ⚠️ and surfaces the skip note on its own
  italic line under the subject.

- Update the vm_runner_rest.go deterministic override path to match: no
  in-title concatenation, yellow background when branch-skipped, skipped_note
  emitted as its own template variable.

Verified against two local simulations: the branch-skipped run renders yellow
with status_text="1 node was skipped by Branch condition." and a clean title;
the normal run renders green with "All steps completed successfully" and a
populated executions[] array.
TaskTriggerToConfig stored cronTrigger.Config.Schedules ([]string) and
fixedTimeTrigger.Config.Epochs ([]int64) directly into the returned
map[string]interface{}. engine.go then passes that map to structpb.NewValue,
which only accepts []interface{} for list values and errors on any concrete
typed slice — so every Cron/FixedTime simulation logged

    WARN engine.go:2956 Failed to convert trigger input to protobuf
        {"error": "proto: invalid type: []string"}

before falling back to a json.Marshal/Unmarshal roundtrip that happens to
coerce the values. Convert both arrays to []interface{} at the source so the
direct structpb path succeeds and the fallback becomes dead (still kept as a
safety net).
@claude
Copy link
Copy Markdown

claude Bot commented Apr 14, 2026

Code Review — PR #525

Overview

This PR bundles six distinct changes: a UserOp receipt timeout increase (60 s → 120 s), a structpb compatibility fix for typed slices, a status-model migration from partial_successwarn (derived from SkippedSteps > 0), plumbing of status/executionError to the context-memory API, a yellow warn-badge for branch-skip runs, and removal of ~1840 lines of Base-mainnet tests that require a funded wallet.

The motivation for each change is well-documented and the mutex sequencing insight (calling AnalyzeExecutionResult before buildRequest takes vm.mu) is correctly applied.


Issues

1. computeExecutionStatus docstring says "triad" but the function only returns two values

core/taskengine/summarizer_deterministic.go

// Returns "success" or "failed" — the same triad the aggregator emits, with
// the yellow "warn" state derived separately from Summary.SkippedSteps.
func computeExecutionStatus(vm *VM) string {

A triad has three members, but this function only ever returns "success" or "failed" — it cannot produce "error". This means that if the aggregator path emits "error" (e.g., VM crash, compilation failure), the deterministic fallback silently maps it to "failed" instead. Callers see different display text ("Execution failed" vs "System error") depending on which path ran. The comment should be corrected, and if the deterministic path is supposed to mirror the triad, computeExecutionStatus should have an "error" arm.


2. Duplicated skip-note generation logic

The same noun/verb pluralisation and sentence format appears in two places:

core/taskengine/summarizer_deterministic.go (ComposeSummary):

noun, verb := "node", "was"
if skippedStepCount != 1 {
    noun, verb = "nodes", "were"
}
skippedNote = fmt.Sprintf("%d %s %s skipped by Branch condition.", skippedStepCount, noun, verb)

core/taskengine/vm_runner_rest.go (deterministic override path):

noun := "node"
verb := "was"
if skippedCount != 1 {
    noun = "nodes"
    verb = "were"
}
skippedNote = fmt.Sprintf("%d %s %s skipped by Branch condition.", skippedCount, noun, verb)

These are identical. If the wording changes (e.g. "by a Branch condition" → "by a branch node"), both sites need updating and one will be missed. Extract a shared helper:

func buildSkippedNote(count int) string {
    noun, verb := "node", "was"
    if count != 1 {
        noun, verb = "nodes", "were"
    }
    return fmt.Sprintf("%d %s %s skipped by Branch condition.", count, noun, verb)
}

3. Shallow copy of Summary in the fallback path

core/taskengine/vm_runner_rest.go:

fallback := *summaryForClient
if failed {
    fallback.Status = "failed"
} else {
    fallback.Status = "success"
}
dynamicData["statusHtml"] = buildStatusHtml(fallback)

fallback := *summaryForClient is a shallow copy — slice fields (Executions, Errors, Transfers, Balances) share backing arrays with summaryForClient. buildStatusHtml currently only reads Status, SkippedSteps, and SkippedNote, so there's no bug today. But if buildStatusHtml is ever extended to iterate over the slice fields, modifying one copy will silently affect the other. Worth documenting with a comment or using a dedicated statusOnlyFallback struct.


4. computeSkippedStepCount reads vm.ExecutionLogs outside vm.mu

func computeSkippedStepCount(vm *VM) int {
    executed := make(map[string]struct{})
    for _, st := range vm.ExecutionLogs {  // ← no lock held
        name := st.GetName()
        executed[name] = struct{}{}
    }
    vm.mu.Lock()
    defer vm.mu.Unlock()
    // reads vm.TaskNodes under lock ...
}

vm.ExecutionLogs is read without holding vm.mu, which is also the pattern used by the pre-existing computeExecutionStatus. This is pre-existing behaviour and not introduced here, but it is a latent data race if writers hold vm.mu when appending to ExecutionLogs. Flagging for awareness — if this is intentional (e.g., ExecutionLogs is append-only before these functions are called), a comment explaining the invariant would prevent future mistakes.


5. mapExecutionStatusToAPIString default silently maps unknowns to "error"

core/taskengine/vm.go:

func mapExecutionStatusToAPIString(resultStatus ExecutionResultStatus) string {
    switch resultStatus {
    case ExecutionSuccess:
        return "success"
    case ExecutionFailed:
        return "failed"
    case ExecutionError:
        return "error"
    default:
        return "error"  // ← silent fallthrough
    }
}

If a new ExecutionResultStatus value is added and the mapping is forgotten, it silently becomes "error". A logger.Warn or at minimum a code comment would help catch this during development.


Minor / Nits

  • StatusHtmlFailedTemplate / StatusHtmlSuccessTemplate removal: These were exported constants (capitalized), so removing them is a breaking change for any external consumer. The PR says they were unused, but a brief grep in downstream repos (avs-infra, ava-sdk-js) is worth doing before merge.

  • node_types.go slice widening: The []string → []interface{} and []int64 → []interface{} conversions are correct and the fix is well-motivated. With Go 1.22 generics already in use elsewhere, a shared toInterfaceSlice[T any] helper would eliminate the visual duplication, but the current inline loops are perfectly readable for two call sites.

  • UserOp timeout (pkg/erc4337/preset/builder.go): Raising from 60 s to 120 s is a reasonable fix. Consider making this configurable via the aggregator config (or at least logging a warning when the wait exceeds the old threshold) so future latency regressions are visible in logs before they become user-facing errors.

  • Integration test updates (summarizer_context_memory_integration_test.go): The test now validates "failed" instead of "failure" — correctly aligned with the new contract. The SummarizeRequest additions of Status and ExecutionError in all test cases are complete.

  • Base test deletion: Removing ~1840 lines of tests that need a funded wallet is the right call. The Sepolia constant extraction into uniswap_sepolia_constants_test.go is clean.


Summary

Area Assessment
Mutex sequencing (AnalyzeExecutionResult before buildRequest) Correct — well-documented
Status model migration (partial_successwarn derived from counts) Correct, clean
structpb slice widening Correct fix
Duplicated skip-note logic Should extract helper
Deterministic "error" state gap Worth addressing
Shallow copy in fallback Low risk now, worth documenting
ExecutionLogs lock hygiene Pre-existing; document the invariant
Test coverage Good — unit tests cover warn/success/failed/error paths

No blocking issues, but items 1–2 above are worth a follow-up before this ships to production.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aligns the aggregator’s execution-status model and notification rendering with the updated context-memory + Studio SDK contract, while also addressing Sepolia UserOp latency and fixing trigger config protobuf conversion.

Changes:

  • Increase ERC-4337 UserOp receipt wait timeout from 60s to 120s to reduce false “no receipt” failures on Sepolia.
  • Update context-memory summarization integration to send/consume the new status + skipped-step fields and render a warn (yellow) badge for success-with-skips.
  • Fix cron/fixed-time trigger config conversion for structpb.NewValue, and remove unreliable Base mainnet integration tests (extract shared Sepolia constants).

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/erc4337/preset/builder.go Raises max UserOp confirmation wait to tolerate Sepolia bundler/mining delays.
docs/changes/0001-execution-status-redesign.md Updates docs to the final shipped client status interpretation model (SDK 2.17.0).
docs/API_PROTOCOL_COMPARISON.md Removes this doc from the repo (moved to avs-infra per PR description).
core/taskengine/vm_runner_rest.go Updates email status badge rendering to support warn-on-skips and new status strings.
core/taskengine/vm.go Adds mapping helper from execution verdict enum to context-memory status strings; updates comments.
core/taskengine/uniswap_sepolia_constants_test.go Adds shared Sepolia Uniswap constants for remaining tests.
core/taskengine/summarizer_format_test.go Updates formatter tests for new status contract and warn derivation from skipped steps.
core/taskengine/summarizer_format_telegram.go Derives Telegram emoji from Summary (warn on skipped steps) and renders skippedNote.
core/taskengine/summarizer_format_integration_test.go Updates integration expectations to the new status triad (warn derived via skipped steps).
core/taskengine/summarizer_format_email.go Updates SendGrid dynamic data helpers to use Summary-derived warn + step counts.
core/taskengine/summarizer_deterministic.go Adds skipped-step counting + note into deterministic Summary struct fields.
core/taskengine/summarizer_context_memory_test.go Updates request builder tests for new required request fields.
core/taskengine/summarizer_context_memory_integration_test.go Updates integration request/response models and VM→request building for new contract.
core/taskengine/summarizer_context_memory.go Sends status + executionError to context-memory; plumbs skipped fields from response into Summary.
core/taskengine/run_node_uniswap_swap_test.go Removes Base-dependent funded-wallet tests (unreliable locally/CI).
core/taskengine/node_types.go Fixes cron/fixed-time trigger config list typing for structpb conversion.
core/taskengine/manual_trigger_loop_test.go Updates comment to reflect new “failed” execution-level status wording.
core/taskengine/execute_sequential_contract_writes_base_test.go Removes Base-dependent funded-wallet tests (unreliable locally/CI).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +160 to +168
// Compute execution verdict BEFORE buildRequest acquires vm.mu — AnalyzeExecutionResult
// takes the same lock and sync.Mutex is not reentrant.
executionError, _, resultStatus := vm.AnalyzeExecutionResult()
status := mapExecutionStatusToAPIString(resultStatus)
Comment on lines +165 to +171
// Compute execution verdict BEFORE buildRequest acquires vm.mu — AnalyzeExecutionResult
// takes the same lock and sync.Mutex is not reentrant.
executionError, _, resultStatus := vm.AnalyzeExecutionResult()
status := mapExecutionStatusToAPIString(resultStatus)

// Build request from VM
req, err := c.buildRequest(vm, currentStepName)
req, err := c.buildRequest(vm, currentStepName, status, executionError)
Comment on lines +1401 to +1427
skippedStepCount := computeSkippedStepCount(vm)
executedStepCount := len(vm.ExecutionLogs)
skippedNote := ""
if status == "success" && skippedStepCount > 0 {
noun, verb := "node", "was"
if skippedStepCount != 1 {
noun, verb = "nodes", "were"
}
skippedNote = fmt.Sprintf("%d %s %s skipped by Branch condition.", skippedStepCount, noun, verb)
}

return Summary{
Subject: subject,
Body: body,
SummaryLine: summaryLine,
Status: status,
Trigger: trigger,
TriggeredAt: triggeredAt,
Executions: executions,
Errors: errors,
SmartWallet: smartWallet,
Network: resolveChainName(vm),
Annotation: annotation,
// AnalysisHtml is now generated by buildAnalysisHtmlFromStructured() in SendGridDynamicData()
Subject: subject,
Body: body,
SummaryLine: summaryLine,
Status: status,
Trigger: trigger,
TriggeredAt: triggeredAt,
Executions: executions,
Errors: errors,
SmartWallet: smartWallet,
Network: resolveChainName(vm),
Annotation: annotation,
SkippedNote: skippedNote,
ExecutedSteps: executedStepCount,
TotalSteps: executedStepCount + skippedStepCount,
SkippedSteps: skippedStepCount,
…ncy, DRY skip note

Four findings from Copilot and Claude on #525:

- ContextMemorySummarizer.Summarize used AnalyzeExecutionResult() even when
  vm.ExecutionLogs was empty. That case, which occurs during single-node
  RunNodeImmediately runs where the notification node is the only step and is
  still executing, returned ExecutionFailed / "no execution steps found", so
  the aggregator was sending status=failed + a bogus executionError to
  context-memory. Mirror the deterministic path and treat empty logs as
  success so the verdict matches (nothing has failed yet).

- ComposeSummary populated Summary.ExecutedSteps/TotalSteps from
  len(vm.ExecutionLogs) and a separate computeSkippedStepCount helper. That
  could overcount versus SummaryLine/subject, which exclude notification
  nodes via getTotalWorkflowSteps. Use the already-computed executedSteps /
  skippedCount / totalWorkflowSteps locals so the step fields match the
  rendered text, and delete the now-unused helper.

- The skip-sentence pluralisation ("1 node was ... / N nodes were ...") lived
  in both summarizer_deterministic.go and vm_runner_rest.go. Extract
  buildSkippedNote(count) once so the wording can't drift.

- computeExecutionStatus's docstring said "the same triad the aggregator
  emits" while only returning two values. Explain that "error" is reserved
  for VM-level failures that prevent a Summary from being built, so the
  deterministic fallback never emits it.
@chrisli30 chrisli30 changed the title feat: context-memory status contract, warn badge, trigger/UserOp fixes fix: context-memory status contract, warn badge, trigger/UserOp fixes Apr 14, 2026
@chrisli30 chrisli30 merged commit d8a8cc2 into main Apr 14, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants