refactor(agent-service): redesign sync-execution result and error model#6009
refactor(agent-service): redesign sync-execution result and error model#6009bobbai00 wants to merge 15 commits into
Conversation
Automated Reviewer SuggestionsBased on the
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #6009 +/- ##
============================================
+ Coverage 59.11% 60.61% +1.50%
- Complexity 3201 3232 +31
============================================
Files 1132 1133 +1
Lines 43681 43494 -187
Branches 4734 4717 -17
============================================
+ Hits 25821 26365 +544
+ Misses 16430 15682 -748
- Partials 1430 1447 +17
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 414 | 0.253 | 23,639/33,453/33,453 us | 🔴 +14.9% / 🔴 +120.9% |
| 🔴 | bs=100 sw=10 sl=64 | 923 | 0.563 | 107,304/138,820/138,820 us | 🔴 +18.5% / 🔴 +28.6% |
| ⚪ | bs=1000 sw=10 sl=64 | 1,117 | 0.682 | 892,104/941,056/941,056 us | ⚪ within ±5% / 🟢 -11.2% |
Baseline details
Latest main 6de8a48 from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 414 tuples/sec | 470 tuples/sec | 776.36 tuples/sec | -11.9% | -46.7% |
| bs=10 sw=10 sl=64 | MB/s | 0.253 MB/s | 0.287 MB/s | 0.474 MB/s | -11.8% | -46.6% |
| bs=10 sw=10 sl=64 | p50 | 23,639 us | 21,484 us | 12,636 us | +10.0% | +87.1% |
| bs=10 sw=10 sl=64 | p95 | 33,453 us | 29,105 us | 15,143 us | +14.9% | +120.9% |
| bs=10 sw=10 sl=64 | p99 | 33,453 us | 29,105 us | 18,954 us | +14.9% | +76.5% |
| bs=100 sw=10 sl=64 | throughput | 923 tuples/sec | 967 tuples/sec | 985.33 tuples/sec | -4.6% | -6.3% |
| bs=100 sw=10 sl=64 | MB/s | 0.563 MB/s | 0.59 MB/s | 0.601 MB/s | -4.6% | -6.4% |
| bs=100 sw=10 sl=64 | p50 | 107,304 us | 103,010 us | 101,671 us | +4.2% | +5.5% |
| bs=100 sw=10 sl=64 | p95 | 138,820 us | 117,171 us | 107,939 us | +18.5% | +28.6% |
| bs=100 sw=10 sl=64 | p99 | 138,820 us | 117,171 us | 113,798 us | +18.5% | +22.0% |
| bs=1000 sw=10 sl=64 | throughput | 1,117 tuples/sec | 1,100 tuples/sec | 1,016 tuples/sec | +1.5% | +9.9% |
| bs=1000 sw=10 sl=64 | MB/s | 0.682 MB/s | 0.671 MB/s | 0.62 MB/s | +1.6% | +9.9% |
| bs=1000 sw=10 sl=64 | p50 | 892,104 us | 916,304 us | 989,693 us | -2.6% | -9.9% |
| bs=1000 sw=10 sl=64 | p95 | 941,056 us | 980,009 us | 1,028,327 us | -4.0% | -8.5% |
| bs=1000 sw=10 sl=64 | p99 | 941,056 us | 980,009 us | 1,059,969 us | -4.0% | -11.2% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,482.81,200,128000,414,0.253,23639.13,33453.15,33453.15
1,100,10,64,20,2167.76,2000,1280000,923,0.563,107303.81,138820.05,138820.05
2,1000,10,64,20,17904.21,20000,12800000,1117,0.682,892104.14,941056.45,941056.4589eb9e9 to
84022c9
Compare
43166b9 to
66d9785
Compare
### What changes were proposed in this PR? Restructures the per-operator summary the sync-execution backend returns and the agent-service / frontend consume, for a leaner, consistent wire contract. This is a focused re-do of apache#5927 cut directly from `main` (no foundation stack): it changes only the execution result/error model and its consumers. - Replace the flat `OperatorInfo` with `OperatorExecutionSummary` (orthogonal sub-summaries: `state`, `errorMessages`, `resultSummary?`, `consoleLogsSummary?`); rename `SyncExecutionResult` → `WorkflowExecutionSummary`. - `resultSummary.sampleTuples` is now `SampleRow[]` (`{ rowIndex, tuple }`) instead of JSON rows with an embedded `__row_index__`; drop the table-shape types (the agent derives input-port shapes from the DAG). - Move `WorkflowFatalError` into `types/execution.ts` and reuse it for per-operator errors — the same type the workflow-compiling service returns for compilation errors, so compile and execution errors share one wire shape; `api/compile-api.ts` re-exports it so its existing importers are unchanged. - `errorMessages` / `errors` are non-optional (empty = none); drop `compilationErrors`; collapse the console-message types and derive warnings from `WARNING:`-titled messages. - Operator results are still pulled on demand via `GET /agents/:id/operator-results` (transport unchanged); that REST payload now carries the canonical `OperatorExecutionSummary`, and the frontend maps it to its flat display type (re-flattening `sampleTuples` so the display components are unchanged). Touches the Scala producer (`SyncExecutionResource`), the agent-service consumers (`result-formatting`, `workflow-execution-tools`, `workflow-result-state`, `server`), and the frontend mapping. Representation/type-level; behavior preserved (input-port shape lines are now derived rather than explicitly rendered). ### Any related issues, documentation, discussions? Closes apache#5750 Part of apache#5747. Supersedes apache#5927. ### How was this PR tested? - agent-service: `tsc --noEmit` clean, `bun test` 110/110 pass, `prettier --check` clean. - The Scala producer (`SyncExecutionResource`) is unchanged from apache#5927, which verified it via `sbt WorkflowExecutionService/compile` and a full-stack end-to-end run (a Claude Haiku 4.5 agent built and executed a CSV workflow; `/operator-results` returned the new shape — `resultSummary.sampleTuples: [{ rowIndex, tuple }]`, `errorMessages: []`). ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.8 (1M context)
Add unit tests for the redesigned sync-execution result/error model to bring patch coverage to 100%: - workflow-execution-tools: drive executeOperatorAndFormat and createExecuteOperatorTool across pre-flight guards, successful runs (shape/warnings/gaps/cell types/truncation), execution failures (FAILED/KILLED/ERROR, per-operator and general errors), abort propagation, and callback failures. - texera-agent: exercise getFormattedResultsForDAG over the visible result branch. - workflow-result-state: cover getOperatorInfo. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wrap long tuple/list expressions to the 100-column limit and drop a stray blank line so scalafmtCheckAll passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- agent-interaction: unit-test the visualization html caching, column and row derivation (ellipsis on index gaps) via direct construction with stubbed services. - result-table-frame: cover setupResultTable populating currentResult, columns, and totalNumTuples for non-empty data. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add SyncExecutionResourceSpec covering the unit-testable changed lines: the new result/error summary case classes and handleExecutionError (all compilation-error branches plus the unknown-error fallback), via PrivateMethodTester so no production visibility changes are needed. The remaining changed lines (executeWorkflowSync orchestration, collectOperatorInfos, and the collectOperatorResult truncation loop) drive a live Pekko execution + DB/Iceberg-backed result documents and are exercised by the integration suite, which is out of unit-test scope. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rs for testing Extract two behavior-preserving pure helpers so the result sampling and per-operator summary logic can be unit-tested without a live engine: - sampleAndTruncateTuples(tupleIterator, totalCount, ...) — the symmetric result truncation / sampling previously inlined in collectOperatorResult. - buildOperatorExecutionSummary(...) — the per-operator summary + console error extraction previously inlined in collectOperatorInfos. Both are pure code moves (identical expressions); call sites delegate to them. Expand SyncExecutionResourceSpec to cover every branch of both (empty/visualization/front-only/oversized/sliding-window truncation, and result/console-error/no-result summaries). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ting Factor the final WorkflowExecutionSummary assembly (fatal-error formatting, operator-console-error detection, state string, and success determination) out of executeWorkflowSync into a pure assembleExecutionSummary helper. Behavior-preserving code move; the live observable-wait / termination handling stays inline. Add unit tests covering the state/success derivation across terminal, console-error, target-results-override, operator-error, and fatal-error cases (SyncExecutionResourceSpec now 29 tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… partials Add two unit cases to close branch gaps in the extracted helpers: - sampleAndTruncateTuples: empty iterator with a positive count (the `!tupleIterator.hasNext` half of the guard). - buildOperatorExecutionSummary: a console ERROR whose message is non-empty but shorter than the title (the length-comparison false branch), so the title is kept. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Shrink the sampling helpers' return to the three consumed values, extract the duplicated back-window loop into collectBackWindow, and drop the constructor-echo tests in favor of the behavioral suites. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
getOperatorErrorText pins the one derivation of "operator failed" the new errorMessages type forces (a fatal error carrying message text), replacing the divergent .length and joined-truthiness checks. In the failure branch, surface per-operator errors for any terminal state (success=false can come with state "Completed"), label CompilationFailed results distinctly, and preserve the Killed state on timeout instead of collapsing to Failed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…tion in tests Import the existing WorkflowFatalError from workflow-websocket.interface instead of redeclaring it in agent.service.ts, and rewrite the agent-interaction spec onto a TestBed fixture with detectChanges() so the changed template is actually rendered and covered. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
8e8a9eb to
392f4de
Compare
What changes were proposed in this PR?
Restructures the per-operator summary the sync-execution backend returns and the agent-service / frontend consume, for a leaner, consistent wire contract. This is a focused re-do of #5927 cut directly from
main(no foundation stack). Scope: only the type definitions of the execution result/error model — the wire contract across the Scala producer, the agent-service, and the frontend — plus the minimal consumer changes the new types force; no transport or formatting-logic changes.OperatorInfowithOperatorExecutionSummary(orthogonal sub-summaries:state,errorMessages,resultSummary?,consoleLogsSummary?); renameSyncExecutionResult→WorkflowExecutionSummary.resultSummary.sampleTuplesis nowSampleRow[]({ rowIndex, tuple }) instead of JSON rows with an embedded__row_index__; drop the table-shape types (the agent derives input-port shapes from the DAG).WorkflowFatalErrorintotypes/execution.tsand reuse it for per-operator errors — the same type the workflow-compiling service returns for compilation errors, so compile and execution errors share one wire shape;api/compile-api.tsre-exports it so its existing importers are unchanged, and the frontend imports its existing declaration instead of adding a copy.errorMessages/errorsare non-optional (empty = none); dropcompilationErrors; collapse the console-message types and derive warnings fromWARNING:-titled messages.GET /agents/:id/operator-results(transport unchanged); that REST payload now carries the canonicalOperatorExecutionSummary, and the frontend maps it to its flat display type (re-flatteningsampleTuplesso the display components are unchanged).Touches the Scala producer (
SyncExecutionResource, split into pure helpers with a new unit spec), the agent-service consumers (result-formatting,workflow-execution-tools,workflow-result-state,server), and the frontend mapping. Representation/type-level; the only behavior deltas are in the failure paths the new error type forces: one shared "operator failed" derivation (getOperatorErrorText), per-operator errors surfaced for any terminal state, a distinct "Compilation error:" label, andKilledpreserved on timeout. Input-port shape lines are now derived rather than explicitly rendered.Any related issues, documentation, discussions?
Closes #5750
Part of #5747.
Supersedes #5927.
How was this PR tested?
tsc --noEmitclean,bun test134/134 pass,prettier --checkclean.sbt WorkflowExecutionService/compileclean; the newSyncExecutionResourceSpecunit suite (no DB) passes 24/24, scalafmt applied. refactor(agent-service): redesign sync-execution result and error model #5927 verified the same wire shape via a full-stack end-to-end run (a Claude Haiku 4.5 agent built and executed a CSV workflow;/operator-resultsreturned the new shape —resultSummary.sampleTuples: [{ rowIndex, tuple }],errorMessages: []).tsc --noEmitclean; targeted karma specs pass 22/22 (the new TestBed fixture renders the changed template); eslint clean.Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.8 (1M context), Claude Fable 5