refactor(agent-service): redesign sync-execution result and error model by bobbai00 · Pull Request #6009 · apache/texera

bobbai00 · 2026-06-29T00:41:29Z

What changes were proposed in this PR?

Restructures the per-operator summary the sync-execution backend returns and the agent-service / frontend consume, for a leaner, consistent wire contract. This is a focused re-do of #5927 cut directly from main (no foundation stack). Scope: only the type definitions of the execution result/error model — the wire contract across the Scala producer, the agent-service, and the frontend — plus the minimal consumer changes the new types force; no transport or formatting-logic changes.

Replace the flat OperatorInfo with OperatorExecutionSummary (orthogonal sub-summaries: state, errorMessages, resultSummary?, consoleLogsSummary?); rename SyncExecutionResult → WorkflowExecutionSummary.
resultSummary.sampleTuples is now SampleRow[] ({ rowIndex, tuple }) instead of JSON rows with an embedded __row_index__; drop the table-shape types (the agent derives input-port shapes from the DAG).
Move WorkflowFatalError into types/execution.ts and reuse it for per-operator errors — the same type the workflow-compiling service returns for compilation errors, so compile and execution errors share one wire shape; api/compile-api.ts re-exports it so its existing importers are unchanged, and the frontend imports its existing declaration instead of adding a copy.
errorMessages / errors are non-optional (empty = none); drop compilationErrors; collapse the console-message types and derive warnings from WARNING:-titled messages.
Operator results are still pulled on demand via GET /agents/:id/operator-results (transport unchanged); that REST payload now carries the canonical OperatorExecutionSummary, and the frontend maps it to its flat display type (re-flattening sampleTuples so the display components are unchanged).

Touches the Scala producer (SyncExecutionResource, split into pure helpers with a new unit spec), the agent-service consumers (result-formatting, workflow-execution-tools, workflow-result-state, server), and the frontend mapping. Representation/type-level; the only behavior deltas are in the failure paths the new error type forces: one shared "operator failed" derivation (getOperatorErrorText), per-operator errors surfaced for any terminal state, a distinct "Compilation error:" label, and Killed preserved on timeout. Input-port shape lines are now derived rather than explicitly rendered.

Any related issues, documentation, discussions?

Closes #5750
Part of #5747.
Supersedes #5927.

How was this PR tested?

agent-service: tsc --noEmit clean, bun test 134/134 pass, prettier --check clean.
Scala: sbt WorkflowExecutionService/compile clean; the new SyncExecutionResourceSpec unit suite (no DB) passes 24/24, scalafmt applied. refactor(agent-service): redesign sync-execution result and error model #5927 verified the same wire shape via a full-stack end-to-end run (a Claude Haiku 4.5 agent built and executed a CSV workflow; /operator-results returned the new shape — resultSummary.sampleTuples: [{ rowIndex, tuple }], errorMessages: []).
frontend: tsc --noEmit clean; targeted karma specs pass 22/22 (the new TestBed fixture renders the changed template); eslint clean.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.8 (1M context), Claude Fable 5

github-actions · 2026-06-29T00:41:46Z

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

Contributors with relevant context: @Ma77Ball, @Yicong-Huang
You can notify them by mentioning @Ma77Ball, @Yicong-Huang in a comment.

codecov-commenter · 2026-06-29T00:42:50Z

Codecov Report

❌ Patch coverage is 89.54248% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.61%. Comparing base (5c4a963) to head (392f4de).
⚠️ Report is 5 commits behind head on main.

Files with missing lines	Patch %	Lines
...he/texera/web/resource/SyncExecutionResource.scala	76.74%	27 Missing and 3 partials ⚠️
...agent-interaction/agent-interaction.component.html	66.66%	0 Missing and 2 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #6009      +/-   ##
============================================
+ Coverage     59.11%   60.61%   +1.50%     
- Complexity     3201     3232      +31     
============================================
  Files          1132     1133       +1     
  Lines         43681    43494     -187     
  Branches       4734     4717      -17     
============================================
+ Hits          25821    26365     +544     
+ Misses        16430    15682     -748     
- Partials       1430     1447      +17

Flag	Coverage Δ		*Carryforward flag
access-control-service	`70.00% <ø> (ø)`		Carriedforward from 5c4a963
agent-service	`57.44% <100.00%> (+12.84%)`	⬆️
amber	`64.01% <76.74%> (+0.89%)`	⬆️
computing-unit-managing-service	`0.00% <ø> (ø)`		Carriedforward from 5c4a963
config-service	`52.30% <ø> (ø)`		Carriedforward from 5c4a963
file-service	`62.81% <ø> (ø)`		Carriedforward from 5c4a963
frontend	`51.78% <96.15%> (+0.46%)`	⬆️
notebook-migration-service	`78.57% <ø> (ø)`		Carriedforward from 5c4a963
pyamber	`91.19% <ø> (ø)`		Carriedforward from 5c4a963
workflow-compiling-service	`55.14% <ø> (ø)`		Carriedforward from 5c4a963

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2026-06-29T00:47:27Z

⚠️ Benchmark changes need a look

🟢 0 better · 🔴 7 worse · ⚪ 8 noise (<±5%) · 0 without baseline

Compared against main 6de8a48 benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

	config	throughput	MB/s	latency	max Δ latest / 7d
🔴	bs=10 sw=10 sl=64	414	0.253	23,639/33,453/33,453 us	🔴 +14.9% / 🔴 +120.9%
🔴	bs=100 sw=10 sl=64	923	0.563	107,304/138,820/138,820 us	🔴 +18.5% / 🔴 +28.6%
⚪	bs=1000 sw=10 sl=64	1,117	0.682	892,104/941,056/941,056 us	⚪ within ±5% / 🟢 -11.2%

Baseline details

Latest main 6de8a48 from same runner

config	metric	PR	latest main	7d avg	Δ latest	Δ 7d
bs=10 sw=10 sl=64	throughput	414 tuples/sec	470 tuples/sec	776.36 tuples/sec	-11.9%	-46.7%
bs=10 sw=10 sl=64	MB/s	0.253 MB/s	0.287 MB/s	0.474 MB/s	-11.8%	-46.6%
bs=10 sw=10 sl=64	p50	23,639 us	21,484 us	12,636 us	+10.0%	+87.1%
bs=10 sw=10 sl=64	p95	33,453 us	29,105 us	15,143 us	+14.9%	+120.9%
bs=10 sw=10 sl=64	p99	33,453 us	29,105 us	18,954 us	+14.9%	+76.5%
bs=100 sw=10 sl=64	throughput	923 tuples/sec	967 tuples/sec	985.33 tuples/sec	-4.6%	-6.3%
bs=100 sw=10 sl=64	MB/s	0.563 MB/s	0.59 MB/s	0.601 MB/s	-4.6%	-6.4%
bs=100 sw=10 sl=64	p50	107,304 us	103,010 us	101,671 us	+4.2%	+5.5%
bs=100 sw=10 sl=64	p95	138,820 us	117,171 us	107,939 us	+18.5%	+28.6%
bs=100 sw=10 sl=64	p99	138,820 us	117,171 us	113,798 us	+18.5%	+22.0%
bs=1000 sw=10 sl=64	throughput	1,117 tuples/sec	1,100 tuples/sec	1,016 tuples/sec	+1.5%	+9.9%
bs=1000 sw=10 sl=64	MB/s	0.682 MB/s	0.671 MB/s	0.62 MB/s	+1.6%	+9.9%
bs=1000 sw=10 sl=64	p50	892,104 us	916,304 us	989,693 us	-2.6%	-9.9%
bs=1000 sw=10 sl=64	p95	941,056 us	980,009 us	1,028,327 us	-4.0%	-8.5%
bs=1000 sw=10 sl=64	p99	941,056 us	980,009 us	1,059,969 us	-4.0%	-11.2%

Raw CSV

config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,482.81,200,128000,414,0.253,23639.13,33453.15,33453.15
1,100,10,64,20,2167.76,2000,1280000,923,0.563,107303.81,138820.05,138820.05
2,1000,10,64,20,17904.21,20000,12800000,1117,0.682,892104.14,941056.45,941056.45

### What changes were proposed in this PR? Restructures the per-operator summary the sync-execution backend returns and the agent-service / frontend consume, for a leaner, consistent wire contract. This is a focused re-do of apache#5927 cut directly from `main` (no foundation stack): it changes only the execution result/error model and its consumers. - Replace the flat `OperatorInfo` with `OperatorExecutionSummary` (orthogonal sub-summaries: `state`, `errorMessages`, `resultSummary?`, `consoleLogsSummary?`); rename `SyncExecutionResult` → `WorkflowExecutionSummary`. - `resultSummary.sampleTuples` is now `SampleRow[]` (`{ rowIndex, tuple }`) instead of JSON rows with an embedded `__row_index__`; drop the table-shape types (the agent derives input-port shapes from the DAG). - Move `WorkflowFatalError` into `types/execution.ts` and reuse it for per-operator errors — the same type the workflow-compiling service returns for compilation errors, so compile and execution errors share one wire shape; `api/compile-api.ts` re-exports it so its existing importers are unchanged. - `errorMessages` / `errors` are non-optional (empty = none); drop `compilationErrors`; collapse the console-message types and derive warnings from `WARNING:`-titled messages. - Operator results are still pulled on demand via `GET /agents/:id/operator-results` (transport unchanged); that REST payload now carries the canonical `OperatorExecutionSummary`, and the frontend maps it to its flat display type (re-flattening `sampleTuples` so the display components are unchanged). Touches the Scala producer (`SyncExecutionResource`), the agent-service consumers (`result-formatting`, `workflow-execution-tools`, `workflow-result-state`, `server`), and the frontend mapping. Representation/type-level; behavior preserved (input-port shape lines are now derived rather than explicitly rendered). ### Any related issues, documentation, discussions? Closes apache#5750 Part of apache#5747. Supersedes apache#5927. ### How was this PR tested? - agent-service: `tsc --noEmit` clean, `bun test` 110/110 pass, `prettier --check` clean. - The Scala producer (`SyncExecutionResource`) is unchanged from apache#5927, which verified it via `sbt WorkflowExecutionService/compile` and a full-stack end-to-end run (a Claude Haiku 4.5 agent built and executed a CSV workflow; `/operator-results` returned the new shape — `resultSummary.sampleTuples: [{ rowIndex, tuple }]`, `errorMessages: []`). ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.8 (1M context)

Add unit tests for the redesigned sync-execution result/error model to bring patch coverage to 100%: - workflow-execution-tools: drive executeOperatorAndFormat and createExecuteOperatorTool across pre-flight guards, successful runs (shape/warnings/gaps/cell types/truncation), execution failures (FAILED/KILLED/ERROR, per-operator and general errors), abort propagation, and callback failures. - texera-agent: exercise getFormattedResultsForDAG over the visible result branch. - workflow-result-state: cover getOperatorInfo. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Wrap long tuple/list expressions to the 100-column limit and drop a stray blank line so scalafmtCheckAll passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- agent-interaction: unit-test the visualization html caching, column and row derivation (ellipsis on index gaps) via direct construction with stubbed services. - result-table-frame: cover setupResultTable populating currentResult, columns, and totalNumTuples for non-empty data. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add SyncExecutionResourceSpec covering the unit-testable changed lines: the new result/error summary case classes and handleExecutionError (all compilation-error branches plus the unknown-error fallback), via PrivateMethodTester so no production visibility changes are needed. The remaining changed lines (executeWorkflowSync orchestration, collectOperatorInfos, and the collectOperatorResult truncation loop) drive a live Pekko execution + DB/Iceberg-backed result documents and are exercised by the integration suite, which is out of unit-test scope. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…rs for testing Extract two behavior-preserving pure helpers so the result sampling and per-operator summary logic can be unit-tested without a live engine: - sampleAndTruncateTuples(tupleIterator, totalCount, ...) — the symmetric result truncation / sampling previously inlined in collectOperatorResult. - buildOperatorExecutionSummary(...) — the per-operator summary + console error extraction previously inlined in collectOperatorInfos. Both are pure code moves (identical expressions); call sites delegate to them. Expand SyncExecutionResourceSpec to cover every branch of both (empty/visualization/front-only/oversized/sliding-window truncation, and result/console-error/no-result summaries). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ting Factor the final WorkflowExecutionSummary assembly (fatal-error formatting, operator-console-error detection, state string, and success determination) out of executeWorkflowSync into a pure assembleExecutionSummary helper. Behavior-preserving code move; the live observable-wait / termination handling stays inline. Add unit tests covering the state/success derivation across terminal, console-error, target-results-override, operator-error, and fatal-error cases (SyncExecutionResourceSpec now 29 tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… partials Add two unit cases to close branch gaps in the extracted helpers: - sampleAndTruncateTuples: empty iterator with a positive count (the `!tupleIterator.hasNext` half of the guard). - buildOperatorExecutionSummary: a console ERROR whose message is non-empty but shorter than the title (the length-comparison false branch), so the title is kept. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Shrink the sampling helpers' return to the three consumed values, extract the duplicated back-window loop into collectBackWindow, and drop the constructor-echo tests in favor of the behavioral suites. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

getOperatorErrorText pins the one derivation of "operator failed" the new errorMessages type forces (a fatal error carrying message text), replacing the divergent .length and joined-truthiness checks. In the failure branch, surface per-operator errors for any terminal state (success=false can come with state "Completed"), label CompilationFailed results distinctly, and preserve the Killed state on timeout instead of collapsing to Failed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…tion in tests Import the existing WorkflowFatalError from workflow-websocket.interface instead of redeclaring it in agent.service.ts, and rewrite the agent-interaction spec onto a TestBed fixture with detectChanges() so the changed template is actually rendered and covered. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

github-actions Bot assigned bobbai00 Jun 29, 2026

github-actions Bot added engine refactor Refactor the code frontend Changes related to the frontend GUI agent-service labels Jun 29, 2026

bobbai00 mentioned this pull request Jun 29, 2026

refactor(agent-service): redesign sync-execution result and error model #5927

Closed

bobbai00 marked this pull request as draft June 29, 2026 01:00

bobbai00 force-pushed the refactor/sync-execution-result-model branch from 89eb9e9 to 84022c9 Compare June 29, 2026 04:56

bobbai00 marked this pull request as ready for review June 29, 2026 08:25

bobbai00 force-pushed the refactor/sync-execution-result-model branch from 43166b9 to 66d9785 Compare July 2, 2026 09:18

bobbai00 requested a review from Yicong-Huang July 2, 2026 09:25

bobbai00 and others added 15 commits July 4, 2026 15:41

refactor(frontend): consume execution summaries directly

316f456

refactor(agent-service): keep operator result summary name

cf39fa3

refactor: align execution result summary contract

782432c

refactor: remove legacy result marker fields

c19569f

style(execution-service): apply scalafmt to SyncExecutionResource

639152b

Wrap long tuple/list expressions to the 100-column limit and drop a stray blank line so scalafmtCheckAll passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

bobbai00 force-pushed the refactor/sync-execution-result-model branch from 8e8a9eb to 392f4de Compare July 5, 2026 00:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(agent-service): redesign sync-execution result and error model#6009

refactor(agent-service): redesign sync-execution result and error model#6009
bobbai00 wants to merge 15 commits into
apache:mainfrom
bobbai00:refactor/sync-execution-result-model

bobbai00 commented Jun 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Jun 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bobbai00 commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this PR?

Any related issues, documentation, discussions?

How was this PR tested?

Was this PR authored or co-authored using generative AI tooling?

Uh oh!

github-actions Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Automated Reviewer Suggestions

Uh oh!

codecov-commenter commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Benchmark changes need a look

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bobbai00 commented Jun 29, 2026 •

edited

Loading

github-actions Bot commented Jun 29, 2026 •

edited

Loading

codecov-commenter commented Jun 29, 2026 •

edited

Loading

github-actions Bot commented Jun 29, 2026 •

edited

Loading