Skip to content

fix(workflow-compiling-service): LogicalLink OperatorIdentity round-trip#5729

Open
wwong0 wants to merge 5 commits into
apache:mainfrom
wwong0:fix/5042-logicallink-operatoridentity-round-trip
Open

fix(workflow-compiling-service): LogicalLink OperatorIdentity round-trip#5729
wwong0 wants to merge 5 commits into
apache:mainfrom
wwong0:fix/5042-logicallink-operatoridentity-round-trip

Conversation

@wwong0

@wwong0 wwong0 commented Jun 16, 2026

Copy link
Copy Markdown

What changes were proposed in this PR?

Bug fix: LogicalLink @JsonCreator constructor (amber and workflow-compiling-service)

@JsonCreator was previously placed on the String convenience constructor of LogicalLink in both modules. OperatorIdentity is a case class that Jackson serializes as an object ({"id":"op-A"}), not a plain string. When reading back a serialized LogicalLink, Jackson dispatched to the @JsonCreator String constructor but could not coerce the {"id":"op-A"} object node to a String, causing any writeValueAsStringreadValue round-trip to fail with MismatchedInputException.

The fix introduces a private readOperatorIdentity(node: JsonNode, fieldName: String) helper in the companion object and moves @JsonCreator to a new JsonNode constructor that delegates to it. The helper accepts both the plain-string shape (front-end input) and the object shape (serialized form), maps null/absent ids to OperatorIdentity(null), and rejects malformed nodes. The String convenience constructor is retained but no longer carries @JsonCreator.

Any related issues, documentation, discussions?

Closes #5042
This PR continues and adopts work from #5175

How was this PR tested?

The new workflow-compiling-service LogicalLinkSpec adds unit coverage of LogicalLink and readOperatorIdentity model-level logic that existing tests do not cover and pins the leniency contract (no require guards in the compiler-service variant).

The existing amber LogicalLinkSpec was updated to match the renamed constructor section, drop the now-invalid MismatchedInputException expectation, and add a passing round-trip test.

Run with:

sbt "WorkflowExecutionService/testOnly *LogicalLinkSpec"

Result: 18/18 tests pass

sbt "WorkflowCompilingService/testOnly *LogicalLinkSpec"

Result: 15/15 tests pass

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Sonnet 4.6

…ound-trip

Prior to this change, `@JsonCreator` was placed on the `String`
convenience constructor in both `amber` and `workflow-compiling-service`
`LogicalLink`. Jackson serialises `OperatorIdentity` as an object
(`{"id":"op-A"}`), so any round-trip through `writeValueAsString` /
`readValue` failed with `MismatchedInputException` because the String
constructor cannot accept an object node.

Fix: add a private `readOperatorIdentity(JsonNode, String)` helper to
the companion object and move `@JsonCreator` to a new `JsonNode`
constructor that delegates to it. The helper handles the string shape
(front-end input), the object shape (serialised form), null/absent
fields, and rejects non-text / non-object nodes with
`IllegalArgumentException`.

The new `workflow-compiling-service` `LogicalLinkSpec` adds direct unit
coverage of the `readOperatorIdentity` branches that the existing
HTTP-layer-only `WorkflowCompilationResourceSpec` did not reach. It also
pins the leniency contract (no `require` guards) and the Jackson
annotation contract (`@JsonProperty` key pinning, round-trip fidelity).
The `amber` `LogicalLinkSpec` is updated to match the renamed constructor
section, drop the now-invalid MismatchedInputException expectation, and add
the round-trip test.

Closes apache#5042
@github-actions

Copy link
Copy Markdown
Contributor

👋 Thanks for your first contribution to Texera, @wwong0!

If you're looking for a good place to start, browse issues labeled starter-task; they're scoped to be approachable for newcomers.

You can drive common housekeeping yourself by commenting one of these commands on its own line:

  • Issues. Comment /take to assign an open issue to yourself, or /untake to release it. You can find unclaimed work with the search filter is:issue is:open no:assignee.
  • Sub-issues. To link issues into a parent/child hierarchy, comment /sub-issue #5166 #5222 on the parent to attach those children (or /unsub-issue #5166 #5222 to detach them). From a child issue, comment /parent-issue #5166 to set its parent, or /unparent-issue to clear it (the current parent is detected automatically). References may be written as #5166 or as a bare 5166; cross-repository references are not supported.
  • Pull requests (author only). Comment /request-review @user to request a review from someone, or /unrequest-review @user to withdraw that request.

Each command must match exactly: /take this will not work, only /take does. For the full contribution flow, see CONTRIBUTING.md.

@github-actions github-actions Bot added engine fix platform Non-amber Scala service paths labels Jun 16, 2026
@Yicong-Huang Yicong-Huang added the release/v1.2 back porting to release/v1.2 label Jun 17, 2026
@Yicong-Huang Yicong-Huang requested a review from Copilot June 17, 2026 05:25

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Jackson (de)serialization for LogicalLink so OperatorIdentity fields can round-trip through objectMapper.writeValueAsStringreadValue in both the Amber engine and workflow-compiling-service, while preserving each module’s strict vs. lenient validation behavior.

Changes:

  • Move @JsonCreator off the String convenience constructor and introduce a JsonNode-based creator that can read both string-shaped and object-shaped operator ids.
  • Add a readOperatorIdentity(node, fieldName) helper to parse fromOpId/toOpId from either JSON shape and centralize error handling.
  • Add/adjust unit tests to cover the round-trip behavior and module-specific strictness/leniency contracts.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
workflow-compiling-service/src/main/scala/org/apache/texera/amber/compiler/model/LogicalLink.scala Adds JsonNode @JsonCreator + helper to support both string/object OperatorIdentity JSON shapes.
workflow-compiling-service/src/test/scala/org/apache/texera/amber/compiler/model/LogicalLinkSpec.scala New spec covering leniency contract and Jackson round-trip behavior for compiler-service LogicalLink.
amber/src/main/scala/org/apache/texera/workflow/LogicalLink.scala Mirrors the JsonNode @JsonCreator + helper in the Amber engine LogicalLink (while keeping require guards).
amber/src/test/scala/org/apache/texera/workflow/LogicalLinkSpec.scala Updates characterization tests to assert successful round-trip after the fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread amber/src/main/scala/org/apache/texera/workflow/LogicalLink.scala
@codecov-commenter

codecov-commenter commented Jun 17, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.51%. Comparing base (6de8a48) to head (d26f64e).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5729      +/-   ##
============================================
- Coverage     59.14%   58.51%   -0.63%     
- Complexity     3201     3212      +11     
============================================
  Files          1132     1145      +13     
  Lines         43681    44435     +754     
  Branches       4734     4817      +83     
============================================
+ Hits          25834    26001     +167     
- Misses        16416    16999     +583     
- Partials       1431     1435       +4     
Flag Coverage Δ *Carryforward flag
access-control-service 70.66% <ø> (+0.66%) ⬆️
agent-service 44.59% <ø> (ø) Carriedforward from 908a61f
amber 61.76% <100.00%> (-1.35%) ⬇️
computing-unit-managing-service 1.25% <ø> (+1.25%) ⬆️
config-service 52.42% <ø> (+0.11%) ⬆️
file-service 60.86% <ø> (-1.96%) ⬇️
frontend 51.37% <ø> (ø) Carriedforward from 908a61f
notebook-migration-service 78.57% <ø> (ø)
pyamber 91.19% <ø> (ø) Carriedforward from 908a61f
workflow-compiling-service 55.49% <100.00%> (+0.34%) ⬆️

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

⚠️ Benchmark changes need a look

🟢 0 better · 🔴 3 worse · ⚪ 12 noise (<±5%) · 0 without baseline

Compared against main 6de8a48 benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

config throughput MB/s latency max Δ latest / 7d
🔴 bs=10 sw=10 sl=64 380 0.232 26,753/32,258/32,258 us 🔴 +9.7% / 🔴 +113.0%
bs=100 sw=10 sl=64 807 0.492 123,218/149,147/149,147 us ⚪ within ±5% / 🔴 +38.2%
bs=1000 sw=10 sl=64 929 0.567 1,080,028/1,113,562/1,113,562 us ⚪ within ±5% / 🔴 +9.1%
Baseline details

Latest main 6de8a48 from same runner

config metric PR latest main 7d avg Δ latest Δ 7d
bs=10 sw=10 sl=64 throughput 380 tuples/sec 407 tuples/sec 776.36 tuples/sec -6.6% -51.1%
bs=10 sw=10 sl=64 MB/s 0.232 MB/s 0.248 MB/s 0.474 MB/s -6.5% -51.0%
bs=10 sw=10 sl=64 p50 26,753 us 24,382 us 12,636 us +9.7% +111.7%
bs=10 sw=10 sl=64 p95 32,258 us 32,891 us 15,143 us -1.9% +113.0%
bs=10 sw=10 sl=64 p99 32,258 us 32,891 us 18,954 us -1.9% +70.2%
bs=100 sw=10 sl=64 throughput 807 tuples/sec 833 tuples/sec 985.33 tuples/sec -3.1% -18.1%
bs=100 sw=10 sl=64 MB/s 0.492 MB/s 0.509 MB/s 0.601 MB/s -3.3% -18.2%
bs=100 sw=10 sl=64 p50 123,218 us 118,590 us 101,671 us +3.9% +21.2%
bs=100 sw=10 sl=64 p95 149,147 us 143,337 us 107,939 us +4.1% +38.2%
bs=100 sw=10 sl=64 p99 149,147 us 143,337 us 113,798 us +4.1% +31.1%
bs=1000 sw=10 sl=64 throughput 929 tuples/sec 943 tuples/sec 1,016 tuples/sec -1.5% -8.6%
bs=1000 sw=10 sl=64 MB/s 0.567 MB/s 0.576 MB/s 0.62 MB/s -1.6% -8.6%
bs=1000 sw=10 sl=64 p50 1,080,028 us 1,053,138 us 989,693 us +2.6% +9.1%
bs=1000 sw=10 sl=64 p95 1,113,562 us 1,150,346 us 1,028,327 us -3.2% +8.3%
bs=1000 sw=10 sl=64 p99 1,113,562 us 1,150,346 us 1,059,969 us -3.2% +5.1%
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,526.01,200,128000,380,0.232,26752.92,32257.87,32257.87
1,100,10,64,20,2479.22,2000,1280000,807,0.492,123217.81,149147.35,149147.35
2,1000,10,64,20,21527.33,20000,12800000,929,0.567,1080028.40,1113562.02,1113562.02

wwong0 added 2 commits June 17, 2026 05:53
…d in LogicalLink

Address review feedback on the LogicalLink `@JsonCreator` fix: the object-shape branch of `readOperatorIdentity` previously coerced a non-textual `id` (e.g. `{"id": 123}`) via `asText()`, silently turning it into `"123"`—inconsistent with the helper's contract of rejecting malformed nodes, and with the existing rejection of a non-text/non-object top-level node. Require `id` to be textual (or null/absent); otherwise throw `IllegalArgumentException`. Applied to both the amber engine and the workflow-compiling-service `LogicalLink` (identical helpers).

Extend both `LogicalLinkSpec` suites with branch-coverage tests for `readOperatorIdentity`: object-shape `id` missing, explicit null, numeric/non-textual `id`, and a top-level explicit-null op id. These pin the new rejection behavior.
@Yicong-Huang

Copy link
Copy Markdown
Contributor

Hi @wwong0, thanks for the PR. overall it looks good.

can you please make coverage higher, before I review details? you can find coverage report here #5729 (comment). thanks.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comment thread amber/src/main/scala/org/apache/texera/workflow/LogicalLink.scala
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: William Wong <120128836+wwong0@users.noreply.github.com>
@wwong0

wwong0 commented Jun 24, 2026

Copy link
Copy Markdown
Author

Hi @Yicong-Huang the CodeCov report updated after your comment, it now shows higher coverage.

@github-actions

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • Contributors with relevant context: @aglinxinyuan, @Yicong-Huang
    You can notify them by mentioning @aglinxinyuan, @Yicong-Huang in a comment.

@Yicong-Huang Yicong-Huang removed the release/v1.2 back porting to release/v1.2 label Jun 29, 2026

@Yicong-Huang Yicong-Huang left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@Yicong-Huang Yicong-Huang enabled auto-merge June 29, 2026 00:35
@Yicong-Huang Yicong-Huang added the release/v1.2 back porting to release/v1.2 label Jul 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

engine fix platform Non-amber Scala service paths release/v1.2 back porting to release/v1.2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LogicalLink: writeValueAsString / readValue is not round-trippable for OperatorIdentity fields

4 participants