Skip to content

feat(c++/node): add close_outputs, NodeFailed and Reload event types (rescue of #1410)#1847

Merged
heyong4725 merged 3 commits into
mainfrom
rescue/cpp-close-outputs-event-types
May 17, 2026
Merged

feat(c++/node): add close_outputs, NodeFailed and Reload event types (rescue of #1410)#1847
heyong4725 merged 3 commits into
mainfrom
rescue/cpp-close-outputs-event-types

Conversation

@heyong4725
Copy link
Copy Markdown
Collaborator

Summary

Rescue of #1410 (@PavelGuzenfeld). Second PR in the 4-PR rescue stack for C++ API parity.

Stacked on top of #1846 (event receive variants). Once #1846 merges I'll rebase this onto main.

Closes #1410.

New FFI surface

Item Behavior
DoraEventType::NodeFailed Upstream node has failed; downstream inputs that depended on it will stop receiving data.
DoraEventType::Reload Hot-reload notification for an operator. C++ nodes can now react (e.g. flush caches) instead of treating it as Unknown.
DoraNodeFailed { affected_input_ids, error, source_node_id } Payload struct surfaced via event_as_node_failed.
event_as_node_failed(event) Extracts the failure payload from a NodeFailed event.
close_outputs(sender, output_ids) Selectively close one or more of this node's outputs without shutting the whole node down. Downstream sees InputClosed.

Implementation notes

  • event_as_node_failed destructures via EventOrReason::Event(Event::NodeFailed { … }) to match the new tuple-struct shape that feat(c++/node): add event receive variants (rescue of #1409) #1846's rescue introduced (the original PR used event.event which assumed the never-adopted named-field shape).
  • close_outputs calls through to dora_node_api::DoraNode::close_outputs(Vec<DataId>) which has been on main since the 1.0 consolidation. Errors are returned as DoraResult { error } for ergonomic consistency with send_output.
  • event_type gets two new match arms (Event::NodeFailed { .. }NodeFailed, Event::Reload { .. }Reload). Existing variants unchanged.

Test plan

  • cargo build -p dora-node-api-cxx — clean compile
  • cargo clippy -p dora-node-api-cxx -- -D warnings — clean
  • Example (examples/c++-dataflow/node-rust-api/main.cc) extended to handle NodeFailed and Reload event types
  • CircleCI cli-rust + examples-linux will exercise via the example's main loop

🤖 Generated with Claude Code

@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io Bot commented May 17, 2026

Merging to main in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

Rescue of #1410 (PavelGuzenfeld). Second in the 4-PR rescue stack
for the C++ API parity work; stacked on top of #1846 (event
receive variants).

New FFI surface
===============

* Two new `DoraEventType` variants:
  - `NodeFailed` -- an upstream node has failed. Inputs that
    depended on the failed node will stop receiving data; the
    payload (via `event_as_node_failed`) names the failed node,
    the error message, and the affected input ids.
  - `Reload` -- hot-reload notification for an operator. C++
    nodes can now react (e.g. flush caches) instead of treating
    it as `Unknown`.

* `event_as_node_failed(event) -> DoraNodeFailed` -- extracts
  `{ affected_input_ids, error, source_node_id }` from a
  NodeFailed event.

* `close_outputs(sender, output_ids)` -- selectively close one or
  more of this node's outputs without shutting the whole node
  down. Downstream subscribers see the corresponding
  `InputClosed` event.

Implementation notes
====================

* `event_as_node_failed` destructures via `EventOrReason::Event(...)`
  rather than the original PR's `event.event` (which assumed the
  named-field DoraEvent shape from #1409 that we never adopted).
* `close_outputs` returns `DoraResult` like the existing
  `send_output` family for ergonomic consistency on the C++ side.

Closes #1410.

Co-Authored-By: Pavel Guzenfeld <PavelGuzenfeld@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@heyong4725 heyong4725 force-pushed the rescue/cpp-close-outputs-event-types branch from b02b2f1 to 5b39f72 Compare May 17, 2026 20:46
heyong4725 added a commit that referenced this pull request May 17, 2026
Rescue of #1413 (PavelGuzenfeld). Third in the 4-PR rescue stack
for C++ API parity; stacked on top of #1847 (close_outputs +
NodeFailed/Reload).

New FFI surface
===============

* `node_config_json(output_sender) -> Result<String>` -- returns
  this node's `NodeRunConfig` (inputs / outputs / metadata from
  the dataflow descriptor) serialized as JSON. Lets C++ nodes
  introspect their own declared interface at runtime without
  re-parsing the yaml.

* `dataflow_descriptor_json(output_sender) -> Result<String>` --
  returns the full parsed `Descriptor` (the dataflow yaml) as
  JSON. Useful for introspecting peer nodes, listing all topics,
  etc. May fail if the daemon hasn't delivered the descriptor
  yet (e.g. very early in startup).

Both functions wrap the existing `DoraNode::node_config()` and
`DoraNode::dataflow_descriptor()` methods from
`apis/rust/node/src/node/mod.rs:1116,1317`, serializing through
`serde_json::to_string`.

Closes #1413.

Co-Authored-By: Pavel Guzenfeld <PavelGuzenfeld@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Review finding on #1847 head 5b39f72: `close_outputs` converted
caller-supplied `Vec<String>` to `Vec<DataId>` via `Into::into`,
which calls `From<String> for DataId` -- documented as panicking
on invalid characters at `libraries/message/src/id.rs:186` with
an explicit `# Panics` notice. A typo in a C++ string literal
would have aborted the whole node instead of returning
`DoraResult.error` like the rest of the C++ API surface does.

Same panic-across-cxx::bridge category as the `Instant::now() +
Duration::from_millis(u64::MAX)` overflow on #1846 -- a public
C++ API accepting unconstrained input must be panic-safe for
the entire input domain.

Fix: parse each id via `id.parse::<DataId>()` (`impl FromStr`,
which is the safe path documented at `libraries/message/src/id.rs:167`)
and on the first parse error return a `DoraResult` whose `error`
names the offending id and the underlying validation error.
The well-formed-id case is unchanged; only the previously-
panicking error path now surfaces cleanly.

Pattern bank update for me (second in this PR series): "public
API accepts type T via FFI" + "T has a `From` impl that
panics on invalid input" = "use `FromStr` / `TryFrom`,
never the panicking `From`". The standard library's
`String::from` is total, so it doesn't share this gotcha, but
dora's id types deliberately panic in `From` to keep the happy
path concise -- which is the right call for trusted Rust call
sites and the wrong call for FFI boundaries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@heyong4725
Copy link
Copy Markdown
Collaborator Author

Real bug, same panic-across-FFI category as #1846's Instant overflow. Fix in 41d9b43a.

The bug

From<String> for DataId is documented as panicking on invalid characters (libraries/message/src/id.rs:186, with explicit # Panics notice). Calling output_ids.into_iter().map(Into::into).collect() on caller-supplied C++ input means a typo in a C++ string literal aborts the whole node instead of returning the DoraResult.error like every other function in the surface.

The fix

Replace Into::into with parse::<DataId>() (the safe FromStr path at libraries/message/src/id.rs:167); first invalid id short-circuits with a structured error:

let mut ids = Vec::with_capacity(output_ids.len());
for id in output_ids {
    match id.parse::<dora_node_api::dora_core::config::DataId>() {
        Ok(parsed) => ids.push(parsed),
        Err(e) => {
            return ffi::DoraResult {
                error: format!("invalid output id '{id}': {e}"),
            };
        }
    }
}
match output_sender.0.close_outputs(ids) {
    Ok(()) => ffi::DoraResult { error: String::new() },
    Err(err) => ffi::DoraResult { error: format!("{err:?}") },
}

Well-formed-id case is unchanged; only the previously-panicking error path now surfaces a DoraResult with a human-readable message naming the offending id.

Pattern bank update (second time in this PR series)

The unifying rule: when an FFI accepts type T and there's a From<X> for T that panics on invalid input, never use that — use FromStr / TryFrom instead. dora's DataId / NodeId deliberately panic in From to keep trusted Rust call sites concise — that's the right call inside the workspace and the wrong call at FFI boundaries.

Lessons from #1846 + #1847 combined:

  1. (round-2 feat(c++/node): add event receive variants (rescue of #1409) #1846) Instant::now() + Duration::from_millis(u64) can panic on overflow — use checked_add.
  2. (this PR) DataId::from(String) panics on invalid id — use parse::<DataId>().

Going forward, when reviewing FFI surfaces I check: "what's the unconstrained input type, what total-feeling Rust conversions does it touch, and are any of them partial-with-panic?"

On the residual docs note

You're right that guide/src/languages/cpp.md (642 lines) documents the pre-existing surface but not the new Empty/Timeout/NodeFailed/Reload event types, try_next_event/next_event_timeout/events_is_empty/drain_events*/close_outputs/event_as_node_failed, or (later in this stack) node_config_json/dataflow_descriptor_json/on_input_closed/on_stop. The cxx::bridge inline doc comments do propagate to generated C++ headers as Doxygen-style annotations, so the API isn't strictly undocumented, but the long-form guide is stale.

That's a doc gap spanning 4 PRs of new C++ API surface, and it'd more than double the size of any individual PR in the stack. Suggestion: I'll file a follow-up issue tracking the doc work for the whole #1846#1849 surface and link it from each PR description. That way the docs land as one coherent update against the post-merge state instead of being smeared across 4 stacked rebases. Sound right, or want me to fold the docs into this stack now?

PR head: 5b39f72f41d9b43a. Ready for re-review.

…de_failed

Review residual: docs/api-cxx.md was stale for the new C++ node
API surface this PR adds (close_outputs, NodeFailed, Reload,
DoraNodeFailed, event_as_node_failed). Also catches up on the
#1846 additions that merged earlier in the stack (Empty,
Timeout, try_next_event, next_event_timeout, events_is_empty,
drain_events, DrainedEvents) since the DoraEventType enum and
the downcast-helper section can't be accurate without them.

Additions
=========

* New "Non-blocking and timed receive" subsection under Events
  documenting try_next_event, next_event_timeout, events_is_empty,
  and the drain_events / DrainedEvents family.

* DoraEvent downcast helpers gain event_as_node_failed.

* DoraEventType enum now lists all 10 variants currently on
  main (was 6) with one-line descriptions, plus a paragraph
  explaining the Empty vs Timeout distinction.

* New DoraNodeFailed struct section after DoraInput, with a
  short usage example.

* New OutputSender::close_outputs subsection, noting that ids
  are validated and return DoraResult.error on invalid input
  (matching the FromStr parse fix landed earlier on this PR).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@heyong4725
Copy link
Copy Markdown
Collaborator Author

Done in 532a17a8. Updated docs/api-cxx.md:

  • New Non-blocking and timed receive subsection under Events — documents try_next_event, next_event_timeout, events_is_empty, and the drain_events / DrainedEvents family. These are feat(c++/node): add event receive variants (rescue of #1409) #1846's surface but they need to be in the doc for the enum below to make sense.
  • DoraEvent downcast section — added event_as_node_failed.
  • DoraEventType enum — expanded from 6 to 10 variants (matching what's actually on main now) with one-line descriptions, plus a paragraph explaining the Empty vs Timeout semantic distinction so readers don't conflate them.
  • New DoraNodeFailed struct section after DoraInput, with a short usage example.
  • New OutputSender::close_outputs subsection in the OutputSender part, calling out that ids are validated and return DoraResult.error on invalid input (matching the FromStr parse fix from the previous review round).

Scope note: I included the #1846 catch-up (Empty/Timeout/try_next_event family) here because the enum/downcast tables can't be accurate without them. Strictly speaking that was #1846's miss, but bundling here keeps the post-merge doc state consistent with main rather than leaving an intermediate broken state for whoever picks up the next stack PR.

PR head: 41d9b43a532a17a8. The 4-PR series docs are now partially caught up (this PR covers #1846 + #1847's surface); #1848 and #1849 should each add their own surface to the same file as they land.

@heyong4725 heyong4725 merged commit a165ca6 into main May 17, 2026
11 of 36 checks passed
@heyong4725 heyong4725 deleted the rescue/cpp-close-outputs-event-types branch May 17, 2026 21:10
heyong4725 added a commit that referenced this pull request May 17, 2026
Rescue of #1413 (PavelGuzenfeld). Third in the 4-PR rescue stack
for C++ API parity; stacked on top of #1847 (close_outputs +
NodeFailed/Reload).

New FFI surface
===============

* `node_config_json(output_sender) -> Result<String>` -- returns
  this node's `NodeRunConfig` (inputs / outputs / metadata from
  the dataflow descriptor) serialized as JSON. Lets C++ nodes
  introspect their own declared interface at runtime without
  re-parsing the yaml.

* `dataflow_descriptor_json(output_sender) -> Result<String>` --
  returns the full parsed `Descriptor` (the dataflow yaml) as
  JSON. Useful for introspecting peer nodes, listing all topics,
  etc. May fail if the daemon hasn't delivered the descriptor
  yet (e.g. very early in startup).

Both functions wrap the existing `DoraNode::node_config()` and
`DoraNode::dataflow_descriptor()` methods from
`apis/rust/node/src/node/mod.rs:1116,1317`, serializing through
`serde_json::to_string`.

Closes #1413.

Co-Authored-By: Pavel Guzenfeld <PavelGuzenfeld@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
heyong4725 added a commit that referenced this pull request May 17, 2026
  + env-secrets caveat on dataflow_descriptor_json

Self-review of #1848:

1) docs/api-cxx.md was not updated to reflect this PR's new C++ API
   surface. Same gap the reviewer caught on #1847, applied to
   #1848's additions. Added subsections under OutputSender for
   `node_config_json` and `dataflow_descriptor_json`.

2) `dataflow_descriptor_json` serializes the full Descriptor, which
   includes per-node `env: { ... }` blocks. Per
   `libraries/message/src/descriptor.rs:1128`, the
   `with_expand_envs` deserializer expands `${HOST_VAR}` references
   at descriptor parse time -- so by the time the node receives the
   descriptor, env values are inline strings, not references. A
   user with secrets defined inline (`API_KEY: "sk-..."`) or via
   host substitution (`API_KEY: "${OPENAI_KEY}"`) will see those
   values in the returned JSON.

   The function's rustdoc had no caveat. Added one to both the
   cxx::bridge declaration in apis/c++/node/src/lib.rs and the
   prose subsection in docs/api-cxx.md, calling out the inline-
   literal and host-substitution paths explicitly and noting that
   callers shouldn't pipe the output to shared logs without
   sanitization.

   Consistent with dora's broader posture (`dora list --json` and
   other paths already expose descriptors) but this is a new
   in-node introspection path that may surface in different
   contexts (e.g. logging from a C++ node that runs in a less-
   privileged process than the daemon).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
heyong4725 added a commit that referenced this pull request May 17, 2026
… of #1413) (#1848)

* feat(c++/node): add node_config_json and dataflow_descriptor_json

Rescue of #1413 (PavelGuzenfeld). Third in the 4-PR rescue stack
for C++ API parity; stacked on top of #1847 (close_outputs +
NodeFailed/Reload).

New FFI surface
===============

* `node_config_json(output_sender) -> Result<String>` -- returns
  this node's `NodeRunConfig` (inputs / outputs / metadata from
  the dataflow descriptor) serialized as JSON. Lets C++ nodes
  introspect their own declared interface at runtime without
  re-parsing the yaml.

* `dataflow_descriptor_json(output_sender) -> Result<String>` --
  returns the full parsed `Descriptor` (the dataflow yaml) as
  JSON. Useful for introspecting peer nodes, listing all topics,
  etc. May fail if the daemon hasn't delivered the descriptor
  yet (e.g. very early in startup).

Both functions wrap the existing `DoraNode::node_config()` and
`DoraNode::dataflow_descriptor()` methods from
`apis/rust/node/src/node/mod.rs:1116,1317`, serializing through
`serde_json::to_string`.

Closes #1413.

Co-Authored-By: Pavel Guzenfeld <PavelGuzenfeld@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(api-cxx): document node_config_json + dataflow_descriptor_json
  + env-secrets caveat on dataflow_descriptor_json

Self-review of #1848:

1) docs/api-cxx.md was not updated to reflect this PR's new C++ API
   surface. Same gap the reviewer caught on #1847, applied to
   #1848's additions. Added subsections under OutputSender for
   `node_config_json` and `dataflow_descriptor_json`.

2) `dataflow_descriptor_json` serializes the full Descriptor, which
   includes per-node `env: { ... }` blocks. Per
   `libraries/message/src/descriptor.rs:1128`, the
   `with_expand_envs` deserializer expands `${HOST_VAR}` references
   at descriptor parse time -- so by the time the node receives the
   descriptor, env values are inline strings, not references. A
   user with secrets defined inline (`API_KEY: "sk-..."`) or via
   host substitution (`API_KEY: "${OPENAI_KEY}"`) will see those
   values in the returned JSON.

   The function's rustdoc had no caveat. Added one to both the
   cxx::bridge declaration in apis/c++/node/src/lib.rs and the
   prose subsection in docs/api-cxx.md, calling out the inline-
   literal and host-substitution paths explicitly and noting that
   callers shouldn't pipe the output to shared logs without
   sanitization.

   Consistent with dora's broader posture (`dora list --json` and
   other paths already expose descriptors) but this is a new
   in-node introspection path that may surface in different
   contexts (e.g. logging from a C++ node that runs in a less-
   privileged process than the daemon).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Pavel Guzenfeld <PavelGuzenfeld@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant