feat(c++/node): add close_outputs, NodeFailed and Reload event types (rescue of #1410)#1847
Conversation
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
Rescue of #1410 (PavelGuzenfeld). Second in the 4-PR rescue stack for the C++ API parity work; stacked on top of #1846 (event receive variants). New FFI surface =============== * Two new `DoraEventType` variants: - `NodeFailed` -- an upstream node has failed. Inputs that depended on the failed node will stop receiving data; the payload (via `event_as_node_failed`) names the failed node, the error message, and the affected input ids. - `Reload` -- hot-reload notification for an operator. C++ nodes can now react (e.g. flush caches) instead of treating it as `Unknown`. * `event_as_node_failed(event) -> DoraNodeFailed` -- extracts `{ affected_input_ids, error, source_node_id }` from a NodeFailed event. * `close_outputs(sender, output_ids)` -- selectively close one or more of this node's outputs without shutting the whole node down. Downstream subscribers see the corresponding `InputClosed` event. Implementation notes ==================== * `event_as_node_failed` destructures via `EventOrReason::Event(...)` rather than the original PR's `event.event` (which assumed the named-field DoraEvent shape from #1409 that we never adopted). * `close_outputs` returns `DoraResult` like the existing `send_output` family for ergonomic consistency on the C++ side. Closes #1410. Co-Authored-By: Pavel Guzenfeld <PavelGuzenfeld@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
b02b2f1 to
5b39f72
Compare
Rescue of #1413 (PavelGuzenfeld). Third in the 4-PR rescue stack for C++ API parity; stacked on top of #1847 (close_outputs + NodeFailed/Reload). New FFI surface =============== * `node_config_json(output_sender) -> Result<String>` -- returns this node's `NodeRunConfig` (inputs / outputs / metadata from the dataflow descriptor) serialized as JSON. Lets C++ nodes introspect their own declared interface at runtime without re-parsing the yaml. * `dataflow_descriptor_json(output_sender) -> Result<String>` -- returns the full parsed `Descriptor` (the dataflow yaml) as JSON. Useful for introspecting peer nodes, listing all topics, etc. May fail if the daemon hasn't delivered the descriptor yet (e.g. very early in startup). Both functions wrap the existing `DoraNode::node_config()` and `DoraNode::dataflow_descriptor()` methods from `apis/rust/node/src/node/mod.rs:1116,1317`, serializing through `serde_json::to_string`. Closes #1413. Co-Authored-By: Pavel Guzenfeld <PavelGuzenfeld@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Review finding on #1847 head 5b39f72: `close_outputs` converted caller-supplied `Vec<String>` to `Vec<DataId>` via `Into::into`, which calls `From<String> for DataId` -- documented as panicking on invalid characters at `libraries/message/src/id.rs:186` with an explicit `# Panics` notice. A typo in a C++ string literal would have aborted the whole node instead of returning `DoraResult.error` like the rest of the C++ API surface does. Same panic-across-cxx::bridge category as the `Instant::now() + Duration::from_millis(u64::MAX)` overflow on #1846 -- a public C++ API accepting unconstrained input must be panic-safe for the entire input domain. Fix: parse each id via `id.parse::<DataId>()` (`impl FromStr`, which is the safe path documented at `libraries/message/src/id.rs:167`) and on the first parse error return a `DoraResult` whose `error` names the offending id and the underlying validation error. The well-formed-id case is unchanged; only the previously- panicking error path now surfaces cleanly. Pattern bank update for me (second in this PR series): "public API accepts type T via FFI" + "T has a `From` impl that panics on invalid input" = "use `FromStr` / `TryFrom`, never the panicking `From`". The standard library's `String::from` is total, so it doesn't share this gotcha, but dora's id types deliberately panic in `From` to keep the happy path concise -- which is the right call for trusted Rust call sites and the wrong call for FFI boundaries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Real bug, same panic-across-FFI category as #1846's The bug
The fixReplace let mut ids = Vec::with_capacity(output_ids.len());
for id in output_ids {
match id.parse::<dora_node_api::dora_core::config::DataId>() {
Ok(parsed) => ids.push(parsed),
Err(e) => {
return ffi::DoraResult {
error: format!("invalid output id '{id}': {e}"),
};
}
}
}
match output_sender.0.close_outputs(ids) {
Ok(()) => ffi::DoraResult { error: String::new() },
Err(err) => ffi::DoraResult { error: format!("{err:?}") },
}Well-formed-id case is unchanged; only the previously-panicking error path now surfaces a Pattern bank update (second time in this PR series)The unifying rule: when an FFI accepts type Lessons from #1846 + #1847 combined:
Going forward, when reviewing FFI surfaces I check: "what's the unconstrained input type, what total-feeling Rust conversions does it touch, and are any of them partial-with-panic?" On the residual docs noteYou're right that That's a doc gap spanning 4 PRs of new C++ API surface, and it'd more than double the size of any individual PR in the stack. Suggestion: I'll file a follow-up issue tracking the doc work for the whole #1846–#1849 surface and link it from each PR description. That way the docs land as one coherent update against the post-merge state instead of being smeared across 4 stacked rebases. Sound right, or want me to fold the docs into this stack now? PR head: |
…de_failed Review residual: docs/api-cxx.md was stale for the new C++ node API surface this PR adds (close_outputs, NodeFailed, Reload, DoraNodeFailed, event_as_node_failed). Also catches up on the #1846 additions that merged earlier in the stack (Empty, Timeout, try_next_event, next_event_timeout, events_is_empty, drain_events, DrainedEvents) since the DoraEventType enum and the downcast-helper section can't be accurate without them. Additions ========= * New "Non-blocking and timed receive" subsection under Events documenting try_next_event, next_event_timeout, events_is_empty, and the drain_events / DrainedEvents family. * DoraEvent downcast helpers gain event_as_node_failed. * DoraEventType enum now lists all 10 variants currently on main (was 6) with one-line descriptions, plus a paragraph explaining the Empty vs Timeout distinction. * New DoraNodeFailed struct section after DoraInput, with a short usage example. * New OutputSender::close_outputs subsection, noting that ids are validated and return DoraResult.error on invalid input (matching the FromStr parse fix landed earlier on this PR). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Done in
Scope note: I included the #1846 catch-up ( PR head: |
Rescue of #1413 (PavelGuzenfeld). Third in the 4-PR rescue stack for C++ API parity; stacked on top of #1847 (close_outputs + NodeFailed/Reload). New FFI surface =============== * `node_config_json(output_sender) -> Result<String>` -- returns this node's `NodeRunConfig` (inputs / outputs / metadata from the dataflow descriptor) serialized as JSON. Lets C++ nodes introspect their own declared interface at runtime without re-parsing the yaml. * `dataflow_descriptor_json(output_sender) -> Result<String>` -- returns the full parsed `Descriptor` (the dataflow yaml) as JSON. Useful for introspecting peer nodes, listing all topics, etc. May fail if the daemon hasn't delivered the descriptor yet (e.g. very early in startup). Both functions wrap the existing `DoraNode::node_config()` and `DoraNode::dataflow_descriptor()` methods from `apis/rust/node/src/node/mod.rs:1116,1317`, serializing through `serde_json::to_string`. Closes #1413. Co-Authored-By: Pavel Guzenfeld <PavelGuzenfeld@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
+ env-secrets caveat on dataflow_descriptor_json Self-review of #1848: 1) docs/api-cxx.md was not updated to reflect this PR's new C++ API surface. Same gap the reviewer caught on #1847, applied to #1848's additions. Added subsections under OutputSender for `node_config_json` and `dataflow_descriptor_json`. 2) `dataflow_descriptor_json` serializes the full Descriptor, which includes per-node `env: { ... }` blocks. Per `libraries/message/src/descriptor.rs:1128`, the `with_expand_envs` deserializer expands `${HOST_VAR}` references at descriptor parse time -- so by the time the node receives the descriptor, env values are inline strings, not references. A user with secrets defined inline (`API_KEY: "sk-..."`) or via host substitution (`API_KEY: "${OPENAI_KEY}"`) will see those values in the returned JSON. The function's rustdoc had no caveat. Added one to both the cxx::bridge declaration in apis/c++/node/src/lib.rs and the prose subsection in docs/api-cxx.md, calling out the inline- literal and host-substitution paths explicitly and noting that callers shouldn't pipe the output to shared logs without sanitization. Consistent with dora's broader posture (`dora list --json` and other paths already expose descriptors) but this is a new in-node introspection path that may surface in different contexts (e.g. logging from a C++ node that runs in a less- privileged process than the daemon). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… of #1413) (#1848) * feat(c++/node): add node_config_json and dataflow_descriptor_json Rescue of #1413 (PavelGuzenfeld). Third in the 4-PR rescue stack for C++ API parity; stacked on top of #1847 (close_outputs + NodeFailed/Reload). New FFI surface =============== * `node_config_json(output_sender) -> Result<String>` -- returns this node's `NodeRunConfig` (inputs / outputs / metadata from the dataflow descriptor) serialized as JSON. Lets C++ nodes introspect their own declared interface at runtime without re-parsing the yaml. * `dataflow_descriptor_json(output_sender) -> Result<String>` -- returns the full parsed `Descriptor` (the dataflow yaml) as JSON. Useful for introspecting peer nodes, listing all topics, etc. May fail if the daemon hasn't delivered the descriptor yet (e.g. very early in startup). Both functions wrap the existing `DoraNode::node_config()` and `DoraNode::dataflow_descriptor()` methods from `apis/rust/node/src/node/mod.rs:1116,1317`, serializing through `serde_json::to_string`. Closes #1413. Co-Authored-By: Pavel Guzenfeld <PavelGuzenfeld@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(api-cxx): document node_config_json + dataflow_descriptor_json + env-secrets caveat on dataflow_descriptor_json Self-review of #1848: 1) docs/api-cxx.md was not updated to reflect this PR's new C++ API surface. Same gap the reviewer caught on #1847, applied to #1848's additions. Added subsections under OutputSender for `node_config_json` and `dataflow_descriptor_json`. 2) `dataflow_descriptor_json` serializes the full Descriptor, which includes per-node `env: { ... }` blocks. Per `libraries/message/src/descriptor.rs:1128`, the `with_expand_envs` deserializer expands `${HOST_VAR}` references at descriptor parse time -- so by the time the node receives the descriptor, env values are inline strings, not references. A user with secrets defined inline (`API_KEY: "sk-..."`) or via host substitution (`API_KEY: "${OPENAI_KEY}"`) will see those values in the returned JSON. The function's rustdoc had no caveat. Added one to both the cxx::bridge declaration in apis/c++/node/src/lib.rs and the prose subsection in docs/api-cxx.md, calling out the inline- literal and host-substitution paths explicitly and noting that callers shouldn't pipe the output to shared logs without sanitization. Consistent with dora's broader posture (`dora list --json` and other paths already expose descriptors) but this is a new in-node introspection path that may surface in different contexts (e.g. logging from a C++ node that runs in a less- privileged process than the daemon). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Pavel Guzenfeld <PavelGuzenfeld@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Rescue of #1410 (@PavelGuzenfeld). Second PR in the 4-PR rescue stack for C++ API parity.
Stacked on top of #1846 (event receive variants). Once #1846 merges I'll rebase this onto main.
Closes #1410.
New FFI surface
DoraEventType::NodeFailedDoraEventType::ReloadUnknown.DoraNodeFailed { affected_input_ids, error, source_node_id }event_as_node_failed.event_as_node_failed(event)NodeFailedevent.close_outputs(sender, output_ids)InputClosed.Implementation notes
event_as_node_faileddestructures viaEventOrReason::Event(Event::NodeFailed { … })to match the new tuple-struct shape that feat(c++/node): add event receive variants (rescue of #1409) #1846's rescue introduced (the original PR usedevent.eventwhich assumed the never-adopted named-field shape).close_outputscalls through todora_node_api::DoraNode::close_outputs(Vec<DataId>)which has been on main since the 1.0 consolidation. Errors are returned asDoraResult { error }for ergonomic consistency withsend_output.event_typegets two new match arms (Event::NodeFailed { .. }→NodeFailed,Event::Reload { .. }→Reload). Existing variants unchanged.Test plan
cargo build -p dora-node-api-cxx— clean compilecargo clippy -p dora-node-api-cxx -- -D warnings— cleanexamples/c++-dataflow/node-rust-api/main.cc) extended to handleNodeFailedandReloadevent types🤖 Generated with Claude Code