Skip to content

[upstream-sync] refactor(engine,state_store): parallel-write shape, borrow-only reconcile (upstream #2022) #178

@github-actions

Description

@github-actions

Upstream Change Summary

Type: architectural-change
Difficulty: Hard
Recommendation: Adapt

A major structural refactor of the engine ↔ state-store boundary with three coordinated parts:

1. eager_existence_upsert at Build start: Component::execute_once now writes the component's own existence bit (and ancestor chain) in its own write transaction before the user processor runs. The Build branch is removed from pre_commit; only the Delete-mode node-type check remains. Simplifies pre_commit and makes the existence-chain invariant explicit at build entry.

2. Borrow-only TargetHandler::reconcile: desired_target_state changes from Option<Prof::TargetStateValue> (owned) to Option<&Prof::TargetStateValue> (borrow). Host-specific implementations decide whether/how to clone (Rust SDK: .clone() when constructing the sink action). Provider-generation updates are buffered into PreCommitOutput.deferred_provider_generations and applied after the outer run_txn commits — this prevents the OnceLock::set "already set" guard from tripping on retries. The engine itself never clones the value.

3. 40001 retry plumbing: Storage::run_txn_with_retry + Pg40001Backoff (full-jitter exponential, 10ms → 1000ms) + is_pg_serialization_failure stub (always returns false in OSS). For LMDB these are no-ops, but the infrastructure makes layering a Postgres-backed state-store mechanical. Simple idempotent call sites use run_txn_with_retry directly.

Upstream References

Relevant Upstream Files / Areas

  • rust/cocoindex/src/engine/Component::execute_once, pre_commit, PreCommitOutput, Committer, build entry
  • rust/cocoindex/src/state_store/Storage::run_txn_with_retry, Pg40001Backoff
  • rust/cocoindex/src/ops/interface.rsTargetHandler::reconcile signature change (owned → borrow)

Recoco Considerations

  • Affected files: crates/recoco-core/src/engine/, crates/recoco-core/src/state_store/, crates/recoco-core/src/ops/interface.rs
  • PUBLIC API BREAKING CHANGE: TargetHandler::reconcile signature changes from Option<Prof::TargetStateValue> to Option<&Prof::TargetStateValue>. Any downstream crate implementing a custom target handler will need to update. Assess recoco's semver version bump requirements before landing.
  • 40001 retry: Keep the infrastructure but clearly document the stub. No behavioral change for LMDB; the plumbing enables a future Postgres state-store.
  • deferred_provider_generations: The OnceLock::set retry-safety fix is an important correctness property — ensure this is retained.
  • Python-specific clone_ref(py) in the Python reconcile implementation should be excluded; Rust SDK uses .clone() only.

Integration Notes

Tackle in three phases:

  1. eager_existence_upsert (self-contained, simplifies pre_commit — lowest risk)
  2. Borrow-only TargetHandler::reconcile (breaks API — coordinate with semver policy, update all built-in targets)
  3. Retry infrastructure (additive, no behavior change for LMDB — lowest priority)

This is the most significant change in this batch. Review recoco issues #174 (concurrent preempt race) and #176 (ReadTxn removal) first, as they interact with the same engine/state-store boundary.

Metadata

Metadata

Assignees

No one assigned

    Labels

    claudeCreated or actioned by Claude AIupstream-syncIssues for syncing updates with our upstream (cocoindex-io/cocoindex)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions