Upstream Change Summary
Type: architectural-change
Difficulty: Hard
Recommendation: Adapt
A major structural refactor of the engine ↔ state-store boundary with three coordinated parts:
1. eager_existence_upsert at Build start: Component::execute_once now writes the component's own existence bit (and ancestor chain) in its own write transaction before the user processor runs. The Build branch is removed from pre_commit; only the Delete-mode node-type check remains. Simplifies pre_commit and makes the existence-chain invariant explicit at build entry.
2. Borrow-only TargetHandler::reconcile: desired_target_state changes from Option<Prof::TargetStateValue> (owned) to Option<&Prof::TargetStateValue> (borrow). Host-specific implementations decide whether/how to clone (Rust SDK: .clone() when constructing the sink action). Provider-generation updates are buffered into PreCommitOutput.deferred_provider_generations and applied after the outer run_txn commits — this prevents the OnceLock::set "already set" guard from tripping on retries. The engine itself never clones the value.
3. 40001 retry plumbing: Storage::run_txn_with_retry + Pg40001Backoff (full-jitter exponential, 10ms → 1000ms) + is_pg_serialization_failure stub (always returns false in OSS). For LMDB these are no-ops, but the infrastructure makes layering a Postgres-backed state-store mechanical. Simple idempotent call sites use run_txn_with_retry directly.
Upstream References
Relevant Upstream Files / Areas
rust/cocoindex/src/engine/ — Component::execute_once, pre_commit, PreCommitOutput, Committer, build entry
rust/cocoindex/src/state_store/ — Storage::run_txn_with_retry, Pg40001Backoff
rust/cocoindex/src/ops/interface.rs — TargetHandler::reconcile signature change (owned → borrow)
Recoco Considerations
- Affected files:
crates/recoco-core/src/engine/, crates/recoco-core/src/state_store/, crates/recoco-core/src/ops/interface.rs
- PUBLIC API BREAKING CHANGE:
TargetHandler::reconcile signature changes from Option<Prof::TargetStateValue> to Option<&Prof::TargetStateValue>. Any downstream crate implementing a custom target handler will need to update. Assess recoco's semver version bump requirements before landing.
- 40001 retry: Keep the infrastructure but clearly document the stub. No behavioral change for LMDB; the plumbing enables a future Postgres state-store.
deferred_provider_generations: The OnceLock::set retry-safety fix is an important correctness property — ensure this is retained.
- Python-specific
clone_ref(py) in the Python reconcile implementation should be excluded; Rust SDK uses .clone() only.
Integration Notes
Tackle in three phases:
eager_existence_upsert (self-contained, simplifies pre_commit — lowest risk)
- Borrow-only
TargetHandler::reconcile (breaks API — coordinate with semver policy, update all built-in targets)
- Retry infrastructure (additive, no behavior change for LMDB — lowest priority)
This is the most significant change in this batch. Review recoco issues #174 (concurrent preempt race) and #176 (ReadTxn removal) first, as they interact with the same engine/state-store boundary.
Upstream Change Summary
Type: architectural-change
Difficulty: Hard
Recommendation: Adapt
A major structural refactor of the engine ↔ state-store boundary with three coordinated parts:
1.
eager_existence_upsertat Build start:Component::execute_oncenow writes the component's own existence bit (and ancestor chain) in its own write transaction before the user processor runs. The Build branch is removed frompre_commit; only the Delete-mode node-type check remains. Simplifiespre_commitand makes the existence-chain invariant explicit at build entry.2. Borrow-only
TargetHandler::reconcile:desired_target_statechanges fromOption<Prof::TargetStateValue>(owned) toOption<&Prof::TargetStateValue>(borrow). Host-specific implementations decide whether/how to clone (Rust SDK:.clone()when constructing the sink action). Provider-generation updates are buffered intoPreCommitOutput.deferred_provider_generationsand applied after the outerrun_txncommits — this prevents theOnceLock::set"already set" guard from tripping on retries. The engine itself never clones the value.3. 40001 retry plumbing:
Storage::run_txn_with_retry+Pg40001Backoff(full-jitter exponential, 10ms → 1000ms) +is_pg_serialization_failurestub (always returnsfalsein OSS). For LMDB these are no-ops, but the infrastructure makes layering a Postgres-backed state-store mechanical. Simple idempotent call sites userun_txn_with_retrydirectly.Upstream References
Relevant Upstream Files / Areas
rust/cocoindex/src/engine/—Component::execute_once,pre_commit,PreCommitOutput,Committer, build entryrust/cocoindex/src/state_store/—Storage::run_txn_with_retry,Pg40001Backoffrust/cocoindex/src/ops/interface.rs—TargetHandler::reconcilesignature change (owned → borrow)Recoco Considerations
crates/recoco-core/src/engine/,crates/recoco-core/src/state_store/,crates/recoco-core/src/ops/interface.rsTargetHandler::reconcilesignature changes fromOption<Prof::TargetStateValue>toOption<&Prof::TargetStateValue>. Any downstream crate implementing a custom target handler will need to update. Assess recoco's semver version bump requirements before landing.deferred_provider_generations: TheOnceLock::setretry-safety fix is an important correctness property — ensure this is retained.clone_ref(py)in the Python reconcile implementation should be excluded; Rust SDK uses.clone()only.Integration Notes
Tackle in three phases:
eager_existence_upsert(self-contained, simplifies pre_commit — lowest risk)TargetHandler::reconcile(breaks API — coordinate with semver policy, update all built-in targets)This is the most significant change in this batch. Review recoco issues #174 (concurrent preempt race) and #176 (ReadTxn removal) first, as they interact with the same engine/state-store boundary.