Remove chain actors (#5502, #5687)#5790
Merged
Merged
Conversation
d4740e1 to
8bf5f36
Compare
, linera-io#5687) Backport of linera-io#5502 (Remove chain actors; handle read-only calls concurrently) and linera-io#5687 (Fix race conditions with getting and dropping chain workers) to testnet_conway. Replaces the channel-based ChainWorkerActor with a direct Arc<RwLock<ChainWorkerState>> approach, enabling concurrent read-only operations. Uses a lock-free papaya::HashMap with Shared<oneshot::Receiver<Weak<_>>> for race-free worker creation, and Arc::try_unwrap in the keep-alive task for safe worker cleanup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8bf5f36 to
2f3d401
Compare
There should only be one stage_block_execution, taking a policy argument. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bart-linera
approved these changes
Mar 26, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Port all new testnet_conway functionality into the no-chain-actor architecture: RevertConfirm cross-chain requests, inbox gap detection, outbox revert, message bundle chunking, reset-on-incorrect-outcome, poisoned worker handling, and next_expected_events support. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Minimizes the diff by keeping methods in the same order as in the testnet_conway branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a journal resolution failure poisons the chain worker, the view's in-memory state is inconsistent. Rolling back would give a false sense of consistency, so the RollbackGuard now skips rollback for poisoned workers. Both chain_read and chain_write evict poisoned workers from the cache so the next request reloads from storage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Port rename of missing_height to retransmit_from in RevertConfirm, block value cache improvements, and other recent testnet_conway changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
afck
added a commit
that referenced
this pull request
Apr 14, 2026
… `actor.rs`. (#6000) ## Motivation In #5790 I must have forgotten to remove (or brought back in a merge attempt) `actor.rs`. The port of #5991 to testnet_conway (#5992) modified `linera-core/src/chain_worker/actor.rs`, but actor.rs is an orphan file with no `mod actor` declaration in `chain_worker/mod.rs`, so it is not part of the build and the TTL inversion was never actually fixed on this branch. ## Proposal Apply the same swap to `handle.rs::create_chain_worker`, which is the code path that is actually compiled, and delete the stale `actor.rs` file so this cannot happen again. ## Test Plan CI ## Release Plan - Release SDK. - Validator hotfix. ## Links - Original TTL fix attempt: #5992 - Removing chain actors but not `actor.rs`: #5790 - [reviewer checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ma2bd
added a commit
to ma2bd/linera-protocol
that referenced
this pull request
Apr 17, 2026
The previous implementation wrote to the DB first, then updated the LRU cache. If the future was cancelled between the two (e.g. by the RollbackGuard introduced in linera-io#5790), the DB would have the new data but the cache would retain stale entries. Subsequent reads would hit the stale cache, and the next save would overwrite the DB with old data, causing silent data loss. Fix: invalidate cache entries BEFORE writing to the DB, then repopulate after success. If cancelled at any point after invalidation, subsequent reads go directly to the DB and see the correct state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This was referenced Apr 17, 2026
ma2bd
added a commit
that referenced
this pull request
Apr 19, 2026
…ing task::spawn) (#6056) ## Motivation The `LruCachingStore::write_batch` method was not cancellation-safe. It wrote to the DB first, then updated the LRU cache. If the caller's future was cancelled between the two steps (e.g. by a gRPC timeout or runtime shutdown), the DB would have the new data but the cache would retain stale entries. The subsequent `RollbackGuard` would then reset the in-memory view state, and the next `save()` would overwrite the DB with old data -- causing silent data loss. This was identified as the likely root cause of the missing outbox bucket data on validator 4 (`The front bucket is always loaded` panic). ## Proposal Wrap the DB write and cache update in a `tokio::task::spawn` so they run to completion even if the caller's future is cancelled. On web targets, the task runs inline since cancellation safety is not a concern there (no `RollbackGuard` / concurrent chain workers). Adds `Clone + Send + 'static` bounds on the inner store type parameter, which are already satisfied by all stores in the chain (ScyllaDB, RocksDB, journaling, value-splitting). ## Test Plan CI ## Links - Alternative to #6051 (invalidate-first approach) - Related: #5790 (Remove chain actors -- introduced `RollbackGuard`) - Related: #6015 (Fix `BucketQueueView::delete_front` load failure handling) - Related: #6046 (Fix storage cache reads dropping Arc before use) - [reviewer checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
afck
pushed a commit
to afck/linera-protocol
that referenced
this pull request
Apr 20, 2026
…ing task::spawn) (linera-io#6056) ## Motivation The `LruCachingStore::write_batch` method was not cancellation-safe. It wrote to the DB first, then updated the LRU cache. If the caller's future was cancelled between the two steps (e.g. by a gRPC timeout or runtime shutdown), the DB would have the new data but the cache would retain stale entries. The subsequent `RollbackGuard` would then reset the in-memory view state, and the next `save()` would overwrite the DB with old data -- causing silent data loss. This was identified as the likely root cause of the missing outbox bucket data on validator 4 (`The front bucket is always loaded` panic). ## Proposal Wrap the DB write and cache update in a `tokio::task::spawn` so they run to completion even if the caller's future is cancelled. On web targets, the task runs inline since cancellation safety is not a concern there (no `RollbackGuard` / concurrent chain workers). Adds `Clone + Send + 'static` bounds on the inner store type parameter, which are already satisfied by all stores in the chain (ScyllaDB, RocksDB, journaling, value-splitting). ## Test Plan CI ## Links - Alternative to linera-io#6051 (invalidate-first approach) - Related: linera-io#5790 (Remove chain actors -- introduced `RollbackGuard`) - Related: linera-io#6015 (Fix `BucketQueueView::delete_front` load failure handling) - Related: linera-io#6046 (Fix storage cache reads dropping Arc before use) - [reviewer checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport of #5502 and #5687.
Motivation
The chain actors are complicated and unnecessary, and even read-only requests to them are unnecessarily run only sequentially.
Proposal
Remove the chain actors, use an
RwLockinstead.Test Plan
CI should catch regressions. We should do benchmarks to see if this improves performance.
Release Plan
Links