Skip to content

fix(common): Dropping CrossProcessLock must not release the lock if a CrossProcessGuard is still alive#6674

Merged
Hywan merged 5 commits into
matrix-org:mainfrom
Hywan:refactor-common-cross-process-lock-num_holders
Jun 25, 2026
Merged

fix(common): Dropping CrossProcessLock must not release the lock if a CrossProcessGuard is still alive#6674
Hywan merged 5 commits into
matrix-org:mainfrom
Hywan:refactor-common-cross-process-lock-num_holders

Conversation

@Hywan

@Hywan Hywan commented Jun 18, 2026

Copy link
Copy Markdown
Member

This patch revisits the way the number of holders is calculated for a cross-process lock. The postulate is the following: Prior to this patch, we used an Arc<AtomicU32>. However, Arc already does atomic counting. So we do a double atomic counting to calculate the number of holders. It seems a bit a waste of work. This patch replaces Arc<AtomicU32> by Arc<()>. This patch abstracts this with a new type: Holders, with the following method:

  • clone to act as Arc, and
  • count to replace the AtomicU32::load, it instead calls Arc::strong_count, which is almost the same. The only difference is that the memory order isn't the same: it was Ordering::SeqCst before, now it is Ordering::Relaxed. I've reviewed the code and it does not seem to make any relevant difference.

A nice side-effect is that we can implement CrossProcessLockGuard::lock_has_been_dropped, because, now, we can know when the lock has been dropped and the guard is “detached”, so it won't have any effect at all. It can be convenient!

All tests have been updated to test the count() in more details, making test_number_of_holders_follows_the_number_of_guards useless: this patch removes it.


Oh wait, now I've realised that we have a big bug :-]. CrossProcessLockGuard::lock_has_been_dropped is wrroooooong! As long as the guard is alive, the lock must be kept alive, of course. The last patch revisits the design of the CrossProcessLock to fix this problem.

This last patch fixes an important problem with the CrossProcessLock. I noticed that when the CrossProcessLock is dropped, all CrossProcessLockGuards report 0 holder, and thus, the lock is released, but it's wrong. As long as guards are alive, the lock MUST NOT be released. Dropping all the guards releases the lock, but dropping the CrossProcessLock —which is not the lock in itself— must not drop the acquired lock.

Previously, the number of holders was calculated based on Arc::strong_count. The new strategy is to use Arc combined with Weak because we can count both separately (!):

  • CrossProcessLock holds an Arc (the inner field).
  • The renew_task holds an Arc (a clone of the inner field).
  • CrossProcessLockGuard holds a Weak.
  • Counting holders = counting the number of Weak pointers.
  • It is safe to upgrade the Weak pointer to an Arc (to get information about dirtiness) in a guard because the renew_task holds a clone of the Arc and will not exit as long as all guards have been dropped.
  • It is always possible to create a Weak pointer (i) either from CrossProcessLock by using Arc::downgrade, (ii) or from CrossProcessLockGuard by cloning it.

The Holders type is removed as it becomes useless. The new CrossProcessLockInner type holds all data useful for the lock to be alive. Counting the number of holders boils down to counting the weak pointers of Weak<CrossProcessLockInner>.


  • I've documented the public API changes in the appropriate changelog files
    (see Writing changelog entries).
  • This PR was made with the help of AI.

Signed-off-by:

…er of cross-process lock holders.

This patch revisits the way the number of holders is calculated for
a cross-process lock. The postulate is the following: Prior to this
patch, we used an `Arc<AtomicU32>`. However, `Arc` already does
atomic counting. So we do a double atomic counting to calculate the
number of holders. It seems a bit a waste of work. This patch replaces
`Arc<AtomicU32>` by `Arc<()>`. This patch abstracts this with a new
type: `Holders`, with the following method:

- `clone` to act as `Arc`, and
- `count` to replace the `AtomicU32::load`, it instead calls
  `Arc::strong_count`, which is the same. Note: the atomic ordering
  isn't the same: it was `SeqCst` before, now it is `Relaxed`.
  I've reviewed the code and it does not seem to make any relevant
  difference.

A nice side-effect is that we can implement
`CrossProcessLockGuard::lock_has_been_dropped`, because, now, we can
know when the lock has been dropped and the guard is “detached”, so it
won't any effect at all. It can be convenient!

All tests have been updated to test the `count()` in more details,
making `test_number_of_holders_follows_the_number_of_guards` useless:
this patch removes it.
@Hywan Hywan marked this pull request as ready for review June 18, 2026 14:22
@Hywan Hywan requested a review from a team as a code owner June 18, 2026 14:22
@Hywan Hywan requested review from stefanceriu and removed request for a team June 18, 2026 14:22
@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.00000% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.89%. Comparing base (3862625) to head (89d2a79).
⚠️ Report is 12 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
crates/matrix-sdk-common/src/cross_process_lock.rs 93.00% 6 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6674      +/-   ##
==========================================
+ Coverage   89.88%   89.89%   +0.01%     
==========================================
  Files         396      396              
  Lines      110236   110256      +20     
  Branches   110236   110256      +20     
==========================================
+ Hits        99083    99114      +31     
+ Misses       7385     7380       -5     
+ Partials     3768     3762       -6     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

@codspeed-hq

codspeed-hq Bot commented Jun 18, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 50 untouched benchmarks


Comparing Hywan:refactor-common-cross-process-lock-num_holders (89d2a79) with main (4bd5e81)

Open in CodSpeed

Comment thread crates/matrix-sdk-common/src/cross_process_lock.rs Outdated
…w_task`.

This patch replaces a manual management of a `JoinHandle` by
`AbortOnDrop`. At least it makes things explicit and we re-use existing
mechanisms.
@Hywan Hywan force-pushed the refactor-common-cross-process-lock-num_holders branch 2 times, most recently from 972cadd to b354f74 Compare June 19, 2026 14:28
Hywan added 2 commits June 19, 2026 16:31
… a `CrossProcessGuard` is still alive.

This patch fixes an important problem with the `CrossProcessLock`.
I noticed that when the `CrossProcessLock` is dropped, all
`CrossProcessLockGuard`s report 0 holder, and thus, the lock is
released, but it's wrong. As long as guards are alive, the lock **MUST
NOT** be released. Dropping all the guards releases the lock, but
dropping the `CrossProcessLock` —which is not the lock in itself— must
not drop the acquired lock.

Previously, the number of holders was calculated based on
`Arc::strong_count`. The new strategy is to use `Arc` combined with
`Weak` because we can count both separately (!):

- `CrossProcessLock` holds an `Arc` (the `inner` field).
- The `renew_task` holds an `Arc` (a clone of the `inner` field).
- `CrossProcessLockGuard` holds a `Weak`.
- Counting holders = counting the number of `Weak` pointers.
- It is safe to upgrade the `Weak` pointer to an `Arc` (to get
  information about dirtiness) in a guard because the `renew_task` holds
  a clone of the `Arc` and will not exit as long as all guards have
  been dropped.
- It is always possible to create a `Weak` pointer (i) either
  from `CrossProcessLock` by using `Arc::downgrade`, (ii) or from
  `CrossProcessLockGuard` by cloning it.

The `Holders` type is removed as it becomes useless. The new
`CrossProcessLockInner` type holds all data useful for the lock to be
alive. Counting the number of holders boils down to count the weak
pointers of `Weak<CrossProcessLockInner>`.
@Hywan Hywan force-pushed the refactor-common-cross-process-lock-num_holders branch from b354f74 to fe9d65d Compare June 19, 2026 14:32
@Hywan Hywan changed the title refactor(common): Remove the use of AtomicU32 to calculate the number of cross-process lock holders fix(common): Dropping CrossProcessLock must not release the lock if a CrossProcessGuard is still alive Jun 19, 2026

@stefanceriu stefanceriu left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me in as much as I understand it but I'm sure somebody else should review it.

Comment thread crates/matrix-sdk-common/src/cross_process_lock.rs
@Hywan Hywan requested a review from andybalaam June 23, 2026 12:02
@Hywan

Hywan commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

Let's invite @andybalaam to the party!

@andybalaam andybalaam left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions in comments, but I think it looks good. I don't understand this as well as I'd like - maybe you could talk me through it @Hywan ?

Comment thread crates/matrix-sdk-common/src/cross_process_lock.rs Outdated
Comment thread crates/matrix-sdk-common/src/cross_process_lock.rs Outdated
Comment thread crates/matrix-sdk-common/src/cross_process_lock.rs
Comment thread crates/matrix-sdk-common/src/cross_process_lock.rs Outdated
Comment thread crates/matrix-sdk-common/src/cross_process_lock.rs Outdated
Comment thread crates/matrix-sdk-common/src/cross_process_lock.rs Outdated
Comment thread crates/matrix-sdk-common/src/cross_process_lock.rs Outdated
@Hywan Hywan enabled auto-merge (rebase) June 25, 2026 13:17
@Hywan Hywan merged commit d23eca2 into matrix-org:main Jun 25, 2026
119 of 137 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants