Commit 05a2ff1
committed
MDEV-34358 Encryption threads consume CPU and deadlock with DROP TABLE/purge
Problem:
========
1. Encryption threads busy-wait when no work is available:
When reaching fil_system.space_list.end(), fil_crypt_return_iops() is called
with wake=true, causing pthread_cond_broadcast() to wake all threads
unnecessarily, leading to CPU waste.
2. Tablespaces with CLOSING/STOPPING flags skipped during iteration:
Since DDL completion doesn't wake encryption threads, these spaces may never
be encrypted if threads sleep indefinitely.
3. For default_encrypt_list iteration, when spaces exist but none are
acquirable, threads need to wake others for cooperative retry, but this
case was not distinguished from fil_system.space_list.end().
4. IOPS are allocated before searching for tablespaces, wasting resources
during iteration when no I/O occurs.
5. Encryption threads use fil_crypt_threads_cond for two different purposes:
waiting for encryption work and waiting for IOPS allocation. When
fil_crypt_return_iops() or fil_crypt_realloc_iops() broadcasts after
releasing IOPS, it wakes ALL threads including those correctly waiting
for work, causing spurious wakeups and CPU waste.
6. When innodb_encrypt_tables or innodb_encryption_rotate_key_age is changed
during encryption thread iteration, threads continue with stale configuration
values, potentially missing tablespaces that should be encrypted or rotated
under the new settings.
7. The InnoDB encryption thread could deadlock with DROP TABLE and purge
operations in a three-way deadlock scenario:
- DROP TABLE thread holds lock_sys.latch and waits for the tablespace
pending reference count to reach zero before dropping the space.
- Encryption thread holds a tablespace reference and waits to acquire
the tablespace allocation latch in exclusive mode to call
fseg_page_is_allocated() for checking if a page is allocated before
encrypting it.
- Purge coordinator thread holds the tablespace allocation latch
in exclusive mode and waits to acquire lock_sys.latch
in shared mode for record lock operations.
This creates a circular dependency and leads to deadlock.
Solution:
=========
1. Implement timed wait with exponential backoff:
When space == fil_system.space_list.end() (applies to both default_encrypt_list
and space_list iteration when no acquirable spaces are found):
- First timeout: 5 seconds
- Subsequent timeouts: (timed_wait_count + 1) * 5 seconds (10s, 20s, 40s, 60s)
- After 5 consecutive timeouts (~135 seconds total), continue with 60-second
timed waits to ensure threads periodically recheck for state changes
- Timeout counter resets to 0 when woken by signal or when work is found
2. Move IOPS allocation from before tablespace search to after finding a space
that needs rotation. If allocation fails, set recheck=true to skip waiting
and immediately try next space.
Encryption threads would hold space references while waiting in
fil_crypt_alloc_iops(), blocking DROP TABLE. To prevent this,
encyption should do release-wait-reacquire pattern.
fil_crypt_alloc_iops(): Added nowait parameter (default false). When
nowait=true, returns immediately if IOPS not available instead of waiting.
fil_crypt_thread(): Try non-blocking IOPS allocation first with nowait=true.
Only if IOPS not immediately available:
- Save space ID and release space reference
- Wait for IOPS with nowait=false
- Reacquire space using fil_space_get_by_id() and space->acquire()
- If space dropped or stopping, release IOPS and skip
This ensures encryption threads never hold space references while waiting
for IOPS, allowing DROP TABLE operations to proceed without deadlock.
3. Introduce separate condition variable fil_crypt_iops_cond specifically for
IOPS allocation synchronization to prevent spurious wakeups:
- fil_crypt_threads_cond: Used in wait_for_work() for waiting when no
tablespaces need encryption. Signaled when settings change, new
tablespaces are created, or thread count changes.
- fil_crypt_iops_cond: Used in fil_crypt_alloc_iops() for waiting
when IOPS limit is reached. Signaled when IOPS are returned via
fil_crypt_return_iops(), released via fil_crypt_realloc_iops(), or
when srv_n_fil_crypt_iops is increased.
4. Added atomic version counter fil_crypt_settings_version that is incremented
whenever innodb_encrypt_tables or innodb_encryption_rotate_key_age changes.
Encryption threads capture the version at iteration start and check for
changes during iteration. If config changed, threads immediately restart
iteration from the beginning to ensure complete coverage with new settings.
New fields:
- fil_crypt_settings_version: Atomic counter to track configuration changes
- rotate_thread_t::timed_wait_count (uint8_t): Counts consecutive timeouts
for exponential backoff
- rotate_thread_t::wait_for_work(): Implements timed/indefinite wait strategy
- rotate_thread_t::settings_version: Compares with fil_crypt_settings_version
to restart encryption from the beginning
5. Fix three-way deadlock with trylock mechanism:
fseg_page_is_allocated(): Added optional caller_mtr parameter. Encryption
thread passes an mtr, allocation bitmap page latch is now correctly held
in that mtr until the caller commits it.
fil_crypt_get_page_throttle(): Use fil_space_t::x_lock_try() to acquire
the tablespace allocation latch non-blockingly. If the trylock fails,
back off and retry the same page to avoid deadlock. The bitmap page
S-latch is added to the mtr and held throughout the page read and
encryption operations. Release the tablespace exclusive latch immediately
after the allocation check to minimize contention, while the bitmap page
S-latch remains held in the mtr.
fil_crypt_rotate_page(): After 10 retries on page latch acquisition,
release IOPS, sleep 10ms, then try to reacquire IOPS with nowait=true.
If IOPS not immediately available, exit gracefully by setting first=true.
This periodic IOPS release allows other encryption threads working on
tablespaces being dropped to make progress and release their space
references, helping to break the deadlock scenario.1 parent b2050fd commit 05a2ff1
4 files changed
Lines changed: 399 additions & 104 deletions
0 commit comments