Skip to content

Make DiskManager max_temp_directory_size dynamically adjustable#22246

Open
Bukhtawar wants to merge 1 commit into
apache:mainfrom
Bukhtawar:dynamic-disk-limit
Open

Make DiskManager max_temp_directory_size dynamically adjustable#22246
Bukhtawar wants to merge 1 commit into
apache:mainfrom
Bukhtawar:dynamic-disk-limit

Conversation

@Bukhtawar
Copy link
Copy Markdown

Change max_temp_directory_size from u64 to AtomicU64 so the spill disk limit can be adjusted at runtime without requiring exclusive (&mut) access to the DiskManager.

Before this change, set_max_temp_directory_size required &mut self which was unavailable after the DiskManager was shared via Arc (as it always is in production through RuntimeEnv). The only workaround was set_arc_max_temp_directory_size which required Arc::get_mut — always failing when multiple sessions held references.

After this change, set_max_temp_directory_size takes &self and uses an atomic store. Any thread can adjust the limit, and subsequent spill writes immediately see the new value.

Use cases:

  • Adaptive spill limits based on available disk space
  • Runtime cluster setting changes without restart
  • Graceful degradation under disk pressure

The set_arc_max_temp_directory_size method is deprecated but kept for backward compatibility — it now delegates to the &self method.

Performance: AtomicU64::load(Acquire) adds ~1ns per spill write check. Negligible since spill writes take milliseconds.

Which issue does this PR close?

  • Closes #.

Rationale for this change

Ensure can be updated in-place based on available disk or in-place disk size increase

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions Bot added the execution Related to the execution crate label May 16, 2026
assert_eq!(dm.max_temp_directory_size(), 2048);

// Can also decrease
dm.set_max_temp_directory_size(512)?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if queries are holding disk above the new configurable limit ? When we change the configuration from 1024 mb to 512 mb, what's the behavior.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests, thanks for pointing this out

@Bukhtawar Bukhtawar force-pushed the dynamic-disk-limit branch from faad8d4 to f22c753 Compare May 16, 2026 14:57
@github-actions github-actions Bot added the auto detected api change Auto detected API change label May 16, 2026
@Bukhtawar Bukhtawar force-pushed the dynamic-disk-limit branch from 3386b55 to 9478f44 Compare May 17, 2026 00:53
@github-actions github-actions Bot removed the auto detected api change Auto detected API change label May 17, 2026
Comment on lines 268 to 278
pub fn set_arc_max_temp_directory_size(
this: &mut Arc<Self>,
max_temp_directory_size: u64,
) -> Result<()> {
if let Some(inner) = Arc::get_mut(this) {
inner.set_max_temp_directory_size(max_temp_directory_size)?;
Ok(())
inner.set_max_temp_directory_size(max_temp_directory_size)
} else {
config_err!("DiskManager should be a single instance")
// No exclusive access — use the atomic path
this.update_max_temp_directory_size(max_temp_directory_size)
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's collapse set_arc_max_temp_directory_size. Both branches now do the same thing. Replace the body with:

this.update_max_temp_directory_size(max_temp_directory_size)

and add #[deprecated(note = "use update_max_temp_directory_size")]. The Arc::get_mut dance no longer serves any purpose.

      #[deprecated(note = "use `update_max_temp_directory_size` instead")]
      pub fn set_arc_max_temp_directory_size(
          this: &Arc<Self>,
          max_temp_directory_size: u64,
      ) -> Result<()> {
          this.update_max_temp_directory_size(max_temp_directory_size)
      }

///
/// For runtime adjustment without `&mut self`, use
/// [`Self::update_max_temp_directory_size`] instead.
pub fn set_max_temp_directory_size(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_max_temp_directory_size(&mut self, ...) is now redundant. It does exactly the same thing as update_max_temp_directory_size but requires &mut. Keeping both is mildly confusing

@Bukhtawar Bukhtawar force-pushed the dynamic-disk-limit branch from 41c5be9 to ecaa3d8 Compare May 17, 2026 06:10
Change `max_temp_directory_size` from `u64` to `AtomicU64` and add a new
`update_max_temp_directory_size(&self)` method that allows adjusting the
spill disk limit at runtime without requiring exclusive access.

The existing `set_max_temp_directory_size(&mut self)` is preserved
unchanged for backward compatibility. The new method enables:
- Adaptive spill limits based on available disk space
- Runtime cluster setting changes without restart
- Graceful degradation under disk pressure

The `set_arc_max_temp_directory_size` now falls through to the atomic
path when `Arc::get_mut` fails (multiple references exist), instead of
returning an error.

When the limit is decreased below current usage:
- Existing spill files remain (not deleted — would break running queries)
- New spill writes fail immediately (used > new limit)
- As old queries complete, usage naturally drops below the new limit
- New spills succeed once usage < limit

Performance: `AtomicU64::load(Acquire)` adds ~1ns per spill write check.
Spill writes take milliseconds — negligible impact.

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
@Bukhtawar Bukhtawar force-pushed the dynamic-disk-limit branch from cfcb50c to dd8ed5c Compare May 17, 2026 06:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

execution Related to the execution crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants