Skip to content

Commit 3b1c8b7

Browse files
mmastracclaude
andcommitted
feat: add file_clone option for reflink-based disk cache
Cherry-picked from upstream PR mozilla#2640. Adds a file_clone option for the disk cache that stores cache entries as uncompressed files and restores them using filesystem reflinks (clonefile() on APFS, FICLONE on Linux). When supported, restored artifacts share underlying storage blocks with the cache entry. Configure with SCCACHE_FILE_CLONE=true or file_clone = true in [cache.disk] config. Upstream: mozilla#2640 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent b45d4e0 commit 3b1c8b7

15 files changed

Lines changed: 1071 additions & 82 deletions

File tree

Cargo.lock

Lines changed: 101 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@ opendal = { version = "0.55.0", optional = true, default-features = false, featu
7575
] }
7676
openssl = { version = "0.10.75", optional = true }
7777
rand = "0.8.4"
78+
reflink-copy = "0.1"
7879
regex = "1.10.3"
7980
reqsign = { version = "0.18.0", optional = true }
8081
reqwest = { version = "0.12", features = [

docs/FileClone.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# FileClone Storage
2+
3+
## Overview
4+
5+
The `file_clone` option enables uncompressed cache storage with Copy-on-Write (CoW) filesystem support for faster cache hits.
6+
7+
## Configuration
8+
9+
Add to your sccache config file (e.g., `~/.config/sccache/config`):
10+
11+
```toml
12+
[cache.disk]
13+
file_clone = true
14+
```
15+
16+
Or set via environment variable:
17+
18+
```bash
19+
export SCCACHE_FILE_CLONE=true
20+
```
21+
22+
## How it Works
23+
24+
When `file_clone` is enabled:
25+
26+
1. **Detection**: sccache checks if the cache directory is on a CoW filesystem (APFS on macOS, Btrfs/XFS on Linux)
27+
2. **Uncompressed Storage**: Cache entries are stored as directories with raw files instead of ZIP+zstd
28+
3. **Reflink Extraction**: On cache hit, files are copied using reflink (near-instant on CoW filesystems)
29+
4. **Fallback**: If CoW is not supported, automatically falls back to traditional compressed storage
30+
31+
## Performance Benefits
32+
33+
On CoW filesystems:
34+
- Near-zero copy time for cached files (reflink uses filesystem-level COW)
35+
- Reduced CPU usage (no decompression step)
36+
- Trade-off: Slightly higher disk usage (uncompressed files)
37+
38+
## Compatibility
39+
40+
Works on:
41+
- macOS with APFS
42+
- Linux with Btrfs
43+
- Linux with XFS
44+
- Other filesystems with reflink support
45+
46+
If the filesystem doesn't support reflink, sccache automatically uses compressed storage and logs a warning.
47+
48+
## Implementation Details
49+
50+
- Cache entries stored as directories under `cache/a/b/{hash}/`
51+
- Each directory contains: `{object_name}`, `stdout`, `stderr`
52+
- Original ZIP+zstd format still supported for backwards compatibility

src/cache/cache.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -629,6 +629,7 @@ pub fn storage_from_config(
629629
preprocessor_cache_mode_config,
630630
rw_mode,
631631
config.basedirs.clone(),
632+
config.fallback_cache.file_clone,
632633
)))
633634
}
634635

src/cache/cache_io.rs

Lines changed: 81 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,10 @@ pub struct FileObjectSource {
4848

4949
/// Result of a cache lookup.
5050
pub enum Cache {
51-
/// Result was found in cache.
51+
/// Result was found in cache (compressed ZIP format).
5252
Hit(CacheRead),
53+
/// Result was found in cache (uncompressed directory format).
54+
UncompressedHit(UncompressedCacheEntry),
5355
/// Result was not found in cache.
5456
Miss,
5557
/// Do not cache the results of the compilation.
@@ -62,6 +64,7 @@ impl fmt::Debug for Cache {
6264
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
6365
match *self {
6466
Cache::Hit(_) => write!(f, "Cache::Hit(...)"),
67+
Cache::UncompressedHit(_) => write!(f, "Cache::UncompressedHit(...)"),
6568
Cache::Miss => write!(f, "Cache::Miss"),
6669
Cache::None => write!(f, "Cache::None"),
6770
Cache::Recache => write!(f, "Cache::Recache"),
@@ -283,3 +286,80 @@ impl Default for CacheWrite {
283286
Self::new()
284287
}
285288
}
289+
290+
/// An uncompressed cache entry stored as a directory.
291+
#[derive(Debug)]
292+
pub struct UncompressedCacheEntry {
293+
pub(crate) dir: PathBuf,
294+
}
295+
296+
impl UncompressedCacheEntry {
297+
pub fn new(dir: PathBuf) -> Self {
298+
Self { dir }
299+
}
300+
301+
pub async fn extract_objects<T>(self, objects: T, pool: &tokio::runtime::Handle) -> Result<()>
302+
where
303+
T: IntoIterator<Item = FileObjectSource> + Send + Sync + 'static,
304+
{
305+
pool.spawn_blocking(move || {
306+
for FileObjectSource {
307+
key,
308+
path,
309+
optional,
310+
} in objects
311+
{
312+
let src = self.dir.join(&key);
313+
314+
if !src.exists() {
315+
if optional {
316+
continue;
317+
}
318+
bail!("Required object '{}' not found in cache", key);
319+
}
320+
321+
let dir = path
322+
.parent()
323+
.context("Output file without a parent directory!")?;
324+
fs::create_dir_all(dir)?;
325+
326+
// Read permissions from the cached source file directly
327+
let mode = get_file_mode(&fs::File::open(&src)?);
328+
329+
// Write to a tempfile and then atomically rename to the final path,
330+
// so parallel builds don't see partially-written files.
331+
let tmp_path = NamedTempFile::new_in(dir)?.into_temp_path();
332+
// Remove the empty temp file so reflink can create the destination
333+
let _ = std::fs::remove_file(&tmp_path);
334+
335+
if let Err(e) = crate::reflink::reflink_or_copy(&src, &tmp_path) {
336+
if !optional {
337+
bail!("Failed to copy object '{}' to {:?}: {}", key, path, e);
338+
}
339+
continue;
340+
}
341+
342+
tmp_path.persist(&path).map_err(|e| {
343+
anyhow::anyhow!("Failed to persist {:?} to {:?}: {}", e.path, path, e.error)
344+
})?;
345+
346+
if let Ok(Some(mode)) = mode {
347+
set_file_mode(&path, mode)?;
348+
}
349+
}
350+
351+
Ok(())
352+
})
353+
.await?
354+
}
355+
356+
pub fn get_stdout(&self) -> Vec<u8> {
357+
let path = self.dir.join("stdout");
358+
fs::read(&path).unwrap_or_default()
359+
}
360+
361+
pub fn get_stderr(&self) -> Vec<u8> {
362+
let path = self.dir.join("stderr");
363+
fs::read(&path).unwrap_or_default()
364+
}
365+
}

0 commit comments

Comments
 (0)