Skip to content

Commit fa6d594

Browse files
fengmk2claude
andcommitted
feat(task): add cache fingerprint ignore patterns
Allow tasks to exclude specific files/directories from cache fingerprint calculation using glob patterns with gitignore-style negation support. This enables selective caching for tasks like package installation where only dependency manifests (package.json) matter for cache validation, not implementation files. Cache hits occur when ignored files change. Key features: - Optional fingerprintIgnores field accepts glob patterns - Negation patterns (!) to include files within ignored directories - Leverages existing vite_glob crate for pattern matching - Fully backward compatible (defaults to None) Example: { "fingerprintIgnores": [ "node_modules/**/*", "!node_modules/**/package.json" ] } 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 096aca2 commit fa6d594

10 files changed

Lines changed: 453 additions & 2 deletions

File tree

Lines changed: 348 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,348 @@
1+
# RFC: Vite+ Cache Fingerprint Ignore Patterns
2+
3+
## Summary
4+
5+
Add support for glob-based ignore patterns to the cache fingerprint calculation, allowing tasks to exclude specific files/directories from triggering cache invalidation while still including important files within ignored directories.
6+
7+
## Motivation
8+
9+
Current cache fingerprint behavior tracks all files accessed during task execution. This causes unnecessary cache invalidation in scenarios like:
10+
11+
1. **Package installation tasks**: The `node_modules` directory changes frequently, but only `package.json` files within it are relevant for cache validation
12+
2. **Build output directories**: Generated files in `dist/` or `.next/` that should not invalidate the cache
13+
3. **Large dependency directories**: When only specific files within large directories matter for reproducibility
14+
15+
### Example Use Case
16+
17+
For an `install` task that runs `pnpm install`:
18+
19+
- Changes to `node_modules/**/*/index.js` should NOT invalidate the cache
20+
- Changes to `node_modules/**/*/package.json` SHOULD invalidate the cache
21+
- This allows cache hits when dependencies remain the same, even if their internal implementation files have different timestamps or minor variations
22+
23+
## Proposed Solution
24+
25+
### Configuration Schema
26+
27+
Extend `TaskConfig` in `vite-task.json` to support a new optional field `fingerprintIgnores`:
28+
29+
```json
30+
{
31+
"tasks": {
32+
"my-task": {
33+
"command": "echo bar",
34+
"cacheable": true,
35+
"fingerprintIgnores": [
36+
"node_modules/**/*",
37+
"!node_modules/**/*/package.json"
38+
]
39+
}
40+
}
41+
}
42+
```
43+
44+
### Ignore Pattern Syntax
45+
46+
The ignore patterns follow standard glob syntax with gitignore-style semantics:
47+
48+
1. **Basic patterns**:
49+
- `node_modules/**/*` - ignore all files under node_modules
50+
- `dist/` - ignore the dist directory
51+
- `*.log` - ignore all log files
52+
53+
2. **Negation patterns** (prefixed with `!`):
54+
- `!node_modules/**/*/package.json` - include package.json files even though node_modules is ignored
55+
- `!important.log` - include important.log even though *.log is ignored
56+
57+
3. **Pattern evaluation order**:
58+
- Patterns are evaluated in order
59+
- Later patterns override earlier ones
60+
- Negation patterns can "un-ignore" files matched by earlier patterns
61+
- Last match wins semantics
62+
63+
### Implementation Details
64+
65+
#### 1. Configuration Schema Changes
66+
67+
**File**: `crates/vite_task/src/config/mod.rs`
68+
69+
```rust
70+
pub struct TaskConfig {
71+
// ...
72+
73+
// New field
74+
#[serde(default)]
75+
pub(crate) fingerprint_ignores: Option<Vec<Str>>,
76+
}
77+
```
78+
79+
#### 2. Fingerprint Validation Changes
80+
81+
**File**: `crates/vite_task/src/fingerprint.rs`
82+
83+
Modify `PostRunFingerprint::create()` to filter paths based on ignore patterns:
84+
85+
```rust
86+
impl PostRunFingerprint {
87+
pub fn create(
88+
executed_task: &ExecutedTask,
89+
fs: &impl FileSystem,
90+
base_dir: &AbsolutePath,
91+
fingerprint_ignores: Option<&[Str]>, // New parameter
92+
) -> Result<Self, Error> {
93+
let ignore_matcher = fingerprint_ignores
94+
.filter(|patterns| !patterns.is_empty())
95+
.map(GlobPatternSet::new)
96+
.transpose()?;
97+
98+
let inputs = executed_task
99+
.path_reads
100+
.par_iter()
101+
.filter(|(path, _)| {
102+
if let Some(ref matcher) = ignore_matcher {
103+
!matcher.is_match(path)
104+
} else {
105+
true
106+
}
107+
})
108+
.flat_map(|(path, path_read)| {
109+
Some((|| {
110+
let path_fingerprint =
111+
fs.fingerprint_path(&base_dir.join(path).into(), *path_read)?;
112+
Ok((path.clone(), path_fingerprint))
113+
})())
114+
})
115+
.collect::<Result<HashMap<RelativePathBuf, PathFingerprint>, Error>>()?;
116+
Ok(Self { inputs })
117+
}
118+
}
119+
```
120+
121+
#### 3. Cache Update Integration
122+
123+
**File**: `crates/vite_task/src/cache.rs`
124+
125+
Update `CommandCacheValue::create()` to pass ignore patterns:
126+
127+
```rust
128+
impl CommandCacheValue {
129+
pub fn create(
130+
executed_task: ExecutedTask,
131+
fs: &impl FileSystem,
132+
base_dir: &AbsolutePath,
133+
fingerprint_ignores: Option<&[Str]>, // New parameter
134+
) -> Result<Self, Error> {
135+
let post_run_fingerprint = PostRunFingerprint::create(
136+
&executed_task,
137+
fs,
138+
base_dir,
139+
fingerprint_ignores,
140+
)?;
141+
Ok(Self {
142+
post_run_fingerprint,
143+
std_outputs: executed_task.std_outputs,
144+
duration: executed_task.duration,
145+
})
146+
}
147+
}
148+
```
149+
150+
### Performance Considerations
151+
152+
1. **Pattern compilation**: Glob patterns are compiled once when loading the task configuration
153+
2. **Filtering overhead**: Path filtering happens during fingerprint creation (only when caching)
154+
3. **Memory impact**: Minimal - only stores compiled glob patterns per task
155+
4. **Parallel processing**: Existing parallel iteration over paths is preserved
156+
157+
### Edge Cases
158+
159+
1. **Empty ignore list**: No filtering applied (backward compatible)
160+
2. **Conflicting patterns**: Later patterns take precedence
161+
3. **Invalid glob syntax**: Return error during workspace loading
162+
4. **Absolute paths in patterns**: Treated as relative to package directory
163+
5. **Directory vs file patterns**: Both supported via glob syntax
164+
165+
## Alternative Designs Considered
166+
167+
### Alternative 1: `inputs` field extension
168+
169+
Extend the existing `inputs` field to support ignore patterns:
170+
171+
```json
172+
{
173+
"inputs": {
174+
"include": ["src/**/*"],
175+
"exclude": ["src/**/*.test.js"]
176+
}
177+
}
178+
```
179+
180+
**Rejected because**:
181+
182+
- The `inputs` field currently uses a different mechanism (pre-execution declaration)
183+
- This feature is about post-execution fingerprint filtering
184+
- Mixing the two concepts would be confusing
185+
186+
### Alternative 2: Separate `fingerprintExcludes` field
187+
188+
Only support exclude patterns (no negation):
189+
190+
```json
191+
{
192+
"fingerprintExcludes": ["node_modules/**/*"]
193+
}
194+
```
195+
196+
**Rejected because**:
197+
198+
- Cannot express "ignore everything except X" patterns
199+
- Less flexible for complex scenarios
200+
- Gitignore-style syntax is more familiar to developers
201+
202+
### Alternative 3: Include/Exclude separate fields
203+
204+
```json
205+
{
206+
"fingerprintExcludes": ["node_modules/**/*"],
207+
"fingerprintIncludes": ["node_modules/**/*/package.json"]
208+
}
209+
```
210+
211+
**Rejected because**:
212+
213+
- More verbose
214+
- Less clear precedence rules
215+
- Gitignore-style is a proven pattern
216+
217+
## Migration Path
218+
219+
### Backward Compatibility
220+
221+
This feature is fully backward compatible:
222+
223+
- Existing task configurations work unchanged
224+
- Default value for `fingerprintIgnores` is `None` (when omitted)
225+
- No behavior changes when field is absent or `null`
226+
- Empty array `[]` is treated the same as `None` (no filtering)
227+
228+
## Testing Strategy
229+
230+
### Unit Tests
231+
232+
1. **Pattern matching**:
233+
- Test glob pattern compilation
234+
- Test negation pattern precedence
235+
- Test edge cases (empty patterns, invalid syntax)
236+
237+
2. **Fingerprint filtering**:
238+
- Test path filtering with various patterns
239+
- Test no filtering when patterns are empty
240+
- Test complex pattern combinations
241+
242+
3. **Cache behavior**:
243+
- Test cache hit when ignored files change
244+
- Test cache miss when non-ignored files change
245+
- Test negation patterns work correctly
246+
247+
### Integration Tests
248+
249+
Create fixtures with realistic scenarios:
250+
251+
```
252+
fixtures/fingerprint-ignore-test/
253+
package.json
254+
vite-task.json # with fingerprintIgnores config
255+
node_modules/
256+
pkg-a/
257+
package.json
258+
index.js
259+
pkg-b/
260+
package.json
261+
index.js
262+
```
263+
264+
Test cases:
265+
266+
1. Cache hits when `index.js` files change
267+
2. Cache misses when `package.json` files change
268+
3. Negation patterns correctly include files
269+
270+
## Documentation Requirements
271+
272+
### User Documentation
273+
274+
Add to task configuration docs:
275+
276+
````markdown
277+
### fingerprintIgnores
278+
279+
Type: `string[]`
280+
Default: `[]`
281+
282+
Glob patterns to exclude files from cache fingerprint calculation.
283+
Patterns starting with `!` are negation patterns that override earlier excludes.
284+
285+
Example:
286+
287+
```json
288+
{
289+
"tasks": {
290+
"install": {
291+
"command": "pnpm install",
292+
"cacheable": true,
293+
"fingerprintIgnores": [
294+
"node_modules/**/*",
295+
"!node_modules/**/*/package.json"
296+
]
297+
}
298+
}
299+
}
300+
```
301+
````
302+
303+
This configuration ignores all files in `node_modules` except `package.json`
304+
files, which are still tracked for cache validation.
305+
306+
````
307+
### Examples Documentation
308+
309+
Add common patterns:
310+
311+
1. **Package installation**:
312+
```json
313+
"fingerprintIgnores": [
314+
"node_modules/**/*",
315+
"!node_modules/**/*/package.json",
316+
"!node_modules/.pnpm/lock.yaml"
317+
]
318+
````
319+
320+
2. **Build outputs**:
321+
```json
322+
"fingerprintIgnores": [
323+
"dist/**/*",
324+
".next/**/*",
325+
"build/**/*"
326+
]
327+
```
328+
329+
3. **Temporary files**:
330+
```json
331+
"fingerprintIgnores": [
332+
"**/*.log",
333+
"**/.DS_Store",
334+
"**/tmp/**"
335+
]
336+
```
337+
338+
## Conclusion
339+
340+
This RFC proposes adding glob-based ignore patterns to cache fingerprint calculation. The feature:
341+
342+
- Solves real caching problems (especially for install tasks)
343+
- Uses familiar gitignore-style syntax
344+
- Is fully backward compatible
345+
- Has minimal performance impact
346+
- Provides clear migration and documentation path
347+
348+
The implementation is straightforward, leveraging the proven `vite_glob` crate, and integrates cleanly with existing fingerprint and cache systems.

0 commit comments

Comments
 (0)