fix: handle exclude patterns for nested directories#3
Conversation
Directory exclude patterns like `.git/` and `node_modules/` now correctly match at any depth in the path, not just as a prefix. For example, `vendor/lib/.git/HEAD` is now properly excluded by the `.git/` pattern. The fix checks if a simple (single-segment) directory pattern matches any segment of the path, not just the beginning. Multi-segment patterns like `a/b/` continue to work via prefix matching as before. Also fixes a path segment boundary bug in GetRelativePath where `/tmp` could incorrectly match `/tmp2` due to missing separator check. Both fixes include test cases. Made-with: Cursor
There was a problem hiding this comment.
Pull request overview
Fixes path handling so directory exclude patterns (e.g., .git/, node_modules/) match at any depth, and tightens GetRelativePath to avoid false prefix matches (e.g., /tmp incorrectly matching /tmp2).
Changes:
- Update exclude-pattern matching to detect single-segment directory patterns anywhere within a path.
- Add path-segment boundary check in
GetRelativePathto prevent prefix false-positives. - Add/expand tests for nested directory exclusions and
GetRelativePathedge cases.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| internal/digest/ingest.go | Extends directory-pattern matching to match single-segment dir patterns at any path depth. |
| internal/digest/ingest_test.go | Adds test cases for nested directory exclusion behavior. |
| internal/fsutil/fs.go | Fixes GetRelativePath by requiring a path-separator boundary for prefix checks. |
| internal/fsutil/fs_test.go | Adds tests covering GetRelativePath nested/same-dir/similar-prefix scenarios. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pathToCheckPrefix = parentDir + "/" | ||
| } | ||
|
|
||
| // Check if path starts with the pattern (e.g. ".git/HEAD" matches ".git/") |
There was a problem hiding this comment.
This comment is slightly misleading: for files, the code checks the parent directory prefix (pathToCheckPrefix), not the full file path. Consider rewording the example to reflect what is actually being compared (e.g., that a file under ".git/" has parentDir ".git/" which matches the pattern).
| // Check if path starts with the pattern (e.g. ".git/HEAD" matches ".git/") | |
| // Check whether the directory prefix being examined matches the pattern. | |
| // For directories this is the directory path itself; for files it is the | |
| // parent directory (e.g. ".git/HEAD" has parentDir ".git/", which matches ".git/"). |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Summary
.git/andnode_modules/now correctly match at any depth in the path, not just as a prefix. For example,vendor/lib/.git/HEADis now properly excluded by the.git/pattern.GetRelativePathsegment boundary: Paths like/tmpno longer incorrectly match/tmp2due to a missing path separator check instrings.HasPrefix.Problem
When a project contains nested
.gitdirectories (e.g., submodules, vendored repos), the exclude pattern.git/only matched paths that started with.git/. Paths likevendor/lib/.git/objects/packwere not excluded because theisPathMatchWithInfofunction only checked for prefix matches.Before:
After:
Changes
internal/digest/ingest.go: For simple (single-segment) directory patterns, also check if the pattern matches any segment of the path usingstrings.Containsinternal/fsutil/fs.go: Add path separator boundary check inGetRelativePathinternal/digest/ingest_test.go: Add 8 test cases for nested directory exclusioninternal/fsutil/fs_test.go: New test file forGetRelativePathedge casesTest plan
.git/,node_modules/,.next/,build/directoriesGetRelativePathsegment boundarygo test ./...