fix(indexer): treat escaped "\!" gitignore lines as literal, not negation#193
fix(indexer): treat escaped "\!" gitignore lines as literal, not negation#193jichaowang02-lang wants to merge 2 commits into
Conversation
…tion `_normalize_gitignore_lines` unescaped a leading "\#"/"\!" and only *then* checked for negation. For "\!name" the unescape produced "!name", which the negation check misread as a re-include rule, emitting "!**/name". A literal ignore such as `\!important` therefore cancelled an unrelated `important` rule instead of ignoring the file named "!important". Detect the escape before the negation check and skip negation handling for escaped lines, so "\!important" normalizes to "**/!important" (the '!' is no longer pattern-leading, so it is literal). Ordinary negation and escaped "\#" behavior are unchanged. Add tests/test_indexer_gitignore.py covering plain/negated/escaped patterns and an end-to-end GitIgnoreSpec check that the escaped line no longer re-includes unrelated matches.
There was a problem hiding this comment.
Pull request overview
Fixes .gitignore normalization so escaped leading \! patterns are treated as literal ! entries (not negations), preventing unintended re-includes during indexing.
Changes:
- Adjust
_normalize_gitignore_linesto detect escaped\!/\#before applying negation logic. - Add new unit tests covering plain patterns, negation, escaped
#, escaped!, subdirectory prefixing, and an end-to-endGitIgnoreSpecregression.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/cocoindex_code/indexer.py |
Updates .gitignore line normalization to avoid misclassifying escaped bang as a negation. |
tests/test_indexer_gitignore.py |
Adds regression/unit coverage for .gitignore normalization (including the escaped \! scenario). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| escaped = line.startswith("\\#") or line.startswith("\\!") | ||
| if escaped: | ||
| line = line[1:] | ||
| negated = False | ||
| else: | ||
| negated = line.startswith("!") | ||
| if negated: | ||
| line = line[1:] |
| def test_escaped_hash_is_literal_not_comment() -> None: | ||
| # "\#notacomment" -> a file literally named "#notacomment". | ||
| assert _normalize_gitignore_lines(["\\#notacomment"], ROOT) == ["**/#notacomment"] | ||
|
|
||
|
|
||
| def test_escaped_bang_is_literal_not_negation() -> None: | ||
| # Regression: "\!important" means "ignore a file literally named '!important'", | ||
| # NOT a negation, so it must not become a "!"-prefixed (negation) pattern. | ||
| assert _normalize_gitignore_lines(["\\!important"], ROOT) == ["**/!important"] | ||
|
|
||
|
|
Addressing review feedback: the previous fix stripped the backslash from an
escaped "\!"/"\#" line and relied on the "**/" prefix to keep the leading
"!"/"#" from being misread. But a pattern that contains a "/" is anchored and
gets no "**/" prefix, so "\!dir/file" normalized to "!dir/file" — which
GitIgnoreSpec reads back as a negation (and "\#dir/file" as a comment),
dropping the rule.
Keep the backslash in the emitted pattern so pathspec parses the "!"/"#"
literally in both the prefixed ("**/\!foo") and anchored ("\!dir/file") cases.
Adds an end-to-end test for path-bearing escaped patterns; updates the
exact-form assertions to the now-escaped output.
|
Good catch — fixed in ec2ce38. Path-bearing escaped patterns ( |
Summary
_normalize_gitignore_linesmishandles a.gitignoreline that escapes a leading!(\!name). Per gitignore semantics,\!namemeans "ignore a file literallynamed
!name" — it is not a negation. The function unescapes the backslash andthen runs its negation check, so
\!nameis unescaped to!nameand misread as are-include rule.
Impact
A literal entry like
\!importantis normalized to!**/important(a negation),which cancels an unrelated
importantignore rule — files the user meant toignore start getting indexed.
important!importantFix
Detect the
\#/\!escape before the negation check and skip negationhandling for escaped lines.
\!importantnow normalizes to**/!important— since!is no longer the leading character, it is treated literally. Ordinary negation(
!build/keep.txt) and escaped-#behavior are unchanged.Tests
Adds
tests/test_indexer_gitignore.py(the function had no tests): plain, negated,escaped-
#, escaped-!, and subdirectory-prefix cases, plus an end-to-endGitIgnoreSpecassertion that the escaped line no longer re-includes unrelatedmatches. Verified against
pathspec==1.1.1.