feat: add ccc grep structural code search#200
Merged
Merged
Conversation
Add a `ccc grep "PATTERN" [DIR/FILE]` subcommand backed by cocoindex's structural `code_match` API. It compiles the pattern once per language, walks the project (honoring configured include/exclude globs and nested .gitignore, or the enclosing git repo when run outside a cocoindex project), and matches files in parallel on a thread pool, streaming each file's results as soon as it completes. Supports `--lang` and `--path` filters like `ccc search`, and renders matches with colorized line numbers and paths under a TTY. Extract the shared source-file walking logic (the include/exclude + nested-gitignore matcher and the os.walk-based file iteration) into a new `file_walk` module, now the single source of truth used by the indexer, the daemon's doctor file-walk, and grep. Bump cocoindex to >=1.0.13 for the locked symmetric pattern syntax used by `code_match`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
On Windows the directory walk rendered paths with backslashes (e.g. `sub\b.py`) and CRLF files left a trailing `\r` on every rendered code line, which broke two tests. Normalize all display paths via `Path.as_posix()` — matching the indexer and `ccc search` — and strip the trailing `\r` when splitting source into lines. Add a CRLF rendering regression test that runs on all platforms. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Completes the Windows path fix: test_grep_path_glob compared fm.path (now normalized to posix) against a str()-built path, which renders with backslashes on Windows. Use as_posix() so the assertion is platform independent, matching the other display-path assertions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ccc grep "PATTERN" [DIR/FILE]subcommand backed by cocoindex's structuralcode_matchAPI: per-language pattern compilation, project-aware file walking (configured include/exclude globs + nested.gitignore, falling back to the enclosing git repo when run outside a cocoindex project), parallel matching on a thread pool, and results streamed per file as they complete. Supports--lang/--pathfilters and TTY-colorized output.os.walk-based iteration) into a newfile_walkmodule — now the single source of truth used by the indexer, the daemon's doctor file-walk, and grep.>=1.0.13for the locked symmetric pattern syntax (\(NAME*\)) used bycode_match.Test plan
tests/test_grep.pycovers the engine and CLI end-to-end (pattern compilation, project/gitignore awareness, git-root anchoring, rendering).uv run mypy src/ tests/,uv run ruff check, and the fulluv run pytest tests/suite (230 passed) all green against the released cocoindex 1.0.13.