Skip to content

feat: add ccc grep structural code search#200

Merged
georgeh0 merged 3 commits into
mainfrom
g/ccc-grep
Jun 22, 2026
Merged

feat: add ccc grep structural code search#200
georgeh0 merged 3 commits into
mainfrom
g/ccc-grep

Conversation

@georgeh0

Copy link
Copy Markdown
Member

Summary

  • Add a ccc grep "PATTERN" [DIR/FILE] subcommand backed by cocoindex's structural code_match API: per-language pattern compilation, project-aware file walking (configured include/exclude globs + nested .gitignore, falling back to the enclosing git repo when run outside a cocoindex project), parallel matching on a thread pool, and results streamed per file as they complete. Supports --lang/--path filters and TTY-colorized output.
  • Extract the shared source-file walking logic (include/exclude + nested-gitignore matcher and os.walk-based iteration) into a new file_walk module — now the single source of truth used by the indexer, the daemon's doctor file-walk, and grep.
  • Bump cocoindex to >=1.0.13 for the locked symmetric pattern syntax (\(NAME*\)) used by code_match.

Test plan

  • New tests/test_grep.py covers the engine and CLI end-to-end (pattern compilation, project/gitignore awareness, git-root anchoring, rendering).
  • uv run mypy src/ tests/, uv run ruff check, and the full uv run pytest tests/ suite (230 passed) all green against the released cocoindex 1.0.13.

georgeh0 and others added 3 commits June 22, 2026 10:10
Add a `ccc grep "PATTERN" [DIR/FILE]` subcommand backed by cocoindex's
structural `code_match` API. It compiles the pattern once per language,
walks the project (honoring configured include/exclude globs and nested
.gitignore, or the enclosing git repo when run outside a cocoindex
project), and matches files in parallel on a thread pool, streaming each
file's results as soon as it completes. Supports `--lang` and `--path`
filters like `ccc search`, and renders matches with colorized line
numbers and paths under a TTY.

Extract the shared source-file walking logic (the include/exclude +
nested-gitignore matcher and the os.walk-based file iteration) into a new
`file_walk` module, now the single source of truth used by the indexer,
the daemon's doctor file-walk, and grep.

Bump cocoindex to >=1.0.13 for the locked symmetric pattern syntax used
by `code_match`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
On Windows the directory walk rendered paths with backslashes (e.g.
`sub\b.py`) and CRLF files left a trailing `\r` on every rendered code
line, which broke two tests. Normalize all display paths via
`Path.as_posix()` — matching the indexer and `ccc search` — and strip the
trailing `\r` when splitting source into lines. Add a CRLF rendering
regression test that runs on all platforms.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Completes the Windows path fix: test_grep_path_glob compared fm.path
(now normalized to posix) against a str()-built path, which renders with
backslashes on Windows. Use as_posix() so the assertion is platform
independent, matching the other display-path assertions.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@georgeh0 georgeh0 merged commit 2e96b45 into main Jun 22, 2026
4 checks passed
@georgeh0 georgeh0 deleted the g/ccc-grep branch June 22, 2026 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant