|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Pre-Commit Checklist |
| 6 | + |
| 7 | +**ALWAYS run the following before committing:** |
| 8 | +```bash |
| 9 | +npm run build # TypeScript compilation + asset copy |
| 10 | +npm run typecheck # Strict type checking (tsc --noEmit) |
| 11 | +npm run lint # ESLint on src/ |
| 12 | +npm test # Full test suite |
| 13 | +``` |
| 14 | + |
| 15 | +**Quick single-test command:** |
| 16 | +```bash |
| 17 | +npm test -- <path-to-test-file> # Run specific test file |
| 18 | +``` |
| 19 | + |
| 20 | +## Development Commands |
| 21 | + |
| 22 | +| Command | Purpose | |
| 23 | +|---------|---------| |
| 24 | +| `npm run build` | Compile TypeScript and copy assets to `dist/` | |
| 25 | +| `npm run dev` | Watch mode compilation (tsc --watch) | |
| 26 | +| `npm run lint` | Run ESLint on TypeScript files | |
| 27 | +| `npm run lint:fix` | Auto-fix ESLint issues | |
| 28 | +| `npm run typecheck` | Type check without emitting files (`tsc --noEmit`) | |
| 29 | +| `npm test` | Run all tests (Jest with ts-jest preset) | |
| 30 | +| `npm run test:integration` | Run integration tests only | |
| 31 | + |
| 32 | +## Architecture Overview |
| 33 | + |
| 34 | +**Code Search MCP** is a Model Context Protocol (MCP) server that provides intelligent code search capabilities across 12+ programming languages. It integrates universal-ctags, ripgrep, and ast-grep to deliver: |
| 35 | + |
| 36 | +- **Symbol Search** - Indexed lookup of classes, functions, methods, variables |
| 37 | +- **AST Search** - Structural code pattern matching with metavariables and relational rules |
| 38 | +- **Text Search** - Fast regex-based code search via ripgrep |
| 39 | +- **File Search** - Glob-based file navigation |
| 40 | +- **Stack Detection** - Technology stack identification |
| 41 | +- **Dependency Analysis** - Multi-ecosystem package analysis |
| 42 | +- **Index Caching** - Persistent symbol indices (80%+ faster startup) |
| 43 | + |
| 44 | +## Core Components |
| 45 | + |
| 46 | +### MCP Server (`src/mcp/server.ts`) |
| 47 | +- Entry point for all MCP tool calls |
| 48 | +- Routes requests to appropriate services |
| 49 | +- Enforces workspace security via `allowedWorkspaces` configuration |
| 50 | +- Exposes 10 tools: `detect_stacks`, `search_symbols`, `search_text`, `search_files`, `refresh_index`, `cache_stats`, `clear_cache`, `analyze_dependencies`, `search_ast_pattern`, `search_ast_rule`, `check_ast_grep` |
| 51 | + |
| 52 | +### Symbol Search (`src/symbol-search/`) |
| 53 | +- **SymbolIndexer** - Manages ctags integration and cache orchestration |
| 54 | +- **SymbolSearchService** - Handles symbol search with match modes (exact, prefix, substring, regex) |
| 55 | +- **TextSearchService** - Ripgrep wrapper for text/code pattern search |
| 56 | +- **ctags-integration.ts** - Spawns universal-ctags in temp directory (security hardening) |
| 57 | + |
| 58 | +### AST Search (`src/ast-search/`) |
| 59 | +- **ASTSearchService** - Bundled ast-grep NAPI for structural code search |
| 60 | +- Supports 15 languages with dynamic language registration |
| 61 | +- Provides pattern matching with metavariables (`$VAR`, `$$VAR`, `$$$VAR`) |
| 62 | +- Supports complex rules with relational operators (`inside`, `has`, `precedes`, `follows`) |
| 63 | + |
| 64 | +### Cache System (`src/cache/`) |
| 65 | +- **CacheManager** - Persistent symbol index caching in `~/.code-search-mcp-cache/` |
| 66 | +- Automatic invalidation based on file modification times |
| 67 | +- Workspace isolation via hash-based IDs |
| 68 | + |
| 69 | +### Security (`src/utils/security.ts`) |
| 70 | +- **ReDoS prevention** - Regex complexity validation to prevent catastrophic backtracking |
| 71 | +- **Path validation** - UNC extended-length path blocking, traversal prevention |
| 72 | +- **Resource limits** - Max file sizes (100MB), recursion depths (100), timeouts (30s) |
| 73 | +- Created during security audit - always validate user inputs |
| 74 | + |
| 75 | +## Project Structure |
| 76 | + |
| 77 | +``` |
| 78 | +src/ |
| 79 | +├── mcp/ # MCP server implementation, tool handlers |
| 80 | +├── symbol-search/ # Symbol indexing, text search |
| 81 | +├── ast-search/ # AST pattern matching (ast-grep NAPI) |
| 82 | +├── file-search/ # File system navigation (fast-glob) |
| 83 | +├── stack-detection/ # Technology stack detection engine |
| 84 | +├── dependency-analysis/ # Multi-ecosystem dependency parsing |
| 85 | +├── cache/ # Persistent index caching |
| 86 | +├── types/ # TypeScript type definitions |
| 87 | +├── utils/ # Utilities: security, workspace validation |
| 88 | +└── stacks.json # Stack definitions for detection |
| 89 | +``` |
| 90 | + |
| 91 | +## Important Conventions |
| 92 | + |
| 93 | +### TypeScript Configuration |
| 94 | +- **Strict mode enabled** - `noUnusedLocals`, `noUnusedParameters`, `noImplicitReturns` |
| 95 | +- **ESM modules** - `module: "NodeNext"`, `moduleResolution: "NodeNext"` |
| 96 | +- **Compilation target** - `ES2022`, Node.js 18+ required |
| 97 | + |
| 98 | +### Import Convention |
| 99 | +- Always use `.js` extensions in imports (ESM requirement): `import { foo } from './foo.js'` |
| 100 | +- Type-only imports: `import type { Foo } from './types/foo.js'` |
| 101 | + |
| 102 | +### Security Model |
| 103 | +- **Workspace validation** - All paths validated against `allowedWorkspaces` list |
| 104 | +- **Path traversal blocking** - Checks for `..` and absolute paths before resolution |
| 105 | +- **UNC path blocking** - Windows `\\?\` and `\\.\` paths rejected explicitly |
| 106 | +- **Symlink hardening** - Temp files use system temp directory, symlink checks before writes |
| 107 | + |
| 108 | +### Testing |
| 109 | +- Unit tests in `tests/unit/`, integration tests in `tests/integration/` |
| 110 | +- Uses `ts-jest` with ESM preset: `--experimental-vm-modules` |
| 111 | +- Tests include 61 security-specific tests across 3 test files |
| 112 | + |
| 113 | +## MCP Tools Reference |
| 114 | + |
| 115 | +| Tool | Purpose | |
| 116 | +|------|---------| |
| 117 | +| `detect_stacks` | Auto-detect technology stacks in a directory | |
| 118 | +| `search_symbols` | Find classes, functions, methods by name/pattern | |
| 119 | +| `search_text` | Regex code search using ripgrep | |
| 120 | +| `search_files` | Find files by name, extension, or glob pattern | |
| 121 | +| `refresh_index` | Rebuild symbol index for a workspace | |
| 122 | +| `cache_stats` | Get cache statistics for workspaces | |
| 123 | +| `clear_cache` | Clear cached indices | |
| 124 | +| `analyze_dependencies` | Analyze project dependencies | |
| 125 | +| `search_ast_pattern` | AST pattern matching with metavariables | |
| 126 | +| `search_ast_rule` | Complex AST rules with relational/composite operators | |
| 127 | +| `check_ast_grep` | Verify ast-grep availability | |
| 128 | + |
| 129 | +## External Dependencies |
| 130 | + |
| 131 | +- **@ast-grep/napi** - Bundled native binaries (no installation required) |
| 132 | +- **@vscode/ripgrep** - Bundled ripgrep binary |
| 133 | +- **@LLMTooling/universal-ctags-node** - Bundled universal-ctags binary |
| 134 | +- **fast-glob** - Fast glob pattern matching |
| 135 | +- **@modelcontextprotocol/sdk** - MCP SDK for server implementation |
| 136 | + |
| 137 | +All external binaries are bundled - no external ctags/ripgrep installation needed by end users. |
0 commit comments