|
| 1 | +# AGENTS.md |
| 2 | + |
| 3 | +Instructions for AI coding agents working on the RuleProbe codebase. |
| 4 | + |
| 5 | +RuleProbe verifies whether agents follow instruction files. This file is the instruction file for agents working on RuleProbe itself. It is parsed by RuleProbe in the self-check workflow, so every rule below is written to be machine-verifiable. |
| 6 | + |
| 7 | +## Project |
| 8 | + |
| 9 | +- Repository: https://github.com/moonrunnerkc/ruleprobe |
| 10 | +- Package: https://www.npmjs.com/package/ruleprobe |
| 11 | +- Language: TypeScript (strict) |
| 12 | +- Runtime: Node.js >= 18 |
| 13 | +- License: MIT |
| 14 | + |
| 15 | +## Build and Test |
| 16 | + |
| 17 | +- Use `npm` as the package manager. Do not switch to pnpm, yarn, or bun. |
| 18 | +- Use `vitest` as the test runner. Do not introduce jest or mocha. |
| 19 | +- Run `npm test` before declaring any change complete. |
| 20 | +- Run `npm run build` to verify the TypeScript compile is clean. |
| 21 | +- A `package-lock.json` must exist at the repo root. |
| 22 | +- Pinned dependency versions are required in `package.json` (no `^` or `~` ranges). |
| 23 | + |
| 24 | +## Code Style |
| 25 | + |
| 26 | +- Use TypeScript strict mode. Never disable strict checks. |
| 27 | +- Never use `any`. Use `unknown` and narrow, or define a precise type. |
| 28 | +- Always use named exports. Never use default exports. |
| 29 | +- Use camelCase for variables and functions. |
| 30 | +- Use PascalCase for types, interfaces, and classes. |
| 31 | +- Use kebab-case for filenames. |
| 32 | +- Prefer `const` over `let`. |
| 33 | +- Prefer `interface` over `type` for object shapes. |
| 34 | +- Prefer `async/await` over `.then()` chains. |
| 35 | +- Never use `console.log` in production code. Use the structured logger. |
| 36 | +- Never use `eval`. |
| 37 | +- No magic numbers without a named constant or inline comment justifying the value. |
| 38 | +- No em dashes anywhere in source, comments, docs, or commit messages. Use commas, colons, semicolons, parentheses, or separate sentences. |
| 39 | +- Files must stay under 300 lines. If a file approaches the limit, decompose it. |
| 40 | +- Add full JSDoc to every exported function, class, and type. |
| 41 | +- Avoid nested ternaries. Use early returns to flatten control flow. |
| 42 | + |
| 43 | +## Architecture Boundaries |
| 44 | + |
| 45 | +- Parser code lives under `src/parser/`. Do not call verifier code from the parser. |
| 46 | +- Verifier engines live under `src/verifiers/`. Each engine exports a single entrypoint that returns `VerificationResult[]`. |
| 47 | +- The semantic tier lives under `src/semantic/`. Source code must never leave the user's machine. Only numeric AST vectors, opaque sub-tree hashes, boolean flags, and rule text may be sent to an LLM. |
| 48 | +- The CLI lives under `src/cli/`. Commands compose pipeline functions; they do not contain pipeline logic. |
| 49 | +- Shared types live under `src/types/`. Do not duplicate type definitions across modules. |
| 50 | + |
| 51 | +## Verifier Engines |
| 52 | + |
| 53 | +There are eight verifier engines: `ast`, `filesystem`, `regex`, `treesitter`, `preference`, `tooling`, `config-file`, `git-history`. When adding a check, use an existing engine. Adding a new engine requires a written justification in the PR description. |
| 54 | + |
| 55 | +- AST checks must use `ts-morph`. Do not parse TypeScript with regex. |
| 56 | +- Tree-sitter WASM loading must fail gracefully. If a grammar fails to load, log a warning and skip the check. Never block other verifiers. |
| 57 | +- Type-aware AST checks (implicit any, unused exports, unresolved imports) require a `tsconfig.json` and the `--project` flag. Skip cleanly when absent. |
| 58 | +- Semantic tier failures must never prevent deterministic results from returning. |
| 59 | + |
| 60 | +## Parser Rules |
| 61 | + |
| 62 | +- The parser supports 7 instruction file formats: `CLAUDE.md`, `AGENTS.md`, `.cursorrules`, `copilot-instructions.md`, `GEMINI.md`, `.windsurfrules`, `.rules`. Parser changes must not break extraction for any of them. |
| 63 | +- Lines that cannot be mapped to a deterministic check go into the `unparseable` array. Do not invent rules to inflate the parse rate. |
| 64 | +- LLM-extracted rules must be tagged `extractionMethod: 'llm'` with `confidence: 'medium'`. |
| 65 | +- Rubric-decomposed rules must be tagged `confidence: 'low'`. |
| 66 | + |
| 67 | +## Testing |
| 68 | + |
| 69 | +- Every new function requires at least one test. |
| 70 | +- Test files live under `tests/` and mirror the `src/` directory structure. |
| 71 | +- Test names describe behavior, not implementation. |
| 72 | +- Tests must validate real behavior, not wiring. Reading the implementation should not be required to understand what a test verifies. |
| 73 | +- No mocks except at external API boundaries (Anthropic API, OpenAI API, GitHub API, filesystem boundaries when testing error paths). |
| 74 | +- Use `describe` and `it` blocks. Do not use `test()` directly. |
| 75 | +- Never use `console.log` in tests. |
| 76 | +- New matchers require: the matcher implementation, a test file with real-world instruction examples, and an entry in `docs/matchers.md`. |
| 77 | + |
| 78 | +## Security |
| 79 | + |
| 80 | +- Never execute scanned code. |
| 81 | +- Never modify files in the scanned directory. |
| 82 | +- All user-supplied paths must be resolved and bounded to the working directory. |
| 83 | +- Symlinks resolving outside the project root must be skipped unless `--allow-symlinks` is passed. |
| 84 | +- Never write API keys to disk or include them in reports. |
| 85 | +- Network calls are allowed only when the user opts in: `--llm-extract`, `--rubric-decompose`, `--semantic`, or `ruleprobe run`. |
| 86 | + |
| 87 | +## Imports |
| 88 | + |
| 89 | +- No path aliases. Use relative imports. |
| 90 | +- No barrel imports from deep internal modules. Import directly from the file that defines the symbol. |
| 91 | +- No wildcard imports. |
| 92 | +- Do not import lodash. Use native JavaScript methods. |
| 93 | + |
| 94 | +## Error Handling |
| 95 | + |
| 96 | +- Never use empty catch blocks. |
| 97 | +- Never swallow errors silently. Log or rethrow. |
| 98 | +- Catch clauses must declare the caught type as `unknown` and narrow. |
| 99 | +- Error messages must include what failed and what to do about it. |
| 100 | + |
| 101 | +## Git Workflow |
| 102 | + |
| 103 | +- Use conventional commit messages: `feat:`, `fix:`, `docs:`, `refactor:`, `test:`, `chore:`. |
| 104 | +- Branch names use kebab-case: `feat/new-matcher`, `fix/parser-bug`. |
| 105 | +- Pull requests must pass the self-check workflow before merge. |
| 106 | +- Do not commit `.env` files or any file containing secrets. |
| 107 | + |
| 108 | +## Configuration Files |
| 109 | + |
| 110 | +- ESLint config lives at `.eslintrc.json` or `eslint.config.js`. |
| 111 | +- Prettier config lives at `.prettierrc` or `.prettierrc.json`. |
| 112 | +- TypeScript config lives at `tsconfig.json`. |
| 113 | +- Vitest config lives at `vitest.config.ts`. |
| 114 | +- Do not add competing tools (Biome alongside ESLint, Rome, etc.). |
| 115 | + |
| 116 | +## Documentation |
| 117 | + |
| 118 | +- Update `docs/matchers.md` when adding or modifying a matcher. |
| 119 | +- Update `docs/cli-reference.md` when adding or changing a CLI command or flag. |
| 120 | +- Update `docs/api-reference.md` when changing the public API surface. |
| 121 | +- Update the relevant release notes file under `docs/` for any user-facing change. |
| 122 | + |
| 123 | +## What Not To Do |
| 124 | + |
| 125 | +- Do not introduce a new agent SDK adapter without a corresponding integration test. |
| 126 | +- Do not weaken the security boundary to make a check easier to implement. |
| 127 | +- Do not push deterministic logic into the semantic tier because it is easier to write. |
| 128 | +- Do not add features that require API keys for the default deterministic path. |
| 129 | +- Do not add dependencies without a clear justification. Each dependency is a maintenance cost. |
0 commit comments