Skip to content

Commit 0926237

Browse files
committed
chore: restore AGENTS.md — project instructions for AI agents
1 parent 4d48358 commit 0926237

1 file changed

Lines changed: 110 additions & 0 deletions

File tree

AGENTS.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# AGENTS.md — Inspect
2+
3+
Website security auditing and crawling library for Go. Crawls sites concurrently, runs checks and declarative rules, generates findings with severity and CWE references.
4+
5+
## Design Principles
6+
7+
- **Library + CLI** — importable library with optional `inspect-ci` binary
8+
- **No LLM dependency** — pure static analysis on crawled pages
9+
- **Extensible** — custom checks (Go code) + declarative rules (no code required)
10+
11+
## Build & Test
12+
13+
```bash
14+
go test ./... # Run all tests
15+
go test -race ./... # Race detector
16+
go test -coverprofile=c.out ./... # Coverage
17+
go vet ./... # Static analysis
18+
gofumpt -w . # Format
19+
```
20+
21+
## Architecture
22+
23+
- `crawler.go` — Concurrent website crawler with depth control
24+
- `check.go` — Check interface and built-in security checks
25+
- `rule.go` — Declarative rule engine (YAML-based)
26+
- `finding.go` — Findings with severity, CWE, and evidence
27+
- `report.go` — Report generation (JSON, SARIF, HTML)
28+
- `cmd/inspect-ci/` — Optional CI binary for pipeline integration
29+
30+
## Conventions
31+
32+
- Go 1.26+, pure Go, no CGO
33+
- Table-driven tests
34+
- Conventional Commits: `feat:`, `fix:`, `docs:`, `refactor:`, `test:`
35+
- No `Co-authored-by:` trailers (auto-stripped by githook)
36+
- `gofumpt` formatting enforced in CI
37+
- CWE references required for all security findings
38+
39+
## Common Pitfalls
40+
41+
- Crawler tests need HTTP test servers — use `httptest.NewServer`
42+
- Rule YAML must be validated before execution
43+
- Session cookie matching uses substring, not exact match
44+
45+
## Naming Conventions
46+
47+
- **Types are domain nouns**: `Finding`, `Report`, `Stats`, `Page`, `PageLink`, `Checker`, `RuleCheck`
48+
- **Option functions use `With` prefix**: `WithChecks()`, `WithDepth()`, `WithConcurrency()`, `WithAllowPrivateIPs()`
49+
- **Preset options are bare vars**: `Quick`, `Standard`, `Deep`, `SecurityOnly`, `CI` — exported `var Option` values
50+
- **Severity is a type alias**: `type Severity = types.Severity` from `hawk/shared/types` — shared across hawk-eco
51+
- **Internal adapters use `Adapter` suffix**: `ruleCheckAdapter`, `customCheckAdapter` — bridge public to internal interfaces
52+
- **Check names are lowercase strings**: `"security"`, `"links"`, `"forms"`, `"a11y"`, `"performance"` — used in `WithChecks()`
53+
- **Error handling**: `Scan()` returns `(*Report, error)` — validation errors for empty URL, nil errors for success
54+
55+
## API Patterns
56+
57+
- **Functional options pattern**: same as sight — `Option` interface with `optFunc` adapter, `buildConfig()` merge
58+
- **One-shot + reusable**: `Scan(ctx, target, opts...)` creates a `Scanner` internally; `NewScanner(opts...)` for reuse
59+
- **Checker interface for extensibility**: `Name() string` + `Run(ctx, pages) []Finding` — register via `RegisterCheck()`
60+
- **RuleCheck for declarative rules**: `HeaderMatch`, `HeaderMissing`, `BodyMatch`, `BodyMissing`, `URLMatch` patterns
61+
- **Global + per-scanner custom checks**: `RegisterCheck()`/`RegisterRule()` for global; pass slices to `Scanner` for scoped
62+
- **Report.Failed()**: checks if any finding meets `FailOn` severity threshold — same pattern as sight
63+
- **ReDoS protection**: all user-supplied regex patterns go through `compileWithTimeout()` and `matchWithTimeout()` with 1s/100ms limits
64+
- **Regex complexity check**: `checkRegexComplexity()` rejects nested quantifiers and deep group nesting before compilation
65+
66+
## Testing Patterns
67+
68+
- **External test package**: `package inspect_test` — tests import `inspect` as a consumer would
69+
- **httptest.NewServer for all tests**: each test spins up a mock HTTP server with specific HTML/headers/responses
70+
- **Test patterns by concern**: `TestScan_BasicSite` (links), `TestScan_SecurityHeaders`, `TestScan_FormCSRF`, `TestScan_Accessibility`
71+
- **Always pass `WithAllowPrivateIPs()`**: tests run against `127.0.0.1` — without this flag, localhost is blocked
72+
- **Always pass `WithDepth(1)`**: keeps tests fast by limiting crawl depth
73+
- **Finding assertions**: iterate `report.Findings` and check specific `Check`, `Severity`, `Message` fields
74+
- **Preset smoke test**: `TestScan_Presets` runs all presets against a simple server — catches config panics
75+
- **ClearCustomChecks() in tests**: call before registering test-specific checks to avoid global state leaks
76+
- **Report method tests**: `TestReport_Failed`, `TestReport_MaxSeverity` — test on struct literals, no HTTP needed
77+
78+
## Refactoring Guidelines
79+
80+
- **Safe to refactor**: `checkRegexComplexity()`, `compileWithTimeout()`, `matchWithTimeout()` — internal helpers
81+
- **Safe to refactor**: `truncateEvidence()`, `intIn()` — pure utility functions
82+
- **Safe to refactor**: `parseInspectTOML()`, `parseInspectKeyValue()`, `applyFileConfig()` — config parsing internals
83+
- **Do not touch**: `Checker` interface (`Name()`, `Run()`) — breaking change for all custom check implementations
84+
- **Do not touch**: `RuleCheck` struct field names — used by consumers to define declarative rules
85+
- **Do not touch**: `Finding`, `Report`, `Stats` struct field names/tags — JSON serialization contract
86+
- **Safe to extend**: add new `Option` functions, new presets, new built-in checks in `checks/` package
87+
- **When adding checks**: create a new file in `checks/`, implement `Checker` interface, register in `init()`
88+
89+
## Key File Locations
90+
91+
| What | Where |
92+
|---|---|
93+
| Public API entry point | `inspect.go` (types, `Scan()`, `Finding`, `Report`, `Stats`) |
94+
| Check interface & adapters | `check.go` (`Checker`, `RuleCheck`, `RegisterCheck()`, `RegisterRule()`, ReDoS protection) |
95+
| Scanner implementation | `scanner.go` (crawler orchestration, check execution) |
96+
| Configuration & presets | `options.go` (`config` struct, `With*` functions, presets) |
97+
| Config file loading | `config.go` (`.inspect.toml` parsing, `LoadConfig()`) |
98+
| Severity type alias | `severity.go` (re-exports from `hawk/shared/types`) |
99+
| SARIF output | `sarif.go` |
100+
| CI output formatting | `ci_output.go` |
101+
| Built-in checks | `checks/` directory |
102+
| Internal crawler | `internal/crawler/` |
103+
| Internal check runner | `internal/check/` |
104+
| Browser-based crawling | `browser.go`, `browser/` |
105+
| LLM scanner integration | `llm_scanner.go` |
106+
| API security checks | `api_security.go` |
107+
| Dependency checking | `dependency_check.go` |
108+
| SBOM generation | `sbom.go` |
109+
| Main test file | `inspect_test.go` (httptest servers, per-concern scenarios) |
110+
| Linter config | `.golangci.yml` (errcheck, govet, staticcheck, gocritic, bodyclose, noctx) |

0 commit comments

Comments
 (0)