Tier 1 convergence: docs, security, emitter fixes, and LLM resilience#2
Merged
Merged
Conversation
added 11 commits
May 22, 2026 21:55
The README and api reference still claimed APIs that were removed in v4.5.0 and described type unions that no longer match src/types.ts. The README's opening also led with prose instead of the artifact, and the cover image referenced in the assets path was missing. Specific changes: - Add assets/cover.svg (was missing, referenced by README header) - Rewrite README header to show cover + concrete 2-sentence description - Drop the "What's New in v4.5.0" section (duplicates CHANGELOG.md) - Trim 110 lines of inline CLI deep-dive (lives in docs/cli-reference.md) - Add the tree-sitter engine row to How It Works (re-added in v4.5.0) - docs/api-reference.md: remove the Agent Invocation section (buildAgentConfig, invokeAgent, isAgentSdkAvailable, hasAgentOutput, watchForCompletion, countCodeFiles were all removed in v4.5.0 and no longer exist in src/) - docs/api-reference.md: correct RuleCategory from 15 stale members to the 8 in src/types.ts; correct VerifierType to 4 members - docs/api-reference.md: replace the ruleprobe/semantic subpath import example with a note (package.json has no exports field) Closes: docs-cover-missing, docs-stale-agent-invocation, docs-rulecategory-mismatch, docs-verifiertype-mismatch, docs-semantic-subpath Co-found: docs-reviewer
Two stale claims in SECURITY.md: The list of network opt-in flags still referenced ruleprobe run, a command removed in v4.5.0. Trimmed it back to the three flags that actually trigger network calls: --llm-extract, --rubric-decompose, and --semantic. The audit status block still claimed "as of v0.1.0, 5 moderate advisories in esbuild." Current npm audit reports 7 moderate advisories across the vitest chain (vite, esbuild, postcss, brace-expansion). All dev-tooling; none reach the published package. Updated text and removed the unsupported version pin. Closes: docs-security-run-stale, docs-security-audit-stale Co-found: docs-reviewer, security-reviewer
The .gitignore was missing a handful of defense-in-depth patterns that are baseline for a Node project, and it had a duplicate entry for .ruleprobe-semantic/. No sensitive files are currently tracked as a result of the gaps, but the gaps create accident surface for future contributors. Added patterns: .env.* glob (with .env.example/.env.template exceptions), credential extensions (*.pem, *.key, *.p12, *.pfx, *.crt), editor backup patterns (*~, *.bak, *.orig, *.swp, *.swo), npm and yarn debug logs, Windows noise (Thumbs.db, desktop.ini), and the local pubprep output directory. Removed the duplicate .ruleprobe-semantic/ line. Closes: L-02-gitignore-gaps Co-found: security-reviewer
Both the self-check workflow and the published composite action referenced actions/checkout and actions/setup-node by mutable v4 tag. A tag redirect by a compromised maintainer account would run in CI with contents:read and pull-requests:write, exposing the workflow's GITHUB_TOKEN. Pinned actions/checkout to 11bd71901bbe5b1630ceea73d27597364c9af683 (v4.2.2) and actions/setup-node to 49933ea5288caeca8642d1e84afbd3f7d6820020 (v4.4.0). Version tags retained as adjacent comments for human review and Dependabot tag-comment updates. Closes: M-01-a03-ci-sha-pinning Co-found: security-reviewer
emitLegacyConfig appended a // Unmappable rules comment block after the JSON body whenever any rule from the instruction file did not map to an ESLint rule. The full output was not valid JSON, breaking the JSDoc contract on emitEslintConfig that promises "valid JSON suitable for .eslintrc.json files." The existing test acknowledged this by stripping the comment block before calling JSON.parse, which just tested around the invariant violation. emitLegacyConfig now returns strict JSON only. Unmappable rules are surfaced through a new exported helper formatUnmappableSummary(), which lint-config writes to stderr after the JSON has been emitted. The flat config path is unchanged because flat config is JavaScript and the comments are syntactically valid there. The existing test is rewritten to assert two contracts: the legacy output parses cleanly and contains no // markers, and the summary helper surfaces unmappable entries by id, reason, and source text. Closes: a3f1c8d2 Co-found: tech-debt-reviewer
emitLegacyConfig unconditionally added eslint:recommended plus plugin:<name>/recommended to the extends field whenever any plugin was referenced. The flat emitter does not do this. The same EslintConfig produced a minimal targeted output in flat format and a full recommended-set output (often hundreds of extra rules) in legacy format, breaking symmetry and surprising users who asked for a targeted config. The legacy emitter now omits extends entirely. Users who want the recommended sets can add `extends` in their own root config; the generated file no longer makes that choice for them. Updated the corresponding test to assert extends is absent. Closes: e5a3c197 Co-found: tech-debt-reviewer
The bidirectional mapping table in src/mappings/index.ts had two entries that did not match the canonical names used by the per-rule mappers in src/mapper/mappings/. The extractor reads this table to reverse-map ESLint rule names back to instruction prose, so the mismatch meant `ruleprobe extract` silently failed to recognize the rules the mapper now correctly emits. - no-enum: was mapped to @typescript-eslint/no-enum (a rule that does not exist in typescript-eslint). Now no-restricted-syntax, matching src/mapper/mappings/type-safety.ts which uses no-restricted-syntax with a TSEnumDeclaration selector. - no-unused-exports: was mapped to no-unused-vars (different semantics, scoped to local variable usage). Now import/no-unused-modules, matching the canonical mapper. Closes: b7e2a941 Co-found: tech-debt-reviewer
KNOWN_PATTERN_TYPES was defined twice as a module-level const in src/llm/pipeline.ts and src/llm/rubric-pipeline.ts, each with a "keep in sync" comment and no test to enforce equality. The two lists had already diverged: the rubric version included type-aware and tree-sitter entries that the extraction version did not, so an unparseable line involving those checks would degrade the extraction prompt without warning. Moved the canonical list to src/llm/known-patterns.ts and grouped the entries by verifier (AST, regex, filesystem, type-aware, tree-sitter) so the next addition lands in one place. Both pipelines now import the same constant. Closes: c9d4b073 Co-found: tech-debt-reviewer
The OpenAI provider made a single fetch and surfaced any non-ok response as an error. A transient rate-limit (429) or upstream outage (503) silently dropped the batch of unparseable lines for that call, even though the failure mode was retriable. Static extraction always covers the same lines on the next run, so the failure went unnoticed. The provider now retries 429 and 503 responses up to maxAttempts (default 3), with exponential backoff starting at 1000ms or the Retry-After header value when present. Non-transient statuses (400, 401, 500, etc.) still surface immediately because retrying will not fix a misconfigured request. The provider config grew three injectable knobs (maxAttempts, retryBaseDelayMs, sleep, fetchImpl) so the retry path is testable without a network. New tests cover 429/503 retry, non-retryable errors, exhaustion, and Retry-After honoring. Closes: a8c5d620 Co-found: tech-debt-reviewer
src/types.ts sat at 293 lines, 7 short of the self-check 300-line limit. The next substantive type addition would have tripped the self-check workflow on the PR that introduced it; src/types.ts is imported by nearly every feature module, so any new type ends up here. Split into two files: - src/types-core.ts: instruction-file and rule primitives (Rule, RuleSet, VerificationPattern, RuleCategory, VerifierType, QualifierType, InstructionFileType, MarkdownSection, RuleMatcher, INSTRUCTION_FILE_NAMES) - src/types-results.ts: verification and report types (Evidence, RuleResult, ReportSummary, AdherenceReport, AgentRun, TaskTemplate, ReportFormat, CrossFileConflict, CrossFileRedundancy, FileAnalysis, ProjectAnalysis, DEFAULT_COMPLIANCE_THRESHOLD) src/types.ts is now a 39-line barrel that re-exports both. Every existing `from '../types.js'` import keeps working with no call-site changes. Closes: d2e6f318 Co-found: tech-debt-reviewer
The security review flagged LLM10:2025 unbounded consumption for the opt-in LLM paths. Each path already has an effective per-invocation budget in code, but the budget model was not documented and the review treated it as absent. - --llm-extract: one OpenAI call per invocation (batch of 50 lines), 3 retry attempts on 429/503 only. - --rubric-decompose: one OpenAI call per invocation (batch of 20). - --semantic: bounded by --max-llm-calls (default 20), already enforced by the semantic engine and tested. Added a SECURITY.md section spelling this out so operators know their worst-case cost ceiling without reading the source. Closes: L-03-llm10-unbounded-calls Co-found: security-reviewer
RuleProbe: Instruction Adherence ReportClick to expand full report (247 lines) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Commits
Test plan