Update picomatch to address audit findings#197
Open
dragonnite1221-lgtm wants to merge 5 commits into
Open
Conversation
…tion (colbymchenry#191) * feat(mcp): steer agents to explore-first; fix Kotlin/Swift test detection Two changes from diagnosing why Claude Code's Explore agent wasn't using codegraph_explore on a benchmark run (37 calls / ~90k tokens via search+Read+grep, vs a general-purpose agent that led with explore: 13 calls / ~55k tokens for the same question). 1. Tool guidance reframed across server-instructions.ts, instructions-template.ts, and .cursor/rules/codegraph.mdc (+ the explore/search tool descriptions): codegraph_explore is the workhorse for understanding/architecture/"how does X work" questions. Seed it with the key symbol names (a quick search/context first if the question names nothing concrete), read its output, and fill gaps with node/Read — instead of searching then Reading each file. The old "search first to find names, then explore" wording was short-circuiting: agents searched, got file:line locations, and Read them, never reaching explore. 2. isTestFile now recognizes Kotlin (*Test.kt, jvmTest/commonTest/ androidTest source sets), Swift (*Tests.swift), and other camelCase test conventions, so test code is deprioritized in explore/context ranking. Previously only Java/JS/Python were known, letting tests dominate Kotlin/Swift exploration (OkHttp "trace a request" went from 8/9 test files to surfacing Call.kt/OkHttpClient.kt/Request.kt/Response.kt). Capital-led matching keeps latest.kt/manifest.kt unflagged. An IDF common-term down-weighting was prototyped for the cold-query case but dropped — it was a measured no-op (the "common" terms weren't actually common in the test indexes); the test-detection gap was the real cause. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(agent-eval): add agent-behavior eval harness for codegraph MCP usage Tooling to measure how a Claude Code agent actually uses the codegraph MCP tools on a real repo — does it lead with codegraph_explore, how many Read/Grep follow-ups, token cost — for validating tool-guidance changes (server-instructions, tool descriptions) against real agent behavior. - itrun.sh drives the real interactive TUI via tmux (the faithful Explore path). Hardened for unattended runs: type-and-verify prompt delivery (the ❯ glyph is drawn ~6s before the input accepts keys), auto-accepts the "trust this folder" dialog, busy-detection keys on the universal "(Ns · …)" spinner so the pre-stream thinking phase counts as busy, and fails loudly instead of capturing an empty pane. - parse-session.mjs reports the tool breakdown + token accounting (gen / fresh-in / cached-in / billable) from the session and subagent logs, consistent across main-thread and subagent runs; counts main-thread Bash in the grep verdict. - run-agent.sh / parse-run.mjs are the headless stream-json complement (exact per-tool tokens/cost via claude -p). - run-interactive-test.md documents how to run it and how completion is detected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Folds all changes since 0.7.10 into 0.7.12 (0.7.11 was unpublished from npm): size-adaptive codegraph_explore output budget (colbymchenry#185/colbymchenry#187), line numbers in explore source sections (colbymchenry#188), explore-first tool guidance (colbymchenry#191), language-neutral source-omission markers, and Kotlin/Swift test-file detection (colbymchenry#191). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rationale
npm audit reports production advisories for picomatch versions below 4.0.4. This keeps the dependency within the existing major version while clearing the production audit finding.
Tests