Skip to content

Commit 2c1a314

Browse files
colbymchenryclaude
andauthored
feat(mcp): line numbers in explore output + per-file cluster fixes (#188)
* feat(mcp): line numbers in explore output + per-file cluster fixes Follow-up to #185. Three changes to codegraph_explore: 1. Source sections now carry cat -n style line-number prefixes (<num>\t<code>), so the agent can cite file:line straight from the payload instead of re-Reading the file just to recover a line number. Isolated A/B: the no-line-numbers arm spent 2 Reads + a grep to find a line number the line-numbered arm cited with zero follow-up calls. Payload cost ~3-5%. Toggle off with CODEGRAPH_EXPLORE_LINENUMS=0. 2. Per-file cluster selection now ranks clusters containing a query entry point ahead of dense declaration blocks. Density-only ranking buried the relevant methods (perform/didCreateURLRequest/task in Alamofire's Session.swift) under the top-of-file class header + property list. 3. Whole-file "envelope" nodes (a class/struct/etc. spanning >50% of the file) are excluded from clustering. The Session class spans ~1,400 lines; keeping it collapsed every method into one giant cluster that tail-trimmed down to just the class header, hiding the methods. Net vs the 0.7.10 baseline, line numbers on: Alamofire -60%, Excalidraw -32%, VS Code -12% per explore call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(mcp): language-neutral omission markers in explore output The gap separator and the two tail-trim markers used C-style `//` comments, which aren't comments in Python, Ruby, etc. Switch to plain `... (gap) ...` / `... (trimmed) ...` so they read correctly inside any language's fenced source block. With line numbers on, the line-number jump already corroborates a gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(mcp): language-neutral truncation marker in codegraph_context Sibling to the explore marker fix: codegraph_context's code-block truncation used a C-style `// ... truncated ...`. Switch to `... (truncated) ...` so it reads correctly in any language's fenced source block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(release): bump version to 0.7.11 --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 93e53e7 commit 2c1a314

6 files changed

Lines changed: 150 additions & 27 deletions

File tree

CHANGELOG.md

Lines changed: 29 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,18 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
99

1010
## [Unreleased]
1111

12+
### Added
13+
- **MCP / explore**: `codegraph_explore` source sections now carry line
14+
numbers (cat -n style `<num>\t<code>`, matching the Read tool). This lets
15+
the agent cite `file:line` straight from the explore payload instead of
16+
re-opening the file just to find a line number — the dominant residual
17+
cost on precise-tracing questions. In an isolated A/B (answer a
18+
"which exact line" question with the relevant code already in the
19+
payload), the no-line-numbers arm spent 2 file Reads + a grep recovering
20+
the line number while the line-numbered arm answered with zero follow-up
21+
tool calls. Payload cost is small (~3-5%). Set
22+
`CODEGRAPH_EXPLORE_LINENUMS=0` to disable.
23+
1224
### Changed
1325
- **MCP / explore**: `codegraph_explore` output is now adaptive to project
1426
size. The tool used to apply a fixed 35KB cap regardless of how large the
@@ -22,12 +34,23 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
2234
(<5,000) caps at ~28KB; large (<15,000) keeps the historical ~35KB; very
2335
large goes up to ~38KB. A new per-file char cap also prevents a single
2436
file with many adjacent symbols from collapsing into one whole-file dump
25-
(the Alamofire `Session.swift` case from #185). Measured against the
26-
same repos used in the README benchmark: Alamofire ~62% smaller per call,
27-
Excalidraw ~35%, VS Code ~14%. Agent-trust floor still holds — the
28-
Relationships section, scored cluster selection, and structured-source
29-
output are all retained. Thanks to
30-
[@essopsp](https://github.com/essopsp) for the repro.
37+
(the Alamofire `Session.swift` case from #185). Per-file cluster
38+
selection ranks clusters that contain a query entry point ahead of dense
39+
declaration blocks, and whole-file "envelope" nodes (a class/struct that
40+
spans most of the file) are excluded from clustering so the methods the
41+
query asked about aren't buried under the container's opening lines.
42+
Measured against the same repos used in the README benchmark, end state
43+
with line numbers on: Alamofire ~60% smaller per call, Excalidraw ~32%,
44+
VS Code ~12%. Agent-trust floor still holds — the Relationships section,
45+
scored cluster selection, and structured-source output are all retained.
46+
Thanks to [@essopsp](https://github.com/essopsp) for the repro.
47+
48+
### Fixed
49+
- **MCP**: source-omission markers in `codegraph_explore` and
50+
`codegraph_context` output are now language-neutral (`... (gap) ...`,
51+
`... (trimmed) ...`, `... (truncated) ...`) instead of C-style `//`
52+
comments, which were misleading inside Python, Ruby, and other non-C
53+
fenced source blocks.
3154

3255
## [0.7.10] - 2026-05-19
3356

__tests__/explore-output-budget.test.ts

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,4 +188,47 @@ describe('codegraph_explore output respects the adaptive budget', () => {
188188
const sourceFollowsHeader = text.indexOf('### Source Code') > 0;
189189
expect(hasRelationships || sourceFollowsHeader).toBe(true);
190190
});
191+
192+
it('prefixes source lines with line numbers by default (cat -n style)', async () => {
193+
delete process.env.CODEGRAPH_EXPLORE_LINENUMS;
194+
const result = await handler.execute('codegraph_explore', { query: 'Session method helper' });
195+
const text = result.content?.[0]?.text ?? '';
196+
// At least one fenced source line should look like `<digits>\t<code>`.
197+
expect(/\n\d+\t/.test(text)).toBe(true);
198+
});
199+
200+
it('omits line numbers when CODEGRAPH_EXPLORE_LINENUMS=0', async () => {
201+
process.env.CODEGRAPH_EXPLORE_LINENUMS = '0';
202+
try {
203+
const result = await handler.execute('codegraph_explore', { query: 'Session method helper' });
204+
const text = result.content?.[0]?.text ?? '';
205+
// The synthetic source has no tab-prefixed numeric lines of its own,
206+
// so none should appear when the toggle is off.
207+
expect(/\n\d+\t(?:export| )/.test(text)).toBe(false);
208+
} finally {
209+
delete process.env.CODEGRAPH_EXPLORE_LINENUMS;
210+
}
211+
});
212+
213+
it('uses language-neutral omission markers (no C-style // in the output)', async () => {
214+
// The gap/trimmed separators must not assume `//` is a comment — that's
215+
// wrong in Python, Ruby, etc. They render inside fenced source blocks.
216+
const result = await handler.execute('codegraph_explore', { query: 'Session method helper' });
217+
const text = result.content?.[0]?.text ?? '';
218+
expect(text).not.toContain('// ... (gap)');
219+
expect(text).not.toContain('// ... trimmed');
220+
});
221+
222+
it('does not collapse a whole-file class into just its header (envelope filter)', async () => {
223+
// The synthetic `Session` class spans the entire file. Without the
224+
// envelope filter it would form one giant cluster that tail-trims to
225+
// the class declaration, hiding the methods. Confirm real method bodies
226+
// make it into the output. Regression guard for the #185 follow-up.
227+
const result = await handler.execute('codegraph_explore', { query: 'Session method helper' });
228+
const text = result.content?.[0]?.text ?? '';
229+
// A method body line (`methodN(arg: string)`) should appear, not just
230+
// the `export class Session {` opener.
231+
const hasMethodBody = /method\d+\(arg: string\)/.test(text);
232+
expect(hasMethodBody).toBe(true);
233+
});
191234
});

package-lock.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@colbymchenry/codegraph",
3-
"version": "0.7.10",
3+
"version": "0.7.11",
44
"description": "Supercharge Claude Code with semantic code intelligence. 94% fewer tool calls • 77% faster exploration • 100% local.",
55
"main": "dist/index.js",
66
"types": "dist/index.d.ts",

src/context/index.ts

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1006,9 +1006,11 @@ export class ContextBuilder {
10061006

10071007
const code = await this.extractNodeCode(node);
10081008
if (code) {
1009-
// Truncate if too long
1009+
// Truncate if too long. Language-neutral marker (no `//` — not a
1010+
// comment in Python, Ruby, etc.); this renders inside a fenced
1011+
// source block whose language varies.
10101012
const truncated = code.length > maxBlockSize
1011-
? code.slice(0, maxBlockSize) + '\n// ... truncated ...'
1013+
? code.slice(0, maxBlockSize) + '\n... (truncated) ...'
10121014
: code;
10131015

10141016
blocks.push({

src/mcp/tools.ts

Lines changed: 71 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,38 @@ export function getExploreOutputBudget(fileCount: number): ExploreOutputBudget {
142142
};
143143
}
144144

145+
/**
146+
* Whether `codegraph_explore` should prefix source lines with their line
147+
* numbers (cat -n style: `<num>\t<code>`).
148+
*
149+
* Line numbers let the agent cite `file:line` straight from the explore
150+
* payload instead of re-Reading the file just to find a line number — the
151+
* dominant residual cost on precise-tracing questions (#185 follow-up).
152+
*
153+
* Defaults ON. Set `CODEGRAPH_EXPLORE_LINENUMS=0` to disable (used by the
154+
* A/B harness to measure the payload-cost vs. read-savings tradeoff).
155+
*/
156+
function exploreLineNumbersEnabled(): boolean {
157+
return process.env.CODEGRAPH_EXPLORE_LINENUMS !== '0';
158+
}
159+
160+
/**
161+
* Prefix each line of a source slice with its 1-based line number, matching
162+
* the Read tool's `cat -n` convention (number + tab) so the agent treats it
163+
* the same way it treats Read output.
164+
*
165+
* @param slice contiguous source text (already extracted from the file)
166+
* @param firstLineNumber the 1-based line number of the slice's first line
167+
*/
168+
function numberSourceLines(slice: string, firstLineNumber: number): string {
169+
const out: string[] = [];
170+
const split = slice.split('\n');
171+
for (let i = 0; i < split.length; i++) {
172+
out.push(`${firstLineNumber + i}\t${split[i]}`);
173+
}
174+
return out.join('\n');
175+
}
176+
145177
/**
146178
* Mark a Claude session as having consulted MCP tools.
147179
* This enables Grep/Glob/Bash commands that would otherwise be blocked.
@@ -940,10 +972,19 @@ export class ToolHandler {
940972
// are worth 10, directly-connected nodes 3, peripheral nodes 1, and
941973
// bare edge-source lines 2 (less than a connected node but more than
942974
// a peripheral one — they hint at a reference but aren't a definition).
975+
// Container kinds whose body can span most/all of a file. When such a
976+
// node covers most of the file we drop it from the ranges: keeping it
977+
// would merge every method inside it into one giant cluster spanning
978+
// the whole file, which then tail-trims down to just the container's
979+
// opening lines (its header/declarations) and buries the methods the
980+
// query actually asked about (#185 follow-up — Session.swift in
981+
// Alamofire is the canonical case: the `Session` class spans ~1,400
982+
// lines). We want the granular symbols inside, not the envelope.
983+
const ENVELOPE_KINDS = new Set(['file', 'module', 'class', 'struct', 'interface', 'enum', 'namespace', 'protocol', 'trait', 'component']);
943984
const ranges: Array<{ start: number; end: number; name: string; kind: string; importance: number }> = group.nodes
944985
.filter(n => n.startLine > 0 && n.endLine > 0)
945-
// Skip file/component nodes that span the entire file — they'd create one giant cluster
946-
.filter(n => !(n.kind === 'component' && n.startLine === 1 && n.endLine >= fileLines.length - 1))
986+
// Drop whole-file envelope nodes (containers covering >50% of the file).
987+
.filter(n => !(ENVELOPE_KINDS.has(n.kind) && (n.endLine - n.startLine + 1) > fileLines.length * 0.5))
947988
.map(n => {
948989
let importance = 1;
949990
if (entryNodeIds.has(n.id)) importance = 10;
@@ -975,12 +1016,13 @@ export class ToolHandler {
9751016
if (ranges.length === 0) continue;
9761017

9771018
const gapThreshold = budget.gapThreshold;
978-
const clusters: Array<{ start: number; end: number; symbols: string[]; score: number }> = [];
1019+
const clusters: Array<{ start: number; end: number; symbols: string[]; score: number; maxImportance: number }> = [];
9791020
let current = {
9801021
start: ranges[0]!.start,
9811022
end: ranges[0]!.end,
9821023
symbols: [`${ranges[0]!.name}(${ranges[0]!.kind})`],
9831024
score: ranges[0]!.importance,
1025+
maxImportance: ranges[0]!.importance,
9841026
};
9851027

9861028
for (let i = 1; i < ranges.length; i++) {
@@ -989,13 +1031,15 @@ export class ToolHandler {
9891031
current.end = Math.max(current.end, r.end);
9901032
current.symbols.push(`${r.name}(${r.kind})`);
9911033
current.score += r.importance;
1034+
current.maxImportance = Math.max(current.maxImportance, r.importance);
9921035
} else {
9931036
clusters.push(current);
9941037
current = {
9951038
start: r.start,
9961039
end: r.end,
9971040
symbols: [`${r.name}(${r.kind})`],
9981041
score: r.importance,
1042+
maxImportance: r.importance,
9991043
};
10001044
}
10011045
}
@@ -1005,25 +1049,36 @@ export class ToolHandler {
10051049
// The pathological case (#185): a file like Session.swift where every
10061050
// method is adjacent collapses into one cluster spanning the whole
10071051
// file, and dumping that into the agent's context is most of the
1008-
// token cost on small projects. We pick clusters in score order
1009-
// (importance per line, so we don't prefer one giant low-density
1010-
// cluster over several focused ones) until the per-file char cap is
1011-
// hit. Truly enormous single clusters get tail-trimmed with a marker.
1052+
// token cost on small projects. We pick clusters in priority order
1053+
// until the per-file char cap is hit. Truly enormous single clusters
1054+
// get tail-trimmed with a marker.
10121055
const contextPadding = 3;
1056+
const withLineNumbers = exploreLineNumbersEnabled();
10131057
const buildSection = (c: { start: number; end: number }): string => {
10141058
const startIdx = Math.max(0, c.start - 1 - contextPadding);
10151059
const endIdx = Math.min(fileLines.length, c.end + contextPadding);
1016-
return fileLines.slice(startIdx, endIdx).join('\n');
1060+
const slice = fileLines.slice(startIdx, endIdx).join('\n');
1061+
// startIdx is 0-based, so the slice's first line is line startIdx + 1.
1062+
return withLineNumbers ? numberSourceLines(slice, startIdx + 1) : slice;
10171063
};
1018-
const GAP_MARKER = '\n\n// ... (gap) ...\n\n';
1019-
1020-
// Score clusters by score-per-line (density) so a 30-line cluster
1021-
// with two entry symbols outranks a 400-line cluster with two
1022-
// peripheral symbols. Stable tiebreak by score, then by smaller
1023-
// span (cheaper to include).
1064+
// Language-neutral separator (no `//` — not a comment in Python, Ruby,
1065+
// etc.). With line numbers on, the line-number jump also signals the gap.
1066+
const GAP_MARKER = '\n\n... (gap) ...\n\n';
1067+
1068+
// Rank clusters for inclusion under the per-file cap. Entry-point
1069+
// clusters come first: a cluster containing a query entry point
1070+
// (importance 10) must outrank a dense block of mere declarations,
1071+
// otherwise on a large file like Session.swift the top-of-file class
1072+
// header + property list (many adjacent low-importance nodes, high
1073+
// density) wins the budget and buries the actual methods the query
1074+
// asked about (perform/didCreateURLRequest/task live deep in the
1075+
// file). Within the same importance tier, prefer density (score per
1076+
// line) so we still favor focused clusters over sprawling ones, then
1077+
// smaller span as a cheap-to-include tiebreak.
10241078
const rankedClusters = clusters
10251079
.map((c, i) => ({ idx: i, span: c.end - c.start + 1, c }))
10261080
.sort((a, b) => {
1081+
if (b.c.maxImportance !== a.c.maxImportance) return b.c.maxImportance - a.c.maxImportance;
10271082
const densityA = a.c.score / a.span;
10281083
const densityB = b.c.score / b.span;
10291084
if (densityB !== densityA) return densityB - densityA;
@@ -1064,7 +1119,7 @@ export class ToolHandler {
10641119
// If a single chosen cluster is still oversize (long monolithic
10651120
// function), tail-trim it. Better one trimmed view than nothing.
10661121
if (fileSection.length > budget.maxCharsPerFile) {
1067-
fileSection = fileSection.slice(0, budget.maxCharsPerFile) + '\n// ... trimmed ...';
1122+
fileSection = fileSection.slice(0, budget.maxCharsPerFile) + '\n... (trimmed) ...';
10681123
fileTrimmed = true;
10691124
}
10701125
if (chosenIndices.size < clusters.length || fileTrimmed) {
@@ -1094,7 +1149,7 @@ export class ToolHandler {
10941149
if (totalChars + fileSection.length + 200 > budget.maxOutputChars) {
10951150
const remaining = budget.maxOutputChars - totalChars - 200;
10961151
if (remaining < 500) break;
1097-
const trimmed = fileSection.slice(0, remaining) + '\n// ... trimmed ...';
1152+
const trimmed = fileSection.slice(0, remaining) + '\n... (trimmed) ...';
10981153

10991154
lines.push(fileHeader);
11001155
lines.push('');

0 commit comments

Comments
 (0)