Skip to content

fix: resolve all dogfood v2.2.0 bugs#72

Merged
carlos-alm merged 1 commit into
mainfrom
fix/dogfood-v2.2.0-bugs
Feb 24, 2026
Merged

fix: resolve all dogfood v2.2.0 bugs#72
carlos-alm merged 1 commit into
mainfrom
fix/dogfood-v2.2.0-bugs

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

Fixes all 4 bugs found in the dogfood report for v2.2.0, plus a search quality improvement:

  • structure . returns empty results — treat . as no filter in structureData() (src/structure.js)
  • Stale embeddings after rebuild — invalidate embeddings when nodes are deleted during build (full + incremental), warn about orphaned embeddings (src/builder.js)
  • embed default model requires HF auth — change default from nomic-v1.5 (gated) to minilm (public), catch auth errors with clear guidance (src/embedder.js, src/cli.js)
  • Cross-language false positive in export — add --min-confidence filter (default 0.5) to DOT/Mermaid/JSON exports (src/export.js, src/cli.js)
  • Search quality — split camelCase/snake_case identifiers into words in embedding text so "build graph" matches buildGraph (src/embedder.js)

Test plan

  • All 476 tests pass
  • Lint clean (biome)
  • Verify codegraph structure . returns full directory tree
  • Verify codegraph embed . works without HF_TOKEN
  • Verify codegraph export excludes low-confidence edges by default
  • Rebuild + re-embed and verify search "build graph" ranks buildGraph higher

- structure: treat `.` as no filter in structureData() (#1)
- builder: invalidate embeddings when nodes are deleted during build,
  warn about orphaned embeddings after rebuild (#2)
- embedder: change default model to minilm (public, no auth required),
  catch auth/download errors with clear guidance (#3)
- embedder: split camelCase/snake_case identifiers in embedding text
  for better search relevance (search quality note)
- export: add --min-confidence filter (default 0.5) to DOT/Mermaid/JSON
  exports, filtering spurious low-confidence edges (#4)
- dogfood report: annotate all bugs as fixed

Impact: 8 functions changed, 8 affected
@claude

claude Bot commented Feb 24, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


PR Review: Fix dogfood v2.2.0 bugs

Review Checklist:

  • Read dogfood report to understand original issues
  • Phase 1: Root cause analysis
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity analysis
  • Phase 4: Code quality review
  • Provide final rating and recommendation

Root Cause Analysis

Problem Being Solved: Fixes 4 legitimate bugs found during dogfooding v2.2.0, plus a search quality enhancement. These are real functional issues, not symptoms:

  1. structure . empty results - Logic bug in filter normalization that treated current directory as an invalid filter
  2. Stale embeddings after rebuild - Data consistency issue where node ID changes invalidated embeddings without cleanup
  3. Default embed model auth failure - UX issue where default model required HF_TOKEN, causing crashes for new users
  4. Cross-language false positive in exports - Low-confidence edge (0.3) showing spurious connections in visualizations
  5. Search quality - Enhancement to improve identifier matching by splitting camelCase/snake_case terms

Why This Approach: Each fix addresses root causes appropriately:

  • Structure bug: Proper filter normalization logic
  • Embeddings: Proactive invalidation during build process + user warnings
  • Auth issue: Changed default to public model + graceful error handling
  • False positives: Configurable confidence threshold with sensible default
  • Search: Improved embedding text generation for better semantic matching

Risk Assessment: LOW - All fixes are surgical, well-scoped, and maintain backward compatibility

Phase 2: Configuration Change Detection ✅

NO HIGH-RISK CONFIGURATION CHANGES DETECTED

  • No linting rules disabled or relaxed
  • No test timeouts increased or coverage thresholds lowered
  • No build configuration modifications
  • Default model change (nomic-v1.5 → minilm) is a legitimate fix for auth issues
  • New --min-confidence flag with 0.5 default is appropriate for filtering noise

Phase 3: Test Integrity ✅

NO TEST INTEGRITY ISSUES

  • All 476 tests still pass according to PR description
  • No test files modified in this PR
  • No weakened assertions or removed tests
  • No eslint-disable or test-related workarounds
  • Changes are in production code with proper error handling

Phase 4: Code Quality Review

Structure Fix (src/structure.js:235)

const filterDir = rawDir && normalizePath(rawDir) !== '.' ? rawDir : null;

Excellent: Clean logic that treats . as "no filter". Fixes the exact root cause identified in the dogfood report.

Embedding Invalidation (src/builder.js)

Well-designed: Two-tier approach:

  1. Full builds: Clear entire embeddings table (DELETE FROM embeddings)
  2. Incremental builds: Delete embeddings for affected files before deleting nodes (correct ordering)
  3. Post-build warning: Check for orphaned embeddings and guide user to re-embed

Order matters here - embeddings must be deleted before nodes to maintain referential integrity. Implementation is correct.

Auth Error Handling (src/embedder.js:113-125)

Robust:

  • Changed default from gated nomic-v1.5 to public minilm model
  • Graceful error handling with clear user guidance (HF_TOKEN setup or model switch)
  • Maintains support for gated models for advanced users

Export Confidence Filter (src/export.js:4,12,29,31)

Clean implementation:

  • Sensible default of 0.5 confidence threshold
  • Applied consistently across DOT, Mermaid, JSON formats
  • SQL-level filtering for performance
  • Configurable via --min-confidence flag

Search Quality Enhancement (src/embedder.js:11-17)

Smart improvement:

function splitIdentifier(name) {
  return name
    .replace(/([a-z])([A-Z])/g, '$1 $2')  // camelCase
    .replace(/([A-Z]+)([A-Z][a-z])/g, '$1 $2')  // PascalCase  
    .replace(/[_-]+/g, ' ')  // snake_case, kebab-case
    .trim();
}

Handles all common identifier patterns. Should improve "build graph" → buildGraph matching.

Minor Issues

  • CLI help text: The embed command help mentions "minilm (default)" correctly
  • Model list: Properly shows minilm as default in models command output
  • Error messages: Clear and actionable for auth failures

Security Assessment ✅

  • No command injection risks
  • No SQL injection (uses parameterized queries)
  • No credential exposure
  • Auth token handling is secure (env var only)

Performance Impact ✅

  • Embedding invalidation adds minimal overhead during builds
  • SQL confidence filtering is efficient (indexed)
  • Search quality change doesn't affect performance significantly

Final Recommendation

This is a high-quality bug fix PR that addresses real issues found during dogfooding. Each fix is:

  • ✅ Well-researched and documented in the dogfood report
  • ✅ Surgically targeted at root causes
  • ✅ Backward compatible
  • ✅ Properly tested (all 476 tests pass)
  • ✅ Well-implemented with proper error handling

The changes demonstrate excellent software engineering practices - using the tool on itself to find issues, documenting them thoroughly, and implementing targeted fixes.

Rating: ⭐⭐⭐⭐☆ (4/5)
Action: APPROVE
Reasoning: Solid bug fixes with no concerning patterns. The -1 star is only because this is maintenance rather than new functionality. All fixes address legitimate issues and are well-implemented.


@greptile-apps

greptile-apps Bot commented Feb 24, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Fixes 4 bugs from dogfood testing and improves search quality by addressing . filter handling, stale embeddings after rebuild, auth-required default model, cross-language false positives in exports, and identifier splitting for better semantic search.

  • Fixed structure . returning empty results by treating . as no filter in structureData()
  • Added embedding invalidation during both full and incremental builds, with orphaned embedding warnings
  • Changed default embedding model from nomic-v1.5 (gated) to minilm (public) and added clear auth error handling
  • Added --min-confidence filter (default 0.5) to DOT/Mermaid/JSON exports to exclude low-confidence false positives
  • Improved search quality by splitting camelCase/snake_case identifiers into words in embedding text

Confidence Score: 5/5

  • Safe to merge - all fixes are well-tested, targeted, and address real bugs
  • All 476 tests pass, changes are surgical fixes for documented bugs with clear before/after behavior, no breaking changes, and the implementation follows established patterns in the codebase
  • No files require special attention

Important Files Changed

Filename Overview
src/structure.js Treats . as no filter to fix empty results bug
src/builder.js Invalidates embeddings when nodes are deleted, warns about orphaned embeddings
src/embedder.js Changes default model to public minilm, adds auth error handling, splits identifiers for better search
src/export.js Adds --min-confidence filter (default 0.5) to all export formats
src/cli.js Updates CLI to reflect new defaults and add min-confidence option

Last reviewed commit: 08057f0

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@carlos-alm carlos-alm merged commit 2b3bb05 into main Feb 24, 2026
18 checks passed
@carlos-alm carlos-alm deleted the fix/dogfood-v2.2.0-bugs branch February 24, 2026 02:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant