Skip to content

Fixes for extension .lock database contention and tool improvements to avoid LLM use of grep#119

Merged
data-douser merged 14 commits intomainfrom
dd/no-grep-or-bust
Mar 11, 2026
Merged

Fixes for extension .lock database contention and tool improvements to avoid LLM use of grep#119
data-douser merged 14 commits intomainfrom
dd/no-grep-or-bust

Conversation

@data-douser
Copy link
Copy Markdown
Collaborator

@data-douser data-douser commented Mar 10, 2026

Resolves #117, closes #120

Summary

Eliminate the need for LLMs to use grep, find, or direct CLI access when working with the ql-mcp server. Adds new search/discovery tools, rewrites the evaluator log profiler for bounded responses, and fixes database resolution for vscode-codeql storage paths.

New MCP Tools

search_ql_code

Text/regex search across .ql/.qll files with structured JSON results (file paths, line numbers, optional context lines). Replaces grep -rn for QL code search.

codeql_resolve_files

Find files by extension and glob patterns in directory trees, wrapping codeql resolve files. Replaces find + grep for file discovery in library packs.

Improved Tools

profile_codeql_query_from_logs — two-tier response design

  • Tier 1 (inline): compact JSON — name, duration, resultSize, evalOrder, strategy, dependency count per predicate. Always small.
  • Tier 2 (detail file): line-indexed text file with full RA operations, pipeline-stage tuple progressions, and dependency lists. Each inline predicate includes {startLine, endLine} for targeted read_file access.
  • Parser enhanced to capture RA steps from PREDICATE_STARTED events and per-pipeline timing/tuple counts from PIPELINE_STARTED/PIPELINE_COMPLETED events.
  • Properly handles single-query (codeql query run) and multi-query (codeql database analyze) evaluator logs.

codeql_resolve_database — parent directory probing

When given a directory that isn't itself a database (e.g. a vscode-codeql storage path like .../advanced-security-codeql-sap-js), probes immediate children for codeql-database.yml and resolves to the actual database subdirectory (e.g. javascript/).

Extension: Database Lock Avoidance

  • DatabaseCopier class copies databases from vscode-codeql storage to a managed directory, removing .lock files to prevent contention with the query server.
  • EnvironmentBuilder uses the copier when copyDatabases is enabled; platform-native path delimiters for env vars.
  • New getManagedDatabaseStoragePath in StoragePaths.
  • New copyDatabases extension setting (default: true).

Prompt & Resource Updates

  • All grep and CLI command references removed from prompts
  • codeql_generate_log-summary de-emphasized; profile_codeql_query_from_logs is the primary evaluator log analysis tool
  • New "Discover and Search QL Code" workflow in server-tools.md
  • search_ql_code and codeql_resolve_files added to tool reference tables in all relevant prompts

Cross-Platform

  • \r\n\n normalization in evaluator log parser and search tool
  • path.delimiter for environment variable list parsing

Tests

  • Parser: 42 tests (up from 29) — new RA steps, pipeline stages, Windows line endings, real multi-query fixture
  • Profile tool: 12 tests — two-tier response, multi-query (database analyze pattern), detail file, eval order
  • search_ql_code: 18 tests — text/regex search, case sensitivity, context lines, truncation, file extensions, Windows line endings
  • resolve_files: 8 tests — tool definition, schema, result processor
  • codeql-tools: registration assertions for new tools
  • Resources: new tool documentation assertions
  • Extension: e2e tool list assertions, database copier tests, copydb integration test
  • Client integration test fixtures for both new tools

Resolves #117

Fixes a known compatibility issue for databases added, and therefore
locked, via the GitHub.vscode-codeql extension.

The vscode-codeql query server creates .lock files in the cache
directory of every registered CodeQL database, preventing the ql-mcp
server from running CLI commands (codeql_query_run,
codeql_database_analyze) against those same databases.

Add a DatabaseCopier that syncs databases from vscode-codeql storage
into a managed directory under the `vscode-codeql-development-mcp-server`
extension's globalStorage, stripping .lock files from the copy. The
EnvironmentBuilder now sets CODEQL_DATABASES_BASE_DIRS to this managed
directory by default (configurable via codeql-mcp.copyDatabases).

- New DatabaseCopier class with incremental sync (skips unchanged databases)
- StoragePaths.getManagedDatabaseStoragePath() for the managed databases/ dir
- EnvironmentBuilder accepts injectable DatabaseCopierFactory for testability
- codeql-mcp.copyDatabases setting (default: true)
- 11 unit tests for DatabaseCopier (real filesystem operations)
- 15 unit tests for EnvironmentBuilder (updated for copy mode + fallback)
- 3 bridge integration tests (managed dir structure, no .lock files)
- 4 E2E integration tests: inject .lock → copy → codeql_query_run +
  codeql_database_analyze succeed against the lock-free copy
Add search_ql_code and codeql_resolve_files tools in order to
eliminate grep/CLI dependencies.

- New tools: search_ql_code (QL text/regex search) and codeql_resolve_files
  (file discovery by extension/glob) so LLMs never need shell access
- Rewrite profile_codeql_query_from_logs with two-tier design: compact
  inline JSON + line-indexed detail file for targeted read_file access;
  parser now captures RA operations and pipeline-stage tuple progressions
- Fix codeql_resolve_database to probe child directories for databases
- Remove all grep/CLI references from prompts and resources
- Cross-platform: normalize \r\n line endings in parser and search tool
@data-douser data-douser requested review from a team and enyil as code owners March 10, 2026 14:24
Copilot AI review requested due to automatic review settings March 10, 2026 14:24
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 10, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

Comment thread server/src/tools/codeql/search-ql-code.ts Fixed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the CodeQL MCP server’s developer ergonomics by adding QL source discovery/search tools, improving evaluator-log profiling output for LLM consumption, and updating discovery/environment handling (including a VS Code “copy databases” workflow to avoid .lock contention).

Changes:

  • Add search_ql_code (in-process QL grep) and codeql_resolve_files (CLI wrapper for codeql resolve files) and register/document them.
  • Refactor profile_codeql_query_from_logs to return compact structured JSON and write a line-indexed detail file (RA steps, pipeline stages, deps).
  • Switch CODEQL_*_DIRS parsing/joining to path.delimiter, and add VS Code copyDatabases support + tests (managed lock-free database copies).

Reviewed changes

Copilot reviewed 39 out of 41 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
server/test/src/tools/codeql/search-ql-code.test.ts Unit tests for searchQlCode() behavior (regex, context, truncation, CRLF).
server/test/src/tools/codeql/resolve-files.test.ts Tests the codeql_resolve_files CLI tool definition and result processor.
server/test/src/tools/codeql/profile-codeql-query-from-logs.test.ts Updates assertions for new JSON response + adds multi-query fixture coverage.
server/test/src/tools/codeql-tools.test.ts Ensures new tools are registered and updates expected tool count.
server/test/src/lib/resources.test.ts Verifies server-tools resource documents the new tools.
server/test/src/lib/evaluator-log-parser.test.ts Adds coverage for RA steps/pipeline stages parsing and CRLF handling.
server/src/tools/codeql/search-ql-code.ts Implements + registers search_ql_code tool.
server/src/tools/codeql/resolve-files.ts Adds codeql_resolve_files CLI tool definition.
server/src/tools/codeql/profile-codeql-query-from-logs.ts Two-tier profiler output: compact JSON + line-indexed detail file.
server/src/tools/codeql/index.ts Exports new CodeQL tool registrations/definitions.
server/src/tools/codeql-tools.ts Registers new tools with the MCP server.
server/src/resources/server-tools.md Documents search_ql_code, codeql_resolve_files, and updated profiler behavior.
server/src/resources/performance-patterns.md Updates performance workflow guidance for the new profiler output model.
server/src/prompts/tools-query-workflow.prompt.md Adds the new tools to the “query workflow” prompt tool table.
server/src/prompts/ql-tdd-basic.prompt.md Updates perf step to use profile_codeql_query_from_logs.
server/src/prompts/ql-tdd-advanced.prompt.md Adds guidance for search_ql_code and updates profiling workflow text.
server/src/prompts/ql-lsp-iterative-development.prompt.md Replaces “grep” guidance with search_ql_code.
server/src/prompts/explain-codeql-query.prompt.md Replaces CLI-grep guidance with MCP tool-based analysis guidance.
server/src/lib/evaluator-log-parser.ts Adds RA steps + pipeline stage tuple progression parsing; CRLF normalization.
server/src/lib/discovery-config.ts Uses path.delimiter for CODEQL_*_DIRS parsing and updates docs.
server/src/lib/cli-tool-registry.ts Adds codeql_resolve_files handling and improves codeql_resolve_database path probing.
extensions/vscode/test/suite/workspace-scenario.integration.test.ts Updates expectations for managed database directories and delimiter splitting.
extensions/vscode/test/suite/mcp-tool-e2e.integration.test.ts Asserts the new tools are available from the server.
extensions/vscode/test/suite/copydb-e2e.integration.test.ts New E2E suite exercising copyDatabases against a real DB + CLI tools.
extensions/vscode/test/suite/bridge.integration.test.ts Updates env var delimiter splitting and managed DB expectations.
extensions/vscode/test/bridge/storage-paths.test.ts Tests getManagedDatabaseStoragePath().
extensions/vscode/test/bridge/environment-builder.test.ts Tests copyDatabases default/disabled paths and copier integration.
extensions/vscode/test/bridge/database-copier.test.ts New unit tests for managed DB copying + .lock removal behavior.
extensions/vscode/src/bridge/storage-paths.ts Adds managed database storage path helper.
extensions/vscode/src/bridge/environment-builder.ts Implements copyDatabases flow + uses platform delimiter when building env vars.
extensions/vscode/src/bridge/database-copier.ts Adds database copy/sync logic with .lock removal.
extensions/vscode/package.json Adds codeql-mcp.copyDatabases setting and reorganizes configuration entries.
extensions/vscode/esbuild.config.js Ensures new copydb E2E suite is bundled for test runs.
client/integration-tests/primitives/tools/search_ql_code/search_predicate_name/before/monitoring-state.json Adds integration test fixture (currently not in standard monitoring-state shape).
client/integration-tests/primitives/tools/search_ql_code/search_predicate_name/after/monitoring-state.json Adds integration test fixture (currently mixes tool output into monitoring-state).
client/integration-tests/primitives/tools/search_ql_code/search_predicate_name/README.md Documents the new integration test scenario.
client/integration-tests/primitives/tools/codeql_resolve_files/find_qll_files/before/monitoring-state.json Adds integration test fixture (currently not in standard monitoring-state shape).
client/integration-tests/primitives/tools/codeql_resolve_files/find_qll_files/after/monitoring-state.json Adds integration test fixture (currently not in standard monitoring-state shape).
client/integration-tests/primitives/tools/codeql_resolve_files/find_qll_files/README.md Documents the new integration test scenario (name/extension mismatch).

Comment thread server/src/tools/codeql/profile-codeql-query-from-logs.ts Outdated
Comment thread server/src/tools/codeql/search-ql-code.ts
@data-douser data-douser marked this pull request as draft March 10, 2026 14:41
@data-douser data-douser self-assigned this Mar 10, 2026
@data-douser data-douser added documentation Improvements or additions to documentation enhancement New feature or request javascript Pull requests that update javascript code server labels Mar 10, 2026
@data-douser data-douser changed the title Add search_ql_code and codeql_resolve_files tools Add search_ql_code and codeql_resolve_files tools and improve profile_codeql_query_from_logs tool Mar 10, 2026
@data-douser data-douser changed the title Add search_ql_code and codeql_resolve_files tools and improve profile_codeql_query_from_logs tool Fixes for extension .lock database contention and tool improvements to avoid LLM use of grep Mar 10, 2026
- Eliminate filesystem race condition in search-ql-code.ts (read-then-check
  instead of stat-then-read)
- Add symlink cycle detection using lstatSync and visited-path tracking
- Fix tool description field names in profile-codeql-query-from-logs.ts
  ({startLine,endLine} → detailLines: {start,end})
- Fix monitoring-state.json fixtures to use standard sessions format
- Rename find_qll_files → find_ql_files to match actual .ql extension
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 43 out of 45 changed files in this pull request and generated 4 comments.

Comment thread server/src/tools/codeql/search-ql-code.ts Outdated
Comment thread server/src/tools/codeql/search-ql-code.ts
Comment thread extensions/vscode/src/bridge/database-copier.ts
Comment thread extensions/vscode/src/bridge/environment-builder.ts
- addresses latest review feedback for PR #119
- search-ql-code: check file size via lstatSync before reading; stream
  large files (>5 MB) line-by-line instead of skipping them
- evaluator-log-parser: replace readFileSync with streaming async
  generator (createReadStream + readline) for brace-depth JSON parsing;
  parseEvaluatorLog now reads the file once instead of twice
- profile-codeql-query: convert local parser to streaming with Map-based
  lookups instead of O(n) events.find()
- database-copier: use lstat in removeLockFiles to skip symlinks; throw
  on fatal mkdir failures for proper fallback in EnvironmentBuilder
- Validate contextLines/maxResults with schema bounds and clamping
- Add environment-builder test for syncAll-throws fallback
- search_ql_code: add missing await in tool handler; skip .codeql,
  node_modules, and .git directories to avoid duplicate results from
  compiled pack caches
- cli-tool-registry: extract resolveDatabasePath helper for multi-language
  DB root auto-resolution; apply to codeql_query_run, codeql_database_analyze,
  and codeql_resolve_database
- environment-builder: route CODEQL_MCP_TMP_DIR to workspace-local
  .codeql/ql-mcp scratch directory (configurable via scratchDir setting);
  add CODEQL_MCP_WORKSPACE_FOLDERS env var
- query-file-finder: add contextual hints array for missing tests,
  documentation, and expected results
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 46 out of 48 changed files in this pull request and generated 2 comments.

Comment thread server/src/tools/codeql/profile-codeql-query-from-logs.ts
Comment thread server/src/tools/codeql/search-ql-code.ts
Copilot AI and others added 2 commits March 10, 2026 16:30
…h via exponential backoff retry (#121)

* Initial plan

* fix: add retry logic with exponential backoff to install-packs.sh

The GitHub Actions integration test was failing on windows-latest with
HTTP 503 "Egress is over the account limit" when downloading CodeQL
packs from GHCR.io.

Add a run_with_retry() helper function that retries a command up to 3
times with exponential backoff (10s, 20s, 40s). Both codeql pack
install calls in install_packs() now use run_with_retry to handle
transient network errors gracefully.

Co-authored-by: data-douser <70299490+data-douser@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: data-douser <70299490+data-douser@users.noreply.github.com>
- addresses latest feedback for PR #119 ;
- profile-codeql-query-from-logs: remove non-deterministic `Generated:`
  timestamp from detail file header to ensure reproducible output for
  integration test fixtures ;
- search-ql-code: early-exit file processing once maxResults matches are
  collected; subsequent files are scanned cheaply for totalMatches count
  only, avoiding large array allocations and context extraction ;
Copilot AI review requested due to automatic review settings March 10, 2026 22:32
Comment thread server/src/tools/codeql/search-ql-code.ts Fixed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 47 out of 49 changed files in this pull request and generated 2 comments.

Comment thread server/src/tools/codeql/search-ql-code.ts Outdated
Comment thread server/src/lib/cli-tool-registry.ts
- search-ql-code: use streaming (readline) for totalMatches counting on
  large files in the early-exit path; eliminates TOCTOU race from prior
  lstatSync check
- cli-tool-registry: resolveDatabasePath now collects all candidate
  children and throws on ambiguity instead of silently picking the first
- Add tests for cross-file totalMatches accuracy under truncation, single-
  child DB auto-resolve, and multi-child DB ambiguity error
Copilot AI review requested due to automatic review settings March 10, 2026 23:14
Comment thread server/src/tools/codeql/search-ql-code.ts Fixed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 48 out of 50 changed files in this pull request and generated 2 comments.

Comment thread server/src/tools/codeql/search-ql-code.ts
Comment thread extensions/vscode/test/suite/copydb-e2e.integration.test.ts
@data-douser data-douser marked this pull request as ready for review March 11, 2026 00:13
Copilot AI review requested due to automatic review settings March 11, 2026 00:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 49 out of 51 changed files in this pull request and generated 2 comments.

Comment thread extensions/vscode/test/suite/copydb-e2e.integration.test.ts
Comment thread server/src/tools/codeql/search-ql-code.ts Outdated
@data-douser data-douser merged commit 3bd0471 into main Mar 11, 2026
26 checks passed
@data-douser data-douser deleted the dd/no-grep-or-bust branch March 11, 2026 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request javascript Pull requests that update javascript code server

Projects

None yet

4 participants