Skip to content

Bugs and design improvements found during ql-mcp evaluation session (v2.25.1-next.2) #198

@data-douser

Description

@data-douser

Summary

During a thorough evaluation of the ql-mcp server primitives against 3 CodeQL databases, 4 query packs (UI5, CAP, XSJS, standard JS), generating 11 SARIF files with 1,101 total findings, several bugs and design improvement opportunities were identified. This issue tracks resolution of all items for the v2.25.1-next.2 release.

Evaluation context: JavaScript language, advanced-security/codeql-sap-js custom queries, CodeQL CLI v2.25.1, macOS.


Bugs

Bug 1: bqrs_interpret is unusable through the MCP interface (Critical)

The codeql_bqrs_interpret tool has multiple schema/parameter issues that make it effectively non-functional:

  1. file parameter type mismatch: When passed as a string, the tool rejects with "must be array". When passed as an array ["/path/to/file.bqrs"], the literal bracket characters are included in the file path, causing: No results file present at /Users/.../["/path/to/file.bqrs"]
  2. -t parameter undocumented format: The underlying CLI requires KEY=VALUE format (e.g. -t id=js/query-id) but the tool schema provides no guidance on this. Passing just a string like "problem" fails with: Value for option '-t' should be in KEY=VALUE format but was problem
  3. Missing required database parameter: The tool needs a database path for source archive context during SARIF interpretation, but this isn't exposed as a required parameter in the schema.

Impact: bqrs_interpret is the only BQRS-to-SARIF primitive available standalone (outside of database_analyze/query_run). Its broken state forces workarounds.

Suggested fix: Fix the file parameter to accept a single string path. Document the -t parameter format. Add database as an optional parameter for source archive resolution.


Bug 2: query_results_cache_compare reports incorrect totalResultCount

Reproduction:

  1. Run query_run for UI5Clickjacking.ql on a database → cached with a key
  2. query_results_cache_retrieve with that cache key → returns totalResults: 82
  3. query_results_cache_compare with queryName: "UI5Clickjacking" → reports totalResultCount: 0 for both databases ❌

The compare tool is not properly reading the result count from the cached SARIF data.

Suggested fix: Ensure cache_compare reads totalResults from the stored SARIF subset the same way cache_retrieve does.


Bug 3: annotation_search does not search by category field

Reproduction:

  1. Create annotation with category: "performance", entityKey: "UI5Xss.ql", content about performance data
  2. annotation_search with search: "performance" → returns [] (empty) ❌
  3. annotation_search with search: "UI5Xss" → returns the annotation ✅

The search only matches against content and entityKey text, not the category field.

Suggested fix: Include the category field in the full-text search, or add a separate category filter parameter.


Bug 4: audit_add_notes ignores findingId parameter

Reproduction:

  1. Store a finding via audit_store_findings → returns finding with id: 10
  2. Call audit_add_notes with findingId: 10 + notes → rejected as "must have required property 'sourceLocation'"
  3. Must provide sourceLocation + line + owner + repo instead of just findingId

The findingId parameter is accepted by the schema but effectively ignored — the tool identifies findings by sourceLocation + line composite key.

Suggested fix: Either make findingId the primary lookup mechanism (simpler API), or document clearly that sourceLocation + line is required.


Bug 5: bqrs_info files parameter schema mismatch

The files parameter is typed as an array, but codeql bqrs info only accepts a single file argument. Passing multiple files in the array causes: Unmatched arguments from index 1: '/path/to/second.bqrs'

Suggested fix: Change files to a single file string parameter, or loop internally over the array calling bqrs info once per file.


Design Improvements

Design 1: database_analyze should populate query_results_cache

Currently, only query_run auto-caches results into the query_results_cache. Running database_analyze (the primary bulk analysis tool) does not populate the cache. This means:

  • Users cannot use cache_compare or cache_retrieve after a database_analyze run
  • To get cached results, users must re-run each query individually via query_run

Suggested approach: After database_analyze completes, iterate over the per-query BQRS results and cache them the same way query_run does.


Design 2: Guard against parallel database_analyze on the same database

When the MCP framework dispatches two database_analyze calls in parallel targeting the same database, the second fails with:

the cache directory is already locked by another running process

The CodeQL CLI uses exclusive file locks on the database cache directory and cannot handle concurrent access.

Suggested approach: Implement a per-database mutex/queue in the MCP server so that concurrent requests to the same database are serialized rather than failing. Alternatively, return a user-friendly error indicating the database is in use.


Design 3: query_results_cache_lookup should support lookup by cache key or query name

Currently query_results_cache_lookup only accepts language as a parameter. It cannot look up whether a specific cache key exists or whether results for a specific query name are cached. The only way to check is to call cache_retrieve and see if it returns data.

Suggested approach: Add optional cacheKey and/or queryName parameters to cache_lookup so users can check cache status without retrieving the full result set.


Design 4: MCP prompts for persisting profile_codeql_query_from_logs results

The profile_codeql_query_from_logs tool returns raw JSON with profiling data (slowest predicates, durations, result sizes, cache hit ratios). There is currently no streamlined way to persist these results as annotations or audit findings.

Suggested approach: Provide MCP prompt templates (e.g. store_profiling_results or annotate_query_performance) that:

  1. Accept profiling output from profile_codeql_query_from_logs
  2. Automatically extract key metrics (top N predicates, total duration, predicate counts, cache hit ratio)
  3. Create structured annotations or audit findings linked to the query file
  4. Enable iterative profile → annotate → optimize → re-profile → compare workflows

Design 5: Auto-enable annotation, audit, and cache tools by default in the VSIX extension

Currently, to use the annotation, audit, and query results caching tools, users must manually configure two environment variables in VS Code settings:

  • ENABLE_ANNOTATION_TOOLS = true
  • MONITORING_STORAGE_LOCATION = .codeql/ql-mcp

This is a poor out-of-box experience and means most users never discover these capabilities.

Suggested approach for v2.25.1-next.2:

  1. Set ENABLE_ANNOTATION_TOOLS=true by default in the extension's server startup logic, so annotation/audit/cache tools are registered automatically when the MCP server starts.
  2. Default MONITORING_STORAGE_LOCATION to a sensible workspace-relative path (e.g. .codeql/ql-mcp) when not explicitly configured. This aligns with the existing convention of storing ql-mcp artifacts under .codeql/.
  3. Add a VS Code extension setting (e.g. codeql-mcp.enableAnnotationTools with default true) that allows users to disable these tools at server startup if desired. When set to false, the extension should omit ENABLE_ANNOTATION_TOOLS from the env or set it to false, preventing registration of the annotation/audit/cache tools.
  4. Remove the need for manual Additional Env configuration — the two env vars should be managed internally by the extension based on the setting above, not require users to hand-add them in the Codeql-mcp: Additional Env table.

This ensures:

  • New installs get the full tool suite immediately
  • Existing users who already have the env vars configured are unaffected
  • Users who experience issues can disable via a single toggle in settings
  • The Additional Env mechanism remains available as an override for advanced configuration

Checklist for v2.25.1-next.2

  • Bug 1: Fix bqrs_interpret parameter handling (file type, -t format docs, database param)
  • Bug 2: Fix query_results_cache_compare result counting
  • Bug 3: Include category in annotation_search matching
  • Bug 4: Fix audit_add_notes to use findingId as primary lookup
  • Bug 5: Fix bqrs_info files parameter to handle single file correctly
  • Design 1: Populate query_results_cache from database_analyze results
  • Design 2: Serialize concurrent database_analyze calls to the same database
  • Design 3: Add cacheKey/queryName params to query_results_cache_lookup
  • Design 4: Add MCP prompts for persisting profiling results as annotations
  • Design 5: Auto-enable annotation/audit/cache tools by default; add codeql-mcp.enableAnnotationTools setting
  • Release Prep 1: Use the maintain-changelog agent skill to guide an update of the CHANGLOG.md in order to replace the section for v2.25.1-next.1 release with an entry appropriate for the v2.25.1->v2.25.1-next.2 delta

Testing Notes

All bugs were discovered during a single evaluation session using:

  • Databases: 3 JavaScript databases of varying sizes (5.8 MB and 11.5 MB stringpools)
  • Query packs: advanced-security/javascript-sap-ui5-queries, advanced-security/javascript-sap-cap-queries, advanced-security/javascript-sap-xsjs-queries, codeql/javascript-queries
  • Tools exercised: register_database, resolve_database, database_analyze, query_run, bqrs_info, bqrs_decode, bqrs_interpret, query_results_cache_lookup/retrieve/compare, annotation_create/get/search/update/list, audit_store_findings/list_findings/add_notes, profile_codeql_query_from_logs
  • 11 SARIF files generated with 1,101 total findings across all query packs
  • 15 annotations and 5 audit findings persisted in the ql-mcp annotation/audit stores

Metadata

Metadata

Labels

bugSomething isn't workingenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions