Skip to content

Commit 5e19d95

Browse files
Copilotdata-douser
andcommitted
Rewrite static MCP resources with actionable LLM-oriented content, rename URIs, add new resources
- Rewrite getting-started.md as MCP server orientation guide (codeql://server/overview) - Rewrite query-basics.md as practical query writing reference (codeql://server/queries) - Rewrite security-templates.md with multi-language templates and TDD workflow - Rewrite performance-patterns.md with profiling tool focus - Create server-prompts.md (codeql://server/prompts) with complete prompt reference - Create server-tools.md (codeql://server/tools) with complete default tool reference - Rewrite ql-test-driven-development.md as TDD theory overview with cross-links - Register ql-test-driven-development.md as MCP resource (codeql://learning/test-driven-development) - Update resources.ts with new imports and getters - Update codeql-resources.ts with new URIs and 7 resource registrations - Update resources.test.ts with tests for new resources - Update docs/ql-mcp/resources.md and server/README.md Co-authored-by: data-douser <70299490+data-douser@users.noreply.github.com>
1 parent 91a5a4a commit 5e19d95

14 files changed

+978
-415
lines changed

docs/ql-mcp/resources.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,19 @@
44
55
## Overview
66

7-
The server exposes **4 static learning resources** and a set of **dynamic per-language resources** that supply AI assistants with CodeQL reference material. Resources are read-only and backed by `*.prompt.md` files bundled with the server.
7+
The server exposes **7 static resources** and a set of **dynamic per-language resources** that supply AI assistants with CodeQL reference material. Resources are read-only and backed by `.md` files bundled with the server.
88

99
## Static Resources
1010

11-
| Resource | URI | Description |
12-
| --------------------------- | ----------------------------------- | --------------------------------------------------- |
13-
| CodeQL Getting Started | `codeql://learning/getting-started` | Comprehensive introduction to CodeQL for beginners |
14-
| CodeQL Query Basics | `codeql://learning/query-basics` | Learn the fundamentals of writing CodeQL queries |
15-
| CodeQL Security Templates | `codeql://templates/security` | Ready-to-use security query templates |
16-
| CodeQL Performance Patterns | `codeql://patterns/performance` | Best practices for writing efficient CodeQL queries |
11+
| Resource | URI | Description |
12+
| ------------------------------ | ------------------------------------------- | ------------------------------------------------------------------------- |
13+
| CodeQL Server Overview | `codeql://server/overview` | MCP server orientation guide: tools, prompts, resources, and workflows |
14+
| CodeQL Server Prompts | `codeql://server/prompts` | Complete reference of MCP prompts for CodeQL development workflows |
15+
| CodeQL Query Writing Guide | `codeql://server/queries` | Practical reference for writing and validating CodeQL queries |
16+
| CodeQL Server Tools | `codeql://server/tools` | Complete reference of default MCP tools for CodeQL development |
17+
| CodeQL Test-Driven Development | `codeql://learning/test-driven-development` | TDD theory and workflow for developing CodeQL queries |
18+
| CodeQL Security Templates | `codeql://templates/security` | Security query templates for multiple languages and vulnerability classes |
19+
| CodeQL Performance Patterns | `codeql://patterns/performance` | Performance profiling and optimization for CodeQL queries |
1720

1821
## Language-Specific Resources
1922

server/README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -85,10 +85,12 @@ Full reference: [Prompts](https://github.com/advanced-security/codeql-developmen
8585

8686
### Resources
8787

88-
Static learning materials and per-language references served to AI assistants:
88+
Static reference materials and per-language references served to AI assistants:
8989

90-
- **CodeQL Getting Started** / **Query Basics** — Introductory guides
91-
- **Security Templates** / **Performance Patterns** — Ready-to-use templates and best practices
90+
- **Server Overview** / **Query Writing Guide** — MCP server orientation and query development reference
91+
- **Server Tools** / **Server Prompts** — Complete tool and prompt references
92+
- **Test-Driven Development** — TDD theory and workflow for CodeQL queries
93+
- **Security Templates** / **Performance Patterns** — Multi-language security templates and profiling guidance
9294
- **Language AST References** — For actions, cpp, csharp, go, java, javascript, python, ql, ruby
9395
- **Language Security Patterns** — For cpp, csharp, go, javascript, python
9496

server/dist/codeql-development-mcp-server.js

Lines changed: 91 additions & 16 deletions
Large diffs are not rendered by default.

server/dist/codeql-development-mcp-server.js.map

Lines changed: 3 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

server/src/lib/resources.ts

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@ import gettingStartedContent from '../resources/getting-started.md';
1212
import performancePatternsContent from '../resources/performance-patterns.md';
1313
import queryBasicsContent from '../resources/query-basics.md';
1414
import securityTemplatesContent from '../resources/security-templates.md';
15+
import serverPromptsContent from '../resources/server-prompts.md';
16+
import serverToolsContent from '../resources/server-tools.md';
17+
import testDrivenDevelopmentContent from '../resources/ql-test-driven-development.md';
1518

1619
/**
1720
* Get the getting started guide content
@@ -20,6 +23,13 @@ export function getGettingStartedGuide(): string {
2023
return gettingStartedContent;
2124
}
2225

26+
/**
27+
* Get the performance patterns content
28+
*/
29+
export function getPerformancePatterns(): string {
30+
return performancePatternsContent;
31+
}
32+
2333
/**
2434
* Get the query basics guide content
2535
*/
@@ -35,8 +45,22 @@ export function getSecurityTemplates(): string {
3545
}
3646

3747
/**
38-
* Get the performance patterns content
48+
* Get the server prompts overview content
3949
*/
40-
export function getPerformancePatterns(): string {
41-
return performancePatternsContent;
50+
export function getServerPrompts(): string {
51+
return serverPromptsContent;
52+
}
53+
54+
/**
55+
* Get the server tools overview content
56+
*/
57+
export function getServerTools(): string {
58+
return serverToolsContent;
59+
}
60+
61+
/**
62+
* Get the test-driven development guide content
63+
*/
64+
export function getTestDrivenDevelopment(): string {
65+
return testDrivenDevelopmentContent;
4266
}
Lines changed: 82 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,93 @@
1-
# CodeQL Getting Started Guide
1+
# CodeQL Development MCP Server — Getting Started
22

3-
## What is CodeQL?
3+
This resource is the primary onboarding guide for LLM clients connecting to the CodeQL Development MCP Server. It explains what the server provides, which tools and prompts are available, and how to orchestrate common workflows.
44

5-
CodeQL is a semantic code analysis engine that allows you to write queries to find problems in source code.
5+
## What This Server Does
66

7-
## Installation
7+
The CodeQL Development MCP Server wraps the CodeQL CLI and supporting utilities behind the Model Context Protocol (MCP). It exposes **tools** (executable actions), **prompts** (reusable workflow templates), and **resources** (reference material) that enable an LLM to develop, test, and analyze CodeQL queries without direct shell access.
88

9-
1. Download CodeQL CLI from GitHub releases
10-
2. Add to PATH
11-
3. Verify: `codeql version`
9+
## Available Resources
1210

13-
## First Steps
11+
Read these resources via `resources/read` to deepen your understanding:
1412

15-
### 1. Create a Database
13+
| URI | Purpose |
14+
| ------------------------------------------- | ----------------------------------------- |
15+
| `codeql://server/overview` | This guide — MCP server orientation |
16+
| `codeql://server/queries` | Writing and validating CodeQL queries |
17+
| `codeql://server/tools` | Complete default tool reference |
18+
| `codeql://server/prompts` | Complete prompt reference |
19+
| `codeql://templates/security` | Security query templates (multi-language) |
20+
| `codeql://patterns/performance` | Performance profiling and optimization |
21+
| `codeql://learning/test-driven-development` | TDD theory and workflow for CodeQL |
22+
| `codeql://languages/{language}/ast` | Language-specific AST class reference |
23+
| `codeql://languages/{language}/security` | Language-specific security patterns |
1624

17-
```bash
18-
codeql database create my-db --language=java --source-root=./src
19-
```
25+
## Quick-Start Workflows
2026

21-
### 2. Run Analysis
27+
### 1. Create a New Query (TDD Approach)
2228

23-
```bash
24-
codeql database analyze my-db --format=sarif --output=results.sarif
25-
```
29+
Use the `test_driven_development` prompt (or `ql_tdd_basic` / `ql_tdd_advanced`):
2630

27-
## Resources
31+
1. `create_codeql_query` — scaffold query, test files, and `.qlref`
32+
2. `codeql_pack_install` — install pack dependencies
33+
3. Write test code with positive and negative cases
34+
4. `codeql_test_run` — run tests (expect failure initially)
35+
5. Implement query logic
36+
6. `codeql_query_compile` — validate syntax
37+
7. `codeql_test_run` — iterate until tests pass
38+
8. `codeql_test_accept` — accept correct results as baseline
2839

29-
- [CodeQL Documentation](https://codeql.github.com/)
30-
- [GitHub Security Lab](https://securitylab.github.com/)
40+
### 2. Understand Code Structure
41+
42+
Use the `tools_query_workflow` prompt:
43+
44+
1. `codeql_query_run` with `queryName="PrintAST"` — visualize the AST
45+
2. `codeql_query_run` with `queryName="PrintCFG"` — visualize control flow
46+
3. `codeql_query_run` with `queryName="CallGraphFrom"` / `"CallGraphTo"` — trace call relationships
47+
48+
### 3. Analyze Query Quality
49+
50+
1. `codeql_database_analyze` — run queries against a database
51+
2. `profile_codeql_query` or `profile_codeql_query_from_logs` — analyze performance
52+
3. `run_query_and_summarize_false_positives` prompt — assess precision
53+
4. `sarif_rank_false_positives` / `sarif_rank_true_positives` prompts — rank results
54+
55+
### 4. Iterative Development with LSP
56+
57+
Use the `ql_lsp_iterative_development` prompt:
58+
59+
1. `codeql_lsp_completion` — get code completions while writing QL
60+
2. `codeql_lsp_definition` — navigate to symbol definitions
61+
3. `codeql_lsp_references` — find all references to a symbol
62+
4. `codeql_lsp_diagnostics` — real-time syntax and semantic validation
63+
64+
## Tool Categories
65+
66+
The server provides **38 default tools** across these categories (see `codeql://server/tools` for the full reference):
67+
68+
- **CodeQL CLI tools** — Database creation, query compilation, execution, result decoding, pack management
69+
- **LSP tools** — Code completion, go-to-definition, find references, diagnostics
70+
- **Query development tools** — Scaffolding, validation, profiling, quick evaluation, database registration
71+
72+
## Prompt Categories
73+
74+
The server provides **11 prompts** (see `codeql://server/prompts` for the full reference):
75+
76+
- **Test-driven development**`test_driven_development`, `ql_tdd_basic`, `ql_tdd_advanced`
77+
- **Code understanding**`tools_query_workflow`, `explain_codeql_query`
78+
- **Iterative development**`ql_lsp_iterative_development`
79+
- **Documentation and quality**`document_codeql_query`, `run_query_and_summarize_false_positives`, `sarif_rank_false_positives`, `sarif_rank_true_positives`
80+
- **Workshop creation**`workshop_creation_workflow`
81+
82+
## Key Concepts
83+
84+
- **CodeQL database**: A relational representation of source code created by `codeql_database_create`. All queries execute against a database.
85+
- **QL pack**: A directory containing `codeql-pack.yml` with query or library code. Use `codeql_pack_install` to resolve dependencies.
86+
- **`.qlref` file**: A test reference that points from a test directory to the query being tested.
87+
- **`.expected` file**: The expected output of a query test. Use `codeql_test_accept` to update it when results are correct.
88+
- **BQRS**: Binary Query Result Sets — the native output format of `codeql_query_run`. Decode with `codeql_bqrs_decode` or interpret with `codeql_bqrs_interpret`.
89+
- **SARIF**: Static Analysis Results Interchange Format — the standard output format for `codeql_database_analyze`.
90+
91+
## Supported Languages
92+
93+
The server supports CodeQL queries for: `actions`, `cpp`, `csharp`, `go`, `java`, `javascript`, `python`, `ruby`, `swift`.
Lines changed: 98 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,100 @@
11
# Performance Optimization Patterns
22

3-
## Efficient Joins
4-
5-
```ql
6-
// Efficient - Proper join condition
7-
from Method m, MethodAccess ma
8-
where ma.getMethod() = m
9-
select m, ma
10-
```
11-
12-
## Early Filtering
13-
14-
```ql
15-
// Filter early for better performance
16-
from Expr e
17-
where e.getEnclosingCallable().getDeclaringType().hasName("Controller")
18-
and e.getType().hasName("String")
19-
```
3+
This resource describes how to evaluate and improve the performance of CodeQL queries using the MCP server's profiling tools. Rather than prescribing generic optimization rules, it focuses on using the `profile_codeql_query_from_logs` tool and the `explain_codeql_query` prompt to make evidence-based performance improvements.
4+
5+
## Primary Performance Tool: `profile_codeql_query_from_logs`
6+
7+
The `profile_codeql_query_from_logs` tool is the primary means of evaluating the actual performance of a CodeQL query. It parses existing CodeQL evaluator logs into a structured performance profile without re-running the query.
8+
9+
### Workflow
10+
11+
1. **Run the query**: Use `codeql_query_run` with `evaluationOutput` set to a directory path. This generates evaluator log files.
12+
2. **Generate a log summary**: Use `codeql_generate_log_summary` to create a human-readable summary of the evaluator log.
13+
3. **Profile**: Use `profile_codeql_query_from_logs` to parse the evaluator log into a structured performance profile identifying expensive predicates, pipeline stages, and tuple counts.
14+
4. **Identify bottlenecks**: Review the profile output for predicates with high evaluation times or unexpectedly large result sets.
15+
5. **Refine**: Modify the query to address identified bottlenecks, then re-run and re-profile to verify improvements.
16+
17+
Alternatively, use `profile_codeql_query` to profile a query by running it against a specific database and analyzing the resulting evaluator log in a single step.
18+
19+
### What the Profile Shows
20+
21+
- **Predicate evaluation times** — which predicates are the most expensive
22+
- **Tuple counts** — how many intermediate results each predicate produces
23+
- **Pipeline stages** — the internal evaluation plan chosen by the CodeQL engine
24+
- **RA (relational algebra) operations** — join orders, aggregation steps, and recursive evaluations
25+
26+
## Using `explain_codeql_query` for Performance Understanding
27+
28+
The `explain_codeql_query` prompt generates a detailed explanation of a query, including Mermaid evaluation diagrams that visualize the data flow and evaluation order. This is useful for understanding _why_ a query may be slow before profiling.
29+
30+
## Key Performance Concepts
31+
32+
The following concepts are relevant when interpreting profiling output. Verify these against actual profiling data rather than applying them blindly.
33+
34+
### Large Intermediate Result Sets
35+
36+
When a predicate produces significantly more tuples than expected, it may indicate:
37+
38+
- Missing or insufficiently restrictive filter conditions in the `where` clause
39+
- A cross-product between two large relations that should be joined more tightly
40+
41+
**How to detect**: Look for predicates in the profile output with high tuple counts relative to their expected output size.
42+
43+
### Recursive Predicate Costs
44+
45+
Recursive predicates (e.g., transitive closures via `+` or `*`) can be expensive when the underlying relation is large. The profiler shows iteration counts and per-iteration tuple growth.
46+
47+
**How to detect**: Look for recursive predicates with many iterations or high per-iteration costs in the profile output.
48+
49+
### Join Order Sensitivity
50+
51+
The CodeQL evaluator chooses a join order for predicates in the `where` clause. In some cases, the chosen order may not be optimal.
52+
53+
**How to detect**: Look for pipeline stages where a large intermediate result is produced before being filtered down. The profiler shows tuple counts at each stage.
54+
55+
### Improving "Performance" — Two Dimensions
56+
57+
The word "performance" for CodeQL queries has two meanings:
58+
59+
1. **Runtime efficiency** — how fast the query evaluates. Addressed by reducing tuple counts, improving join orders, and simplifying recursive predicates.
60+
2. **Result quality** — how accurate the query's output is (precision and recall). Addressed by refining source/sink/sanitizer definitions, adding or removing filter conditions, and testing against diverse codebases.
61+
62+
The `profile_codeql_query_from_logs` tool addresses runtime efficiency. For result quality, use the `run_query_and_summarize_false_positives` prompt and the `sarif_rank_false_positives` / `sarif_rank_true_positives` prompts.
63+
64+
## Performance Review for GitHub Actions CodeQL Scans
65+
66+
When reviewing CodeQL performance in the context of GitHub Actions CI/CD scans, key areas to examine include:
67+
68+
### Code Exclusion
69+
70+
Excluding non-essential files from analysis (vendored dependencies, generated code, test files) is one of the most impactful performance improvements. Any interpreted language or compiled language using `build-mode: none` can use a `paths-ignore` array in the CodeQL configuration file to exclude paths.
71+
72+
### Hardware Sizing
73+
74+
Recommended runner sizes based on lines of code:
75+
76+
- Small (< 100K lines): 8 GB RAM, 2 cores
77+
- Medium (100K–1M lines): 16 GB RAM, 4–8 cores
78+
- Large (> 1M lines): 64 GB RAM, 8 cores
79+
80+
### Monorepo Splitting
81+
82+
For monorepos with multiple independent applications separated by process/network boundaries, consider splitting CodeQL scans by application. This reduces database size and enables parallel scanning via Actions matrix strategies.
83+
84+
## Related Tools and Prompts
85+
86+
| Tool / Prompt | Purpose |
87+
| ------------------------------------------------ | -------------------------------------------------------------------- |
88+
| `profile_codeql_query` | Profile a query run against a database (runs the query and profiles) |
89+
| `profile_codeql_query_from_logs` | Profile from existing evaluator logs (no re-run needed) |
90+
| `codeql_generate_log_summary` | Generate a human-readable evaluator log summary |
91+
| `codeql_query_run` | Execute a query (set `evaluationOutput` to capture logs) |
92+
| `explain_codeql_query` prompt | Understand query evaluation flow with Mermaid diagrams |
93+
| `run_query_and_summarize_false_positives` prompt | Assess result quality (precision) |
94+
95+
## Related Resources
96+
97+
- `codeql://server/overview` — MCP server orientation guide
98+
- `codeql://server/queries` — Query structure and compilation tools
99+
- `codeql://server/tools` — Complete tool reference
100+
- `codeql://learning/test-driven-development` — TDD workflow for iterative query improvement

0 commit comments

Comments
 (0)