| name | create-codeql-query-development-workshop |
|---|---|
| description | Create custom CodeQL query development workshops from production-grade queries. Use this skill to generate guided learning materials with exercises, solutions, and tests that teach developers how to build CodeQL queries incrementally. |
This skill guides you through creating custom CodeQL query development workshops from existing, production-grade CodeQL queries. The workshop format uses a test-driven, incremental learning approach where developers progress through stages from simple to complex.
- Creating training materials for CodeQL query development
- Teaching developers to build custom security or code quality queries
- Generating guided learning paths from existing query implementations
- Building workshops customized to specific business needs or code patterns
Custom workshops are more effective than generic tutorials because:
- Developers learn by building queries that actually matter to their work
- Real-world query patterns are more motivating than toy examples
- Teams can train developers on their specific security or quality concerns
- Workshops scale knowledge transfer from CodeQL experts to their teams
Before creating a workshop, ensure you have:
- An existing CodeQL query (
.qlfile) that is production-ready - Passing unit tests for that query (
.expectedresults that match actual results) - Understanding of the query's purpose and complexity
- Access to CodeQL Development MCP Server tools
This repository uses codeql-pack.yml for new CodeQL pack configuration files and recommends it over qlpack.yml. While both codeql-pack.yml and qlpack.yml are equally supported by CodeQL, codeql-pack.yml is preferred as it aligns with the codeql-pack.lock.yml naming convention used by codeql pack install. If you encounter references to qlpack.yml in this workshop or related materials, treat them as equivalent to codeql-pack.yml, with codeql-pack.yml as the recommended name for new packs.
When invoking this skill, you must provide:
- Source Query Path: Full path to the production query
.qlfile - Source Query Tests Path: Full path to the directory containing unit tests for the query
- Base Directory: Path where the workshop directory will be created (e.g.,
/tmp/workshopsor<your-repo>/workshops) - Workshop Name: Name for the workshop directory (e.g.,
dataflow-analysis-cpp)
The skill creates a complete workshop under <base_dir>/<workshop_name>/:
<base_dir>/<workshop_name>/
├── README.md # Workshop overview and setup instructions
├── codeql-workspace.yml # CodeQL workspace configuration
├── build-databases.sh # Script to create test databases
├── exercises/ # Student exercise queries (incomplete)
│ ├── codeql-pack.yml # Query pack config
│ ├── Exercise1.ql
│ ├── Exercise2.ql
│ └── ...
├── exercises-tests/ # Unit tests for exercises
│ ├── codeql-pack.yml # Test pack config (with extractor + dependency on exercises)
│ ├── Exercise1/
│ │ ├── Exercise1.qlref
│ │ ├── Exercise1.expected
│ │ └── test.{ext}
│ └── ...
├── solutions/ # Complete solution queries
│ ├── codeql-pack.yml # Query pack config
│ ├── Exercise1.ql
│ ├── Exercise2.ql
│ └── ...
├── solutions-tests/ # Unit tests for solutions
│ ├── codeql-pack.yml # Test pack config (with extractor + dependency on solutions)
│ ├── Exercise1/
│ │ ├── Exercise1.qlref
│ │ ├── Exercise1.expected
│ │ └── test.{ext}
│ └── ...
├── graphs/ # AST/CFG visualizations
│ ├── Exercise1-ast.txt
│ ├── Exercise1-cfg.txt
│ └── ...
└── tests-common/ # Shared test code and databases
├── test.{ext}
└── codeql-pack.yml
See workshop-structure-reference.md for detailed structure documentation.
The workshop creation process follows these phases:
- Analyze Source Query using
find_codeql_query_filesandexplain_codeql_query - Identify Complexity to determine number of stages
- Extract Test Cases from existing unit tests
- Plan Stages breaking query from simple to complex
Working backwards from the complete query:
- Identify Decomposition Points (predicates, logic blocks, complexity layers)
- Define Stage Goals (what each exercise teaches)
- Create Stage Order (simple to complex progression)
For each stage (starting with final/complete stage):
- Generate Solution Query for this stage
- Create Solution Tests that validate the solution
- Run Tests using
codeql_test_runto ensure they pass - Generate Exercise Query by removing implementation details
- Create Exercise Tests (may match solution tests or be subset)
- Generate Graph Outputs (AST/CFG) for each stage using
codeql_bqrs_interpret - Create build-databases.sh script for test database creation
- Write README.md with workshop overview, setup, and instructions
- Create codeql-workspace.yml to configure CodeQL workspace
- Test All Solutions run
codeql_test_runon solutions-tests/ - Verify Test Pass Rate ensure 100% pass rate for solutions
- Check File Structure validate all required files exist
- Review Exercise Gaps ensure exercises have appropriate scaffolding
find_codeql_query_files- Locate query files and dependenciesexplain_codeql_query- Understand query purpose and logiccodeql_resolve_metadata- Extract query metadata
codeql_test_extract- Create test databases from test codecodeql_test_run- Execute tests and validate resultscodeql_test_accept- Update expected results when needed
codeql_query_run- Run queries (including PrintAST, PrintCFG)codeql_query_compile- Validate query syntaxcodeql_bqrs_interpret- Generate graph outputs from results
codeql_database_create- Create CodeQL databases from sourcecodeql_resolve_database- Validate database structure
See mcp-tools-reference.md for detailed tool usage patterns.
When decomposing a complex query into stages, consider these patterns:
- Stage 1: Find syntactic elements (e.g.,
ArrayExpr) - Stage 2: Add type constraints (e.g., specific array types)
- Stage 3: Add semantic analysis (e.g., control flow)
- Stage 4: Add data flow analysis (e.g., track values)
- Stage 1: Local pattern matching
- Stage 2: Add local control flow
- Stage 3: Add local data flow
- Stage 4: Add global data flow
- Stage 1: Find all candidates (high recall, low precision)
- Stage 2: Add basic filtering
- Stage 3: Add context-aware filtering
- Stage 4: Eliminate false positives
- Stage 1: Define helper predicates
- Stage 2: Combine helpers into sources
- Stage 3: Define sinks
- Stage 4: Connect sources to sinks with data flow
When creating exercises from solutions:
- Implementation bodies: Leave predicate signatures with
none()body - Complex logic: Replace with
// TODO: Implementcomments - Data flow configs: Provide signature, remove implementation
- Filter predicates: Keep structure, remove conditions
- Import statements: All imports should be present
- Type signatures: Full type information for predicates
- Comments: Helpful hints about what to implement
- Test scaffolding: Basic structure to guide implementation
Add inline comments to guide students:
/**
* Find all array expressions that access a specific type.
*
* Hint: Use `.getArrayBase().getType()` to get the base type.
*/
predicate isTargetArrayAccess(ArrayExpr array) {
// TODO: Implement type checking
none()
}Create test code (test.{ext}) that includes:
- Positive cases: Code patterns the query should detect
- Negative cases: Similar code that should NOT be detected
- Edge cases: Boundary conditions
- Comments: Explain what each test case validates
Example for C++:
// POSITIVE CASE: Null pointer dereference
void unsafeFunction() {
int* ptr = nullptr;
*ptr = 42; // Should be detected
}
// NEGATIVE CASE: Checked before use
void safeFunction() {
int* ptr = nullptr;
if (ptr != nullptr) {
*ptr = 42; // Should NOT be detected
}
}
// EDGE CASE: Pointer in complex expression
void edgeCase() {
int* ptr = nullptr;
int result = ptr ? *ptr : 0; // Should be detected
}The .expected file uses CodeQL test format:
| file | line | col | endLine | endCol | message |
| test.cpp | 3 | 5 | 3 | 8 | Null pointer dereference |
| test.cpp | 18 | 17 | 18 | 20 | Null pointer dereference |
- Early stages: Fewer expected results (simpler queries)
- Later stages: More expected results (more comprehensive)
- Final stage: Should match production query expected results
Generate visual aids for understanding code structure:
Show Abstract Syntax Tree structure:
{
"queryName": "PrintAST",
"queryLanguage": "cpp",
"database": "tests-common/test.testproj",
"outputFormat": "graphtext"
}Use codeql_bqrs_interpret to create graphs/Exercise1-ast.txt.
Show Control Flow Graph:
{
"queryName": "PrintCFG",
"queryLanguage": "cpp",
"database": "tests-common/test.testproj",
"outputFormat": "graphtext"
}Use codeql_bqrs_interpret to create graphs/Exercise1-cfg.txt.
#!/bin/bash
set -e
WORKSHOP_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
TEST_SOURCE="${WORKSHOP_ROOT}/tests-common"
echo "Building test databases..."
# For each test database needed
for db_name in test1 test2; do
DB_PATH="${WORKSHOP_ROOT}/tests-common/${db_name}.testproj"
echo "Creating database: ${db_name}"
rm -rf "${DB_PATH}"
codeql database create \
--language={language} \
--source-root="${TEST_SOURCE}" \
"${DB_PATH}" \
--command="clang -fsyntax-only ${TEST_SOURCE}/${db_name}.c"
done
echo "Database creation complete!"provide:
- '*/codeql-pack.yml'This makes all codeql-pack.yml files available in the workspace.
The generated README.md should include:
- Title and Overview: What the workshop teaches
- Prerequisites: Required knowledge and tools
- Setup Instructions: How to clone, install dependencies, build databases
- Workshop Structure: Overview of exercise progression
- How to Use: Instructions for working through exercises
- Validation: How to test exercise solutions
- Solutions: Where to find reference solutions
- Additional Resources: Links to CodeQL documentation
See example workshop READMEs for templates.
- C/C++:
.c,.cpp,.h,.hpp - C#:
.cs - Go:
.go - Java:
.java - JavaScript/TypeScript:
.js,.ts - Python:
.py - Ruby:
.rb
Language-specific database creation varies:
- C/C++: Requires build command (e.g.,
clang -fsyntax-only) - Java: Requires build tool (e.g.,
mvn clean install) - JavaScript: Usually no build command needed
- Python: Usually no build command needed
Adjust build-databases.sh accordingly.
Include appropriate CodeQL libraries in codeql-pack.yml:
- C/C++:
codeql/cpp-all - C#:
codeql/csharp-all - Go:
codeql/go-all - Java:
codeql/java-all - JavaScript/TypeScript:
codeql/javascript-all - Python:
codeql/python-all - Ruby:
codeql/ruby-all
When writing Java queries, note these API patterns:
- Primitive Types: No
ByteTypeclass exists. UsePrimitiveTypewith.getName() = "byte", e.g.:ace.getType().(Array).getElementType().(PrimitiveType).getName() = "byte" - Array Initializers: No
hasInit()method. Useexists(ace.getInit())to check for initializers - Method Calls: No
MethodAccessclass. UseMethodCallfor method invocations - Deduplication: When matching both
ArrayCreationExprandArrayInit, excludeArrayInitthat are part ofArrayCreationExprto avoid duplicate results:not exists(ArrayCreationExpr ace | ace.getInit() = ai)
Before considering the workshop complete:
- All solution queries compile without errors
- All solution tests pass at 100%
- Exercise queries have appropriate scaffolding (not empty, not complete)
- Expected results progress logically from stage to stage
- Test code covers positive, negative, and edge cases
- Graph outputs exist for stages where helpful
- build-databases.sh successfully creates all needed databases
- README.md provides clear setup and usage instructions
- codeql-workspace.yml correctly references all codeql-pack.yml files
- Too many stages: Keep to 4-8 stages max; too many fragments the learning
- Too few stages: 1-2 stages don't provide enough incremental learning
- Uneven difficulty: Each stage should add similar complexity increments
- Missing test cases: Every query behavior should have test coverage
- Incomplete exercises: Exercises should have enough scaffolding to guide students
- Overly complete exercises: Don't give away the solution in exercise code
- Inconsistent test results: Solution tests must pass reliably
This skill can be used with CodeQL queries from any repository. To see example workshops created with this skill, refer to workshop repositories that demonstrate the standard format and structure.
For detailed guidance:
- Workshop Structure Reference - Complete structure specification
- MCP Tools Reference - Tool usage patterns for workshop creation
- Stage Decomposition Examples - Patterns for breaking down queries
- Example C++ Simple - Basic C++ null pointer dereference workshop structure
Some workshops may have optional advanced branches:
├── exercises/
│ ├── Exercise1.ql
│ ├── Exercise2.ql
│ ├── Exercise3.ql
│ ├── Exercise4-basic.ql
│ └── Exercise4-advanced.ql
Consider creating workshops with different focuses from the same source query:
- Path A: Focus on syntactic analysis
- Path B: Focus on data flow
- Path C: Focus on false positive elimination
Add difficulty metadata to exercises:
/**
* @name Find Array Access
* @description Identify array expressions
* @kind problem
* @difficulty beginner
* @exercise 1
*/If solution tests don't pass:
- Run
codeql_test_runwith verbose output - Compare actual vs expected results
- Verify test database was created correctly
- Check query logic matches intended behavior
- Use
codeql_test_acceptto update.expectedif needed
If students can't complete an exercise:
- Add more scaffolding in the exercise query
- Add more detailed hints in comments
- Consider splitting into two stages
- Provide more example patterns in test code
If generated queries have compilation errors:
- Run
codeql_query_compileto see specific errors - Check import statements are correct
- Verify qlpack dependencies are installed
- Ensure predicate signatures are valid
- Start simple: First workshop should be straightforward
- Test frequently: Run tests after creating each stage
- Iterate on stages: Refine stage boundaries based on testing
- Get feedback: Have someone unfamiliar try the workshop
- Document well: Clear instructions reduce support burden
- Version control: Track workshop iterations in git
- Reuse test code: Same test code across all stages when possible
- create-codeql-query-tdd-generic - TDD approach to query development
- create-codeql-query-unit-test-cpp - Creating C++ query tests
- create-codeql-query-unit-test-java - Creating Java query tests
- create-codeql-query-unit-test-javascript - Creating JavaScript query tests
- create-codeql-query-unit-test-python - Creating Python query tests
A successful workshop:
- Completable: Students can finish with provided guidance
- Educational: Each stage teaches a new concept
- Validated: All tests pass reliably
- Practical: Query addresses real-world concerns
- Scalable: Can be delivered to multiple teams