Skip to content

Commit 04ac140

Browse files
authored
docs: improve Copilot code review custom instructions (#36614)
- Expand .github/instructions/cpp.instructions.md with detailed review guidance: memory ownership semantics (Owned<>/Linked<>/queryFoo/ getFoo/setFoo conventions), thread safety (CriticalBlock scope, TOCTOU, deadlock, atomics), security, performance, correctness, API compatibility, HPCC pattern alignment, and component-specific concerns table (Dali, Thor, Roxie, ESP, ECL). - Add .github/instructions/code-review.instructions.md (applyTo: **, excludeAgent: cloud-agent) with universal PR review guidance: review philosophy, PR-level checks (commit title, issue linkage, scope, over-engineering detection, PR size with split criteria, test coverage, documentation), HPCC comment tag taxonomy (blocking and non-blocking), and output format distinguishing inline comments from review summary body. Signed-off-by: Jake Smith <jake.smith@lexisnexisrisk.com> Reviewed-by: Gavin Halliday <gavin.halliday@lexisnexisrisk.com> Merged-by: Gavin Halliday <gavin.halliday@lexisnexisrisk.com>
1 parent a4c2dc8 commit 04ac140

2 files changed

Lines changed: 207 additions & 4 deletions

File tree

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
---
2+
applyTo: "**"
3+
excludeAgent: "cloud-agent"
4+
---
5+
6+
# Code Review Instructions
7+
8+
These instructions apply when Copilot is reviewing a pull request. They govern
9+
how to communicate findings. Language-specific checks are in the relevant
10+
`*.instructions.md` files (e.g. `cpp.instructions.md`).
11+
12+
## Review Philosophy
13+
- Post inline comments anchored to the specific line(s) where the issue exists.
14+
- Where the fix is clear and self-contained, include an inline suggested change.
15+
- Be concise and actionable — 1–2 sentences per comment maximum.
16+
- Context matters: distinguish hot paths (correctness and performance critical)
17+
from cold paths (readability may take priority).
18+
- Strictness should match the sensitivity of the code: core libraries and
19+
security-sensitive paths demand more rigour than developer-only tooling.
20+
- If the concurrency model, performance requirements, or architectural intent are
21+
unclear, raise it as `discuss:` rather than assuming and flagging a false issue.
22+
- Do not extend the scope of the original change. If an improvement is valid but
23+
out of scope, tag it `future:` rather than blocking the PR.
24+
- The goal is to catch bugs, design problems, and security issues — not to rewrite
25+
code in a preferred style.
26+
27+
## PR-Level Checks (before line review)
28+
29+
### Commit Title & Target Branch
30+
- Is the commit title understandable as a standalone changelog entry?
31+
- Is the target branch correct for the type of change?
32+
- Does the PR description explain *what* was changed and *why*?
33+
34+
### Issue Linkage & Scope
35+
- If the PR links to an issue, verify the changes actually solve the described problem.
36+
- Does the PR stay within the scope of the linked issue? Use `scope:` if it includes
37+
unrelated features, fixes, or refactoring.
38+
- Is the implementation proportional to the problem? Flag over-engineering as `design:`
39+
if custom infrastructure was built where existing HPCC or stdlib primitives would
40+
suffice. Ask: could this be done in a fraction of the lines?
41+
- Watch for wheel-reinvention of existing HPCC primitives — e.g. custom barrier or
42+
thread-pool implementations where `Semaphore`, `IThreadPool`, or `CThreaded` already exist.
43+
- If the PR is over-engineered, post that `design:` verdict as the **first and most
44+
prominent comment**, before any line-level findings. Line-level issues on code that
45+
should be rewritten are noise until the design question is resolved.
46+
47+
### PR Size
48+
- **< 200 lines**: Generally fine.
49+
- **200–500 lines**: Acceptable if cohesive; watch for unrelated changes.
50+
- **500–1000 lines**: Requires justification; consider recommending a split.
51+
- **> 1000 lines**: Almost always should be split. Acceptable exceptions:
52+
- Mechanical refactoring (symbol rename across codebase)
53+
- Auto-generated code updates (protobuf, grammar files)
54+
- Dependency version bumps with cascade changes
55+
- A large new feature that genuinely cannot be decomposed (rare; usually itself a `design:` concern)
56+
- Test code is not exempt — a test requiring hundreds of lines of custom infrastructure
57+
is itself a `design:` problem.
58+
59+
Beyond line count, evaluate whether the PR should be split based on:
60+
- **File spread**: Are changes scattered across multiple unrelated components
61+
(e.g. `dali/`, `thorlcr/`, and `esp/` together may indicate 3 separate PRs)?
62+
- **Change type mixing**: Does the PR bundle a bug fix, a new feature, and
63+
refactoring together? Are cosmetic/style changes mixed with functional ones?
64+
- **Component boundaries**: Could each component's changes stand alone and be
65+
merged independently without breaking functionality?
66+
- **Dependency analysis**: Are all changed files truly dependent on each other,
67+
or could some be sequenced as PR1 → PR2 → PR3?
68+
- **Review complexity**: Would a reviewer need different expertise for different
69+
parts (e.g. security vs performance vs UI)? If so, splitting improves review
70+
quality and safety.
71+
72+
Use `scope:` to recommend a split, naming the suggested sub-PRs and their files. When
73+
in doubt — if splitting is possible and would improve reviewability — recommend it.
74+
75+
### Test Coverage
76+
- **New features**: should include unit or ECL tests demonstrating the functionality.
77+
- **Bug fixes**: should include regression tests proving the bug is fixed.
78+
- **Refactoring**: should maintain or improve existing test coverage.
79+
- **Performance improvements**: should include benchmarks or profiling evidence where practical.
80+
- **API changes**: should include tests for new interfaces and edge cases.
81+
82+
Test types to consider: unit tests (C++, for library/algorithm code), ECL tests (for language
83+
features and runtime behaviour), integration tests (cross-component, e.g. Dali + Thor),
84+
regression tests (for bug fixes). Look for changes in `testing/`, `**/test/`, or files
85+
matching `*test*.cpp` / `*test*.ecl`.
86+
87+
If testable changes exist but no test files are touched, use `discuss:` or `future:`.
88+
89+
Omitting tests is acceptable for: trivial changes (typos, comments, whitespace), pure
90+
refactoring with no observable behaviour change, changes already covered by comprehensive
91+
existing tests, and documentation-only changes.
92+
93+
### Documentation
94+
Consider whether the change affects documentation in any of these areas:
95+
96+
**User-facing behaviour** — use `documentation:` to flag when a follow-up doc PR is needed:
97+
- New ECL language features or functions → user documentation required
98+
- New command-line options or configuration keys → documentation update required
99+
- New or changed APIs → API documentation required
100+
- Behaviour changes in existing features → release notes and user docs
101+
102+
**Developer/operator concerns:**
103+
- New deployment options → installation/deployment docs
104+
- Configuration changes → admin guides
105+
- Performance characteristic changes → performance tuning docs
106+
107+
**Scope of documentation in this PR:**
108+
- In-scope: code comments, inline documentation explaining *why*
109+
- Separate PR recommended: user guides, admin manuals, website content (`devdoc/` or external repos)
110+
111+
## Comment Tags
112+
113+
Prefix every review comment with one of the following tags.
114+
115+
### Blocking tags (must be addressed before merge)
116+
117+
| Tag | Meaning |
118+
|-----------------|----------------------------------------------------------------------|
119+
| `design:` | Architectural or design problem affecting functionality or extensibility |
120+
| `scope:` | PR scope does not match the associated issue; consider splitting |
121+
| `function:` | Incorrect or unexpected functionality |
122+
| `security:` | Introduces a security vulnerability |
123+
| `bug:` | Coding issue causing incorrect behaviour, crash, or data corruption |
124+
125+
### Non-blocking tags
126+
127+
| Tag | Meaning |
128+
|-------------------|--------------------------------------------------------------------|
129+
| `efficiency:` | Works correctly but has scaling or performance concerns |
130+
| `discuss:` | Potential issue requiring clarification before action |
131+
| `style:` | Non-conforming code style |
132+
| `indent:` | Indentation issue |
133+
| `format:` | Unusual formatting |
134+
| `typo:` | Minor typing error |
135+
| `minor:` | Minor improvement, unlikely to block merge |
136+
| `picky:` | Very minor, barely worth noting; developer discretion |
137+
| `future:` | Valid improvement but out of scope for this PR |
138+
| `question:` | Reviewer needs clarification before deciding severity |
139+
| `note:` | Information the contributor may not be aware of |
140+
| `documentation:` | Change may affect user-facing documentation; follow-up PR needed |
141+
| `personal:` | Suggestion based on reviewer preference, not required |
142+
143+
## Output Format
144+
145+
There are two distinct places where findings appear:
146+
147+
**Inline comments** — anchor to the specific diff line where the issue exists. Use for
148+
all findings that relate to a specific line or block of code. Format: one concise line
149+
`tag: Problem. Suggested fix.` Include a suggested change block when the fix is clear
150+
and self-contained.
151+
152+
**Review summary body** — use for PR-level findings that are not tied to a specific line:
153+
commit title issues, missing tests, PR size/scope concerns, over-engineering verdict,
154+
and documentation gaps. Open the summary with any blocking `design:` or `scope:` verdict
155+
so the author sees it first. Group the summary by severity:
156+
157+
**🔴 Critical** — correctness, data races, security vulnerabilities, crashes
158+
**🟡 Important** — performance, contention, complexity, API compatibility
159+
**🟢 Suggestions** — readability, style, minor optimisations
160+
161+
Close with a 1–2 sentence summary of overall code health and the primary area to
162+
focus on.

.github/instructions/cpp.instructions.md

Lines changed: 45 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,20 +11,31 @@ applyTo:
1111

1212
## Code Style (Essential Rules)
1313
- Use Allman brace style, but allow single-line blocks to have no braces, unless nested.
14-
- Use camelCase for variable names.
14+
- Indent with 4 spaces; no tabs.
15+
- `CamelCase` for class names; `camelCase` for variables, functions, and methods.
1516
- Use `constexpr` over macros.
1617
- Use `#pragma once` for header guards.
1718
- Use `Owned<>` vs `Linked<>` for object ownership.
1819
- Avoid default parameters (use method overloading instead).
1920
- Use `%u` for unsigned integers, `%d` for signed integers.
21+
- Mark all virtual function declarations with `virtual`; mark all derived overrides with `override`.
2022
- Complete style guide: `devdoc/StyleGuide.md`.
2123

2224
## Memory Management and Pointers
23-
- Verify proper `Owned<>`/`Linked<>` usage.
24-
- Check for memory/resource leaks in general.
25+
- `Owned<X>` takes ownership of a new/returned pointer; `Linked<X>` shares ownership. Never mix them up.
26+
- **`queryFoo()`** returns are NOT linked — the caller must call `Link()` explicitly to retain the pointer beyond its guaranteed lifetime.
27+
- **`getFoo()`** returns ARE linked — assign to `Owned<>` or return directly; do not call `Link()` again.
28+
- Are `CInterface`-derived objects properly release-counted? Check `beforeDispose()` usage.
29+
- Verify resources are released on all code paths, including exception and early-return paths.
2530

2631
## Thread Safety
27-
- Look for race conditions, proper locking, and synchronization issues, especially in server components.
32+
- Shared mutable state must be protected by a `CriticalSection`, `ReadWriteLock`, or an atomic.
33+
- `CriticalBlock`, `ReadLockBlock`, and `WriteLockBlock` must be correctly scoped — the lock must be released before any I/O, long operations, or calls that could re-acquire the same lock.
34+
- Minimize critical section scope: avoid holding locks across blocking calls, I/O, or memory allocations.
35+
- Check for TOCTOU (time-of-check-time-of-use) patterns where state can change between a check and its use.
36+
- Verify lock ordering is consistent across all code paths to prevent deadlocks.
37+
- Avoid misuse of `volatile`; prefer `std::atomic` with explicit memory ordering.
38+
- Look for race conditions and unnecessary contention on shared resources.
2839

2940
## HPCC Interface Architecture (Java-Style Macros)
3041
- Use standard HPCC macros for inheritance to denote intent:
@@ -40,3 +51,33 @@ applyTo:
4051

4152
## Exception Handling
4253
- Throw and catch HPCC native exceptions utilizing `IException`. Use `ThrowStringException(...)` or `MakeStringException(...)` formats instead of throwing raw `std::runtime_error` or `std::exception`.
54+
- Exceptions must be caught at the correct level and must not leak resources.
55+
56+
## Security
57+
- Validate all user-supplied inputs; bound all buffer and string lengths.
58+
- Authorization must fail-safe — deny by default; explicitly grant access.
59+
- Do not log, expose in URLs, or include in error messages any secrets, credentials, or sensitive data.
60+
- Guard against SQL/command injection in any code that constructs queries or shell commands from input.
61+
- Check for buffer overflow potential and denial-of-service vectors (unbounded loops, allocations proportional to input size).
62+
63+
## Performance
64+
- Prefer `StringBuffer` over repeated `std::string` concatenation in hot paths.
65+
- Avoid unnecessary copies of large objects; use references, `std::move`, or `Owned<>`/`Linked<>` transfer.
66+
- Choose containers appropriate to the access pattern (e.g., avoid linear scan of large collections).
67+
- Flag algorithmic complexity issues (e.g., O(n²) where O(n log n) is achievable).
68+
- Distinguish hot paths (latency/throughput critical, worth optimising) from cold paths (readability may take priority).
69+
70+
## Correctness & Edge Cases
71+
- Check for null pointer dereferences, empty input handling, integer overflow, and off-by-one errors.
72+
- Error reporting must be consistent; return codes and error states must be checked.
73+
- Complex or counter-intuitive code must have a comment explaining *why*, not just *what*.
74+
75+
## API & Backward Compatibility
76+
- Changes must not break existing callers or ECL code that depends on current behaviour.
77+
- Data formats and serialized structures must be compatible across platform versions and in mixed-version environments.
78+
- Minimize the surface area of public interface changes; document any that are unavoidable.
79+
80+
## HPCC Pattern Alignment
81+
- Prefer `jlib`/`system` helpers over custom reimplementations of the same functionality.
82+
- Use `OwnedRoxieString`, `StringBuffer`, `CriticalSection`, `Owned<>`, `Linked<>` idiomatically.
83+
- Flag new code that duplicates functionality already available in the HPCC codebase.

0 commit comments

Comments
 (0)