Skip to content

Commit e2ff864

Browse files
committed
Streamline ppl-bugfix harness and CLAUDE.md after eval
Harness: 369→151 lines. Move fix paths, test templates, symptom table, and case index to ppl-bugfix-reference.md (loaded on demand). Add completion gate with 8 mandatory deliverables. Add guardrails for runaway agents. Merge overlapping checklists. CLAUDE.md: trim build commands, remove v2 engine references, simplify ppl-bugfix section. Command: add bypassPermissions allow-list note. Signed-off-by: Heng Qian <qianheng@amazon.com>
1 parent 42caa29 commit e2ff864

4 files changed

Lines changed: 224 additions & 284 deletions

File tree

.claude/commands/ppl-bugfix.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ Optional mode flag (append to any of the above):
1818
- `--safe``acceptEdits` mode. Auto-approve file edits only, Bash commands require manual approval. (Most conservative)
1919
- `--yolo``bypassPermissions` mode. Fully trusted, no prompts. Subagent runs in an isolated worktree so this is safe. (Default)
2020

21+
> **Note**: `bypassPermissions` skips the interactive prompt but still respects the allow-list in `~/.claude/settings.json`. Ensure git/gh write commands are in the global allow-list.
22+
2123
Examples:
2224
- `/ppl-bugfix #1234` — single issue, defaults to yolo
2325
- `/ppl-bugfix #1234 #5678 --yolo` — two issues in parallel
Lines changed: 60 additions & 240 deletions
Original file line numberDiff line numberDiff line change
@@ -1,253 +1,95 @@
11
# PPL Bugfix Harness
22

3-
---
4-
5-
## Phase 0: Triage & Classification
3+
## Phase 0: Triage
64

7-
### 0.0 Load Issue Context
5+
### 0.1 Load & Reproduce
86

97
```bash
108
gh issue view <issue_number> --repo opensearch-project/sql
119
```
1210

13-
Read the issue title, description, and any reproduction steps before proceeding.
14-
15-
### 0.1 Reproduce the Bug
16-
17-
```bash
18-
# Write a failing integration test, or run an existing one
19-
./gradlew :integ-test:integTest -Dtests.class="*<TestClass>" -Dtests.method="<testMethod>"
20-
```
11+
Write a failing test or run an existing one to reproduce the bug on `main`.
2112

22-
**If the bug does not reproduce on latest `main`** — meaning the test scenario runs but produces correct results instead of the reported error — the bug is already fixed. This is distinct from "can't set up the scenario" (which is an infra issue).
13+
If the bug **does not reproduce** (correct results, not infra failure):
2314

2415
| Finding | Action |
2516
|---------|--------|
26-
| Already fixed by another commit | `gh issue comment` with fixing commit/PR, then `gh issue close` |
27-
| Only on older version | `gh issue comment` with version where it's fixed, then `gh issue close` |
28-
| Intermittent / environment-specific | Label `flaky` or `needs-info`, do NOT close |
29-
| Insufficient info to reproduce | Comment asking for repro steps, label `needs-info` |
17+
| Already fixed | `gh issue comment` + `gh issue close` |
18+
| Older version only | `gh issue comment` + `gh issue close` |
19+
| Intermittent | Label `flaky` or `needs-info`, do NOT close |
20+
| Can't reproduce | Comment asking for repro steps, label `needs-info` |
3021

31-
**HARD STOP**: Do NOT proceed to Phase 1, 2, or 3. Do NOT write tests. Do NOT create a PR. Report back with the finding and the action taken (comment/close). Your job is done.
22+
**HARD STOP** — do not proceed. Report back.
3223

33-
### 0.2 Classify and Route
24+
### 0.2 Classify
3425

35-
Identify the bug layer, then use the routing table to determine the fix path and required tests:
26+
Identify the bug layer (Grammar, AST/Functions, Type System, Optimizer, Execution, DI/Resource) and record it. Consult `.claude/harness/ppl-bugfix-reference.md` for fix-path-specific guidance if needed.
3627

37-
| Layer | Typical Symptoms | Fix Path | Required Tests |
38-
|-------|-----------------|----------|---------------|
39-
| **Grammar/Parser** | `SyntaxCheckException`, unrecognized syntax | Path A | AstBuilderTest |
40-
| **AST/Functions** | Parses OK but AST wrong, function output wrong | Path B | CalcitePPL\*Test + IT + YAML |
41-
| **Semantic Analysis** | `SemanticCheckException`, type mismatch | Path C | Dedicated UT + CalcitePPLIT + YAML |
42-
| **Type System** | Field type lost, implicit conversion errors | Path C | Dedicated UT + CalcitePPLIT + YAML |
43-
| **Optimizer** | Plan bloat, predicate pushdown failure | Path D | CalcitePPL\*Test + CalcitePPLIT + NoPushdownIT + YAML |
44-
| **Execution** | Wrong results, OOM, memory leaks | Path E | IT + YAML |
45-
| **DI / Resource** | NPE, extensions not loaded, long-running OOM | Path E | IT + YAML |
28+
### 0.3 Guardrails
4629

47-
Record before proceeding:
30+
Stop and report back if:
31+
- Root cause unclear after reading 15+ source files
32+
- Fix breaks 5+ unrelated tests
33+
- Same build error 3 times in a row
4834

49-
```
50-
Bug Layer: <from table above>
51-
Fix Path: <A / B / C / D / E>
52-
Issue: #XXXX
53-
```
54-
55-
### 0.3 Execution Flow
56-
57-
Phases interleave TDD-style — write a failing test first, then fix, then complete test coverage:
35+
### 0.4 Execution Flow
5836

5937
```
60-
Phase 0: Triage → Classify → Route
61-
→ Phase 2 (partial): Write FAILING test reproducing the bug
62-
→ Phase 1: Implement fix via routed Path
63-
→ Phase 2 (remaining): Happy path, edge cases, YAML REST test
64-
→ Phase 3: Verify → Commit → PR → Decision Log
38+
Triage → Write FAILING test → Fix → Remaining tests → Verify → Commit → PR → Decision Log → Completion Gate
6539
```
6640

6741
---
6842

69-
## Phase 1: Fix Implementation
70-
71-
### Path A — Grammar / Parser
72-
73-
1. **Update grammar files** (must stay in sync):
74-
- `language-grammar/src/main/antlr4/OpenSearchPPLParser.g4` (primary)
75-
- `ppl/src/main/antlr/OpenSearchPPLParser.g4`
76-
- `async-query-core/src/main/antlr/OpenSearchPPLParser.g4` (if applicable)
77-
2. **Regenerate**: `./gradlew generateGrammarSource`
78-
3. **Update AstBuilder**: `ppl/.../parser/AstBuilder.java` — modify `visit*()` to match new rule
79-
4. **Test**: `AstBuilderTest` using `assertEqual(pplQuery, expectedAstNode)` pattern
80-
81-
> Ref: `734394d` — fixed `renameClasue``renameClause` across 3 grammars + AstBuilder
82-
83-
### Path B — AST / Function Implementation
84-
85-
1. **Locate**: AST nodes in `core/.../ast/tree/`, functions in `core/.../expression/function/` or `PPLBuiltinOperators`
86-
2. **Fix**: Watch Visitor pattern — changes may need syncing to `AbstractNodeVisitor`, `Analyzer`, `CalciteRelNodeVisitor`, `PPLQueryDataAnonymizer`
87-
3. **Test**: `verifyLogical()`, `verifyPPLToSparkSQL()`, `verifyResult()`
88-
89-
> Ref: `26674f9` — rex nested capture group fix, ordinal index → named group extraction
90-
91-
### Path C — Type System / Semantic Analysis
92-
93-
1. **Locate**: `OpenSearchTypeFactory.java` (Calcite type factory), `Analyzer.java` / `ExpressionAnalyzer.java`
94-
2. **Fix**: Preserve nullable semantics when overriding Calcite methods; protect UDT from `leastRestrictive()` downgrade
95-
3. **Test (critical)**: Cover type preservation, nullable propagation, fallback to parent, mixed types — every edge case
96-
97-
> Ref: `ada2e34` — UNION lost timestamp UDT, fixed `leastRestrictive()`, added 8 UTs + 4 ITs
98-
99-
### Path D — Optimizer / Predicate Pushdown
100-
101-
1. **Locate**: `PredicateAnalyzer.java`, `LogicalPlanOptimizer`, `QueryService.java`
102-
2. **Fix**: Watch `nullAs` semantics (TRUE/FALSE/UNKNOWN); for plan bloat consider Calcite rules like `FilterMergeRule`
103-
3. **Verify**: `EXPLAIN` output comparison + integration test result correctness
43+
## Phase 1: Fix
10444

105-
> Ref: `b4df010``isnotnull()` not pushed down with multiple `!=`; `e045d15` — multi-filter OOM, inserted `FilterMergeRule`
106-
107-
### Path E — Execution / Resource Management
108-
109-
1. **Locate**: DQE operators in `dqe/`, `OpenSearchExecutionEngine.java`, `SQLPlugin.java`, `OpenSearchPluginModule.java`
110-
2. **Common patterns**:
111-
112-
| Problem | Fix | Example |
113-
|---------|-----|---------|
114-
| Cache key collision | `IndexReader.CacheHelper.getKey()` | `97d5d26` |
115-
| Memory leak (no eviction) | Close listener + upper bound | `97d5d26` |
116-
| Unbounded growth | `MAX_CAPACITY` check, throw with user guidance | `f024b4f` |
117-
| Non-singleton repeated registration | `@Singleton`; `put` instead of `compute/append` | `90393bf` |
118-
| DI not injected | Holder class → Guice → constructor injection | `f6be830` |
119-
120-
> Ref: `f024b4f``LongOpenHashSet` capacity 1024→8 (8KB→64B per group), 8M cap on 5 HashMap variants
45+
Find and fix the root cause. Consult `.claude/harness/ppl-bugfix-reference.md` for path-specific patterns and examples.
12146

12247
---
12348

124-
## Phase 2: Writing Tests
125-
126-
### 2.1 Test Templates
127-
128-
**Unit test** (extend `CalcitePPLAbstractTest`):
129-
```java
130-
public class CalcitePPLYourFixTest extends CalcitePPLAbstractTest {
131-
public CalcitePPLYourFixTest() {
132-
super(CalciteAssert.SchemaSpec.SCOTT_WITH_TEMPORAL);
133-
}
134-
135-
@Before
136-
public void init() {
137-
doReturn(true).when(settings)
138-
.getSettingValue(Settings.Key.CALCITE_ENGINE_ENABLED);
139-
}
140-
141-
@Test
142-
public void testBugScenario() {
143-
verifyLogical("source=EMP | where SAL > 1000",
144-
"LogicalFilter(condition=[>($5, 1000)])\n"
145-
+ " LogicalTableScan(table=[[scott, EMP]])\n");
146-
}
147-
}
148-
```
149-
150-
**Integration test** (extend `CalcitePPLIT`):
151-
```java
152-
public class CalcitePPLYourFixIT extends CalcitePPLIT {
153-
@Override
154-
public void init() throws IOException {
155-
super.init();
156-
enableCalcite();
157-
}
158-
159-
@Test
160-
public void testBugFixEndToEnd() throws IOException {
161-
JSONObject result = executeQuery("source=<index> | <your PPL>");
162-
verifySchema(result, schema("field", "alias", "type"));
163-
verifyDataRows(result, rows("expected_value_1"), rows("expected_value_2"));
164-
}
165-
}
166-
```
167-
168-
**YAML REST test** — place at `integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/<ISSUE>.yml`, auto-discovered by `RestHandlerClientYamlTestSuiteIT`:
169-
```yaml
170-
setup:
171-
- do:
172-
indices.create:
173-
index: test_issue_<ISSUE>
174-
body:
175-
settings: { number_of_shards: 1, number_of_replicas: 0 }
176-
mappings: { properties: { <field>: { type: <type> } } }
177-
- do:
178-
query.settings:
179-
body: { transient: { plugins.calcite.enabled: true } }
180-
---
181-
teardown:
182-
- do:
183-
query.settings:
184-
body: { transient: { plugins.calcite.enabled: false } }
185-
---
186-
"<Bug description for issue #ISSUE>":
187-
- skip: { features: [headers, allowed_warnings] }
188-
- do:
189-
bulk: { index: test_issue_<ISSUE>, refresh: true, body: ['{"index": {}}', '{"<field>": "<value>"}'] }
190-
- do:
191-
headers: { Content-Type: 'application/json' }
192-
ppl: { body: { query: "source=test_issue_<ISSUE> | <your PPL>" } }
193-
- match: { total: <expected> }
194-
- length: { datarows: <expected> }
195-
```
49+
## Phase 2: Tests
19650

197-
### 2.2 Test Checklist
51+
Consult `.claude/harness/ppl-bugfix-reference.md` for test templates.
19852

199-
- [ ] Failing test reproducing the original bug (write FIRST, before fixing)
200-
- [ ] Happy path after fix
201-
- [ ] Edge cases (null, empty, extreme volumes)
202-
- [ ] YAML REST test (named by issue number)
203-
- [ ] Optimizer bugs: both pushdown enabled and disabled
204-
- [ ] Type system bugs: nullable / non-nullable combinations
205-
- [ ] New AST nodes: update `PPLQueryDataAnonymizerTest`
53+
Required deliverables:
54+
- Failing test reproducing the bug (written BEFORE the fix)
55+
- Unit tests covering happy path and edge cases
56+
- Integration test (`*IT.java` extending `CalcitePPLIT`)
57+
- YAML REST test at `integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/<ISSUE>.yml`
20658

20759
---
20860

209-
## Phase 3: Verification & Submission
61+
## Phase 3: Verify & Submit
21062

211-
### 3.1 Local Verification
63+
### 3.1 Verify
21264

21365
```bash
214-
./gradlew spotlessApply # 1. Format
215-
./gradlew :<module>:test --tests "<TestClass>" # 2. Affected module UT
216-
./gradlew test # 3. Full UT regression
217-
./gradlew :integ-test:integTest -Dtests.class="*<YourIT>" # 4. New IT
218-
./gradlew :integ-test:integTest # 5. Full IT regression
219-
./gradlew :integ-test:yamlRestTest # 6. YAML REST tests
220-
# If grammar modified:
221-
./gradlew generateGrammarSource && ./gradlew :ppl:test # 7. Parser tests
66+
./gradlew spotlessApply
67+
./gradlew :<module>:test --tests "<TestClass>"
68+
./gradlew test
69+
./gradlew :integ-test:integTest -Dtests.class="*<YourIT>"
22270
```
22371

224-
### 3.2 Commit & Push
72+
Run `./gradlew :integ-test:yamlRestTest` if YAML tests were added. Run `./gradlew generateGrammarSource && ./gradlew :ppl:test` if grammar was modified.
73+
74+
### 3.2 Commit & PR
22575

22676
```bash
227-
# Commit with DCO sign-off — do NOT add Co-Authored-By lines
22877
git add <changed_files>
22978
git commit -s -m "[BugFix] Fix <description> (#<issue_number>)"
230-
231-
# Sync with main (merge, not rebase — PRs use squash-merge)
23279
git fetch origin && git merge origin/main
233-
# Always re-run tests after merge — even a trivial merge can introduce regressions
23480
./gradlew test && ./gradlew :integ-test:integTest -Dtests.class="*<YourIT>"
23581

236-
# Resolve fork remote — find or add it
237-
# 1. Check if a fork remote already exists (not opensearch-project/sql)
238-
git remote -v
239-
# 2. If only origin (upstream) exists, add the fork:
82+
# Resolve fork remote (check git remote -v; add if missing)
24083
git remote add fork https://github.com/<fork_owner>/sql.git
241-
# Use the git user name to infer the fork owner, or fall back to "qianheng-aws"
242-
243-
# Push
24484
git push -u fork <branch_name>
24585
```
24686

247-
### 3.3 Create Draft PR
87+
Do NOT add Co-Authored-By lines. Use the git user name to infer the fork owner, or fall back to "qianheng-aws".
24888

24989
```bash
250-
gh pr create --draft --repo opensearch-project/sql --title "[BugFix] Fix <description> (#<issue_number>)" --body "$(cat <<'EOF'
90+
gh pr create --draft --repo opensearch-project/sql \
91+
--title "[BugFix] Fix <description> (#<issue_number>)" \
92+
--body "$(cat <<'EOF'
25193
### Description
25294
<Brief description of fix and root cause>
25395
@@ -256,19 +98,17 @@ Resolves #<issue_number>
25698
25799
### Check List
258100
- [x] New functionality includes testing
259-
- [x] Javadoc added for new public API
260101
- [x] Commits signed per DCO (`-s`)
261102
- [x] `spotlessCheck` passed
262103
- [x] Unit tests passed
263104
- [x] Integration tests passed
264-
- [ ] Grammar changes: all three `.g4` files synced (if applicable)
265105
EOF
266106
)"
267107
```
268108

269-
### 3.4 Persist Decision Context
109+
### 3.3 Decision Log
270110

271-
Before the subagent completes, persist the *why* behind key decisions — the PR diff only shows the *what*. The **PR comment is the single source of truth** — it survives across sessions, GA runs, and is visible to both human reviewers and follow-up agents.
111+
Post as a PR comment:
272112

273113
```bash
274114
gh pr comment <pr_number> --body "$(cat <<'EOF'
@@ -284,48 +124,28 @@ EOF
284124

285125
---
286126

287-
## Phase 4: Retrospective
127+
## Completion Gate
288128

289-
After each bugfix, check if the harness itself needs updating:
129+
Do NOT report "done" until every item below is checked. List each in your final report:
290130

291-
- **Template wrong** (API name, assertion field)? → Fix in Phase 2 templates
292-
- **New bug pattern** not covered? → Add fix path in Phase 1 or symptom in Quick Reference
293-
- **Verification gap** caused rework? → Add step to 3.1 or 2.2 checklist
294-
- **Representative fix**? → Add row to Case Index below
131+
- [ ] **Unit tests**: New test class or methods
132+
- [ ] **Integration test**: New `*IT.java` test
133+
- [ ] **YAML REST test**: `issues/<ISSUE>.yml`
134+
- [ ] **spotlessApply**: Ran successfully
135+
- [ ] **Tests pass**: Affected modules
136+
- [ ] **Commit**: DCO sign-off, `[BugFix]` prefix, no Co-Authored-By
137+
- [ ] **Draft PR**: `--draft`, body contains `Resolves #<issue>`
138+
- [ ] **Decision Log**: PR comment posted
295139

296-
Include harness improvements in the same PR as the bugfix — they share the same review context.
140+
If any item is blocked, report which and why.
297141

298142
---
299143

300-
## Quick Reference: Symptom → Fix Path
301-
302-
```
303-
SyntaxCheckException / unrecognized syntax → Path A: Grammar/Parser
304-
SemanticCheckException / type mismatch → Path C: Type System / Analysis
305-
Field type wrong (timestamp→string) → Path C: check leastRestrictive / coercion
306-
EXPLAIN shows predicate not pushed down → Path D: Optimizer / Pushdown
307-
Multi-condition query: missing/extra rows → Path D: PredicateAnalyzer nullAs handling
308-
OOM / memory growth over time → Path E: singletons, cache eviction, bounds
309-
NPE in Transport layer → Path E: DI / Guice injection chain
310-
"node must be boolean/number, found XXX" → Path B: OpenSearchJsonContent parse*Value
311-
Regex/function extraction offset → Path B: ordinal vs named references
312-
```
144+
## Phase 4: Retrospective
313145

314-
---
146+
- [ ] Symptom in Quick Reference? Add if missing.
147+
- [ ] Classification correct? Fix routing if misleading.
148+
- [ ] Test template worked as-is? Fix if broken.
149+
- [ ] New pattern? Add to Case Index.
315150

316-
## Appendix: Case Index
317-
318-
| Commit | Bug | Layer | Tests |
319-
|--------|-----|-------|-------|
320-
| `ada2e34` | UNION loses UDT type | Type System | 8 UT + 4 IT |
321-
| `26674f9` | rex capture group index shift | AST/Functions | Multiple UTs |
322-
| `b4df010` | isnotnull not pushed down with != | Optimizer | 2 UT + IT |
323-
| `e045d15` | Multiple filters OOM | Optimizer | 26 output updates |
324-
| `f024b4f` | High-cardinality GROUP BY OOM | Execution | Benchmark |
325-
| `97d5d26` | OrdinalMap cache collision + leak | Execution ||
326-
| `90393bf` | Non-singleton ExecutionEngine leak | Resource ||
327-
| `f6be830` | Transport extensions not injected | DI ||
328-
| `734394d` | Grammar rule typo | Grammar ||
329-
| `246ed0d` | Float precision flaky test | Test Infra ||
330-
| `d56b8fa` | Wildcard index type conflict | Value Parsing | 3 UT + 1 IT + 1 YAML |
331-
| `5a78b78` | Boolean coercion from numeric in wildcard queries | Value Parsing | 3 UT + 1 IT + 1 YAML |
151+
Include harness improvements in the same PR.

0 commit comments

Comments
 (0)