Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions .claude/commands/ppl-bugfix.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,33 @@ Agent(

### 2B: Follow-up

Before dispatching, check if an existing worktree already has the PR branch checked out:

```bash
# List worktrees and find one on the PR branch
for wt in .claude/worktrees/agent-*/; do
branch=$(git -C "$wt" branch --show-current 2>/dev/null)
if [ "$branch" = "<pr_branch>" ]; then
echo "REUSE: $wt (branch: $branch)"
fi
done
```

**If existing worktree found**: Do NOT use `isolation: "worktree"`. Pass the worktree path in the prompt so the subagent works there directly.

```
Agent(
mode: "<resolved_mode>",
name: "bugfix-<issue_number>",
description: "PPL bugfix #<issue_number> followup",
prompt: "cd <worktree_path> first, then read .claude/harness/ppl-bugfix-followup.md and follow it.
PR: <pr_number> (<pr_url>), Issue: #<issue_number>
Working directory: <worktree_path>"
)
```

**If no existing worktree**: Create a new one.

```
Agent(
mode: "<resolved_mode>",
Expand All @@ -100,6 +127,16 @@ After all subagents complete, report a summary for each:
- Classification, fix summary, PR URL, worktree path and branch, items needing human attention (2A)
- What was addressed, current PR state, whether another round is needed (2B)

**Always include the worktree→PR mapping** from the subagent's output, e.g.:

```
Worktree: /path/to/.claude/worktrees/agent-xxxx
Branch: bugfix-1234
PR: #5678
```

**Important**: After reporting, the main agent must remember this mapping. When the user later asks to make changes to the PR (e.g., "commit this to PR #5678"), operate in the worktree directory — not the main session directory.

## Subagent Lifecycle

Subagents are task-scoped. They complete and release context — they cannot poll for events.
Expand Down
24 changes: 23 additions & 1 deletion .claude/harness/ppl-bugfix-followup.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,18 @@

---

## Report Working Directory

```bash
echo "Worktree: $(pwd)"
echo "Branch: $(git branch --show-current)"
```

Include this in your output so the caller knows where changes are happening.

## Reconstruct Context

The follow-up agent runs in a fresh worktree. First checkout the PR branch, then load state:
First checkout the PR branch, then load state:

```bash
# Checkout the PR branch in this worktree
Expand Down Expand Up @@ -101,3 +110,16 @@ For each comment addressed (bot or human):
- **Did the follow-up workflow itself miss this signal?** → Update this file

If any improvement is needed, make the edit and include it in the same commit.

## Completion Gate

Before reporting "done":

1. Run `git status --porcelain` — if any uncommitted changes remain, commit and push them. This includes harness edits from Retrospective.
2. Report in your final output:

```
Worktree: <absolute path>
Branch: <branch name>
PR: <pr_number>
```
20 changes: 19 additions & 1 deletion .claude/harness/ppl-bugfix-harness.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,15 @@

## Phase 0: Triage

### 0.0 Report Working Directory

```bash
echo "Worktree: $(pwd)"
echo "Branch: $(git branch --show-current)"
```

Include this in your output so the caller knows where changes are happening.

### 0.1 Load & Reproduce

```bash
Expand Down Expand Up @@ -53,7 +62,7 @@ Consult `.claude/harness/ppl-bugfix-reference.md` for test templates.
Required deliverables:
- Failing test reproducing the bug (written BEFORE the fix)
- Unit tests covering happy path and edge cases
- Integration test (`*IT.java` extending `CalcitePPLIT`)
- Integration test — add to an existing `*IT.java` when possible; if creating a new one, add it to `CalciteNoPushdownIT`
- YAML REST test at `integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/<ISSUE>.yml`

---
Expand Down Expand Up @@ -126,6 +135,8 @@ EOF

## Completion Gate

Run `git status --porcelain` — if any uncommitted changes remain, commit and push them before proceeding.

Do NOT report "done" until every item below is checked. List each in your final report:

- [ ] **Unit tests**: New test class or methods
Expand All @@ -149,3 +160,10 @@ If any item is blocked, report which and why.
- [ ] New pattern? Add to Case Index.

Include harness improvements in the same PR.

Report in your final output:
```
Worktree: <absolute path>
Branch: <branch name>
PR: <pr_number>
```
6 changes: 5 additions & 1 deletion .claude/harness/ppl-bugfix-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,9 +109,13 @@ teardown:
headers: { Content-Type: 'application/json' }
ppl: { body: { query: "source=test_issue_<ISSUE> | <your PPL>" } }
- match: { total: <expected> }
- length: { datarows: <expected> }
- match: { datarows: [ [ <row1_val1>, <row1_val2> ], [ <row2_val1>, <row2_val2> ] ] }

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is from skill self-improving after receiving the feedback of PR_REVEW workflow comments.

```

> **Always include `datarows` assertions** — verifying only `total` and `schema` will miss
> wrong values. Count the expected output groups carefully (e.g., for `chart ... by <col>`,
> count distinct (row_split, col_split) groups after null filtering, not the number of input rows).

---

## Symptom → Fix Path
Expand Down
22 changes: 0 additions & 22 deletions CLAUDE_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,25 +30,3 @@ Fix a PPL bug end-to-end or follow up on an existing PR.

**Related files:** [`.claude/harness/ppl-bugfix-harness.md`](.claude/harness/ppl-bugfix-harness.md)

---

## `/dedupe`

Find duplicate GitHub issues for a given issue.

Comment on lines -33 to -38

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this removing on purpose?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, #5319 has migrate to reuse the workflow in opensearch-build. The original dedupe workflow has been migrated to that repo for reusing across openseach projects.

**Usage:**

```
/dedupe 1234
```

**What it does:**

1. Reads the target issue
2. Runs 3+ parallel search strategies to find potential duplicates (only older issues)
3. Verifies candidates by reading each one
4. Posts a structured comment on the issue listing 1-3 confirmed duplicates (if any)

**Skips:** closed issues, broad feedback issues, issues already checked

**Related files:** [`scripts/comment-on-duplicates.sh`](scripts/comment-on-duplicates.sh)
Original file line number Diff line number Diff line change
Expand Up @@ -3205,7 +3205,7 @@ public RelNode visitChart(Chart node, CalcitePlanContext context) {
|| node.getColumnSplit() == null
|| Objects.equals(config.limit, 0)) {
// The output of chart is expected to be ordered by row split names
relBuilder.sort(relBuilder.field(0));
relBuilder.sort(relBuilder.nullsLast(relBuilder.field(0)));
return relBuilder.peek();
}

Expand Down Expand Up @@ -3275,7 +3275,8 @@ public RelNode visitChart(Chart node, CalcitePlanContext context) {
relBuilder.field(2))
.as(aggFieldName));
// The output of chart is expected to be ordered by row and column split names
relBuilder.sort(relBuilder.field(0), relBuilder.field(1));
relBuilder.sort(
relBuilder.nullsLast(relBuilder.field(0)), relBuilder.nullsLast(relBuilder.field(1)));
return relBuilder.peek();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import org.apache.calcite.rex.RexCall;
import org.apache.calcite.sql.type.ReturnTypes;
import org.apache.calcite.sql.type.SqlReturnTypeInference;
import org.apache.calcite.sql.type.SqlTypeTransforms;
import org.opensearch.sql.calcite.utils.PPLOperandTypes;
import org.opensearch.sql.expression.function.ImplementorUDF;
import org.opensearch.sql.expression.function.UDFOperandMetadata;
Expand Down Expand Up @@ -43,7 +44,7 @@ public MinspanBucketFunction() {

@Override
public SqlReturnTypeInference getReturnTypeInference() {
return ReturnTypes.VARCHAR_2000;
return ReturnTypes.VARCHAR_2000.andThen(SqlTypeTransforms.FORCE_NULLABLE);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import org.apache.calcite.rex.RexCall;
import org.apache.calcite.sql.type.ReturnTypes;
import org.apache.calcite.sql.type.SqlReturnTypeInference;
import org.apache.calcite.sql.type.SqlTypeTransforms;
import org.opensearch.sql.calcite.utils.PPLOperandTypes;
import org.opensearch.sql.expression.function.ImplementorUDF;
import org.opensearch.sql.expression.function.UDFOperandMetadata;
Expand Down Expand Up @@ -47,7 +48,7 @@ public RangeBucketFunction() {

@Override
public SqlReturnTypeInference getReturnTypeInference() {
return ReturnTypes.VARCHAR_2000;
return ReturnTypes.VARCHAR_2000.andThen(SqlTypeTransforms.FORCE_NULLABLE);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import org.apache.calcite.rex.RexCall;
import org.apache.calcite.sql.type.ReturnTypes;
import org.apache.calcite.sql.type.SqlReturnTypeInference;
import org.apache.calcite.sql.type.SqlTypeTransforms;
import org.opensearch.sql.calcite.utils.PPLOperandTypes;
import org.opensearch.sql.expression.function.ImplementorUDF;
import org.opensearch.sql.expression.function.UDFOperandMetadata;
Expand Down Expand Up @@ -41,7 +42,7 @@ public SpanBucketFunction() {

@Override
public SqlReturnTypeInference getReturnTypeInference() {
return ReturnTypes.VARCHAR_2000;
return ReturnTypes.VARCHAR_2000.andThen(SqlTypeTransforms.FORCE_NULLABLE);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
CalciteAddColTotalsCommandIT.class,
CalciteConvertCommandIT.class,
CalciteArrayFunctionIT.class,
CalciteBinChartNullIT.class,
CalciteBinCommandIT.class,
CalciteConvertTZFunctionIT.class,
CalciteCsvFormatIT.class,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.calcite.remote;

import static org.opensearch.sql.legacy.TestsConstants.TEST_INDEX_BANK_WITH_NULL_VALUES;
import static org.opensearch.sql.util.MatcherUtils.rows;
import static org.opensearch.sql.util.MatcherUtils.schema;
import static org.opensearch.sql.util.MatcherUtils.verifyDataRows;
import static org.opensearch.sql.util.MatcherUtils.verifySchema;

import java.io.IOException;
import org.json.JSONObject;
import org.junit.jupiter.api.Test;
import org.opensearch.sql.ppl.PPLIntegTestCase;

/** Integration test for GitHub issue #5174: bin/chart NPE with null values. */
public class CalciteBinChartNullIT extends PPLIntegTestCase {
@Override
public void init() throws Exception {
super.init();
enableCalcite();
loadIndex(Index.BANK_WITH_NULL_VALUES);
}

@Test
public void testBinThenChartWithNullValuesShouldNotCauseNPE() throws IOException {
// bin balance span=10000 produces null for documents without a balance field.
// chart count() over bal_bin by gender should handle these null bin values safely.
JSONObject result =
executeQuery(
String.format(
"source=%s | bin balance span=10000 as bal_bin"
+ " | chart count() over bal_bin by gender",
TEST_INDEX_BANK_WITH_NULL_VALUES));
verifySchema(
result,
schema("bal_bin", "string"),
schema("gender", "string"),
schema("count()", "bigint"));
// Should only contain rows for non-null balance values (4 records with balance)
verifyDataRows(
result,
rows("0-10000", "M", 1),
rows("30000-40000", "F", 1),
rows("30000-40000", "M", 1),
rows("40000-50000", "F", 1));
}

@Test
public void testBinThenChartSingleGroupWithNullValues() throws IOException {
// chart with only row split (no column split): the simpler sort path
JSONObject result =
executeQuery(
String.format(
"source=%s | bin balance span=10000 as bal_bin | chart count() over bal_bin",
TEST_INDEX_BANK_WITH_NULL_VALUES));
verifySchema(result, schema("bal_bin", "string"), schema("count()", "bigint"));
verifyDataRows(result, rows("0-10000", 1), rows("30000-40000", 2), rows("40000-50000", 1));
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
setup:
- do:
query.settings:
body:
transient:
plugins.calcite.enabled: true

- do:
indices.create:
index: issue5174
body:
settings:
number_of_shards: 1
number_of_replicas: 0
mappings:
properties:
count:
type: integer
category:
type: keyword
subcategory:
type: keyword
value:
type: double
ts:
type: date

- do:
bulk:
refresh: true
body:
- '{"index": {"_index": "issue5174", "_id": "1"}}'
- '{"count": 1, "category": "A", "subcategory": "X", "value": 10.5, "ts": "2024-01-01"}'
- '{"index": {"_index": "issue5174", "_id": "2"}}'
- '{"count": 2, "category": "A", "subcategory": "Y", "value": 20.3, "ts": "2024-01-02"}'
- '{"index": {"_index": "issue5174", "_id": "3"}}'
- '{"count": 10, "category": "B", "subcategory": "X", "value": 100.0, "ts": "2024-01-03"}'
- '{"index": {"_index": "issue5174", "_id": "4"}}'
- '{"count": null, "category": "B", "subcategory": "Y", "value": null, "ts": "2024-01-04"}'

---
teardown:
- do:
indices.delete:
index: issue5174
ignore_unavailable: true
- do:
query.settings:
body:
transient:
plugins.calcite.enabled: false

---
"Issue 5174: bin then chart with null values should not cause NPE":
- skip:
features:
- headers
- do:
headers:
Content-Type: 'application/json'
ppl:
body:
query: source=issue5174 | bin value span=50 as val_bin | chart count() over val_bin by category

- match: { total: 2 }
- match: { schema: [ { name: val_bin, type: string }, { name: category, type: string }, { name: "count()", type: bigint } ] }
- match: { datarows: [ [ "0-50", "A", 2 ], [ "100-150", "B", 1 ] ] }

---
"Issue 5174: bin then chart with single group and null values":
- skip:
features:
- headers
- do:
headers:
Content-Type: 'application/json'
ppl:
body:
query: source=issue5174 | bin value span=50 as val_bin | chart count() over val_bin

- match: { total: 2 }
- match: { schema: [ { name: val_bin, type: string }, { name: "count()", type: bigint } ] }
- match: { datarows: [ [ "0-50", 2 ], [ "100-150", 1 ] ] }
Loading
Loading