Skip to content

Fix #5166: Handle standard JSONPath prefix in json_extract#5205

Closed
penghuo wants to merge 1 commit into
opensearch-project:mainfrom
penghuo:fix/issue-5166
Closed

Fix #5166: Handle standard JSONPath prefix in json_extract#5205
penghuo wants to merge 1 commit into
opensearch-project:mainfrom
penghuo:fix/issue-5166

Conversation

@penghuo

@penghuo penghuo commented Mar 5, 2026

Copy link
Copy Markdown
Collaborator

Description

json_extract always returned string "null" instead of the actual extracted value when using standard JSONPath notation (e.g., $.name).

Root cause: JsonUtils.convertToJsonPath() unconditionally prepends $. to all input paths. When users pass standard JSONPath like $.name, it becomes $.$.name — an invalid path that matches nothing, causing Calcite's JsonFunctions to return null.

Fix: Strip any existing $. prefix before normalization in convertToJsonPath(), so paths already in standard JSONPath form are handled correctly. Also handles bare $ input. The fix benefits all 5 JSON UDFs that share this code path (json_extract, json_delete, json_set, json_append, json_extend).

Related Issues

#5166

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…rtToJsonPath

convertToJsonPath() unconditionally prepended "$." to the input path.
When users passed standard JSONPath notation like "$.name", this produced
the invalid path "$.$.name" which matched nothing, causing json_extract
to return null instead of the actual value.

Strip any existing "$." prefix before normalization so that both the
plugin's custom notation (e.g. "name", "a{}.b") and standard JSONPath
notation (e.g. "$.name", "$.user.name") are handled correctly.

Signed-off-by: Peng Huo <penghuo@gmail.com>
@github-actions

github-actions Bot commented Mar 5, 2026

Copy link
Copy Markdown
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Fix convertToJsonPath to handle standard JSONPath prefix

Relevant files:

  • core/src/main/java/org/opensearch/sql/expression/function/jsonUDF/JsonUtils.java
  • core/src/test/java/org/opensearch/sql/expression/function/jsonUDF/JsonExtractFunctionImplTest.java
  • core/src/test/java/org/opensearch/sql/expression/json/JsonFunctionsTest.java

Sub-PR theme: Add integration tests for issue 5166 json_extract fix

Relevant files:

  • integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5166.yml

⚡ Recommended focus areas for review

Incomplete Stripping

The fix only strips a single $. prefix. If a user passes a path like $$.name or $.[*].name (with bracket notation after $), the stripping logic may not handle it correctly. Additionally, paths like $[0] (starting with $[ rather than $.) are not handled — the code only checks for $. prefix and bare $, but valid JSONPath can also start with $[. This could still result in malformed paths like $.$[0].

if (input.startsWith("$.")) {
  input = input.substring(2);
} else if (input.equals("$")) {
  return "$";
}
Normalization Bypass

When a path starts with $., the prefix is stripped and the remainder is passed through the normalization loop. However, if the remainder itself contains JSONPath-specific syntax (e.g., [*], [0]), the normalization loop may transform it incorrectly — for example, $.scores[*] would become $.scores[*] after stripping to scores[*], but the loop processes { and } for bracket notation, not [ and ]. This means users passing already-normalized JSONPath with bracket notation may get unexpected results.

if (input.startsWith("$.")) {
  input = input.substring(2);
} else if (input.equals("$")) {
  return "$";
}

StringBuilder sb = new StringBuilder("$.");
int i = 0;
while (i < input.length()) {

@github-actions

github-actions Bot commented Mar 5, 2026

Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Guard against empty path after prefix stripping

After stripping the $. prefix, if the remaining input is empty (i.e., the original
input was exactly "$."), the code will proceed to build "$." with an empty loop
body, resulting in a trailing dot "$.". This edge case should be handled explicitly
by returning "$" when input becomes empty after stripping.

core/src/main/java/org/opensearch/sql/expression/function/jsonUDF/JsonUtils.java [28-35]

 if (input.startsWith("$.")) {
   input = input.substring(2);
+  if (input.isEmpty()) return "$";
 } else if (input.equals("$")) {
   return "$";
 }
 
 StringBuilder sb = new StringBuilder("$.");
Suggestion importance[1-10]: 6

__

Why: This is a valid edge case: if input is exactly "$.", stripping the prefix leaves an empty string, causing the loop to produce "$." with a trailing dot. Adding if (input.isEmpty()) return "$"; correctly handles this case with minimal code change.

Low
Handle bracket-style JSONPath prefix correctly

The current code only handles $. prefix but not paths starting with $[ (e.g., $[0],
$[*]). Stripping only $. will cause $[0] to be processed as [0] and then incorrectly
transformed to $.[0]. Consider also handling the $[ case by stripping just the $
prefix when the input starts with $[.

core/src/main/java/org/opensearch/sql/expression/function/jsonUDF/JsonUtils.java [28-32]

 if (input.startsWith("$.")) {
   input = input.substring(2);
+} else if (input.startsWith("$[")) {
+  input = input.substring(1);
+  StringBuilder sb = new StringBuilder("$");
+  // return early with the bracket-style path processing
+  // (or integrate into the main loop below)
+  return "$" + input;
 } else if (input.equals("$")) {
   return "$";
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion raises a valid concern about $[ style paths not being handled, but the improved_code is incomplete and incorrect - it returns "$" + input which would double the [ character and doesn't properly integrate with the existing loop logic. The concern is real but the proposed fix is not usable as-is.

Low

@penghuo penghuo added the PPL Piped processing language label Mar 5, 2026
@penghuo penghuo closed this Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant