[Bug Fix] Fix PPL syntax highlighting accepting invalid field=/comparison source fields#5559
Closed
Hanyu-W wants to merge 1 commit into
Closed
Conversation
…arison source fields The Explore query editor validates PPL in the browser using the runtime grammar bundle it fetches from the cluster (for data sources >= 3.6). That bundle is the serialized form of this `.g4`. Because the grammar typed the source field of `grok`/`parse`/`patterns` (and `spath` `input=`/`output=`) as the broad `expression` rule, the editor accepted syntax that is not valid PPL and drew no error squiggle on it. Concretely, `FIELD` (and `SED`, `MAX_MATCH`, `OFFSET_FIELD`) are valid keyword-as-identifiers, so `grok field=body "..."` matched the `compareExpr` alternative of `expression` (parsed as `Compare(field, =, body)`) and was accepted as well-formed. The documented form is positional: `grok <field> <pattern>`. This bug fix narrows each source-field slot to the tightest grammar rule that matches the command's real contract, so the editor (and the server parser that shares this grammar) flags the invalid form. - grok / parse / patterns: `expression` -> `valueExpression`. `valueExpression` excludes comparison/IN/BETWEEN/IS-NULL (which live only at the `expression` tier) but still includes function-call and arithmetic sources the engine genuinely evaluates — e.g. `parse DATE_FORMAT(HIREDATE, '%Y-%m-%d') '...'` stays valid (this is why narrowing all the way to `fieldExpression` would have been wrong). - spath: `input`/`output` `expression` -> `fieldExpression`. spath stringifies its input into a field name and never evaluates it, so the source must be a bare field path; function calls/comparisons there are now flagged instead of accepted. No AstBuilder change is needed: the labeled `ctx.source_field` accessor and `param.input.getText()` work regardless of the rule's static type, and nothing casts these contexts to a specific subtype. Tests: 8 negative cases asserting the parser rejects the invalid forms (`SyntaxCheckException`: named-param, comparison, keyword-named-param, spath function-call/comparison) plus 3 positive regression guards (dotted/keyword field names; function-call source for parse; dotted field with options for patterns). Verified in the Explore editor: the invalid forms now render a red squiggle and the valid forms do not. Note: this also tightens the server parser (same grammar), so `grok field=body` is now rejected before semantic analysis instead of being silently misparsed. That is a user-visible behavior change on the server, secondary to the highlighting fix. Signed-off-by: Hanyu Wei <weihanyu@amazon.com>
Contributor
Author
|
Closing the PR-I have a better fix upcoming in the linting project. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The grammar typed the source field of
grok/parse/patterns(andspathinput=/output=) as the broadexpressionrule. Any<name>=<name>form (e.g.grok field=body "...",grok foo=bar "...") matched thecompareExpralternative and was silently accepted — parsed as a comparison instead of a syntax error. On clusters >= 3.6 where the Explore editor uses the runtime grammar bundle (via/_plugins/_ppl/_grammar), there was no error highlighting at all for these forms. The documented form is positional (grok <field> <pattern>).This narrows each source-field slot to the tightest rule matching the command's contract, tightening both the server parser (rejects at parse time instead of misparsing) and the grammar bundle the Explore editor consumes:
expression→valueExpression(drops comparisons, keeps function-call/arithmetic sources likeparse DATE_FORMAT(...) '...')expression→fieldExpression(source is stringified into a field name, never evaluated)Testing
SyntaxCheckExceptionfor invalid formsPOST /_plugins/_pplwith invalid forms returns 400; valid forms execute correctlyRelated Issues
Resolves #5558
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.