Stop misclassifying unquoted string parameters as keywords#21
Merged
Conversation
YES, THPRES and other uppercase string values in record lines under SCALECRS, EQLOPTS, etc. were highlighted, linted, and shown in the docs panel as if they were keyword declarations. OPM Flow itself only recognises keywords in column 1; align the editor with that rule: - Grammar: anchor the keywords rule to column 1. - Analyzer: treat indented keyword-shaped tokens inside an open record block as record values, even when the name is in the index. - Editor: introduce KEYWORD_LINE_COL1_RE and use it for active-keyword scan, docs panel, hover, and folding so clicking THPRES under EQLOPTS shows EQLOPTS docs rather than the unrelated SOLUTION THPRES.
Adds 17 extensions used by opm-tests decks (eqldims, regdims, gridopts, swatinit, faults, fipzon, permx, pvtnum, rxvd, trans, vfpprod, grid, prpecl, dat, incl, sched, smry) so these files load with the opm-flow language id and pick up syntax highlighting, diagnostics, hover, etc.
The OPM manual states a FIP keyword name is "FIP as the first three characters followed by up to a five letter character string", producing deck tokens like FIPZON, FIPGL, FIPNL, FIPUNIT, FIPHC. Mark FIP as a templated base name so the existing FIP+[A-Z0-9]+ fallback resolves those tokens to the FIP entry. Direct entries (FIPNUM, FIPOWG, FIPSEP, FIP_PROBE) still win via the direct-lookup path.
W/G/C/R/B/A/N/S-prefixed SUMMARY mnemonics take a list of names closed by a single '/'; names may sit inline or be spread across many lines. The previous fixed/1 classification mis-flagged each intermediate name line as missing the terminating '/'. Reclassify these 976 mnemonics as size_kind 'array' so per-line terminator checks are skipped but a missing block-end '/' is still flagged.
Non-F SUMMARY mnemonics may legally appear without a body — bare lines
stacked back-to-back and closed by a single trailing '/' is a common
real-deck pattern:
GMWPR
GMWIN
/
Add an `optional_body` flag on AnalysisEntry and tag all 981 non-F
SUMMARY array mnemonics with it. The close-block terminator check now
skips entries whose body was empty (recordCount === 0); once names are
listed, the missing-'/' diagnostic still fires.
UDQ SUMMARY variables are user-named: the manual documents them as placeholders FUXXXXXX, WUXXXXXX, GUXXXXXX, CUXXXXXX, RUXXXXXX, SUXXXXXX where the trailing X's stand for the user-defined name (up to six characters). Real decks write tokens like WUWI1, FUOIL, GUTOT. The 8-char placeholders sat in the index but never resolved: the template fallback requires the base name to be shorter than the deck token. Strip the trailing X's so each entry is keyed by its 2-char scope prefix (FU, WU, GU, CU, RU, SU) and mark them templated. The existing <base>+[A-Z0-9]+ fallback then resolves WUWI1, FUOIL, etc. to the correct shape. Direct lookups for FULLIMP, GUIDECAL, RUNSPEC, SURFACT, ... still win because direct beats template.
MESSAGES is a single record of 13 INT parameters that real decks
routinely split across two lines — print limits then stop limits,
closed by a single trailing '/':
MESSAGES
80000 10000 5000000 5000 300 1
80000 10000 5000000 80000 10 1 /
opm-common doesn't flag any MESSAGES item as size_type: "ALL", so the
existing variadic-record auto-detection never tagged it. Add a manually
curated VARIADIC_RECORD_KEYWORDS set in the build script and a matching
'variadic_record: true' on the MESSAGES index entry so per-line missing
'/' diagnostics are suppressed.
opm-common classifies VFPPROD and VFPINJ as 'list' (size = string sentinel), which triggered (a) per-line "missing terminating '/'" diagnostics on every line of the multi-line LIQ/THP/WFR/GFR/ALQ/BHP tables and (b) a closeKw "missing terminating '/' to close the record list" because real decks never close the block with a standalone '/' — the next keyword does. Add the keywords to VARIADIC_RECORD_KEYWORDS (suppress per-line check) and a new NO_LIST_TERMINATOR_KEYWORDS set that reclassifies them to 'fixed' with size_count = records_meta length (suppress final terminator check). Both axis-table and BHP-table forms now parse cleanly.
keyword line pattern Tighten the keyword rule so it only fires when the line is the keyword alone, optionally followed by a '--' comment or a '/' (this keeps inline 'KEYWORD /' colored). Record lines starting with an uppercase identifier at column 1 — well/group names like OP01, property names like SGL/SGCR/SOWCR in EQUALS, group names like UPPER/FIELD in GRUPTREE, enum values like OPEN/ORAT in WCONHIST — no longer pick up the keyword color. Add a low-precedence 'unquoted-strings' pattern (string.unquoted.opm- flow) that catches any remaining bare uppercase identifier, so those same record-line tokens now render in the string color instead of falling through unstyled. Quoted strings, variables, defaults, numbers, and terminators keep their existing scopes because they match earlier in the precedence list.
Six pytest cases still asserted the old 'fixed/size_count=1' shape for non-F SUMMARY mnemonics. Update them to the new 'array' + optional_body=True shape introduced when WOPR/GMWIN multi-line and bare-stack bodies were fixed, and import the new _summary_optional_body helper.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #13:
YESunderSCALECRS,THPRESunderEQLOPTS, and similar unquoted string values were highlighted, linted, and shown in the docs panel as if they were keyword declarations. OPM Flow itself only recognises keywords in column 1; align grammar, analyzer, and cursor-driven providers with that rule.Testing the fix surfaced several related real-deck issues. Each is its own commit so changes can be reviewed independently.
What's in the PR
.eqldims,.regdims,.gridopts,.swatinit,.faults,.trans,.vfpprod,.pvtnum,.rxvd,.grid,.fipzon,.permx, …) so those files load with theopm-flowlanguage id.FIPZON,FIPGL,FIPNL,FIPUNIT,FIPHCresolve through the existing templating path against a newFIPbase entry.WOPR,WGPR, …) — reshape 976 entries fromsize_kind: fixed/1toarrayso multi-line bodies don't get a per-line "missing terminating/" diagnostic.GMWPR \n GMWIN \n /) — newoptional_bodyflag tagged on 981 non-F mnemonics so empty bodies don't demand a closing/; once values appear the diagnostic still fires.WUWI1,FUOIL,GUTOT, …) — strip the trailing X's from the manual'sFUXXXXXX/WUXXXXXX/… placeholders and mark the 2-char prefixes templated.variadic_record: truevia a manually-curated set in the build script./, no standalone closer. Reclassified fromlisttofixedplusvariadic_record: trueso per-line and final-terminator checks are suppressed.keywordspattern so it only matches a line that is the keyword alone (optionally followed by a--comment or a single/). Add a low-precedenceunquoted-strings(string.unquoted.opm-flow) pattern so well/group/property names likeOP01,FIELD,UPPER,SGL/SOWCRinEQUALS/GRUPTREE/WCONHISTrecords render in the string color rather than being mis-painted as keywords.Tests
189 jest tests (up from 168), including regression cases for every issue listed above. Grammar changes aren't jest-covered and were spot-checked against the user-reported decks in the Extension Development Host.