Skip to content

Commit 0001e2f

Browse files
committed
Point parser diagnostics at the offending name (#7742)
When the unquoted-identifier parser finishes, require that the next char is a real word-boundary (not another identifier char and not another '-'). Otherwise the caller wrote something like `pubKeyHash-305478r71`, `foo-bar` or `foo-123-456`: the '-NNN' we just consumed as the numeric unique-suffix is not actually terminal, and the prefix interpretation would silently mis-parse. Consume the remainder of the extended identifier so the diagnostic can cite the full bad text, then raise a new `InvalidIdentifier` custom parser error with a caret on the start of the identifier and an actionable hint to quote it with backticks. For the original Scalus 0.16.0 HTLC reproducer this changes the error from `htlc.uplc:448:39: unexpected '(' expecting ')'` (on a lambda 8+ chars past the real site) to `htlc.uplc:447:41: Invalid identifier 'pubKeyHash-305478r71'` — on the offending name itself. The three negative goldens added in the previous commit are updated to the new message; all 3886 tests across plutus-core/untyped-plutus-core/ plutus-ir pass unchanged.
1 parent aa23687 commit 0001e2f

6 files changed

Lines changed: 71 additions & 13 deletions

File tree

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
### Changed
2+
3+
- The UPLC/PLC/PIR textual parser now rejects unquoted identifiers that
4+
contain a `-` anywhere other than as the terminal numeric unique-suffix
5+
separator (e.g. `pubKeyHash-305478r71`, `foo-bar`, `foo-123-456`) with
6+
a dedicated `InvalidIdentifier` diagnostic that points directly at the
7+
offending name and shows the full bad text. Previously the same inputs
8+
silently mis-parsed — the prefix was taken as a name plus unique-suffix
9+
and the remainder was picked up as an adjacent term — which surfaced as
10+
a confusing "unexpected '(' expecting ')'" message far from the real
11+
site (see #7742). To use such a string as a name verbatim, wrap it in
12+
backticks: `` `pubKeyHash-305478r71` ``.

plutus-core/plutus-core/src/PlutusCore/Error.hs

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,12 @@ data ParserError
5252
= BuiltinTypeNotAStar !T.Text !SourcePos
5353
| UnknownBuiltinFunction !T.Text !SourcePos ![T.Text]
5454
| InvalidBuiltinConstant !T.Text !T.Text !SourcePos
55+
| {-| An unquoted identifier that violates the grammar: a '-' appeared
56+
anywhere other than as the separator of a terminal numeric unique-suffix
57+
(e.g. @pubKeyHash-305478r71@, @foo-bar@, @foo-123-456@). The 'Text'
58+
carries the full offending text as it appeared in the source, so the
59+
user sees their own name back in the diagnostic. -}
60+
InvalidIdentifier !T.Text !SourcePos
5561
deriving stock (Eq, Ord, Generic)
5662
deriving anyclass (NFData)
5763

@@ -192,6 +198,18 @@ instance Pretty ParserError where
192198
<+> squotes (pretty s)
193199
<+> "at"
194200
<+> pretty loc
201+
pretty (InvalidIdentifier txt loc) =
202+
"Invalid identifier"
203+
<+> squotes (pretty txt)
204+
<+> "at"
205+
<+> pretty loc
206+
<> "."
207+
<> hardline
208+
<> "A '-' inside a name is the numeric unique-suffix separator and must be"
209+
<+> "followed only by digits and a word boundary."
210+
<> hardline
211+
<> "To use this text as a name verbatim, quote it with backticks:"
212+
<+> pretty ("`" <> txt <> "`")
195213

196214
instance ShowErrorComponent ParserError where
197215
showErrorComponent = show . pretty

plutus-core/plutus-core/src/PlutusCore/Parser/ParserCommon.hs

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ import Control.Monad.Except
1313
import Control.Monad.Reader (ReaderT, ask, local, runReaderT)
1414
import Control.Monad.State (StateT, evalStateT)
1515
import Data.Map qualified as M
16+
import Data.Set qualified as Set
1617
import Data.Text (Text)
1718
import Data.Text qualified as Text
1819
import Text.Megaparsec hiding (ParseError, State, parse, some)
@@ -217,9 +218,33 @@ name = try $ parseUnquoted <|> parseQuoted
217218
where
218219
parseUnquoted :: Parser Name
219220
parseUnquoted = do
221+
startOffset <- getOffset
222+
startPos <- getSourcePos'
220223
_ <- lookAhead (satisfy isIdentifierStartingChar)
224+
inputBefore <- getInput
221225
str <- takeWhileP (Just "identifier-unquoted") isIdentifierChar
222-
Name str <$> uniqueSuffix str
226+
u <- uniqueSuffix str
227+
{- The parsed prefix is only a valid identifier if the next character is
228+
a real word-boundary. If instead we see more identifier chars or another
229+
'-', the user wrote something like `foo-bar` or `pubKeyHash-305478r71` —
230+
the '-NNN' run we just treated as a unique-suffix was actually part of
231+
their intended name (or they have a stray '-' at all). Fail with a
232+
custom diagnostic that points at the whole offending identifier. -}
233+
mBad <- optional (lookAhead (satisfy isNameExtensionChar))
234+
case mBad of
235+
Nothing -> pure (Name str u)
236+
Just _ -> do
237+
-- Consume the remainder so the reported text covers the full name.
238+
_ <- takeWhileP Nothing isNameExtensionChar
239+
inputAfter <- getInput
240+
let consumed = Text.length inputBefore - Text.length inputAfter
241+
fullText = Text.take consumed inputBefore
242+
parseError $
243+
FancyError startOffset $
244+
Set.singleton (ErrorCustom (InvalidIdentifier fullText startPos))
245+
246+
isNameExtensionChar :: Char -> Bool
247+
isNameExtensionChar c = isIdentifierChar c || c == '-'
223248

224249
parseQuoted :: Parser Name
225250
parseQuoted = do
Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1-
test:1:28:
1+
test:1:21:
22
|
33
1 | (program 1.1.0 (lam foo-123-456 foo-123-456))
4-
| ^
5-
unexpected '-'
6-
expecting '`', digit, opening bracket '[', or opening parenthesis '('
4+
| ^
5+
Invalid identifier 'foo-123-456' at test:1:21.
6+
A '-' inside a name is the numeric unique-suffix separator and must be followed only by digits and a word boundary.
7+
To use this text as a name verbatim, quote it with backticks: `foo-123-456`
Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1-
test:1:42:
1+
test:1:21:
22
|
33
1 | (program 1.1.0 (lam pubKeyHash-305478r71 (lam x x)))
4-
| ^
5-
unexpected '('
6-
expecting closing parenthesis ')'
4+
| ^
5+
Invalid identifier 'pubKeyHash-305478r71' at test:1:21.
6+
A '-' inside a name is the numeric unique-suffix separator and must be followed only by digits and a word boundary.
7+
To use this text as a name verbatim, quote it with backticks: `pubKeyHash-305478r71`
Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1-
test:1:24:
1+
test:1:21:
22
|
33
1 | (program 1.1.0 (lam foo-bar foo-bar))
4-
| ^
5-
unexpected '-'
6-
expecting '`', identifier-unquoted, opening bracket '[', or opening parenthesis '('
4+
| ^
5+
Invalid identifier 'foo-bar' at test:1:21.
6+
A '-' inside a name is the numeric unique-suffix separator and must be followed only by digits and a word boundary.
7+
To use this text as a name verbatim, quote it with backticks: `foo-bar`

0 commit comments

Comments
 (0)