Skip to content

Fix #1557: eagerly validate separator after non-root number values#1615

Draft
seonwooj0810 wants to merge 2 commits into
FasterXML:3.xfrom
seonwooj0810:fix/1557-non-root-number-lazy-validation
Draft

Fix #1557: eagerly validate separator after non-root number values#1615
seonwooj0810 wants to merge 2 commits into
FasterXML:3.xfrom
seonwooj0810:fix/1557-non-root-number-lazy-validation

Conversation

@seonwooj0810

@seonwooj0810 seonwooj0810 commented May 29, 2026

Copy link
Copy Markdown
Contributor

Fixes #1557

Problem

Number values inside Arrays / Objects (non-root) were not checked for a
valid trailing separator. Only root-level values were validated (added in #105,
extended in #679 for the second-decimal-point case). For non-root numbers the
parser simply pushed the trailing character back and returned the number token,
so malformed content failed lazily — only when the next token was read:

// all of these reported a number token first, failing one token too late:
[ 123true ]   // -> 123
[ 100k ]      // -> 100
[ 100/ ]      // -> 100
[ 1.5x ]      // -> 1.5
[ 1.5false ]  // -> 1.5

Approach

Generalize the existing root-only _verifyRootSpace into a
_verifyNumberSeparator helper, called at every number-completion site in the
three blocking parsers (ReaderBasedJsonParser, UTF8StreamJsonParser,
UTF8DataInputJsonParser). Root values keep the exact previous behavior; a
non-root number must now be followed by one of:

  • white space (' ', '\t', '\n', '\r')
  • a value separator / structure end: , ] }
  • a comment start / or # when the corresponding JsonReadFeature is enabled
  • end-of-input

otherwise an expected space, comma, or closing bracket/brace ... error is
reported at the offending character (location aligned with the root path).

Performance: the check reuses the trailing character that the number loop
already reads to detect end-of-number (no extra buffer access / look-ahead),
and the accept/reject decision is a constant-time switch. The hot path cost is
therefore negligible, and the change keeps the same performance shape as the
existing _verifyRootSpace it generalizes.

Scope / follow-up

This PR covers the three blocking parsers. The async (non-blocking)
parser still fails lazily for this case — its number state machine completes at
several different points (including buffer-boundary resumption) and deserves a
separate, carefully-tested change. Its reproduction remains under
tofix/async/AsyncTokenBranchNumberErrorTest with @JacksonTestFailureExpected.

@cowtowncoder — do you want async handled in this PR, or as a separate follow-up?
Happy to extend if preferred.

Tests

  • Added nonRootMangledIntegers1557 / nonRootMangledFloats1557 to
    NonStandardNumberParsingTest (verified across ALL_MODES).
  • Added nonRootNumberFollowedByComment1557 to prove the / and # branches:
    numbers immediately adjacent to comments ([1/* c */,2], [1// c\n,2],
    [1#c\n,2]) still parse when comments are enabled.
  • Removed the now-fixed tofix repros for the blocking parsers (assertions moved
    into the suite above), mirroring how Number parsing should fail for trailing dot (period) #679 was handled.

Full ./mvnw clean test is green.

…values

Number values inside Arrays/Objects were only checked for a valid
trailing separator at root level (cf. FasterXML#105 / FasterXML#679). Non-root values
pushed the trailing character back unchecked, so malformed input such
as `[ 123true ]`, `[ 100k ]` or `[ 1.5x ]` reported a number token and
only failed lazily when accessing the *following* token.

Generalize the existing root-only `_verifyRootSpace` into
`_verifyNumberSeparator`, applied at every number-completion site in the
three blocking parsers (Reader / UTF-8 stream / UTF-8 DataInput). A
non-root number must now be followed by white space, `,`, `]`, `}`, a
comment start (when comments are enabled) or end-of-input; otherwise an
"expected space..." error is reported at the offending character.

The check reuses the trailing character already read to detect the end
of the number and is a constant-time switch, so the hot path cost is
negligible.

Tests: add non-root verification (and comment-adjacent cases) to
`NonStandardNumberParsingTest` across ALL_MODES; remove the now-fixed
`tofix` repros for the blocking parsers.

Async (non-blocking) parser is intentionally left as a follow-up; its
`tofix` repro remains under `@JacksonTestFailureExpected`.
@seonwooj0810 seonwooj0810 force-pushed the fix/1557-non-root-number-lazy-validation branch from 77ab9b7 to 1ae30da Compare May 29, 2026 00:45
@github-actions

Copy link
Copy Markdown
Contributor

📈 Overall Code Coverage

Metric Coverage Change
Instructions coverage 83.01% 📈 +0.000%
Branches branches 75.77% 📈 +0.050%

Overall project coverage from JaCoCo test results. Change values compare against the latest base branch build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parsing for non-root number values fails lazily

2 participants