Skip to content

chore(message-parser): all contributions #39853

Merged
ggazzo merged 69 commits intodevelopfrom
chore/merge-message-parser-prs
Apr 17, 2026
Merged

chore(message-parser): all contributions #39853
ggazzo merged 69 commits intodevelopfrom
chore/merge-message-parser-prs

Conversation

@ggazzo
Copy link
Copy Markdown
Member

@ggazzo ggazzo commented Mar 25, 2026

  • Investigate HeadingBlock.tsx change for breaking change potential
  • Fix HeadingBlock.tsx to properly render all Inlines[] using InlineElements (consistent with ParagraphBlock)
  • Run code validation

Summary by CodeRabbit

  • New Features

    • Better block parsing (headings, code blocks, lists, quotes, paragraphs).
    • Richer inline content in headings and improved link recognition.
    • Improved URL/autolink handling and timestamp parsing.
  • Bug Fixes

    • Corrected emoji joining and plain-text reduction.
    • Increased parser robustness and determinism.
  • Tests

    • Added fuzz tests, block-splitter/unit suites, and benchmark fixtures.
    • Tightened many tests to require exact parse results.
  • Documentation

    • Added a proposal for a compact AST format.

Task: ARCH-2105

@dionisio-bot
Copy link
Copy Markdown
Contributor

dionisio-bot Bot commented Mar 25, 2026

Looks like this PR is ready to merge! 🎉
If you have any trouble, please check the PR guidelines

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Mar 25, 2026

🦋 Changeset detected

Latest commit: dd9d251

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 42 packages
Name Type
@rocket.chat/message-parser Patch
@rocket.chat/meteor Patch
@rocket.chat/core-services Patch
@rocket.chat/core-typings Patch
@rocket.chat/gazzodown Patch
@rocket.chat/livechat Patch
@rocket.chat/rest-typings Patch
@rocket.chat/pdf-worker Patch
rocketchat-services Patch
@rocket.chat/account-service Patch
@rocket.chat/authorization-service Patch
@rocket.chat/ddp-streamer Patch
@rocket.chat/omnichannel-transcript Patch
@rocket.chat/presence-service Patch
@rocket.chat/queue-worker Patch
@rocket.chat/abac Patch
@rocket.chat/federation-matrix Patch
@rocket.chat/network-broker Patch
@rocket.chat/omni-core-ee Patch
@rocket.chat/omnichannel-services Patch
@rocket.chat/presence Patch
@rocket.chat/uikit-playground Patch
@rocket.chat/api-client Patch
@rocket.chat/apps Patch
@rocket.chat/cron Patch
@rocket.chat/ddp-client Patch
@rocket.chat/fuselage-ui-kit Patch
@rocket.chat/http-router Patch
@rocket.chat/model-typings Patch
@rocket.chat/models Patch
@rocket.chat/ui-avatar Patch
@rocket.chat/ui-client Patch
@rocket.chat/ui-contexts Patch
@rocket.chat/ui-voip Patch
@rocket.chat/web-ui-registration Patch
@rocket.chat/license Patch
@rocket.chat/media-calls Patch
@rocket.chat/mock-providers Patch
@rocket.chat/instance-status Patch
@rocket.chat/omni-core Patch
@rocket.chat/server-fetch Patch
@rocket.chat/ui-video-conf Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 25, 2026

Walkthrough

Adds a BlockSplitter module, overhauls the PEG grammar and utils, widens heading nodes to contain inline elements, introduces property-based fuzz tests with fast-check, adds many tests and benchmarks, and standardizes existing tests to strict equality assertions.

Changes

Cohort / File(s) Summary
Changesets
.changeset/add-parser-fuzz-testing.md, .changeset/block-splitter-layer1.md, .changeset/message-parser-guards-coverage.md, .changeset/message-parser-joinEmoji-tests.md
New changeset entries documenting fuzz testing, BlockSplitter PoC, guards coverage, and joinEmoji tests.
BlockSplitter Feature
packages/message-parser/src/BlockSplitter.ts, packages/message-parser/tests/blockSplitter.spec.ts
New exported BlockType/Block types and BlockSplitter.split implementation with comprehensive tests for headings, fences, lists, quotes, paragraphs, and edge cases (including incomplete fences).
Grammar Refactor (PEG)
packages/message-parser/src/grammar.pegjs
Large grammar rewrite: removed extractFirstResult, replaced emphasis/emoticon gating with first-char dispatch and PEG lookaheads, simplified timestamp/code/heading/inline/link rules, and adjusted fallback Any rule.
Type & Guards Changes
packages/message-parser/src/definitions.ts, packages/message-parser/src/guards.ts
Heading.value changed from Plain[]Inlines[]; added KATEX, INLINE_KATEX, TIMESTAMP entries to Types; isNodeOfType generic typing switched to TypesKeysTypes[T].
Utils & Core Logic
packages/message-parser/src/utils.ts
Replaced URL validation with URL.canParse, added hasAbsoluteSchemePrefix, caching for tld options, refactored reducePlainTexts into fast/slow paths, changed timestampFromIsoTime param types from string[] → string, and removed extractFirstResult.
Heading Rendering Updates
packages/gazzodown/src/blocks/HeadingBlock.tsx, packages/gazzodown/src/PreviewMarkup.tsx, ee/packages/pdf-worker/src/.../HeadingBlock.tsx
Heading components now accept Inlines[] and render via shared InlineElements/PreviewInlineElements instead of mapping plain text.
Tests — Assertion Standardization
packages/message-parser/tests/{abuse,any,blockquotes,codeFence,color,email,emoji,emoticons,emphasis,emphasisWithEmoticons,escaped,inlineCode,inlineCodeStrike,katex,lineBreak,link,mention,orderedList,phoneNumber,spoiler,spoilerBlock,strikethrough,strongEmphasis,tasks,timestamp,unorderedList,url}.test.ts
Replaced many toMatchObject(output) assertions with strict toEqual(output) across ~28+ test files to require exact AST equality.
New Tests & Fuzzing
packages/message-parser/tests/fuzz.test.ts, packages/message-parser/tests/guards.test.ts, packages/message-parser/tests/joinEmoji.test.ts, packages/message-parser/tests/skip-flags-regression.spec.ts
Added fast-check fuzz tests (non-throw and determinism), guards narrowing tests, emoji-joining tests for reducePlainTexts, and skip-flags regression tests for deep nesting/pathological strings.
Benchmarks & Package Config
packages/message-parser/benchmarks/parser.bench.ts, packages/message-parser/package.json
Expanded benchmark fixtures and added fast-check devDependency; added engines.node: 22.16.0.
Docs / Proposal
docs/proposals/compact-ast-format.md
New proposal describing a compact AST format using span references and short type keys to reduce redundant text payloads.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title "chore(message-parser): all contributions" is vague and generic. It uses the non-descriptive term "all contributions" which does not convey meaningful information about the specific changes in the changeset. Replace the vague phrase "all contributions" with a specific summary of the main changes. For example: "chore(message-parser): grammar refactor and performance optimizations" or "chore(message-parser): inline rendering and parser optimizations".
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ggazzo ggazzo force-pushed the chore/merge-message-parser-prs branch 3 times, most recently from 906ec70 to cfa651e Compare March 25, 2026 17:22
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 96.16519% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.74%. Comparing base (7d19209) to head (dd9d251).
⚠️ Report is 2 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop   #39853      +/-   ##
===========================================
- Coverage    70.07%   69.74%   -0.33%     
===========================================
  Files         3289     3290       +1     
  Lines       117541   119068    +1527     
  Branches     20844    21574     +730     
===========================================
+ Hits         82361    83039     +678     
- Misses       31890    32718     +828     
- Partials      3290     3311      +21     
Flag Coverage Δ
e2e 59.64% <ø> (-0.07%) ⬇️
e2e-api 46.24% <ø> (-0.02%) ⬇️
unit 70.49% <96.16%> (-0.43%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ggazzo ggazzo force-pushed the chore/merge-message-parser-prs branch from cfa651e to 3cd9dcc Compare March 25, 2026 17:28
@ggazzo ggazzo force-pushed the chore/merge-message-parser-prs branch from 3cd9dcc to c59b51e Compare March 25, 2026 17:29
ggazzo and others added 16 commits March 25, 2026 14:30
Ensure ASTNode includes all parser-emitted node types via Types map and improve isNodeOfType narrowing.\n\nAdd guard tests for TIMESTAMP, ORDERED_LIST, and IMAGE nodes.\n\nRefs #39057
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (15)
packages/message-parser/src/utils.ts (1)

190-215: Remove the fast/slow-path prose from the implementation.

A pair of small helpers would keep this branch readable without inline narration in the hot path.

As per coding guidelines, "Avoid code comments in the implementation".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/utils.ts` around lines 190 - 215, The
fast/slow-path comments should be removed and the logic split into small
helpers: extract the nested check on flattenableValues into a helper like
isSimpleInline(v: unknown): boolean that returns false for Array.isArray(v) or
(v as Inlines).type === 'EMOJI', then replace the inline loop that sets
needsSlowPath with a call to flattenableValues.every(isSimpleInline); also
extract the plain-text merging logic into a helper like pushOrMerge(result:
Paragraph['value'], current: Inlines) that appends current or merges into the
previous PLAIN_TEXT entry, and then use those helpers in place of the commented
fast/slow branches so the implementation reads clearly without prose.
packages/message-parser/package.json (1)

69-71: Node engine constraint mirrors root but uses exact version pinning.

The "node": "22.16.0" constraint matches the root package.json, suggesting this strict versioning is intentional across the monorepo. However, exact patch-level pinning can cause friction for contributors or CI systems using adjacent versions (e.g., 22.16.1). If strict versioning is required, consider documenting why in the repository or use a .nvmrc file. Otherwise, relaxing to a range (e.g., ">=22.16.0") would improve compatibility without sacrificing stability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/package.json` around lines 69 - 71, The package.json
currently pins Node exactly with "engines": { "node": "22.16.0" } which causes
unnecessary friction; update the Node engine constraint in
packages/message-parser's package.json (the "engines" -> "node" entry) to a more
tolerant range such as ">=22.16.0" (or mirror whatever range the root
package.json uses) or add documentation/.nvmrc if strict pinning is intentional
so contributors and CI can use adjacent patch versions without failing.
packages/message-parser/tests/katex.test.ts (1)

33-34: Use toEqual() for consistency across all three test suites, or document why partial matching is intentional for the second and third suites.

The first test suite (line 21) uses toEqual() for exact matching, while the second and third suites use toMatchObject() for partial matching. Since all helper functions create simple, deterministic objects with predictable properties and all three suites test the same parse() function, strict matching should be consistent across all three. Using toMatchObject() can mask unexpected properties in the parser output.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/katex.test.ts` around lines 33 - 34, The test
uses toMatchObject() for the parse(input, { katex: { dollarSyntax: true } })
assertion which allows partial matching and is inconsistent with the first
suite; change that assertion to use toEqual() so the test asserts exact equality
against the expected output object (refer to the parse function, the katex
option object, and the input/output test variables in katex.test.ts) to ensure
strict, deterministic comparisons.
packages/message-parser/tests/fuzz.test.ts (1)

5-5: Remove redundant filter from plainTextArbitrary.

{1,20} already guarantees non-empty strings, so the extra .filter adds unnecessary generator overhead.

♻️ Suggested simplification
-const plainTextArbitrary = fc.stringMatching(/[a-zA-Z0-9 ]{1,20}/).filter((value) => value.length > 0);
+const plainTextArbitrary = fc.stringMatching(/[a-zA-Z0-9 ]{1,20}/);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/fuzz.test.ts` at line 5, The generator
plainTextArbitrary currently applies an unnecessary .filter to enforce non-empty
strings even though the regex /[a-zA-Z0-9 ]{1,20}/ already guarantees length
>=1; remove the .filter((value) => value.length > 0) call and keep the
fc.stringMatching(/[a-zA-Z0-9 ]{1,20}/) expression (reference symbol:
plainTextArbitrary) to eliminate redundant overhead.
packages/message-parser/tests/url.test.ts (1)

163-164: Use toEqual instead of toMatchObject for invalid-port autoLink assertions.

This keeps strictness consistent with the rest of the suite and catches unexpected shape drift.

✅ Suggested matcher tightening
-		expect(autoLink('https://rocket.chat:99999')).toMatchObject(link('https://rocket.chat:99999'));
-		expect(autoLink('https://rocket.chat:65536')).toMatchObject(link('https://rocket.chat:65536'));
+		expect(autoLink('https://rocket.chat:99999')).toEqual(link('https://rocket.chat:99999'));
+		expect(autoLink('https://rocket.chat:65536')).toEqual(link('https://rocket.chat:65536'));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/url.test.ts` around lines 163 - 164, Change the
two invalid-port assertions to use strict equality: replace the
expect(...).toMatchObject(...) calls with expect(...).toEqual(...) for the
autoLink('https://rocket.chat:99999') and autoLink('https://rocket.chat:65536')
assertions so the test compares full object equality (using the existing
autoLink and link helpers) rather than a loose shape match.
packages/message-parser/tests/joinEmoji.test.ts (2)

13-23: Prefer toEqual for exact token-shape assertions.

toMatchObject can miss extra/unexpected fields in parser outputs.

🔍 Suggested strict assertions
-		expect(result[0]).toMatchObject({ type: 'PLAIN_TEXT', value: 'hello world' });
+		expect(result[0]).toEqual({ type: 'PLAIN_TEXT', value: 'hello world' });
@@
-		expect(result[0]).toMatchObject({
+		expect(result[0]).toEqual({
 			type: 'PLAIN_TEXT',
 			value: 'hello:smile:world',
 		});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/joinEmoji.test.ts` around lines 13 - 23,
Replace the loose toMatchObject assertions in the joinEmoji tests with strict
equality checks so unexpected fields are caught: in the tests that call
reducePlainTexts([...]) (using helpers plain and emoji) change
expect(result[0]).toMatchObject({...}) to expect(result[0]).toEqual({...}) and
supply the exact token shape you expect (exact type and value) for both the
single-token merge test and the emoji-with-neighbors test so the assertions fail
if extra fields are present.

5-7: Add an explicit length assertion before indexing result[0].

This makes failures clearer when output shape changes.

🧪 Suggested assertion hardening
 	it('keeps emoji when alone', () => {
 		const result = reducePlainTexts([emoji('smile')]);
+		expect(result).toHaveLength(1);
 		expect(result[0].type).toBe('EMOJI');
 	});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/joinEmoji.test.ts` around lines 5 - 7, Add an
explicit length assertion on the result from reducePlainTexts before indexing
it: after calling reducePlainTexts([emoji('smile')]) assert the array length
(e.g., expect(result).toHaveLength(1) or expect(result.length).toBe(1)) and then
assert result[0].type === 'EMOJI'; this ensures failures are clearer when the
output shape changes and references the reducePlainTexts call and the result
variable used in the test.
packages/message-parser/src/compact.ts (5)

69-72: Unused optional property in CBlock heading type.

The r?: Span property on the heading block type ({ t: 'h'; l: 1 | 2 | 3 | 4; c: CInline[]; r?: Span }) is defined but never populated in compactBlock (line 450 only returns { t: 'h', l, c }).

If r is intended for future use, add a comment. Otherwise, consider removing to keep the type surface minimal.

-	| { t: 'h'; l: 1 | 2 | 3 | 4; c: CInline[]; r?: Span }
+	| { t: 'h'; l: 1 | 2 | 3 | 4; c: CInline[] }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/compact.ts` around lines 69 - 72, The CBlock
heading variant declares an unused optional property r?: Span; update the
declaration or code to match actual use: either remove r from the heading
variant in the CBlock union to keep the type minimal, or if r is reserved, add
an inline comment explaining it's intentionally unused for future use and leave
it; locate the CBlock type and the compactBlock function (which currently
returns { t: 'h', l, c }) and make the type and implementation consistent
(remove r or document its intent).

436-438: Type assertion loses compile-time safety.

Casting node to any bypasses TypeScript's exhaustiveness checking for the switch statement. If a new block type is added to Root, the compiler won't flag this switch as incomplete.

Consider using a typed union or a discriminated union pattern to preserve type safety.

♻️ Suggested approach
-function compactBlock(ctx: Ctx, node: Paragraph | Root[number]): CBlock {
-	const n = node as any;
-	switch (n.type) {
+function compactBlock(ctx: Ctx, node: Root[number]): CBlock {
+	switch (node.type) {
 		case 'PARAGRAPH': {
-			const para = node as Paragraph;
+			const para = node;
 			const c = para.value.map((ch) => compactInline(ctx, ch));

Then use node directly with proper narrowing in each case.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/compact.ts` around lines 436 - 438, The code
casts node to any in compactBlock (signature: compactBlock(ctx: Ctx, node:
Paragraph | Root[number])) which disables TypeScript's exhaustiveness checking;
remove the `as any` cast, ensure the parameter uses a proper discriminated union
type for block nodes (e.g., a BlockNode union or the correct Root[number]
union), switch on `node.type` directly, and add an exhaustive fallback (use a
helper like assertNever on the unreachable branch) so the compiler will error if
a new block type is added.

574-592: Custom deepEqual implementation is functional but consider alternatives.

The implementation works correctly for the use case. If the project already uses a testing utility library like Jest's expect, you could leverage expect(a).toEqual(b) in tests while keeping this for runtime validation. Alternatively, a well-tested utility like lodash.isEqual could be considered, though avoiding dependencies is also valid.

The key sorting (lines 588-589) adds unnecessary overhead for equality checking—comparing key sets directly would suffice.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/compact.ts` around lines 574 - 592, The deepEqual
function should avoid sorting key arrays for performance; instead, compute keysA
and keysB without sorting, check that keysA.length === keysB.length, then verify
every key in keysA exists in keysB (e.g., via a Set or includes) before
comparing values recursively; update the block referencing keysA and keysB
(currently lines creating keysA/keysB and the deepEqual(keysA, keysB) check) to
use this length+existence check so key-ordering and unnecessary sorting overhead
are removed while preserving the recursive value comparison in deepEqual.

121-131: Type narrowing could be cleaner.

The conditional 'c' in node && Array.isArray((node as any).c) with subsequent cast is awkward. Since node.t === 'a' at this point, consider using a type guard or restructuring the union type to make narrowing more straightforward.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/compact.ts` around lines 121 - 131, The narrowing
for case 'a' in compact.ts is awkward; replace the ad-hoc "'c' in node &&
Array.isArray((node as any).c)" check and cast by using a proper type guard or
discriminated union for the inline node (e.g., add or use an InlineA type
predicate that checks node.t === 'a' and Array.isArray(node.c)), then use that
guard so you can treat node as { t: 'a'; c: CInline[]; s: string } without
casting in the block where you call expandInline and construct the Link
(referencing the switch case for 'a', the node variable, expandInline, CInline,
Plain, Markup, and Link).

277-279: Regex created on every loop iteration.

The regex /\s/ is constructed on each iteration of the while loop. Precompiling it as a module-level constant would be marginally more efficient, especially for messages with many whitespace characters.

+const WHITESPACE_RE = /\s/;
+
 function skipWhitespace(ctx: Ctx) {
-	while (ctx.pos < ctx.msg.length && /\s/.test(ctx.msg[ctx.pos])) ctx.pos++;
+	while (ctx.pos < ctx.msg.length && WHITESPACE_RE.test(ctx.msg[ctx.pos])) ctx.pos++;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/compact.ts` around lines 277 - 279, The loop in
function skipWhitespace repeatedly constructs the regex /\s/ on each iteration;
predeclare a module-level constant (e.g., const WHITESPACE_RE = /\s/;) and use
that inside skipWhitespace to test ctx.msg[ctx.pos], updating ctx.pos while
ctx.pos < ctx.msg.length && WHITESPACE_RE.test(ctx.msg[ctx.pos]); this avoids
recreating the regex and slightly improves performance for long runs over
ctx.msg.
packages/message-parser/tests/compact.test.ts (3)

336-339: Test output logging may clutter CI logs.

The console.log inside the test body outputs size metrics for every parameterized case. While useful during development, it may add noise to CI output. Consider guarding behind an environment flag or removing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/compact.test.ts` around lines 336 - 339, Remove
or gate the console.log call that prints size metrics in the parameterized test
(the line using `console.log(\`  ${_label}: ${result.sizeOld}B →
${result.sizeNew}B (${result.reduction})\`)`) to avoid cluttering CI; either
delete that statement from the test or wrap it in a conditional that checks an
environment flag (e.g., process.env.SHOW_TEST_LOGS) so the log only appears when
explicitly enabled, leaving the assertions
(expect(result.sizeNew).toBeLessThan(result.sizeOld)) unchanged.

119-134: Type casts bypass compile-time safety.

The as unknown as Root casts here and in similar locations (lines 122, 128, 134, 144, 148) suppress type mismatches between test fixtures and the actual Root type. This could mask regressions if the Root type changes.

Consider defining explicit test fixture types or ensuring fixtures align with the actual AST types to catch breaking changes at compile time.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/compact.test.ts` around lines 119 - 134, The
test fixtures in packages/message-parser/tests/compact.test.ts are being
force-cast with "as unknown as Root" (used in expectExpand calls), which hides
type mismatches; update the tests by giving fixtures the correct AST types
instead of using the double cast: either (A) change the fixtures passed to
expectExpand to conform to the real Root structure and remove the casts, or (B)
introduce a typed helper/type alias for the test fixtures (e.g., export a
TestRoot or concrete typed fixture objects) and update expectExpand's signature
to accept that type so you can remove "as unknown as Root"; refer to the
expectExpand calls and the Root type to locate and fix each occurrence.

18-22: Debug logging left in test helper.

The console.log statements will produce noise in test output during CI runs. Consider using a conditional debug flag or removing them.

🧹 Suggested fix
 	if (!result.ok) {
-		console.log('Old:', JSON.stringify(oldMd, null, 2));
-		console.log('Compact:', JSON.stringify(result.compact, null, 2));
-		console.log('Expanded:', JSON.stringify(result.expanded, null, 2));
+		// Failure info is available via result object; Jest's error diff will show the mismatch
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/compact.test.ts` around lines 18 - 22, Remove
the stray debug console.log calls left in the test helper in
packages/message-parser/tests/compact.test.ts: the block that prints oldMd,
result.compact, and result.expanded when (!result.ok) should not run
unconditionally in CI; either delete those console.log lines or gate them behind
a debug flag (e.g., check process.env.DEBUG_TESTS or a local constant) so they
only print when explicitly enabled, leaving the failure assertion/reporting
logic for result.ok intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/message-parser/src/BlockSplitter.ts`:
- Around line 27-35: The heading detection in BlockSplitter uses
/^(#{1,6})\s+(.+)$/ which allows up to 6 hashes, but the real parser's grammar
(HeadingStart) caps at 4; change the regex in the heading detection (where
headingMatch is computed in BlockSplitter) to limit hashes to {1,4} so
BlockType.HEADING, the currentBlock creation, and the subsequent flush behavior
align with the parser; alternatively, if you intend to accept 5–6 hashes, update
packages/message-parser/src/grammar.pegjs's HeadingStart to match that
range—pick one consistent approach and update the corresponding code
(headingMatch logic in BlockSplitter or HeadingStart in the grammar) so both
components agree.
- Around line 40-52: BlockSplitter currently uses line.startsWith('```') which
treats lines like ```` or '``` js' as fences; update the fence checks to mirror
the PEG grammar by using the same regex matcher instead of startsWith (e.g.,
match /^```[ \t]*[A-Za-z0-9_-]*$/ and capture the optional language token) when
detecting openers and terminators in the BlockSplitter parsing loop (replace the
two startsWith('```') checks around variable line and the while-termination
condition), extract language from the capture group rather than slicing, and
keep the rest of the codePaths (codeLines, closed flag, i++) the same.

In `@packages/message-parser/src/grammar.pegjs`:
- Around line 131-133: The CodeChunkChar rule currently rejects backticks unless
followed by a non-backtick, which prevents lines that end with one or two
backticks from being parsed; update CodeChunkChar so it still prevents consuming
the start of a triple-backtick fence but allows a single "`" or double "``" to
be consumed when they are followed by a line break or end-of-input. Concretely,
adjust CodeChunkChar (the rule used by CodeChunk) to include alternatives that
match "`" and "``" when the next character is a newline or EOF (while keeping
the existing alternative that disallows backticks when followed by another
backtick to avoid matching "```").
- Around line 105-109: The Timezone rule only matches +/-HH:MM, so ISO8601Date
and ISO8601DateWithoutMilliseconds no longer recognize the trailing "Z"; update
the Timezone production to also accept an uppercase or lowercase "Z" (e.g.,
allow "Z" or "z" as a valid tz) and ensure the tz value passed into
timestampFromIsoTime is normalized (e.g., "Z" for UTC or the original "+/-HH:MM"
string) so ISO8601Date and ISO8601DateWithoutMilliseconds will return timestamp
nodes for timestamps ending with Z.

---

Nitpick comments:
In `@packages/message-parser/package.json`:
- Around line 69-71: The package.json currently pins Node exactly with
"engines": { "node": "22.16.0" } which causes unnecessary friction; update the
Node engine constraint in packages/message-parser's package.json (the "engines"
-> "node" entry) to a more tolerant range such as ">=22.16.0" (or mirror
whatever range the root package.json uses) or add documentation/.nvmrc if strict
pinning is intentional so contributors and CI can use adjacent patch versions
without failing.

In `@packages/message-parser/src/compact.ts`:
- Around line 69-72: The CBlock heading variant declares an unused optional
property r?: Span; update the declaration or code to match actual use: either
remove r from the heading variant in the CBlock union to keep the type minimal,
or if r is reserved, add an inline comment explaining it's intentionally unused
for future use and leave it; locate the CBlock type and the compactBlock
function (which currently returns { t: 'h', l, c }) and make the type and
implementation consistent (remove r or document its intent).
- Around line 436-438: The code casts node to any in compactBlock (signature:
compactBlock(ctx: Ctx, node: Paragraph | Root[number])) which disables
TypeScript's exhaustiveness checking; remove the `as any` cast, ensure the
parameter uses a proper discriminated union type for block nodes (e.g., a
BlockNode union or the correct Root[number] union), switch on `node.type`
directly, and add an exhaustive fallback (use a helper like assertNever on the
unreachable branch) so the compiler will error if a new block type is added.
- Around line 574-592: The deepEqual function should avoid sorting key arrays
for performance; instead, compute keysA and keysB without sorting, check that
keysA.length === keysB.length, then verify every key in keysA exists in keysB
(e.g., via a Set or includes) before comparing values recursively; update the
block referencing keysA and keysB (currently lines creating keysA/keysB and the
deepEqual(keysA, keysB) check) to use this length+existence check so
key-ordering and unnecessary sorting overhead are removed while preserving the
recursive value comparison in deepEqual.
- Around line 121-131: The narrowing for case 'a' in compact.ts is awkward;
replace the ad-hoc "'c' in node && Array.isArray((node as any).c)" check and
cast by using a proper type guard or discriminated union for the inline node
(e.g., add or use an InlineA type predicate that checks node.t === 'a' and
Array.isArray(node.c)), then use that guard so you can treat node as { t: 'a';
c: CInline[]; s: string } without casting in the block where you call
expandInline and construct the Link (referencing the switch case for 'a', the
node variable, expandInline, CInline, Plain, Markup, and Link).
- Around line 277-279: The loop in function skipWhitespace repeatedly constructs
the regex /\s/ on each iteration; predeclare a module-level constant (e.g.,
const WHITESPACE_RE = /\s/;) and use that inside skipWhitespace to test
ctx.msg[ctx.pos], updating ctx.pos while ctx.pos < ctx.msg.length &&
WHITESPACE_RE.test(ctx.msg[ctx.pos]); this avoids recreating the regex and
slightly improves performance for long runs over ctx.msg.

In `@packages/message-parser/src/utils.ts`:
- Around line 190-215: The fast/slow-path comments should be removed and the
logic split into small helpers: extract the nested check on flattenableValues
into a helper like isSimpleInline(v: unknown): boolean that returns false for
Array.isArray(v) or (v as Inlines).type === 'EMOJI', then replace the inline
loop that sets needsSlowPath with a call to
flattenableValues.every(isSimpleInline); also extract the plain-text merging
logic into a helper like pushOrMerge(result: Paragraph['value'], current:
Inlines) that appends current or merges into the previous PLAIN_TEXT entry, and
then use those helpers in place of the commented fast/slow branches so the
implementation reads clearly without prose.

In `@packages/message-parser/tests/compact.test.ts`:
- Around line 336-339: Remove or gate the console.log call that prints size
metrics in the parameterized test (the line using `console.log(\`  ${_label}:
${result.sizeOld}B → ${result.sizeNew}B (${result.reduction})\`)`) to avoid
cluttering CI; either delete that statement from the test or wrap it in a
conditional that checks an environment flag (e.g., process.env.SHOW_TEST_LOGS)
so the log only appears when explicitly enabled, leaving the assertions
(expect(result.sizeNew).toBeLessThan(result.sizeOld)) unchanged.
- Around line 119-134: The test fixtures in
packages/message-parser/tests/compact.test.ts are being force-cast with "as
unknown as Root" (used in expectExpand calls), which hides type mismatches;
update the tests by giving fixtures the correct AST types instead of using the
double cast: either (A) change the fixtures passed to expectExpand to conform to
the real Root structure and remove the casts, or (B) introduce a typed
helper/type alias for the test fixtures (e.g., export a TestRoot or concrete
typed fixture objects) and update expectExpand's signature to accept that type
so you can remove "as unknown as Root"; refer to the expectExpand calls and the
Root type to locate and fix each occurrence.
- Around line 18-22: Remove the stray debug console.log calls left in the test
helper in packages/message-parser/tests/compact.test.ts: the block that prints
oldMd, result.compact, and result.expanded when (!result.ok) should not run
unconditionally in CI; either delete those console.log lines or gate them behind
a debug flag (e.g., check process.env.DEBUG_TESTS or a local constant) so they
only print when explicitly enabled, leaving the failure assertion/reporting
logic for result.ok intact.

In `@packages/message-parser/tests/fuzz.test.ts`:
- Line 5: The generator plainTextArbitrary currently applies an unnecessary
.filter to enforce non-empty strings even though the regex /[a-zA-Z0-9 ]{1,20}/
already guarantees length >=1; remove the .filter((value) => value.length > 0)
call and keep the fc.stringMatching(/[a-zA-Z0-9 ]{1,20}/) expression (reference
symbol: plainTextArbitrary) to eliminate redundant overhead.

In `@packages/message-parser/tests/joinEmoji.test.ts`:
- Around line 13-23: Replace the loose toMatchObject assertions in the joinEmoji
tests with strict equality checks so unexpected fields are caught: in the tests
that call reducePlainTexts([...]) (using helpers plain and emoji) change
expect(result[0]).toMatchObject({...}) to expect(result[0]).toEqual({...}) and
supply the exact token shape you expect (exact type and value) for both the
single-token merge test and the emoji-with-neighbors test so the assertions fail
if extra fields are present.
- Around line 5-7: Add an explicit length assertion on the result from
reducePlainTexts before indexing it: after calling
reducePlainTexts([emoji('smile')]) assert the array length (e.g.,
expect(result).toHaveLength(1) or expect(result.length).toBe(1)) and then assert
result[0].type === 'EMOJI'; this ensures failures are clearer when the output
shape changes and references the reducePlainTexts call and the result variable
used in the test.

In `@packages/message-parser/tests/katex.test.ts`:
- Around line 33-34: The test uses toMatchObject() for the parse(input, { katex:
{ dollarSyntax: true } }) assertion which allows partial matching and is
inconsistent with the first suite; change that assertion to use toEqual() so the
test asserts exact equality against the expected output object (refer to the
parse function, the katex option object, and the input/output test variables in
katex.test.ts) to ensure strict, deterministic comparisons.

In `@packages/message-parser/tests/url.test.ts`:
- Around line 163-164: Change the two invalid-port assertions to use strict
equality: replace the expect(...).toMatchObject(...) calls with
expect(...).toEqual(...) for the autoLink('https://rocket.chat:99999') and
autoLink('https://rocket.chat:65536') assertions so the test compares full
object equality (using the existing autoLink and link helpers) rather than a
loose shape match.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8f89bec5-7c62-4c69-a194-bd91b6f183d3

📥 Commits

Reviewing files that changed from the base of the PR and between 44938fd and 137c237.

📒 Files selected for processing (50)
  • .changeset/add-parser-fuzz-testing.md
  • .changeset/block-splitter-layer1.md
  • .changeset/message-parser-guards-coverage.md
  • .changeset/message-parser-joinEmoji-tests.md
  • ee/packages/pdf-worker/src/templates/ChatTranscript/markup/blocks/HeadingBlock.tsx
  • packages/gazzodown/src/PreviewMarkup.tsx
  • packages/gazzodown/src/blocks/HeadingBlock.tsx
  • packages/message-parser/benchmarks/parser.bench.ts
  • packages/message-parser/package.json
  • packages/message-parser/src/BlockSplitter.ts
  • packages/message-parser/src/compact.ts
  • packages/message-parser/src/definitions.ts
  • packages/message-parser/src/grammar.pegjs
  • packages/message-parser/src/guards.ts
  • packages/message-parser/src/utils.ts
  • packages/message-parser/tests/abuse.test.ts
  • packages/message-parser/tests/any.test.ts
  • packages/message-parser/tests/blockSplitter.spec.ts
  • packages/message-parser/tests/blockquotes.test.ts
  • packages/message-parser/tests/codeFence.test.ts
  • packages/message-parser/tests/color.test.ts
  • packages/message-parser/tests/compact.test.ts
  • packages/message-parser/tests/email.test.ts
  • packages/message-parser/tests/emoji.test.ts
  • packages/message-parser/tests/emoticons.test.ts
  • packages/message-parser/tests/emphasis.test.ts
  • packages/message-parser/tests/emphasisWithEmoticons.test.ts
  • packages/message-parser/tests/escaped.test.ts
  • packages/message-parser/tests/fuzz.test.ts
  • packages/message-parser/tests/guards.test.ts
  • packages/message-parser/tests/heading.test.ts
  • packages/message-parser/tests/image.test.ts
  • packages/message-parser/tests/inlineCode.test.ts
  • packages/message-parser/tests/inlineCodeStrike.test.ts
  • packages/message-parser/tests/joinEmoji.test.ts
  • packages/message-parser/tests/katex.test.ts
  • packages/message-parser/tests/lineBreak.test.ts
  • packages/message-parser/tests/link.test.ts
  • packages/message-parser/tests/mention.test.ts
  • packages/message-parser/tests/orderedList.test.ts
  • packages/message-parser/tests/phoneNumber.test.ts
  • packages/message-parser/tests/skip-flags-regression.spec.ts
  • packages/message-parser/tests/spoiler.test.ts
  • packages/message-parser/tests/spoilerBlock.test.ts
  • packages/message-parser/tests/strikethrough.test.ts
  • packages/message-parser/tests/strongEmphasis.test.ts
  • packages/message-parser/tests/tasks.test.ts
  • packages/message-parser/tests/timestamp.test.ts
  • packages/message-parser/tests/unorderedList.test.ts
  • packages/message-parser/tests/url.test.ts

Comment on lines +27 to +35
const headingMatch = line.match(/^(#{1,6})\s+(.+)$/);
if (headingMatch) {
this.flush(blocks, currentBlock);
currentBlock = {
type: BlockType.HEADING,
content: headingMatch[2],
level: headingMatch[1].length,
};
this.flush(blocks, currentBlock);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Keep heading detection aligned with the real parser.

This splitter accepts #{1,6}, but packages/message-parser/src/grammar.pegjs still caps HeadingStart at 4. ##### title will be a HEADING here and plain text in the actual parser.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/BlockSplitter.ts` around lines 27 - 35, The
heading detection in BlockSplitter uses /^(#{1,6})\s+(.+)$/ which allows up to 6
hashes, but the real parser's grammar (HeadingStart) caps at 4; change the regex
in the heading detection (where headingMatch is computed in BlockSplitter) to
limit hashes to {1,4} so BlockType.HEADING, the currentBlock creation, and the
subsequent flush behavior align with the parser; alternatively, if you intend to
accept 5–6 hashes, update packages/message-parser/src/grammar.pegjs's
HeadingStart to match that range—pick one consistent approach and update the
corresponding code (headingMatch logic in BlockSplitter or HeadingStart in the
grammar) so both components agree.

Comment on lines +40 to +52
if (line.startsWith('```')) {
this.flush(blocks, currentBlock);
const language = line.slice(3).trim();
const codeLines = [];
let closed = false;
i++;
while (i < lines.length && !lines[i].startsWith('```')) {
codeLines.push(lines[i]);
i++;
}
if (i < lines.length) {
closed = true;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Mirror the grammar's fence matching here.

The opener/terminator checks use startsWith('```'), so lines like ```` or ```js inside code content are treated as fences here even though the parser only accepts the stricter fence forms from `packages/message-parser/src/grammar.pegjs`.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/BlockSplitter.ts` around lines 40 - 52,
BlockSplitter currently uses line.startsWith('```') which treats lines like ````
or '``` js' as fences; update the fence checks to mirror the PEG grammar by
using the same regex matcher instead of startsWith (e.g., match /^```[
\t]*[A-Za-z0-9_-]*$/ and capture the optional language token) when detecting
openers and terminators in the BlockSplitter parsing loop (replace the two
startsWith('```') checks around variable line and the while-termination
condition), extract language from the capture group rather than slicing, and
keep the rest of the codePaths (codeLines, closed flag, i++) the same.

Comment on lines +105 to +109
Timezone = offset:('+'/'-') tzHour:$(Digit |2|) ":" tzMinute:$(Digit |2|) { return offset + tzHour + ':' + tzMinute; }

ISO8601Date = year:Digit |4| "-" month:Digit |2| "-" day:Digit |2| "T" hours:Digit |2| ":" minutes:Digit|2| ":" seconds:Digit |2| "." milliseconds:Digit |3| tz:Timezone? { return timestampFromIsoTime({year: year.join(''), month: month.join(''), day: day.join(''), hours: hours.join(''), minutes: minutes.join(''), seconds: seconds.join(''), milliseconds: milliseconds.join(''), timezone: tz}) }
ISO8601Date = year:$(Digit |4|) "-" month:$(Digit |2|) "-" day:$(Digit |2|) "T" hours:$(Digit |2|) ":" minutes:$(Digit |2|) ":" seconds:$(Digit |2|) "." milliseconds:$(Digit |3|) tz:Timezone? { return timestampFromIsoTime({ year, month, day, hours, minutes, seconds, milliseconds, timezone: tz }); }

ISO8601DateWithoutMilliseconds = year:Digit |4| "-" month:Digit |2| "-" day:Digit |2| "T" hours:Digit |2| ":" minutes:Digit|2| ":" seconds:Digit |2| tz:Timezone? { return timestampFromIsoTime({year: year.join(''), month: month.join(''), day: day.join(''), hours: hours.join(''), minutes: minutes.join(''), seconds: seconds.join(''), timezone: tz}) }
ISO8601DateWithoutMilliseconds = year:$(Digit |4|) "-" month:$(Digit |2|) "-" day:$(Digit |2|) "T" hours:$(Digit |2|) ":" minutes:$(Digit |2|) ":" seconds:$(Digit |2|) tz:Timezone? { return timestampFromIsoTime({ year, month, day, hours, minutes, seconds, timezone: tz }); }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Z UTC timestamps stopped matching here.

Timezone now only accepts +/-HH:MM, so the ISO-8601 forms documented above on Lines 89-90, like <t:2025-07-22T10:00:00.000Z>, will fall back to plain text instead of producing a timestamp node.

Suggested fix
-Timezone = offset:('+'/'-') tzHour:$(Digit |2|) ":" tzMinute:$(Digit |2|) { return offset + tzHour + ':' + tzMinute; }
+Timezone
+  = "Z" { return 'Z'; }
+  / offset:('+'/'-') tzHour:$(Digit |2|) ":" tzMinute:$(Digit |2|) { return offset + tzHour + ':' + tzMinute; }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Timezone = offset:('+'/'-') tzHour:$(Digit |2|) ":" tzMinute:$(Digit |2|) { return offset + tzHour + ':' + tzMinute; }
ISO8601Date = year:Digit |4| "-" month:Digit |2| "-" day:Digit |2| "T" hours:Digit |2| ":" minutes:Digit|2| ":" seconds:Digit |2| "." milliseconds:Digit |3| tz:Timezone? { return timestampFromIsoTime({year: year.join(''), month: month.join(''), day: day.join(''), hours: hours.join(''), minutes: minutes.join(''), seconds: seconds.join(''), milliseconds: milliseconds.join(''), timezone: tz}) }
ISO8601Date = year:$(Digit |4|) "-" month:$(Digit |2|) "-" day:$(Digit |2|) "T" hours:$(Digit |2|) ":" minutes:$(Digit |2|) ":" seconds:$(Digit |2|) "." milliseconds:$(Digit |3|) tz:Timezone? { return timestampFromIsoTime({ year, month, day, hours, minutes, seconds, milliseconds, timezone: tz }); }
ISO8601DateWithoutMilliseconds = year:Digit |4| "-" month:Digit |2| "-" day:Digit |2| "T" hours:Digit |2| ":" minutes:Digit|2| ":" seconds:Digit |2| tz:Timezone? { return timestampFromIsoTime({year: year.join(''), month: month.join(''), day: day.join(''), hours: hours.join(''), minutes: minutes.join(''), seconds: seconds.join(''), timezone: tz}) }
ISO8601DateWithoutMilliseconds = year:$(Digit |4|) "-" month:$(Digit |2|) "-" day:$(Digit |2|) "T" hours:$(Digit |2|) ":" minutes:$(Digit |2|) ":" seconds:$(Digit |2|) tz:Timezone? { return timestampFromIsoTime({ year, month, day, hours, minutes, seconds, timezone: tz }); }
Timezone
= "Z" { return 'Z'; }
/ offset:('+'/'-') tzHour:$(Digit |2|) ":" tzMinute:$(Digit |2|) { return offset + tzHour + ':' + tzMinute; }
ISO8601Date = year:$(Digit |4|) "-" month:$(Digit |2|) "-" day:$(Digit |2|) "T" hours:$(Digit |2|) ":" minutes:$(Digit |2|) ":" seconds:$(Digit |2|) "." milliseconds:$(Digit |3|) tz:Timezone? { return timestampFromIsoTime({ year, month, day, hours, minutes, seconds, milliseconds, timezone: tz }); }
ISO8601DateWithoutMilliseconds = year:$(Digit |4|) "-" month:$(Digit |2|) "-" day:$(Digit |2|) "T" hours:$(Digit |2|) ":" minutes:$(Digit |2|) ":" seconds:$(Digit |2|) tz:Timezone? { return timestampFromIsoTime({ year, month, day, hours, minutes, seconds, timezone: tz }); }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/grammar.pegjs` around lines 105 - 109, The
Timezone rule only matches +/-HH:MM, so ISO8601Date and
ISO8601DateWithoutMilliseconds no longer recognize the trailing "Z"; update the
Timezone production to also accept an uppercase or lowercase "Z" (e.g., allow
"Z" or "z" as a valid tz) and ensure the tz value passed into
timestampFromIsoTime is normalized (e.g., "Z" for UTC or the original "+/-HH:MM"
string) so ISO8601Date and ISO8601DateWithoutMilliseconds will return timestamp
nodes for timestamps ending with Z.

Comment on lines +131 to +133
// Charclass avoids per-char lookahead; never consume start of "```"
CodeChunkChar = [^\r\n`] / "`" [^`\r\n] / "`" "`" [^`\r\n]
CodeChunk = text:$(CodeChunkChar)+ { return plain(text); }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Allow code lines to end with one or two backticks.

The new CodeChunkChar only accepts backticks when a non-backtick follows them, so a fenced line ending in ` or `` makes the whole code block unparseable.

Suggested fix
-CodeChunkChar = [^\r\n`] / "`" [^`\r\n] / "`" "`" [^`\r\n]
+CodeChunkChar = [^\r\n`] / "`" !("``") / "`" "`" !"`"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Charclass avoids per-char lookahead; never consume start of "```"
CodeChunkChar = [^\r\n`] / "`" [^`\r\n] / "`" "`" [^`\r\n]
CodeChunk = text:$(CodeChunkChar)+ { return plain(text); }
// Charclass avoids per-char lookahead; never consume start of "
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/grammar.pegjs` around lines 131 - 133, The
CodeChunkChar rule currently rejects backticks unless followed by a
non-backtick, which prevents lines that end with one or two backticks from being
parsed; update CodeChunkChar so it still prevents consuming the start of a
triple-backtick fence but allows a single "`" or double "``" to be consumed when
they are followed by a line break or end-of-input. Concretely, adjust
CodeChunkChar (the rule used by CodeChunk) to include alternatives that match
"`" and "``" when the next character is a newline or EOF (while keeping the
existing alternative that disallows backticks when followed by another backtick
to avoid matching "```").

Move compact AST format out of this PR into a standalone proposal doc,
as it needs further discussion before implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/proposals/compact-ast-format.md`:
- Around line 47-77: The compact type mapping table is missing CODE_LINE and
LIST_ITEM and incorrectly lists BLOCKQUOTE (which doesn’t exist) instead of the
exported QUOTE type; update the table to add a `CODE_LINE` entry (key `cl`) and
a `LIST_ITEM` entry (key `li`, note include `number` when present), remove the
`BLOCKQUOTE` row, ensure `QUOTE` remains (key `q`), and update the adjacent
statement that claims the table "covers all current AST node types" so it
accurately reflects the exported Types list.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 536bd759-2928-4fc0-8958-cccac0106313

📥 Commits

Reviewing files that changed from the base of the PR and between 137c237 and f147e26.

📒 Files selected for processing (1)
  • docs/proposals/compact-ast-format.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: 📦 Build Packages
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: CodeQL-Build
  • GitHub Check: CodeQL-Build
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: smirk-dev
Repo: RocketChat/Rocket.Chat PR: 39625
File: apps/meteor/app/api/server/v1/push.ts:85-97
Timestamp: 2026-03-14T14:58:58.834Z
Learning: In RocketChat/Rocket.Chat, the `push.token` POST/DELETE endpoints in `apps/meteor/app/api/server/v1/push.ts` were already migrated to the chained router API pattern on `develop` prior to PR `#39625`. `cleanTokenResult` (which strips `authToken` and returns `PushTokenResult`) and `isPushTokenPOSTProps`/`isPushTokenDELETEProps` validators already exist on `develop`. PR `#39625` only migrates `push.get` and `push.info` to the chained pattern. Do not flag `cleanTokenResult` or `PushTokenResult` as newly introduced behavior-breaking changes when reviewing this PR.
Learnt from: gabriellsh
Repo: RocketChat/Rocket.Chat PR: 37419
File: packages/i18n/src/locales/en.i18n.json:918-921
Timestamp: 2025-11-19T18:20:07.720Z
Learning: Repo: RocketChat/Rocket.Chat — i18n/formatting
Learning: This repository uses a custom message formatting parser in UI blocks/messages; do not assume standard Markdown rules. For keys like Call_ended_bold, Call_not_answered_bold, Call_failed_bold, and Call_transferred_bold in packages/i18n/src/locales/en.i18n.json, retain the existing single-asterisk emphasis unless maintainers request otherwise.
Learnt from: ggazzo
Repo: RocketChat/Rocket.Chat PR: 35995
File: apps/meteor/app/api/server/v1/rooms.ts:1107-1112
Timestamp: 2026-02-23T17:53:18.785Z
Learning: In Rocket.Chat PR reviews, maintain strict scope boundaries—when a PR is focused on a specific endpoint (e.g., rooms.favorite), avoid reviewing or suggesting changes to other endpoints that were incidentally refactored (e.g., rooms.invite) unless explicitly requested by maintainers.
🪛 LanguageTool
docs/proposals/compact-ast-format.md

[uncategorized] ~9-~9: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...AST is verbose — every node carries its full text content as self-contained strings. For ...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

🔇 Additional comments (1)
docs/proposals/compact-ast-format.md (1)

1-46: Proposal framing and trade-off articulation look solid.

Problem statement, operations (compactify/expand/validateRoundtrip), and open questions are clear and actionable.

Also applies to: 78-103

Comment on lines +47 to +77
### Compact Type Mapping

| Verbose Type | Compact Key | Notes |
|---|---|---|
| `PLAIN_TEXT` | `[start, end]` | Span tuple, no wrapper object |
| `BOLD` | `b` | |
| `ITALIC` | `i` | |
| `STRIKE` | `s` | |
| `SPOILER` | `\|\|` | |
| `INLINE_CODE` | `` ` `` | |
| `MENTION_USER` | `@` | |
| `MENTION_CHANNEL` | `#` | |
| `INLINE_KATEX` | `$` | |
| `LINK` | `a` | |
| `IMAGE` | `img` | |
| `EMOJI` | `:` | |
| `TIMESTAMP` | `ts` | |
| `COLOR` | `c` | Stores RGBA as `[r, g, b, a]` |
| `PARAGRAPH` | `p` | |
| `HEADING` | `h` | Includes level `l: 1..4` |
| `CODE` | ```` ``` ```` | |
| `BLOCKQUOTE` | `>` | |
| `QUOTE` | `q` | |
| `SPOILER_BLOCK` | `\|\|\|` | |
| `ORDERED_LIST` | `ol` | |
| `UNORDERED_LIST` | `ul` | |
| `TASKS` | `tl` | |
| `KATEX` | `$$` | |
| `LINE_BREAK` | `br` | |
| `BIG_EMOJI` | `E` | |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Compact mapping is incomplete/inconsistent with current AST types.

The table omits CODE_LINE and LIST_ITEM, and it introduces BLOCKQUOTE which is not present in the current exported Types (which has QUOTE). This also makes the “covers all current AST node types” statement on Line 104 inaccurate.

🛠️ Proposed doc fix
 | `CODE` | ```` ``` ```` | |
+| `CODE_LINE` | `cl` | |
-| `BLOCKQUOTE` | `>` | |
 | `QUOTE` | `q` | |
 | `SPOILER_BLOCK` | `\|\|\|` | |
 | `ORDERED_LIST` | `ol` | |
 | `UNORDERED_LIST` | `ul` | |
+| `LIST_ITEM` | `li` | Include `number` when present |
 | `TASKS` | `tl` | |
-A working proof-of-concept implementation exists with full bidirectional conversion and roundtrip validation tests. It covers all current AST node types including BigEmoji, lists, tasks, code blocks, KaTeX, and color nodes.
+A working proof-of-concept implementation exists with full bidirectional conversion and roundtrip validation tests. It is intended to cover all current AST node types (including BigEmoji, lists, tasks, code blocks, KaTeX, and color nodes), pending finalization of the compact type map.

Also applies to: 104-104

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/proposals/compact-ast-format.md` around lines 47 - 77, The compact type
mapping table is missing CODE_LINE and LIST_ITEM and incorrectly lists
BLOCKQUOTE (which doesn’t exist) instead of the exported QUOTE type; update the
table to add a `CODE_LINE` entry (key `cl`) and a `LIST_ITEM` entry (key `li`,
note include `number` when present), remove the `BLOCKQUOTE` row, ensure `QUOTE`
remains (key `q`), and update the adjacent statement that claims the table
"covers all current AST node types" so it accurately reflects the exported Types
list.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs/proposals/compact-ast-format.md">

<violation number="1" location="docs/proposals/compact-ast-format.md:104">
P3: The compact type mapping table is missing `CODE_LINE` and `LIST_ITEM` entries, which are both used in the proof-of-concept implementation (see the removed `compact.ts` where both types are imported and handled). This makes the statement on line 104 ("covers all current AST node types") inaccurate. Consider adding `CODE_LINE` → `cl` and `LIST_ITEM` → `li` to the table, and softening the completeness claim.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.


## Reference

A working proof-of-concept implementation exists with full bidirectional conversion and roundtrip validation tests. It covers all current AST node types including BigEmoji, lists, tasks, code blocks, KaTeX, and color nodes.
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The compact type mapping table is missing CODE_LINE and LIST_ITEM entries, which are both used in the proof-of-concept implementation (see the removed compact.ts where both types are imported and handled). This makes the statement on line 104 ("covers all current AST node types") inaccurate. Consider adding CODE_LINEcl and LIST_ITEMli to the table, and softening the completeness claim.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/proposals/compact-ast-format.md, line 104:

<comment>The compact type mapping table is missing `CODE_LINE` and `LIST_ITEM` entries, which are both used in the proof-of-concept implementation (see the removed `compact.ts` where both types are imported and handled). This makes the statement on line 104 ("covers all current AST node types") inaccurate. Consider adding `CODE_LINE` → `cl` and `LIST_ITEM` → `li` to the table, and softening the completeness claim.</comment>

<file context>
@@ -0,0 +1,104 @@
+
+## Reference
+
+A working proof-of-concept implementation exists with full bidirectional conversion and roundtrip validation tests. It covers all current AST node types including BigEmoji, lists, tasks, code blocks, KaTeX, and color nodes.
</file context>
Fix with Cubic

@ggazzo ggazzo added the stat: QA assured Means it has been tested and approved by a company insider label Apr 17, 2026
@dionisio-bot dionisio-bot Bot added the stat: ready to merge PR tested and approved waiting for merge label Apr 17, 2026
@dionisio-bot dionisio-bot Bot enabled auto-merge April 17, 2026 17:33
@ggazzo
Copy link
Copy Markdown
Member Author

ggazzo commented Apr 17, 2026

/jira ARCH-2083

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stat: QA assured Means it has been tested and approved by a company insider stat: ready to merge PR tested and approved waiting for merge type: chore type: feature Pull requests that introduces new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants