Skip to content

Commit c158e4a

Browse files
programadclaude
andcommitted
refactor: 🔧 simplify verbatim handling to match Slack's empirical behavior
Empirically tested in Slack's section/mrkdwn renderer (side-by-side Block Kit Builder, both verbatim modes). Findings: - Verbatim:true does NOT suppress markdown formatting (`*bold*`, `_italic_`, `~strike~`, `` `code` ``) — Slack renders these the same in both modes. - Verbatim:true does NOT suppress code spans or angle-bracket URLs. - The ONE thing verbatim:true does suppress in section/mrkdwn (that affects this library) is bare-form `@here` / `@channel` / `@everyone` (without the `<!…>` brackets). Slack interpolates them as chips in verbatim:false and renders them as plain text in verbatim:true. Implementation changes: - Drop the separate "split by directive boundaries, render rest as literal text" verbatim path. Both modes now flow through the same pipeline. - Build two yozora parser instances. The verbatim instance constructs SlackBroadcastTokenizer with `matchTypedBroadcast: false` so bare `@here` etc. stay as plain text. Bracket-form `<!here>` etc. still resolve in both modes. - Delete `directives.tsx` (no longer needed). Inline DIRECTIVE_PATTERN helpers into `preparse.ts`. Tests: - Flip the two tests that asserted the wrong verbatim behavior ("does NOT bold *text* in verbatim" → "renders *bold* in verbatim (matches Slack)", same for `<URL>` autolinking). - Rename code-span suppression block to call out the known divergence (Slack resolves directives inside code spans; this library doesn't — tracked as a follow-up since it needs custom tokenization). - Add per-broadcast-target tests for the new typed-broadcast suppression. - Add a DOM-equality test confirming verbatim and non-verbatim produce identical output for non-typed-broadcast content. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent e2ba081 commit c158e4a

8 files changed

Lines changed: 229 additions & 212 deletions

File tree

‎.changeset/fix-mrkdwn-directives.md‎

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
"slack-blocks-to-jsx": patch
33
---
44

5-
Resolve Slack directive atoms (`<@U…>`, `<#C…>`, `<!subteam^…>`, `<!channel|here|everyone>`, `<!date^…>`) in `section`/`mrkdwn` text and other mrkdwn-typed text. Directives now fire the same hooks as the rich_text path in both `verbatim: true` and `verbatim: false` modes. Code-span content stays literal (directives inside `` `…` `` or ` ```…``` ` are not resolved). `&amp;` is now decoded alongside the existing `&gt;` / `&lt;` decoding so link `href`s and visible text don't leak literal `&amp;`.
5+
Resolve Slack directive atoms (`<@U…>`, `<#C…>`, `<!subteam^…>`, `<!channel|here|everyone>`, `<!date^…>`) in `section`/`mrkdwn` text and other mrkdwn-typed text. Directives now fire the same hooks as the `rich_text` path. `verbatim: true` matches Slack's empirical behavior — it suppresses bare-form `@here` / `@channel` / `@everyone` interpolation but is otherwise a no-op (markdown sugar, code spans, angle-bracket URLs, and structured `<!…>` directives all render the same in both modes). `&amp;` is now decoded alongside `&gt;` / `&lt;` so escaped ampersands don't leak into hrefs or visible text.

‎src/utils/markdown_parser/__tests__/directives.test.tsx‎

Lines changed: 95 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,13 @@ describe("date directives", () => {
162162
});
163163
});
164164

165-
describe("code-span suppression", () => {
165+
// KNOWN DIVERGENCE FROM SLACK: Slack's own renderer resolves directives that appear inside
166+
// inline code (`` `<@U…>` ``) — empirically confirmed in Slack's section/mrkdwn rendering.
167+
// This library currently keeps them literal (CommonMark-style code-span opacity), which is
168+
// what most React/markdown consumers expect. Resolving directives inside code spans is
169+
// tracked as a follow-up; it requires either a custom inline-code tokenizer or an AST-walk
170+
// after yozora parse.
171+
describe("code-span suppression (known divergence from Slack)", () => {
166172
it("does not fire hooks.user for directives inside inline code (non-verbatim)", () => {
167173
const user = vi.fn();
168174
const { container } = renderMrkdwn("Try \\`<@U123>\\`".replace(/\\`/g, "`"), false, {
@@ -208,22 +214,103 @@ describe("regression — non-directive markdown", () => {
208214
expect(container.querySelector("code")?.textContent).toBe("code");
209215
});
210216

211-
it("does NOT bold *text* in verbatim mode (renders literal)", () => {
217+
// Verbatim is effectively a no-op in Slack's renderer (empirically verified — see PR
218+
// description). Markdown formatting, code spans, directives, and angle-bracket URLs all
219+
// render the same in both modes. Slack only suppresses bare-URL autolinking in verbatim
220+
// mode, which this library doesn't do in either mode anyway.
221+
222+
it("renders *bold* in verbatim mode (matches Slack)", () => {
212223
const { container } = renderMrkdwn(`Hi *bold* world`, true);
213-
expect(container.querySelector("strong")).toBeNull();
214-
expect(container.textContent).toContain("*bold*");
224+
expect(container.querySelector("strong")?.textContent).toBe("bold");
215225
});
216226

217-
it("auto-links bare URLs in non-verbatim", () => {
227+
it("renders <URL> as a link in non-verbatim", () => {
218228
const { container } = renderMrkdwn(`Visit <https://example.com>`, false);
219229
const anchor = container.querySelector("a");
220230
expect(anchor?.getAttribute("href")).toBe("https://example.com");
221231
});
222232

223-
it("does NOT auto-link bare URLs in verbatim", () => {
233+
it("renders <URL> as a link in verbatim too (matches Slack)", () => {
224234
const { container } = renderMrkdwn(`Visit <https://example.com>`, true);
225-
expect(container.querySelector("a")).toBeNull();
226-
expect(container.textContent).toContain("<https://example.com>");
235+
const anchor = container.querySelector("a");
236+
expect(anchor?.getAttribute("href")).toBe("https://example.com");
237+
});
238+
239+
it("renders <URL|label> as a link in verbatim too (matches Slack)", () => {
240+
const { container } = renderMrkdwn(`Visit <https://example.com|click>`, true);
241+
const anchor = container.querySelector("a");
242+
expect(anchor?.getAttribute("href")).toBe("https://example.com");
243+
expect(anchor?.textContent).toBe("click");
244+
});
245+
246+
it("renders inline code in verbatim too", () => {
247+
const { container } = renderMrkdwn("Try `code`", true);
248+
expect(container.querySelector("code")?.textContent).toBe("code");
249+
});
250+
});
251+
252+
describe("typed-broadcast suppression in verbatim mode (matches Slack)", () => {
253+
// Slack renders bare `@here` / `@channel` / `@everyone` (without `<!…>` brackets) as chips
254+
// in verbatim:false and as plain text in verbatim:true. This is the only verbatim difference
255+
// this library cares about — empirically verified against Slack's section/mrkdwn renderer.
256+
257+
it("fires hooks.atHere for typed @here in non-verbatim", () => {
258+
const atHere = vi.fn(() => <span data-testid="h">@here</span>);
259+
const { getByTestId } = renderMrkdwn(`Hello @here folks`, false, { hooks: { atHere } });
260+
expect(atHere).toHaveBeenCalledTimes(1);
261+
expect(getByTestId("h")).toBeInTheDocument();
262+
});
263+
264+
it("does NOT fire hooks.atHere for typed @here in verbatim", () => {
265+
const atHere = vi.fn();
266+
const { container } = renderMrkdwn(`Hello @here folks`, true, { hooks: { atHere } });
267+
expect(atHere).not.toHaveBeenCalled();
268+
expect(container.textContent).toContain("@here");
269+
});
270+
271+
it("still fires hooks.atHere for bracket-form <!here> in verbatim", () => {
272+
const atHere = vi.fn(() => <span data-testid="h">@here</span>);
273+
const { getByTestId } = renderMrkdwn(`Hello <!here> folks`, true, { hooks: { atHere } });
274+
expect(atHere).toHaveBeenCalledTimes(1);
275+
expect(getByTestId("h")).toBeInTheDocument();
276+
});
277+
278+
it("does NOT fire hooks.atChannel for typed @channel in verbatim", () => {
279+
const atChannel = vi.fn();
280+
renderMrkdwn(`@channel`, true, { hooks: { atChannel } });
281+
expect(atChannel).not.toHaveBeenCalled();
282+
});
283+
284+
it("does NOT fire hooks.atEveryone for typed @everyone in verbatim", () => {
285+
const atEveryone = vi.fn();
286+
renderMrkdwn(`@everyone`, true, { hooks: { atEveryone } });
287+
expect(atEveryone).not.toHaveBeenCalled();
288+
});
289+
});
290+
291+
describe("verbatim and non-verbatim render identically (except typed broadcasts)", () => {
292+
const payload =
293+
"broadcasts: <!here> <!channel> <!everyone>\n" +
294+
"date: <!date^1717000000^{date_pretty}|May 29>\n" +
295+
"sugar: *bold* _italic_ ~strike~ `code`\n" +
296+
"url angle: <https://example.com>\n" +
297+
"url angle+label: <https://example.com|click>";
298+
299+
it("produces identical DOM markup for the same payload", () => {
300+
const hooks = {
301+
atHere: () => <span data-testid="here">@here</span>,
302+
atChannel: () => <span data-testid="channel">@channel</span>,
303+
atEveryone: () => <span data-testid="everyone">@everyone</span>,
304+
date: () => <span data-testid="date">date</span>,
305+
};
306+
const a = renderMrkdwn(payload, true, { hooks });
307+
const verbatimHtml = a.container.innerHTML;
308+
a.unmount();
309+
310+
const b = renderMrkdwn(payload, false, { hooks });
311+
const nonVerbatimHtml = b.container.innerHTML;
312+
313+
expect(verbatimHtml).toBe(nonVerbatimHtml);
227314
});
228315
});
229316

‎src/utils/markdown_parser/directives.tsx‎

Lines changed: 0 additions & 82 deletions
This file was deleted.

‎src/utils/markdown_parser/parser.tsx‎

Lines changed: 29 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
import YozoraParser from "@yozora/parser";
22
import { ReactNode } from "react";
33
import { GlobalStore } from "../../store";
4-
import { renderTextWithDirectives } from "./directives";
54
import { Blockquote, Code, Paragraph } from "./elements";
65
import { maskProtectedRegions } from "./preparse";
76
import {
@@ -14,23 +13,37 @@ import {
1413
} from "./tokenizers";
1514
import { MarkdownElement } from "./types";
1615

17-
const parser = new YozoraParser()
18-
.unmountTokenizer("@yozora/tokenizer-list")
19-
// Slack directives like `<!subteam^S1|@team>` and `<@U1|name>` contain `@` chars in their
20-
// fallback labels. Yozora's autolink tokenizers misread these as email autolinks and steal
21-
// them from our directive tokenizers. Disable both — bare URL autolinking is already handled
22-
// upstream by the `<X>` / `<X|Y>` regex rewrites that produce `[url](url)` markdown links.
23-
.unmountTokenizer("@yozora/tokenizer-autolink")
24-
.unmountTokenizer("@yozora/tokenizer-autolink-extension")
25-
.useTokenizer(new SlackUserMentionTokenizer())
26-
.useTokenizer(new SlackChannelMentionTokenizer())
27-
.useTokenizer(new SlackUserGroupMentionTokenizer())
28-
.useTokenizer(new SlackBroadcastTokenizer())
29-
.useTokenizer(new SlackDateTokenizer())
30-
.useTokenizer(new SlackEmojiTokenizer());
16+
// Slack directives like `<!subteam^S1|@team>` and `<@U1|name>` contain `@` chars in their
17+
// fallback labels. Yozora's autolink tokenizers misread these as email autolinks and steal
18+
// them from our directive tokenizers. Disable both — bare URL autolinking is already handled
19+
// upstream by the `<X>` / `<X|Y>` regex rewrites that produce `[url](url)` markdown links.
20+
const buildParser = (matchTypedBroadcast: boolean) =>
21+
new YozoraParser()
22+
.unmountTokenizer("@yozora/tokenizer-list")
23+
.unmountTokenizer("@yozora/tokenizer-autolink")
24+
.unmountTokenizer("@yozora/tokenizer-autolink-extension")
25+
.useTokenizer(new SlackUserMentionTokenizer())
26+
.useTokenizer(new SlackChannelMentionTokenizer())
27+
.useTokenizer(new SlackUserGroupMentionTokenizer())
28+
.useTokenizer(new SlackBroadcastTokenizer({ matchTypedBroadcast }))
29+
.useTokenizer(new SlackDateTokenizer())
30+
.useTokenizer(new SlackEmojiTokenizer());
31+
32+
// Slack's `verbatim` flag is effectively a no-op in section/mrkdwn rendering EXCEPT for one
33+
// case: it suppresses interpolation of typed-out `@here` / `@channel` / `@everyone` (without
34+
// the `<!…>` brackets). Empirically verified — see PR description for the side-by-side. We
35+
// honor that by using a parser without the bare-form broadcast match in verbatim mode.
36+
const parserDefault = buildParser(true);
37+
const parserVerbatim = buildParser(false);
3138

3239
type Options = {
3340
markdown: boolean;
41+
// In Slack's renderer, `verbatim` only changes two things in section/mrkdwn (empirically
42+
// verified): it suppresses bare-URL autolinking, and it suppresses interpolation of bare
43+
// `@here` / `@channel` / `@everyone`. Everything else — directives in `<…>` form, markdown
44+
// sugar, code spans, angle-bracket URLs — renders identically in both modes. This library
45+
// doesn't autolink bare URLs in either mode, so the only thing we gate on `verbatim` is the
46+
// bare-broadcast tokenizer match.
3447
verbatim: boolean;
3548
users: GlobalStore["users"];
3649
channels: GlobalStore["channels"];
@@ -49,24 +62,6 @@ function isValidURL(string: string) {
4962
export const markdown_parser = (markdown: string, options: Options): ReactNode => {
5063
if (!markdown) return null;
5164

52-
// In verbatim mode, Slack semantics say markdown formatting (`*bold*`, `_italic_`, `~strike~`,
53-
// bare URLs, code spans) should render as literal text, but Slack-formed directives are atoms
54-
// that must still resolve through hooks. Split the text by directive boundaries and render
55-
// each segment, preserving newlines as <br/>s.
56-
if (options.verbatim) {
57-
const segments = renderTextWithDirectives(markdown);
58-
return (
59-
<div>
60-
{segments.map((segment, i) => {
61-
if (typeof segment === "string") {
62-
return renderVerbatimText(segment, i);
63-
}
64-
return segment;
65-
})}
66-
</div>
67-
);
68-
}
69-
7065
let text_string = markdown;
7166

7267
// Normalize fenced code so yozora can recognize ``` blocks.
@@ -112,7 +107,7 @@ export const markdown_parser = (markdown: string, options: Options): ReactNode =
112107
// sees their native shape. Directive tokenizers will tokenize them at parse time.
113108
text_string = mask.restore(text_string);
114109

115-
const parsed_data = parser.parse(text_string);
110+
const parsed_data = (options.verbatim ? parserVerbatim : parserDefault).parse(text_string);
116111

117112
const elements = parsed_data.children as unknown as MarkdownElement[];
118113

@@ -129,14 +124,3 @@ export const markdown_parser = (markdown: string, options: Options): ReactNode =
129124
</div>
130125
);
131126
};
132-
133-
const renderVerbatimText = (text: string, baseKey: number): ReactNode => {
134-
if (!text.includes("\n")) return text;
135-
const lines = text.split("\n");
136-
const out: ReactNode[] = [];
137-
lines.forEach((line, idx) => {
138-
if (idx > 0) out.push(<br key={`br-${baseKey}-${idx}`} />);
139-
if (line) out.push(line);
140-
});
141-
return <span key={`v-${baseKey}`}>{out}</span>;
142-
};

‎src/utils/markdown_parser/preparse.ts‎

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,30 @@
1-
import { DIRECTIVE_PATTERN_GLOBAL } from "./directives";
2-
31
const PLACEHOLDER_OPEN = "";
42
const PLACEHOLDER_CLOSE = "";
53
const PLACEHOLDER_PATTERN = new RegExp(`${PLACEHOLDER_OPEN}([0-9a-z]+)${PLACEHOLDER_CLOSE}`, "g");
64

75
const FENCED_CODE = /```\n[\s\S]*?\n```/g;
86
const INLINE_CODE = /`[^`\n]+`/g;
97

8+
// Slack-formed directive atoms. Pre-masked before the URL-rewrite regex pass so it cannot
9+
// mangle their interiors (defense in depth: `isValidURL` happens to leave directives alone
10+
// today, but the protection is incidental, not deliberate).
11+
const DIRECTIVE_USER = /<@[^|>\s]+(?:\|[^>]*)?>/;
12+
const DIRECTIVE_CHANNEL = /<#[^|>\s]+(?:\|[^>]*)?>/;
13+
const DIRECTIVE_USERGROUP = /<!subteam\^[^|>\s]+(?:\|[^>]*)?>/;
14+
const DIRECTIVE_BROADCAST = /<!(?:here|channel|everyone)>/;
15+
const DIRECTIVE_DATE = /<!date\^[^>]+>/;
16+
17+
const DIRECTIVE_PATTERN_GLOBAL = new RegExp(
18+
[
19+
DIRECTIVE_USER.source,
20+
DIRECTIVE_CHANNEL.source,
21+
DIRECTIVE_USERGROUP.source,
22+
DIRECTIVE_BROADCAST.source,
23+
DIRECTIVE_DATE.source,
24+
].join("|"),
25+
"g",
26+
);
27+
1028
type Mask = {
1129
masked: string;
1230
restore: (input: string) => string;

0 commit comments

Comments
 (0)