A mental model of how <AIMarkdown> turns a markdown string into rendered React — and why it's structured the way it is. Read this when:
- You're debugging an unexpected render result.
- You're considering writing a custom remark/rehype plugin and want to know where it would go.
- You want to understand what
<AIMarkdownDocuments>actually coordinates. - You're contributing to the library.
<AIMarkdown>
<AIMarkdownMetadataProvider> ← Context for opaque user metadata
<AIMarkdownRenderStateProvider> ← Context for streaming/config/scheme/documentId
<Typography> ← Configurable wrapper (default | Mantine | custom)
<ExtraStyles?> ← Optional CSS-scope wrapper
<AIMarkdownContent> ← The actual markdown renderer
↳ react-markdown (vendored) with remark/rehype pipeline
↳ block-level memoization
↳ cross-chunk placeholder resolution
</ExtraStyles?>
</Typography>
</AIMarkdownRenderStateProvider>
</AIMarkdownMetadataProvider>
</AIMarkdown>
Each layer has a single, documented responsibility:
| Layer | Responsibility |
|---|---|
<AIMarkdown> |
Top-level prop normalization (font size, document id), preprocessing pipeline orchestration, prop-stability tracking |
<AIMarkdownMetadataProvider> |
Isolate opaque user data from render state |
<AIMarkdownRenderStateProvider> |
Hold immutable render state, deep-merge config with defaults |
<Typography> |
Apply font-family, base font-size, theme class names; inject CSS custom properties via style |
<ExtraStyles> |
Optional CSS-scope wrapper (used by Mantine integration for em-based token overrides) |
<AIMarkdownContent> |
Vendor-forked react-markdown pipeline + block memoization + cross-chunk resolution |
AIMarkdownMetadataProvider and AIMarkdownRenderStateProvider are deliberately separate.
Render state (streaming flag, config, font-size, color-scheme, documentId) is what the markdown body subscribes to via useAIMarkdownRenderState. It changes infrequently relative to streaming.
Metadata (user callbacks, ids, app-level data) typically rebuilds every render — a parent rebuilding metadata={{ onCopy, messageId }} is normal React usage.
If both lived in one context, every metadata change would re-render the entire markdown body. With them split, components reading metadata re-render when needed; the body's useAIMarkdownRenderState doesn't see a Provider value change, and block-level memoization stays effective.
See Metadata Context for the consumer-side implications.
<AIMarkdownContent> is where the heavy lifting happens. Per render:
content (string)
│
▼
Stage A: contentPreprocessors
├── Built-in LaTeX normalizer (preprocessLaTeX)
└── User-supplied preprocessors (in order)
│
▼
Stage B: parse (mdast + hast)
├── unified.parse → mdast
├── remark plugins (GFM, math, breaks, emoji, pangu, smartypants, …)
└── remark-rehype + custom mdast handlers → hast
│
▼
Stage C: cross-chunk contributions (if inside <AIMarkdownDocuments>)
├── Extract this chunk's refs/defs
└── Push to Registry (registerChunk)
│
▼
Stage D: block planning
├── Walk hast top-level children
├── Build per-block BlockInfo (raw, position, taint, ctx digest)
└── Compute globalCtx digest from ref/def contributors
│
▼
Stage E: per-block render with cache lookup
├── For each block:
│ ├── Cache key = (raw, occurrence, ctx, startOffset, startLine)
│ ├── Cache hit → return cached ReactNode
│ └── Cache miss → toJsxRuntime(block hast) + memoize result
└── Concatenate ReactNodes
│
▼
Stage F: per-attribute URL transform
└── urlTransform (Gate 2)
— rewrites every URL-bearing attribute at render time;
schema-level allowlist (Gate 1, rehype-sanitize) ran
earlier in Stage B's rehype chain
│
▼
React renders
The LaTeX preprocessor is built-in and always first. It does string-level transforms — normalize \(…\) to $…$, escape | inside math, recognize currency $ vs math $, truncate unclosed $$ blocks for streaming safety. See Content Preprocessors for full details.
User preprocessors run next, in order. They receive the LaTeX-normalized string.
unified.parse produces an mdast (Markdown AST). remark plugins (GFM, math, breaks, emoji, pangu, smartypants, mark-highlight, etc., all gated on config) run on mdast. Then remark-rehype converts to hast (HTML AST) using custom mdast handlers that:
- Inject phantom footnote definitions (when
preserveOrphanReferencesis on and no matching[^x]exists, somdast-util-to-hastdoesn't silently drop the def). - Emit cross-chunk placeholder elements (
<cross-chunk-link>,<cross-chunk-image>,<footnote-sup>) when wrapped in<AIMarkdownDocuments>.
The rehype plugin chain then runs on hast — including rehype-raw (for raw HTML survival), rehype-katex (for math rendering), and rehype-sanitize (Gate 1 of URL sanitization: per-protocol allowlist applied to href/src/cite). Per-attribute URL rewriting (urlTransform, Gate 2) is not part of this chain — it runs later at render time (see Stage F).
When inside <AIMarkdownDocuments>, each chunk reports its refs and defs to the shared Registry. The registry tracks chunk symbols by useId() reactId, with refcount + microtask-deferred cleanup for Strict Mode safety. See Cross-chunk Coordination for the full lifecycle.
buildBlocks cuts the hast into per-block units — each top-level hast child that has an mdast counterpart, plus an optional synthetic footnote section.
For each block, the planner computes:
raw— source text of the block.startOffset,startLine,endOffset— position metadata, used in the cache key so identical text at different positions doesn't collide.tainted— whether the block contains footnote/link/image references or definitions.ctxdigest — the document-wide hash of all ref/def contributions. Used to invalidate tainted blocks when refs/defs change anywhere in the document.
For each block, the renderer either:
- Hits the cache — returns the previously-computed
ReactNode(zero work). - Misses the cache — calls
toJsxRuntimeon the block's hast, stores the result, returns it.
The cache lives in a useRef-backed Map. Eviction: when a block doesn't appear in a new plan (block was removed), it's dropped on the next pass. The cache is per-<AIMarkdown> instance; cross-chunk coordination uses a separate Registry (which itself caches selectors by version).
Only Gate 2 (urlTransform) runs in this stage — a per-attribute rewriter applied during the hast traversal in renderHastSubtree. Gate 1 (rehype-sanitize schema, per-protocol allowlist) has already run earlier in Stage B as part of the rehype plugin chain. See URL Sanitization & Custom Schemes for the full model.
Markdown footnotes and hash links emit <li id="…"> and <a href="#…"> with auto-generated ids. Without namespacing, two <AIMarkdown> instances on the same page would collide:
<!-- Message 1 -->
<a href="#user-content-fn-1">[1]</a>
<li id="user-content-fn-1">…definition A…</li>
<!-- Message 2 -->
<a href="#user-content-fn-1">[1]</a>
<!-- ← scrolls to message 1's footnote! -->
<li id="user-content-fn-1">…definition B…</li>The fix: prefix every clobberable attribute with a per-document namespace. <AIMarkdown> accepts documentId (or generates one via useId()) and derives clobberPrefix from it:
clobberPrefix = `${encodeURIComponent(shortenDocumentId(documentId))}-user-content-`;Long ids (>16 chars) are hashed via MurmurHash3 → Base62 before encoding, to keep the rendered HTML compact when consumers pass UUIDs/nanoids. The shortening only affects the rendered prefix — state.documentId retains the raw value, so registry keying and consumer code reading documentId see the original.
Chunks of the same logical document share documentId, so their prefixes align. This is the bridge between <AIMarkdownDocuments> and cross-chunk anchor navigation.
Located at packages/core/src/components/documentRegistry.ts. Key invariants:
- Per-
documentIdpartitioning. The wrapper holds aMap<documentId, Registry>. Each unique id gets its own registry. - Symbol-keyed contributions. Each chunk allocates a
Symbol(reactId)on mount and contributes to the registry under that symbol. The symbol is the chunk's identity for the registry's lifetime. - Refcount + microtask cleanup.
releaseSymboldecrements a refcount and schedules deletion viaqueueMicrotask. This survives React 19 Strict Mode (mount → unmount → mount within a frame) without losing the chunk's identity. - Monotonic version counter. Every mutation bumps
version; subscribers wake via microtask-coalesced fanout. - labelSet derivation.
labelSet.{footnoteLabels, linkLabels}is the union of own-def labels across all live chunks. Used by Stage B's phantom-def injection to know which orphan refs to protect. - Last-chunk eviction. When the final chunk releases its symbol and the registry becomes empty, an
onEmptycallback fires, removing the registry from the wrapper's Map. The next mount with the same id allocates a fresh registry.
The Registry interface exposes only read methods + selectors. Mutators (registerChunk, allocateSymbol, releaseSymbol, contributeLabels, contributeChunkData) live on the internal RegistryInternal interface, which is not re-exported from the package barrel. Consumer code can't directly drive the registry — only the renderer can.
Located at packages/core/src/components/blockMemo.ts. The invariants:
buildBlocksis hast-driven, not mdast-driven. Hast top-level children that have an mdast counterpart become blocks 1:1. Mdast-only nodes (e.g. metadata) don't produce blocks.- Two-tier offset lookup. Position metadata (
startOffset,startLine) goes into the cache key so identical content at different positions doesn't false-cache. - Swap-and-discard semantics. The plan is rebuilt every render; the prior cache is consulted by key, then discarded blocks are dropped.
- Synchronous G3 flush at 12-dep boundary. (Internal invariant about plan-context invalidation timing.)
globalCtxis the union of ref/def contributors. Tainted blocks include this in their cache key.
These invariants are enforced by tests (byteEquivalence.test.tsx is the harness that verifies byte-identical output across every plugin permutation and blockMemoEnabled on/off).
If you're touching blockMemo.ts or MarkdownContent.tsx, read the design document at the top of blockMemo.ts first.
The library default schema starts from rehype-sanitize's defaultSchema and extends it with three additions:
<mark>tag + class allowlist (for==highlight==).math-inlineandmath-displayclasses on<code>(the markersremark-mathemits beforerehype-katexconsumes them). KaTeX's own output classes (katex,katex-html, …) are not in this allowlist — they survive becauserehype-katexruns afterrehype-sanitizein the rehype chain.- Cross-chunk coordination tags:
cross-chunk-link,cross-chunk-image,footnote-sup.
Hand-rolling a schema via { ...defaultSchema, … } silently drops these. extendSanitizeSchema always works on a deep clone of the library's default (not rehype-sanitize's), so the additions survive.
The library default is not exported as a value — only the helper. This prevents the shallow-spread footgun by construction: there's no sanitizeSchema constant in the public API to shallow-spread from.
See URL Sanitization & Custom Schemes for the two-gate model.
@ai-react-markdown/mantine is a thin wrapper that:
- Extends the core config with
codeBlock.{defaultExpanded, autoDetectUnknownLanguage}. - Provides
MantineAIMarkdownTypography(uses Mantine's<Typography>). - Provides
MantineAIMDefaultExtraStyles(CSS scoping for em-based Mantine token overrides). - Overrides
customComponents.prewithMantineAIMPreCode(CodeHighlight + Mermaid + JSON pretty-print). - Auto-detects color scheme via Mantine's
useComputedColorScheme.
Every one of these uses public extension points from core. No internal access. See Extending via a Sub-package for the template.
useId()powers the auto-generateddocumentId— SSR-safe, stable across re-renders, distinct per component instance.- React 19's Strict Mode double-mount semantics are handled by the microtask-deferred cleanup in
documentRegistry(releaseSymbol → microtask → identity check → maybe delete). - The library doesn't use any React 19-only Hooks beyond
useId. React 19'suse()is not yet leveraged.
The library imports react-markdown as an internal module (packages/core/src/components/markdown/). This is a vendor fork, not a redistribution — the source is bundled and adapted for the library's needs:
- Block-level memoization needs control over the conversion stage (
toJsxRuntime) that the upstream component encapsulates. - The pipeline is exposed as three independent stages (parse, plan, render) so block memoization can intercept between stages.
- Cross-chunk placeholder elements need custom handlers in the mdast → hast conversion that aren't available on the upstream component.
The fork is intentional and the surface area is small. Consumers don't need to install react-markdown themselves — the library's wrapper is the only required dependency.
packages/core/src/
├── index.tsx ← <AIMarkdown> + public API re-exports
├── defs.ts ← config, render state, variant/scheme types
├── context.tsx ← render-state + metadata providers + hooks
├── preprocessors/
│ ├── index.ts ← preprocessing pipeline orchestrator
│ ├── defs.ts ← AIMDContentPreprocessor type
│ └── latex.ts ← built-in LaTeX normalizer
├── hooks/
│ ├── useStableValue.ts ← deep-equal reference stabilizer
│ └── useReferenceFlipWarning.ts ← dev-only identity-flip detector
├── components/
│ ├── MarkdownContent.tsx ← the actual markdown renderer
│ ├── markdown/ ← vendored react-markdown wrapper
│ ├── typography/ ← default typography variant
│ ├── blockMemo.ts ← block-level memoization
│ ├── AIMarkdownDocuments.tsx ← cross-chunk wrapper
│ ├── documentRegistry.ts ← cross-chunk shared state
│ ├── crossChunkPlaceholders.tsx ← placeholder element renderers
│ ├── sanitizeSchema.ts ← library default schema (internal)
│ ├── extendSanitizeSchema.ts ← public schema-extension helper
│ ├── crossChunkUrlSanitize.ts ← cross-chunk URL filter
│ ├── shortenDocumentId.ts ← MurmurHash3 → Base62
│ ├── customMdastHandlers.ts ← mdast → hast handlers (phantom defs, footnote sup, …)
│ └── rehypeRebaseHashLinks.ts ← rehype plugin to prefix hash hrefs
└── typings/
└── partial-deep.ts ← PartialDeep<T> type util
packages/mantine/src/
├── index.tsx ← barrel
├── defs.tsx ← Mantine-extended config + default
├── MantineAIMarkdown.tsx ← wrapper component
├── components/
│ ├── typography/
│ │ └── MantineTypography.tsx
│ ├── extra-styles/
│ │ └── DefaultExtraStyles.tsx
│ └── customized/
│ └── PreCode.tsx ← CodeHighlight + Mermaid + JSON
└── hooks/
├── useMantineAIMarkdownRenderState.ts
└── useMantineAIMarkdownMetadata.ts
The trail of file names is intentionally descriptive — when you're debugging or extending, grep is your friend.