Skip to content

Commit 5f54480

Browse files
authored
docs: document stable block addressing for headless workflows (SD-2069) (#2234)
* docs: document stable block addressing for headless workflows SD-2069: The Document API already resolves paraId-first (stable across loads) over sdBlockId (volatile), but nothing in the docs told users this. Add node addressing docs, cross-session workflow example, warn about sdBlockId volatility in Block Node docs, and reference Document API from the AI Agents page. * chore: clarify stability of paraId
1 parent 17117aa commit 5f54480

4 files changed

Lines changed: 139 additions & 9 deletions

File tree

apps/docs/document-api/common-workflows.mdx

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,52 @@ if (caps.global.trackChanges.enabled) {
131131
}
132132
```
133133

134+
## Cross-session block addressing
135+
136+
When you load a DOCX, close the editor, and load the same file again, `sdBlockId` values change — they're regenerated on every open. The Document API's `find` operation returns addresses whose `nodeId` prefers DOCX-native `paraId`, which is usually stable when reopening the same unchanged DOCX.
137+
138+
This pattern is common in headless pipelines: extract block references in one session, then apply edits in another.
139+
140+
```ts
141+
import { Editor } from 'superdoc/super-editor';
142+
import { readFile, writeFile } from 'node:fs/promises';
143+
144+
const docx = await readFile('./contract.docx');
145+
146+
// Session 1: extract block addresses
147+
const editor1 = await Editor.open(docx);
148+
const result = editor1.doc.find({
149+
select: { type: 'node', nodeType: 'paragraph' },
150+
includeNodes: true,
151+
});
152+
153+
// Save addresses — for DOCX-imported blocks, nodeId uses paraId when available
154+
const addresses = result.items.map((item) => ({
155+
address: item.address,
156+
text: item.node?.text,
157+
}));
158+
await writeFile('./blocks.json', JSON.stringify(addresses));
159+
editor1.destroy();
160+
161+
// Session 2: load the same file again and apply edits
162+
const editor2 = await Editor.open(docx);
163+
const saved = JSON.parse(await readFile('./blocks.json', 'utf-8'));
164+
165+
// Addresses from session 1 usually resolve when reloading the same unchanged DOCX
166+
for (const { address } of saved) {
167+
const node = editor2.doc.getNode(address); // works across sessions
168+
}
169+
editor2.destroy();
170+
```
171+
172+
<Info>
173+
`nodeId` stability depends on the ID source. For DOCX-imported content, `nodeId` comes from `paraId` when available and is best-effort stable across loads. For nodes created at runtime, it falls back to `sdBlockId`, which is volatile.
174+
</Info>
175+
176+
<Warning>
177+
No ID is guaranteed to survive all Microsoft Word round-trips. Re-extract addresses after major external edits or transformations, since Word (or other tools) may rewrite paragraph IDs and SuperDoc may rewrite duplicate IDs on import.
178+
</Warning>
179+
134180
## Dry-run preview
135181

136182
Pass `dryRun: true` to validate an operation without applying it:

apps/docs/document-api/overview.mdx

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,3 +17,33 @@ Document API is in <strong>alpha</strong> and subject to breaking changes while
1717
- Work with predictable inputs and outputs defined per operation.
1818
- Check capabilities up front and branch safely when features are unavailable.
1919

20+
## Node addressing
21+
22+
Every block in a document has a `nodeId` — a string that uniquely identifies it. When you call `find` or `getNodeById`, addresses use this ID:
23+
24+
```json
25+
{
26+
"kind": "block",
27+
"nodeType": "paragraph",
28+
"nodeId": "3A2B1C0D"
29+
}
30+
```
31+
32+
### Stable IDs across loads
33+
34+
For DOCX documents, `nodeId` is derived from the file's native `w14:paraId` attribute. In practice, this is usually stable when you reopen the same unchanged DOCX across separate editor sessions, machines, or headless CLI pipelines.
35+
36+
For nodes created at runtime (not imported from DOCX), `nodeId` falls back to `sdBlockId`, a UUID generated when the editor opens. This fallback is volatile and changes on every load.
37+
38+
| ID source | Stable across loads? | When used |
39+
|-----------|---------------------|-----------|
40+
| `paraId` (from DOCX) | Best effort (usually stable for unchanged DOCX blocks) | Paragraphs, tables, rows, cells imported from DOCX |
41+
| `sdBlockId` (runtime) | No (session-scoped) | Nodes created programmatically before first export |
42+
43+
<Tip>
44+
If you need to reference blocks across separate editor sessions, use `editor.doc.find()` to get addresses — don't read `node.attrs.sdBlockId` directly. The Document API resolves `paraId` first for DOCX-imported content.
45+
</Tip>
46+
47+
<Warning>
48+
No block ID is guaranteed to survive all Microsoft Word round-trips or external document rewrites. Word and other tools may regenerate `w14:paraId` during structural changes (for example split/merge/rebuild operations), and SuperDoc may rewrite duplicate IDs on import to keep block targeting deterministic.
49+
</Warning>

apps/docs/getting-started/ai-agents.mdx

Lines changed: 58 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,56 @@ const redlined = await editor.exportDocx();
8080
```
8181
</CodeGroup>
8282

83+
## Programmatic block access
84+
85+
For multi-step workflows — extract block references, send them to an LLM, then apply edits — use the [Document API](/document-api/overview). It returns stable block addresses that work across separate editor sessions:
86+
87+
<CodeGroup>
88+
```javascript Usage
89+
// Find all paragraphs and their text
90+
const result = editor.doc.find({
91+
select: { type: 'node', nodeType: 'paragraph' },
92+
includeNodes: true,
93+
});
94+
95+
// Each item has an address with a best-effort stable nodeId
96+
const blocks = result.items.map((item) => ({
97+
id: item.address.nodeId, // usually stable across loads for unchanged DOCX content
98+
type: item.address.nodeType,
99+
text: item.node?.text,
100+
}));
101+
102+
// Send blocks to LLM, get back edits, apply them
103+
```
104+
105+
```javascript Full Example
106+
import { readFile } from 'node:fs/promises';
107+
import { Editor } from 'superdoc/super-editor';
108+
109+
const docx = await readFile('./contract.docx');
110+
const editor = await Editor.open(docx);
111+
112+
// Find all paragraphs and their text
113+
const result = editor.doc.find({
114+
select: { type: 'node', nodeType: 'paragraph' },
115+
includeNodes: true,
116+
});
117+
118+
// Each item has an address with a best-effort stable nodeId
119+
const blocks = result.items.map((item) => ({
120+
id: item.address.nodeId, // usually stable across loads for unchanged DOCX content
121+
type: item.address.nodeType,
122+
text: item.node?.text,
123+
}));
124+
125+
// Send blocks to LLM, get back edits, apply them
126+
```
127+
</CodeGroup>
128+
129+
<Info>
130+
Block addresses from `doc.find()` prefer DOCX-native IDs (`paraId`) for imported blocks. This is the best available cross-session anchor, but no ID is guaranteed to survive all Word round-trips. Don't read `node.attrs.sdBlockId` directly — it's regenerated on every load.
131+
</Info>
132+
83133
## LLM quick reference
84134
85135
Point your AI assistant at these URLs for SuperDoc context:
@@ -92,6 +142,14 @@ https://docs.superdoc.dev/llms-full.txt // Complete documentation
92142
## Next steps
93143
94144
<CardGroup cols={2}>
145+
<Card
146+
title="Document API"
147+
icon="book"
148+
href="/document-api/overview"
149+
>
150+
Stable block addressing, find/replace, and programmatic edits
151+
</Card>
152+
95153
<Card
96154
title="SuperEditor API"
97155
icon="code"
@@ -108,14 +166,6 @@ https://docs.superdoc.dev/llms-full.txt // Complete documentation
108166
Content formats, export options, and round-trip behavior
109167
</Card>
110168
111-
<Card
112-
title="AI Redlining (Browser)"
113-
icon="github"
114-
href="https://github.com/superdoc-dev/superdoc/tree/main/examples/features/ai-redlining"
115-
>
116-
React app: upload a DOCX, LLM reviews it, tracked changes in the UI
117-
</Card>
118-
119169
<Card
120170
title="AI Redlining (Headless)"
121171
icon="github"

apps/docs/snippets/extensions/block-node.mdx

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,10 +124,14 @@ Every block-level node (paragraphs, headings, etc.) automatically receives a uni
124124
3. **Collaborative editing** - Reference blocks consistently across clients
125125
4. **Programmatic updates** - Update document structure via APIs
126126

127+
<Warning>
128+
`sdBlockId` is regenerated on every document load. If you need cross-session block references (e.g., headless CLI pipelines, multi-step AI workflows), use the [Document API](/document-api/overview) instead. `editor.doc.find()` returns addresses whose `nodeId` prefers DOCX-native `paraId` for imported blocks. This is best-effort stable, not a permanent global ID: Word or other tools can rewrite IDs during structural changes, and SuperDoc may rewrite duplicates on import.
129+
</Warning>
130+
127131
## Use case
128132

129133
- **Document APIs** - Build REST APIs that manipulate specific blocks
130134
- **Collaboration** - Track who edited which blocks in real-time
131135
- **Comments & Annotations** - Attach metadata to specific blocks
132136
- **Version Control** - Diff documents at the block level
133-
- **Templates** - Replace placeholder blocks with dynamic content
137+
- **Templates** - Replace placeholder blocks with dynamic content

0 commit comments

Comments
 (0)