Skip to content

Commit 46363db

Browse files
committed
Plan a deferred phase 14 for moving writePdf off the main thread.
1 parent a370c5f commit 46363db

1 file changed

Lines changed: 59 additions & 0 deletions

File tree

builder/PLAN-sab-pull-scheduler.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1980,3 +1980,62 @@ knows exactly what it's waiting for, and wakes as soon as the main
19801980
thread sets the task to DONE and notifies the slot. Before waiting,
19811981
the worker checks if other non-dependent work is available (one scan);
19821982
if so, it does that work instead of sleeping.
1983+
1984+
### Phase 14: Move `writePdf` to a worker --- DEFERRED
1985+
1986+
**Motivation.** `writePdf` depends on `flushJoin` + `mermaid` +
1987+
`resolveBookChapters` --- no data dependency on the offline pipeline
1988+
(`searchData``writeAux``writeOffline`). But both `writePdf` and
1989+
the offline pipeline are `runOnMain`, so they serialize on the main
1990+
thread. On a machine where `flushJoin` lands at ~1.2 s, the ~150 ms
1991+
`writePdf` cost is 12 % of the build.
1992+
1993+
**Investigation.** Splitting `writePdf` into two main-thread tasks
1994+
(`assemblePdf` + `writePdfFiles`) to measure the compute-vs-I/O
1995+
breakdown showed:
1996+
1997+
```
1998+
assemblePdf=160ms writePdfFiles=30ms
1999+
```
2000+
2001+
The compute half (`assembleBook`: chapter walking, body transforms,
2002+
href rewriting, html-compress) is ~84 % of the cost. The file-write
2003+
half (one `book.html` + 2 CSS files + ~100 images) is only ~30 ms.
2004+
2005+
**Consequence.** Moving just the file writes to a worker saves ~30 ms
2006+
--- not enough to justify the SAB broadcast plumbing. The real win
2007+
requires moving `assembleBook` itself off main. Two blockers prevent
2008+
that:
2009+
2010+
1. **Live page-object references.** `resolveBookChapters` stores page
2011+
objects in `bookData._chapters[]`, `_foreword`, `_landing`. These
2012+
are identity-linked to `state.pages` entries where `renderedContent`
2013+
was merged after render. Structured clone to a worker breaks the
2014+
identity link.
2015+
2016+
2. **`site.markdown` dependency.** `assembleBook`
2017+
`renderPartDivider` calls `site.markdown.render()` for part
2018+
subtitles and intros. The markdown-it instance is not serializable.
2019+
2020+
**Paths forward (not yet committed to):**
2021+
2022+
- **Index-based chapter references.** `resolveBookChapters` stores
2023+
permalink strings instead of page objects; `assembleBook` builds a
2024+
`Map<permalink, Page>` at the start and resolves refs through it.
2025+
Removes blocker 1.
2026+
2027+
- **Pre-render book text.** Pre-render subtitles/intros during
2028+
`resolveBookChapters` (which runs after `markdownInit`), storing the
2029+
HTML on `bookData` entries. `renderPartDivider` reads the
2030+
pre-rendered strings instead of calling `site.markdown`. Removes
2031+
blocker 2.
2032+
2033+
- **Full worker migration.** With both blockers removed, the entire
2034+
`writePdf` (compute + I/O) can run on a worker via SAB broadcast of
2035+
a page projection (~10 MB: all pages' `permalink`, `navPath`,
2036+
`renderedContent`, `frontmatter` subset). Packing cost ~30--50 ms;
2037+
net main-thread savings ~100--120 ms.
2038+
2039+
Deferred: the refactoring cost is significant for a ~120 ms saving on
2040+
a 4 s build. Revisit if the build wall-clock shrinks enough that the
2041+
PDF task becomes a larger fraction.

0 commit comments

Comments
 (0)