Skip to content

Commit 62f7fb4

Browse files
ds300cursoragent
andauthored
refactor(sync): incremental Pierre persistence with per-record files (tldraw#8013)
In order to enable granular git-level diffs in Pierre history, this PR rewrites Pierre persistence to store each tldraw record as an individual file and use the `TLSyncStorage` interface for incremental updates. **Pierre storage format**: Instead of a single `snapshot.json` blob, each commit now contains `meta.json` (documentClock, schema only) and individual `records/{id}.json` files. Each record file stores only the record state (no lastChangedClock). Tombstone metadata is not stored in Pierre; the file-per-record model represents deletes by removing files. This gives meaningful per-record diffs in Pierre's commit history. **Incremental persistence**: Uses `storage.getChangesSince(pierreDocClock)` to compute the diff since the last Pierre commit. Only changed records are written per commit. For empty repos, `documentClock=-1` triggers returning all records. On `wipeAll`, stale record files in Pierre are cleaned up via `repo.listFiles()`. Optimistic concurrency via `expectedHeadSha` with retry on CAS conflict. **getChangesSince(wipeAll)**: When `wipeAll` is true (e.g. empty repo, `sinceClock -1`), storage was still including tombstones in `diff.deletes`, which caused the first Pierre commit to call `deletePath` on non-existent files and could duplicate deletes with the cleanup loop. InMemorySyncStorage and SQLiteSyncStorage now omit `diff.deletes` when `wipeAll`; the durable object applies `diff.deletes` only when `headSha && !wipeAll`. **Restore fix**: Pierre restore no longer calls `repo.restoreCommit()` (which failed when restoring to HEAD). Instead it reconstructs the snapshot from the Pierre archive via `reconstructSnapshotFromPierre` and loads it into storage with `loadSnapshotIntoStorage`, so the next persist cycle writes the restored state as a new forward commit. **Snapshot reconstruction**: New `pierreSnapshot.ts` uses `modern-tar`'s `createTarDecoder()` to stream the gzip-compressed tar archive from Pierre. The pipeline is `getArchiveStream` → DecompressionStream('gzip') → createTarDecoder(); entries are consumed with a reader loop. Record files are parsed as state only; reconstructed documents use `lastChangedClock: 0` (loadSnapshotIntoStorage only uses state). History snapshot and restore both call `reconstructSnapshotFromPierre` and return or load the resulting `RoomSnapshot`. **Other**: Added missing `assert` import in `TLSyncStorage.ts`. `syncPierreState` uses `listCommits({ limit: 1 })` and `getFileStream('meta.json')` to sync HEAD and documentClock. Pierre repo IDs changed from `.../snapshots/...` to `.../files/...`. Upgraded `@pierre/storage` to `^1.1.0` and added `modern-tar`. Client history snapshot page uses `TlaFileError` in the error boundary and drops the unused error prop from the main component. ### Change type - [x] `improvement` ### Test plan 1. Open a tldraw file with Pierre enabled 2. Make edits and verify Pierre commits contain individual `records/*.json` entries (state only) 3. View Pierre history — verify snapshot loading works 4. Restore from Pierre history — verify the canvas updates and a new Pierre commit is created 5. Verify incremental commits only touch changed records 6. New file (empty Pierre repo): first persist should succeed without deletePath errors - [ ] Unit tests - [ ] End to end tests ### Release notes - Internal improvement to Pierre version history storage format for better granularity. --------- Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent 1be9b0a commit 62f7fb4

12 files changed

Lines changed: 238 additions & 63 deletions

apps/dotcom/client/src/tla/pages/file-pierre-history-snapshot.tsx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ export function ErrorBoundary() {
1616
useEffect(() => {
1717
captureException(error)
1818
}, [error])
19-
return <Component error={error} />
19+
return <TlaFileError error={error} />
2020
}
2121

2222
const { loader, useData } = defineLoader(async (args) => {
@@ -36,7 +36,7 @@ const { loader, useData } = defineLoader(async (args) => {
3636

3737
export { loader }
3838

39-
export function Component({ error: _error }: { error?: unknown }) {
39+
export function Component() {
4040
const userId = useMaybeApp()?.userId
4141

4242
const result = useData()
@@ -54,7 +54,7 @@ export function Component({ error: _error }: { error?: unknown }) {
5454
} as TLStoreSnapshot
5555
}, [result])
5656

57-
const error = _error || !result || !snapshot
57+
const error = !result || !snapshot
5858

5959
useEffect(() => {
6060
if (error && userId) {

apps/dotcom/sync-worker/package.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
},
2424
"dependencies": {
2525
"@clerk/backend": "^1.23.7",
26-
"@pierre/storage": "^0.1.1",
26+
"@pierre/storage": "^1.1.0",
2727
"@rocicorp/zero": "0.25.9",
2828
"@supabase/auth-helpers-remix": "^0.2.6",
2929
"@supabase/supabase-js": "^2.48.1",
@@ -39,6 +39,7 @@
3939
"jose": "^5.9.6",
4040
"kysely": "^0.27.5",
4141
"lodash.throttle": "^4.1.1",
42+
"modern-tar": "^0.7.5",
4243
"pg": "^8.13.1",
4344
"pg-logical-replication": "^2.0.7",
4445
"react": "^19.2.1",

apps/dotcom/sync-worker/src/TLDrawDurableObject.ts

Lines changed: 134 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
/// <reference no-default-lib="true"/>
22
/// <reference types="@cloudflare/workers-types" />
33

4-
import { RefUpdateError } from '@pierre/storage'
4+
import { ApiError, RefUpdateError, type Repo } from '@pierre/storage'
55
import { SupabaseClient } from '@supabase/supabase-js'
66
import {
77
DB,
@@ -53,6 +53,7 @@ import { EventData, writeDataPoint } from './utils/analytics'
5353
import { createPierreClient } from './utils/createPierreClient'
5454
import { createSupabaseClient } from './utils/createSupabaseClient'
5555
import { getRoomDurableObject } from './utils/durableObjects'
56+
import { reconstructSnapshotFromPierre } from './utils/pierreSnapshot'
5657
import { isRateLimited } from './utils/rateLimit'
5758
import { getSlug } from './utils/roomOpenMode'
5859
import { throttle } from './utils/throttle'
@@ -211,6 +212,7 @@ export class TLFileDurableObject extends DurableObject {
211212
// For persistence
212213
supabaseClient: SupabaseClient | void
213214
pierreClient: ReturnType<typeof createPierreClient>
215+
pierreState: PierreState | null = null
214216

215217
// For analytics
216218
measure: Analytics | undefined
@@ -371,16 +373,9 @@ export class TLFileDurableObject extends DurableObject {
371373
if (!repo) {
372374
return new Response('Pierre not available', { status: 503 })
373375
}
374-
const fileStream = await repo.getFileStream({
375-
path: SNAPSHOT_FILE_NAME,
376-
ref: commitHash,
377-
})
378-
dataText = await fileStream.text()
379-
await repo.restoreCommit({
380-
targetCommitSha: commitHash,
381-
targetBranch: 'main',
382-
author: PIERRE_AUTHOR,
383-
})
376+
const snapshot = await reconstructSnapshotFromPierre(repo, commitHash)
377+
dataText = JSON.stringify(snapshot)
378+
this.pierreState = null
384379
} else {
385380
const timestamp = ((await req.json()) as any).timestamp
386381
if (!timestamp) {
@@ -973,6 +968,7 @@ export class TLFileDurableObject extends DurableObject {
973968

974969
const key = getR2KeyForRoom({ slug: slug, isApp: this.documentInfo.isApp })
975970
await this._uploadSnapshotToR2(snapshot, key)
971+
await this.persistToPierre(storage, snapshot)
976972

977973
this.logEvent({ type: 'persist_success', attempts: attempt })
978974
this._lastPersistedClock = snapshot.documentClock
@@ -1023,7 +1019,6 @@ export class TLFileDurableObject extends DurableObject {
10231019
// Then upload to version cache
10241020
const versionKey = `${key}/${new Date().toISOString()}`
10251021
await this._uploadSnapshotToBucket(this.r2.versionCache, snapshot, versionKey)
1026-
await this.persistToPierre(versionKey)
10271022
}
10281023

10291024
private async _uploadSnapshotToBucket(bucket: R2Bucket, snapshot: RoomSnapshot, key: string) {
@@ -1120,44 +1115,139 @@ export class TLFileDurableObject extends DurableObject {
11201115
) {
11211116
return null
11221117
}
1123-
const repoId = `${this.env.TLDRAW_ENV}/snapshots/${this.documentInfo.slug}`
1118+
const repoId = `${this.env.TLDRAW_ENV}/files/${this.documentInfo.slug}`
11241119
return (
11251120
(await this.pierreClient.findOne({ id: repoId })) ??
11261121
(await this.pierreClient.createRepo({ id: repoId }))
11271122
)
11281123
}
11291124

1130-
private async persistToPierre(key: string) {
1125+
/**
1126+
* Sync local Pierre tracking state from the remote repo. Fetches HEAD sha
1127+
* and meta.json (for documentClock). For empty repos, sets documentClock
1128+
* to -1 so getChangesSince returns all records.
1129+
*/
1130+
private async syncPierreState(repo: Repo) {
1131+
let headCommit: { sha: string } | undefined
1132+
try {
1133+
const { commits } = await repo.listCommits({ limit: 1 })
1134+
headCommit = commits[0]
1135+
} catch (error) {
1136+
if (error instanceof ApiError && error.status === 404) {
1137+
this.pierreState = { headSha: undefined, documentClock: -1 }
1138+
return
1139+
}
1140+
throw error
1141+
}
1142+
1143+
if (!headCommit) {
1144+
this.pierreState = { headSha: undefined, documentClock: -1 }
1145+
return
1146+
}
1147+
1148+
const metaResp = await repo.getFileStream({ path: 'meta.json', ref: headCommit.sha })
1149+
const meta = (await metaResp.json()) as PierreMeta
1150+
1151+
this.pierreState = {
1152+
headSha: headCommit.sha,
1153+
documentClock: meta.documentClock ?? 0,
1154+
}
1155+
}
1156+
1157+
private async persistToPierre(storage: TLSyncStorage<TLRecord>, snapshot: RoomSnapshot) {
11311158
try {
11321159
const repo = await this.getPierreRepo()
11331160
if (!repo) return
11341161

1135-
// Get the snapshot from R2 to create a readable stream
1136-
const r2Object = await this.r2.versionCache.get(key)
1137-
if (!r2Object) {
1138-
console.warn('Failed to get R2 object for Pierre upload:', key)
1139-
return
1140-
}
1162+
const MAX_CAS_RETRIES = 3
1163+
for (let attempt = 0; attempt < MAX_CAS_RETRIES; attempt++) {
1164+
if (!this.pierreState) {
1165+
await this.syncPierreState(repo)
1166+
}
1167+
1168+
const { headSha, documentClock: pierreDocClock } = this.pierreState!
1169+
1170+
const { result: changes, documentClock } = storage.transaction((txn) =>
1171+
txn.getChangesSince(pierreDocClock)
1172+
)
1173+
1174+
if (!changes) return
1175+
1176+
const { diff } = changes
1177+
const hasPuts = Object.keys(diff.puts).length > 0
1178+
const hasDeletes = diff.deletes.length > 0
11411179

1142-
// Create commit with the snapshot
1143-
const timestamp = new Date().toISOString()
1144-
await repo
1145-
.createCommit({
1180+
if (!hasPuts && !hasDeletes && pierreDocClock === documentClock) {
1181+
return
1182+
}
1183+
1184+
const timestamp = new Date().toISOString()
1185+
const commitBuilder = repo.createCommit({
11461186
targetBranch: 'main',
11471187
commitMessage: `Snapshot at ${timestamp}`,
11481188
author: PIERRE_AUTHOR,
1189+
expectedHeadSha: headSha,
11491190
})
1150-
.addFile(SNAPSHOT_FILE_NAME, r2Object.body)
1151-
.send()
1152-
.catch((e) => {
1153-
// ignore no changes to commit errors
1154-
if (e instanceof RefUpdateError && e.message.match('no changes to commit')) {
1155-
return
1191+
1192+
const meta: PierreMeta = {
1193+
documentClock,
1194+
schema: snapshot.schema,
1195+
}
1196+
commitBuilder.addFileFromString('meta.json', JSON.stringify(meta))
1197+
1198+
for (const [id, put] of Object.entries(diff.puts)) {
1199+
const state = Array.isArray(put) ? put[1] : put
1200+
commitBuilder.addFileFromString(`records/${id}.json`, JSON.stringify(state))
1201+
}
1202+
1203+
// Only apply diff.deletes when we have a parent commit and we're not in wipeAll.
1204+
// - Empty repo (no headSha): those paths don't exist in Pierre; deletePath would fail.
1205+
// - wipeAll with existing repo: the cleanup loop below already deletes any file not in
1206+
// putIds, so applying diff.deletes here would duplicate deletePath for the same file.
1207+
if (headSha && !changes.wipeAll) {
1208+
for (const id of diff.deletes) {
1209+
commitBuilder.deletePath(`records/${id}.json`)
11561210
}
1157-
throw e
1158-
})
1211+
}
1212+
1213+
// On wipeAll with an existing repo, pruned tombstones may not appear in diff.deletes,
1214+
// so scan Pierre for stale record files and remove them.
1215+
if (changes.wipeAll && headSha) {
1216+
const putIds = new Set(Object.keys(diff.puts))
1217+
const { paths } = await repo.listFiles({ ref: headSha })
1218+
for (const path of paths) {
1219+
if (!path.startsWith('records/')) continue
1220+
const id = path.slice('records/'.length, -'.json'.length)
1221+
if (!putIds.has(id)) {
1222+
commitBuilder.deletePath(path)
1223+
}
1224+
}
1225+
}
1226+
1227+
try {
1228+
const result = await commitBuilder.send().catch((e) => {
1229+
if (e instanceof RefUpdateError && e.message.match('no changes to commit')) {
1230+
return null
1231+
}
1232+
throw e
1233+
})
1234+
1235+
this.pierreState = {
1236+
headSha: result ? result.refUpdate.newSha : headSha,
1237+
documentClock,
1238+
}
1239+
return
1240+
} catch (error) {
1241+
if (error instanceof RefUpdateError) {
1242+
console.warn('Pierre CAS conflict, retrying:', error.message)
1243+
this.pierreState = null
1244+
continue
1245+
}
1246+
throw error
1247+
}
1248+
}
1249+
console.error('Pierre: exhausted CAS retries')
11591250
} catch (error) {
1160-
// Log but don't fail the main persist operation
11611251
console.error('Failed to persist to Pierre:', error)
11621252
this.reportError(error)
11631253
}
@@ -1368,5 +1458,15 @@ async function listAllObjectKeys(bucket: R2Bucket, prefix: string): Promise<stri
13681458
const PERSIST_RETRIES_NOTIFY_THRESHOLD = 10
13691459
const PERSIST_RETRIES_MAX = 100
13701460

1371-
const SNAPSHOT_FILE_NAME = 'snapshot.json'
13721461
const PIERRE_AUTHOR = { email: 'huppy@tldraw.com', name: 'huppy [bot]' }
1462+
1463+
interface PierreState {
1464+
headSha: string | undefined
1465+
documentClock: number
1466+
}
1467+
1468+
/** Shape of meta.json stored in Pierre archives. */
1469+
export interface PierreMeta {
1470+
documentClock: number
1471+
schema: RoomSnapshot['schema']
1472+
}

apps/dotcom/sync-worker/src/routes/getPierreHistory.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ export async function getPierreHistory(
3232

3333
try {
3434
const envName = env.TLDRAW_ENV || 'development'
35-
const repoId = `${envName}/snapshots/${roomId}`
35+
const repoId = `${envName}/files/${roomId}`
3636

3737
const repo = await pierreClient.findOne({ id: repoId })
3838
if (!repo) {

apps/dotcom/sync-worker/src/routes/getPierreHistorySnapshot.ts

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ import { notFound } from '@tldraw/worker-shared'
22
import { IRequest } from 'itty-router'
33
import { Environment } from '../types'
44
import { createPierreClient } from '../utils/createPierreClient'
5+
import { reconstructSnapshotFromPierre } from '../utils/pierreSnapshot'
56
import { isRoomIdTooLong, roomIdIsTooLong } from '../utils/roomIdIsTooLong'
67
import { requireWriteAccessToFile } from '../utils/tla/getAuth'
78
import { isTestFile } from '../utils/tla/isTestFile'
@@ -24,7 +25,7 @@ export async function getPierreHistorySnapshot(
2425
return new Response('Not found', { status: 404 })
2526
}
2627

27-
const commitHash = request.params.timestamp // This is actually the commit hash now
28+
const commitHash = request.params.timestamp
2829
if (!commitHash) {
2930
return new Response('Missing commit hash', { status: 400 })
3031
}
@@ -36,21 +37,18 @@ export async function getPierreHistorySnapshot(
3637

3738
try {
3839
const envName = env.TLDRAW_ENV || 'development'
39-
const repoId = `${envName}/snapshots/${roomId}`
40+
const repoId = `${envName}/files/${roomId}`
4041

4142
const repo = await pierreClient.findOne({ id: repoId })
4243
if (!repo) {
4344
return new Response('Not found', { status: 404 })
4445
}
4546

46-
// Get the file stream and return it
47-
const fileStream = await repo.getFileStream({ path: 'snapshot.json', ref: commitHash })
47+
const snapshot = await reconstructSnapshotFromPierre(repo, commitHash)
4848

49-
// The content should be the snapshot JSON
50-
return new Response(fileStream.body, {
49+
return new Response(JSON.stringify(snapshot), {
5150
headers: {
5251
'content-type': 'application/json',
53-
// cache forever
5452
'Cache-Control': 'public, max-age=31536000, immutable',
5553
},
5654
})
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
import type { Repo } from '@pierre/storage'
2+
import { RoomSnapshot } from '@tldraw/sync-core'
3+
import { createTarDecoder } from 'modern-tar'
4+
import type { PierreMeta } from '../TLDrawDurableObject'
5+
6+
/** Actual format from Pierre: listFiles and getArchiveStream use repo-root paths with no prefix. */
7+
function isMetaJson(name: string) {
8+
return name === 'meta.json'
9+
}
10+
11+
/** Record files are exactly records/<id>.json; tar also includes a "records/" directory entry (excluded here). */
12+
function isRecordFile(name: string) {
13+
return /^records\/[^/]+\.json$/.test(name)
14+
}
15+
16+
/**
17+
* Reconstruct a RoomSnapshot from a Pierre repo at a given ref.
18+
* Streams the tar archive — only the parsed JSON values are held in memory.
19+
*/
20+
export async function reconstructSnapshotFromPierre(
21+
repo: Repo,
22+
ref: string
23+
): Promise<RoomSnapshot> {
24+
const resp = await repo.getArchiveStream({ ref })
25+
if (!resp.body) {
26+
throw new Error(`Empty archive body from Pierre at ref ${ref}`)
27+
}
28+
const entries = resp.body
29+
.pipeThrough(new DecompressionStream('gzip'))
30+
.pipeThrough(createTarDecoder())
31+
32+
let meta: PierreMeta | null = null
33+
const documents: RoomSnapshot['documents'] = []
34+
35+
const reader = entries.getReader()
36+
try {
37+
while (true) {
38+
const { done, value: entry } = await reader.read()
39+
if (done) break
40+
const name = entry.header.name
41+
if (isMetaJson(name)) {
42+
meta = JSON.parse(await new Response(entry.body).text()) as PierreMeta
43+
} else if (isRecordFile(name)) {
44+
const state = JSON.parse(
45+
await new Response(entry.body).text()
46+
) as RoomSnapshot['documents'][number]['state']
47+
documents.push({ state, lastChangedClock: 0 })
48+
} else {
49+
await entry.body.cancel()
50+
}
51+
}
52+
} finally {
53+
reader.releaseLock()
54+
}
55+
56+
if (!meta) {
57+
throw new Error(`No meta.json found in Pierre archive at ref ${ref}`)
58+
}
59+
60+
return {
61+
documentClock: meta.documentClock,
62+
schema: meta.schema,
63+
documents,
64+
}
65+
}

0 commit comments

Comments
 (0)