You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(export): stream /export/dump to R2 with DO alarm resumption (#59)
The legacy /export/dump route buffers the entire dump in memory and runs
synchronously, so it falls over on databases that exceed the 30s Worker
timeout or the Durable Object memory ceiling (currently 1GB, soon 10GB).
This change adds a streaming path that lives inside the Durable Object:
- POST /export/dump kicks off a job, opens an R2 multipart upload, and
returns 202 with a jobId. Supports format=sql|csv|json plus optional
callbackUrl/table/chunkSize.
- GET /export/dump/status/:jobId returns progress (tables, rows, bytes,
parts uploaded) and a downloadUrl once status is 'completed'.
- GET /export/dump/download/:jobId streams the finished object back from
R2 to the client.
- DELETE /export/dump/:jobId aborts an in-flight upload.
The engine paginates 1000 rows at a time, buffers up to the R2 multipart
5 MiB minimum, flushes parts as they fill, and budgets each tick at 20s.
When a tick yields, the leftover bytes are persisted to a temp R2 object
(DO storage values are capped at 128 KiB and cannot hold the buffer
directly). The DO alarm() handler dispatches dump work first, then falls
through to the existing cron logic, so the two co-exist on the same
alarm channel.
A new [[r2_buckets]] binding named DATABASE_DUMPS gates the streaming
path. The legacy GET /export/dump remains untouched for small databases
and existing clients.
Tests: 17 new unit tests covering the engine (mid-tick yield/resume,
multipart flushing at the 5 MiB threshold, error abort, BLOB literals,
empty databases, CSV/JSON formats) and the HTTP routes.
The synchronous endpoint above buffers the whole dump in memory and is bounded by the 30-second Worker timeout. For databases that exceed that budget, switch to the streaming variant — it paginates rows, writes them straight into R2 over (potentially many) Durable Object alarm ticks, and lets you download the finished artifact when ready.
248
+
249
+
Add an R2 binding called `DATABASE_DUMPS` to your `wrangler.toml`:
250
+
251
+
<pre>
252
+
<code>
253
+
[[r2_buckets]]
254
+
binding = "DATABASE_DUMPS"
255
+
bucket_name = "starbasedb-dumps"
256
+
</code>
257
+
</pre>
258
+
259
+
Kick off the job (returns 202 with a `jobId`):
260
+
261
+
<pre>
262
+
<code>
263
+
curl -X POST 'https://starbasedb.YOUR-ID-HERE.workers.dev/export/dump' \
`format` may be `sql`, `csv`, or `json`. Optional fields: `callbackUrl` (POSTed the status view on completion), `table` (export a single table only), `chunkSize` (rows per SELECT batch; default 1000).
271
+
272
+
Poll the status, then download once it reads `completed`:
sql: 'SELECT name FROM sqlite_master WHERE type = ? AND name = ?;',
382
+
params: ['table',only],
383
+
isRaw: false,
384
+
}))asRecord<string,SqlStorageValue>[]
385
+
returnexists.length ? [only] : []
386
+
}
387
+
constrows=(awaitthis.executeQuery({
388
+
sql: "SELECT name FROM sqlite_master WHERE type = 'table' AND name NOT LIKE 'sqlite_%' AND name NOT LIKE 'tmp_%' AND name NOT LIKE '_cf_%' ORDER BY name;",
0 commit comments