RESTORE of a 100k-descriptor backup fails at job creation with:
ERROR: write-job-info-insert: command is too large: 110794348 bytes (max: 67108864)
The error fires from InfoStorage.write (pkg/jobs/job_info_storage.go:471), which serializes the entire Payload and writes it via a single INSERT INTO system.job_info ... VALUES ($1, $2, now(), $3). KV rejects the resulting CPut because it exceeds kv.raft.command.max_size (default 64 MiB).
The bulk of the 110 MB comes from RestoreDetails.TableDescs (pkg/backup/restore_planning.go:2353-2363), which stores the full descpb.TableDescriptor proto for every table being restored — ~1 KB per table × 100k tables ≈ 100 MB.
This is a hard wall at 100k and prohibitive at the 1M-descriptors goal (projected payload ~1 GB before any other growth).
Secondary writebacks compound the problem
Beyond the initial job-creation insert, the executor writes details.TableDescs back twice more during execution — after the prepare step (restore_job.go:2068) and after publishing (restore_job.go:3313) — each via the same single-row system.job_info write. So even if the initial insert were fixed in isolation, the executor would hit the same limit later in the restore.
Approaches
Approach A — chunked payload storage (smallest blast radius, unblocks 100k today). Split the serialized payload across multiple system.job_info rows under separate info_keys (e.g. legacy_payload/0..N), reconstructed on read. No semantic changes; bounded refactor in pkg/jobs. Also applies to the mid-execution writebacks.
Approach B — store only IDs and rewrites; re-derive descriptors at execution (better long-term shape). Drop TableDescs (and TypeDescs, SchemaDescs, DatabaseDescs, FunctionDescs) from RestoreDetails. Persist only DescriptorRewrites plus state flags. At execution time, re-read the backup manifest (already loaded for data ingestion), apply DescriptorRewrites, and materialize descriptors in memory. Shrinks the canonical payload from O(descriptors × descriptor_size) to O(descriptors × ~16 bytes), making 1M tractable in steady state.
Epic: CRDB-62562
Jira issue: CRDB-64110
RESTOREof a 100k-descriptor backup fails at job creation with:The error fires from
InfoStorage.write(pkg/jobs/job_info_storage.go:471), which serializes the entirePayloadand writes it via a singleINSERT INTO system.job_info ... VALUES ($1, $2, now(), $3). KV rejects the resulting CPut because it exceedskv.raft.command.max_size(default 64 MiB).The bulk of the 110 MB comes from
RestoreDetails.TableDescs(pkg/backup/restore_planning.go:2353-2363), which stores the fulldescpb.TableDescriptorproto for every table being restored — ~1 KB per table × 100k tables ≈ 100 MB.This is a hard wall at 100k and prohibitive at the 1M-descriptors goal (projected payload ~1 GB before any other growth).
Secondary writebacks compound the problem
Beyond the initial job-creation insert, the executor writes
details.TableDescsback twice more during execution — after the prepare step (restore_job.go:2068) and after publishing (restore_job.go:3313) — each via the same single-rowsystem.job_infowrite. So even if the initial insert were fixed in isolation, the executor would hit the same limit later in the restore.Approaches
Approach A — chunked payload storage (smallest blast radius, unblocks 100k today). Split the serialized payload across multiple
system.job_inforows under separateinfo_keys (e.g.legacy_payload/0..N), reconstructed on read. No semantic changes; bounded refactor inpkg/jobs. Also applies to the mid-execution writebacks.Approach B — store only IDs and rewrites; re-derive descriptors at execution (better long-term shape). Drop
TableDescs(andTypeDescs,SchemaDescs,DatabaseDescs,FunctionDescs) fromRestoreDetails. Persist onlyDescriptorRewritesplus state flags. At execution time, re-read the backup manifest (already loaded for data ingestion), applyDescriptorRewrites, and materialize descriptors in memory. Shrinks the canonical payload from O(descriptors × descriptor_size) to O(descriptors × ~16 bytes), making 1M tractable in steady state.Epic: CRDB-62562
Jira issue: CRDB-64110