Skip to content

restore: restore job payload exceeds Raft command size limit at 100k descriptors #170669

@kev-cao

Description

@kev-cao

RESTORE of a 100k-descriptor backup fails at job creation with:

ERROR: write-job-info-insert: command is too large: 110794348 bytes (max: 67108864)

The error fires from InfoStorage.write (pkg/jobs/job_info_storage.go:471), which serializes the entire Payload and writes it via a single INSERT INTO system.job_info ... VALUES ($1, $2, now(), $3). KV rejects the resulting CPut because it exceeds kv.raft.command.max_size (default 64 MiB).

The bulk of the 110 MB comes from RestoreDetails.TableDescs (pkg/backup/restore_planning.go:2353-2363), which stores the full descpb.TableDescriptor proto for every table being restored — ~1 KB per table × 100k tables ≈ 100 MB.

This is a hard wall at 100k and prohibitive at the 1M-descriptors goal (projected payload ~1 GB before any other growth).

Secondary writebacks compound the problem

Beyond the initial job-creation insert, the executor writes details.TableDescs back twice more during execution — after the prepare step (restore_job.go:2068) and after publishing (restore_job.go:3313) — each via the same single-row system.job_info write. So even if the initial insert were fixed in isolation, the executor would hit the same limit later in the restore.

Approaches

Approach A — chunked payload storage (smallest blast radius, unblocks 100k today). Split the serialized payload across multiple system.job_info rows under separate info_keys (e.g. legacy_payload/0..N), reconstructed on read. No semantic changes; bounded refactor in pkg/jobs. Also applies to the mid-execution writebacks.

Approach B — store only IDs and rewrites; re-derive descriptors at execution (better long-term shape). Drop TableDescs (and TypeDescs, SchemaDescs, DatabaseDescs, FunctionDescs) from RestoreDetails. Persist only DescriptorRewrites plus state flags. At execution time, re-read the backup manifest (already loaded for data ingestion), apply DescriptorRewrites, and materialize descriptors in memory. Shrinks the canonical payload from O(descriptors × descriptor_size) to O(descriptors × ~16 bytes), making 1M tractable in steady state.

Epic: CRDB-62562

Jira issue: CRDB-64110

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions