Huge async jobs preallocate row-group metadata for the full logical dataset

### Priority Level

High (Major functionality broken)

### Describe the bug

Huge async jobs preallocate row-group metadata for the entire logical dataset before useful work starts. This creates memory overhead proportional to total row-group count even when the scheduler only admits a small active row-group horizon.

For fire-and-forget large jobs, a user can configure millions of records and small row groups, but memory is consumed up front for row groups that are not yet active.

### Steps/Code to reproduce bug

Use a setup path that prepares an async run without needing to complete the full job. Configure:

- hundreds of thousands to millions of row groups
- small row-group sizes, for example 2 rows per group
- around 16 columns
- default/public-style active row-group horizon or a small internal horizon
- no tracing/events required

Representative preparation-only measurements:

| Row groups | RSS delta |
| ---: | ---: |
| 10,000 | ~3.6 MiB |
| 50,000 | ~20 MiB |
| 100,000 | ~43 MiB |
| 250,000 | ~114 MiB |
| 500,000 | ~202 MiB |
| 1,000,000 | ~420 MiB |

The observed slope was roughly hundreds of RSS bytes per row group before active endpoint work began.

### Expected behavior

Memory should scale primarily with active admitted row groups, in-flight tasks, retained completed state, and output buffers. It should not scale linearly with all future row groups that have not yet been admitted.

### Agent Diagnostic / Prior Investigation

The investigation traced the preallocation to row-group metadata construction and copying across async builder/scheduler setup:

- the dataset builder constructs a full `row_groups` list for the logical dataset,
- `CompletionTracker.with_graph(...)` copies that list into a row-group-size dictionary,
- the scheduler stores the full row-group list and derives additional maps from it.

This happens before scheduler execution and before most row groups can become active.

### Additional context

Suggested direction:

- Represent row groups lazily or compactly instead of materializing every `(row_group_id, size)` pair.
- Let `CompletionTracker` validate/derive row-group sizes from a compact range representation.
- Keep per-row-group mutable state only for admitted or recently finalized row groups.
- Add a startup memory regression test for million-row-group configurations.

Local artifact paths and machine identifiers from the investigation were intentionally omitted from this issue.

### Checklist

- [x] I reproduced this issue or provided a minimal example
- [x] I searched the docs/issues myself, or had my agent do so
- [x] If I used an agent, I included its diagnostics above


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Huge async jobs preallocate row-group metadata for the full logical dataset #726

Priority Level

Describe the bug

Steps/Code to reproduce bug

Expected behavior

Agent Diagnostic / Prior Investigation

Additional context

Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Row groups	RSS delta
10,000	~3.6 MiB
50,000	~20 MiB
100,000	~43 MiB
250,000	~114 MiB
500,000	~202 MiB
1,000,000	~420 MiB

Huge async jobs preallocate row-group metadata for the full logical dataset #726

Description

Priority Level

Describe the bug

Steps/Code to reproduce bug

Expected behavior

Agent Diagnostic / Prior Investigation

Additional context

Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions