TiCDC can spike memory and CPU when bulk-creating idle changefeeds

### What did you do?

Bulk-create many TiCDC changefeeds whose tables have no row traffic.

Observed workload:
- Creating 800 changefeeds can OOM a 96 GB machine.
- Creating 400 changefeeds can quickly push TiCDC memory to about 60 GB and total CPU usage to about 93%.
- After roughly 3 minutes, memory drops back to about 17 GB.

Code inspection on upstream/master found that changefeed creation can schedule many maintainer bootstraps concurrently.

### What did you expect to see?

Bulk changefeed creation should respect the configured scheduler concurrency limit and avoid launching hundreds of maintainer bootstraps at the same time.

Memory and CPU should increase gradually during creation, and should not spike high enough to OOM a machine when the steady-state workload can run those changefeeds.

### What did you see instead?

The coordinator is created with hard-coded scheduling settings:

- max task concurrency: `10000`
- balance interval: `time.Minute`

This bypasses the server scheduler config, whose default `max-task-concurrency` is `10`.

The basic scheduler uses this value as its batch size for absent changefeeds. When hundreds of changefeeds are bulk-created, many AddMaintainer operators can be issued almost at once.

Each maintainer bootstrap performs startup work even when tables have no row traffic, including loading table metadata from schema store and building schema/span info. `loadAllPhysicalTablesAtTs` currently also loads full table metadata before applying table filters. This makes creation-time memory and CPU scale poorly with the number of concurrently bootstrapping changefeeds.

### Versions of the cluster

Upstream TiDB cluster version (execute `SELECT tidb_version();` in a MySQL client):

```console
Not captured from the original workload.
```

Upstream TiKV version (execute `tikv-server --version`):

```console
Not captured from the original workload.
```

TiCDC version (execute `cdc version`):

```console
Code issue verified by inspection on upstream/master at 0a418b4132466aa084517ec7137b3d5f24013dcc.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TiCDC can spike memory and CPU when bulk-creating idle changefeeds #4831

What did you do?

What did you expect to see?

What did you see instead?

Versions of the cluster

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TiCDC can spike memory and CPU when bulk-creating idle changefeeds #4831

Description

What did you do?

What did you expect to see?

What did you see instead?

Versions of the cluster

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions