[Feature]: Expose per-phase timings and RSTEST_PROFILE trace output for diagnosing slow runs

## What problem does this feature solve?

When users report that test runs are slow, the current summary lacks the granularity to locate the bottleneck:

```
Duration 3149s (build 39s, tests 51m 50s)
```

The single `tests` number collapses worker pool startup, environment setup (jsdom / happy-dom), setup files loading, test file import, test execution, coverage collection, and teardown into one value. When a user opens an issue saying "my tests are slow", there is no data to distinguish whether the cost comes from barrel-file imports, setup files, environment creation, or something else — which means we either have to guess or ask the user to re-run with ad-hoc instrumentation, and the feedback loop often breaks.

A real-world example: in #1148 (1135 files, coverage enabled, ~52 min total) Vitest on the same repo reports `(transform 291s, setup 2305s, import 3118s, tests 155s, environment 265s)`, which immediately points at `setup` and `import` as the dominant phases. Rstest cannot surface equivalent data today.

The existing [profiling guide](https://rstest.rs/guide/advanced/profiling) lists general-purpose fallbacks (Rsdoctor for build, Samply for CPU sampling, `--heap-prof` for memory), but none of them answer "which Rstest phase is slow" without significant manual interpretation — they either cover only the build side or require reading function-level flamegraphs. We need first-party, phase-level instrumentation so that:

1. The default summary carries enough data to triage remote bug reports from the initial paste.
2. Users can opt into a deeper trace and attach it to issues for us to analyze without needing to reproduce.

## What does the proposed API look like?

Two layers, separated by cost.

### Layer 1 — enriched default summary (always on)

Replace the single-line duration with a per-phase breakdown. Numbers only, no interpretation or hints:

```
Test Files  1135 passed
     Tests  9191 passed | 16 skipped | 2 todo (9209)
  Duration  3149s
      build          39s
      prepare        12s
      env-setup       8s
      setup-files   240s
      import       1800s
      tests         800s
      coverage      250s
      teardown      0.3s
```

Phase times after `build` are aggregated across workers (worker-time, not wall-clock); the output will label this so users don't compare against the wall-clock total and get confused.

### Layer 2 — opt-in trace file

Produce a [Chrome Trace Event Format](https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/preview) JSON file, viewable at [ui.perfetto.dev](https://ui.perfetto.dev). Each worker becomes a separate swim lane, each phase a span, so parallelism gaps and outlier files are visible at a glance.

Enabled via env var (matches `RSPACK_PROFILE` naming used by Rspack / Rsbuild):

```bash
RSTEST_PROFILE=1 rstest run
# writes ./rstest-profile.json

RSTEST_PROFILE=/tmp/trace.json rstest run
# writes to the specified path
```

Or via CLI flag for local iteration:

```bash
rstest run --profile
rstest run --profile /tmp/trace.json
```

### Phases captured

- `build` — Rsbuild compile
- `prepare` — worker spin-up and runtime init
- `env-setup` — jsdom / happy-dom setup + teardown
- `setup-files` — each setup entry load
- `import` — test file module evaluation
- `tests` — `it()` execution aggregate
- `coverage` — istanbul collect / merge
- `teardown` — per-file cleanup

### Relationship to existing profiling tools

| Tool | Scope |
|---|---|
| `RSTEST_PROFILE` (new) | Rstest phase-level spans — triage which phase is slow |
| Rsdoctor | Build-side details (bundle, loaders, plugins) |
| Samply / `--cpu-prof` | Function-level CPU sampling |
| `--heap-prof` | Memory |

`RSTEST_PROFILE` becomes the first-line tool; the others remain for deeper drill-downs once the phase is known.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: Expose per-phase timings and RSTEST_PROFILE trace output for diagnosing slow runs #1195

What problem does this feature solve?

What does the proposed API look like?

Layer 1 — enriched default summary (always on)

Layer 2 — opt-in trace file

Phases captured

Relationship to existing profiling tools

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Tool	Scope
`RSTEST_PROFILE` (new)	Rstest phase-level spans — triage which phase is slow
Rsdoctor	Build-side details (bundle, loaders, plugins)
Samply / `--cpu-prof`	Function-level CPU sampling
`--heap-prof`	Memory

Uh oh!

[Feature]: Expose per-phase timings and RSTEST_PROFILE trace output for diagnosing slow runs #1195

Description

What problem does this feature solve?

What does the proposed API look like?

Layer 1 — enriched default summary (always on)

Layer 2 — opt-in trace file

Phases captured

Relationship to existing profiling tools

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions