Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
136 commits
Select commit Hold shift + click to select a range
91b3cde
feat: InferenceX AgentX MVP scenario, Weka agentic-coding trace repla…
ajcasagrande May 1, 2026
4b895bb
fix: test fixes for unit, component-integration, and integration suites
ajcasagrande May 2, 2026
9616b77
feat: metrics-accumulator pipeline + DAG/agentic-replay hardening + r…
ajcasagrande May 4, 2026
438b311
feat(metrics): expand API usage metrics with multi-vendor support
ajcasagrande May 4, 2026
7e0ac05
feat(metrics): EFFECTIVE/ACTIVE console groups for analyzer outputs
ajcasagrande May 4, 2026
81455ee
feat(metrics): finalize console groups - DEFAULT last + clean IPC/pub…
ajcasagrande May 5, 2026
24d6739
Merge remote-tracking branch 'origin/main' into ajc/inferencex-agentx…
ajcasagrande May 5, 2026
43b473c
docs(specs): dag5 best-of-both DAG branch design
ajcasagrande May 5, 2026
350ea9b
docs(specs): fix attribution and path errors in dag5 spec
ajcasagrande May 5, 2026
2ffb569
docs(plans): dag5 Plan 1 foundation implementation plan
ajcasagrande May 5, 2026
6bffde5
docs(tutorials): clarify agentx-mvp scenario semantics and error mess…
ajcasagrande May 5, 2026
7d342c5
docs+validator: phrase agentic_replay as scenario-driven mode
ajcasagrande May 5, 2026
29418ea
weka: add live-assistant-response mode (env-gated, default off)
cquil11 May 5, 2026
b9f44ea
agentic_replay: terminate trajectory on mid-turn context-overflow
cquil11 May 5, 2026
dc943e7
records_manager: fix _render_realtime_block to use PhaseRecordsStats
cquil11 May 5, 2026
7e6b2e0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 5, 2026
26d1e3a
records_manager: enrich realtime stats line with input throughput + I…
cquil11 May 5, 2026
9b2ed6c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 5, 2026
64a8b0a
records_manager: realtime ITL row uses scalar inter_token_latency
cquil11 May 5, 2026
5af870f
records_manager: realtime tput row uses total_token_throughput
cquil11 May 5, 2026
7e20991
metrics: add InputTokenThroughputMetric (system-level prefill TPS)
cquil11 May 5, 2026
b6ebc19
records_manager: server-side cumulative stats in realtime block
cquil11 May 5, 2026
9902357
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 5, 2026
56e8209
metrics: input_token_throughput shows in realtime block (fix console_…
cquil11 May 6, 2026
1eaded4
fix(timing): drain-observer hook eliminates DAG completion-signal rac…
ajcasagrande May 6, 2026
903d1e6
test(timing): regression suite for drain-observer hook (commit 1eaded…
ajcasagrande May 6, 2026
88a73f9
records_manager: fix NameError in server_snapshot exception handler
cquil11 May 6, 2026
95d25cc
diag: SIGUSR1 -> dump_traceback_all for live hang debugging
cquil11 May 6, 2026
9b3c3d7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 6, 2026
d6d2944
fix(config): allow --warmup-grace-period under agentic_replay without…
ajcasagrande May 6, 2026
ddc9711
orchestrator: skip DAG spawn during WARMUP
cquil11 May 6, 2026
8157cb0
records_manager: realtime block adds p75, per-user tput percentiles, …
cquil11 May 6, 2026
77e4d2f
records_manager: realtime block reads raw_metrics (not filtered)
cquil11 May 6, 2026
a002253
fix(timing): skip DAG branch spawn during WARMUP credit phase
ajcasagrande May 7, 2026
9874cc0
docs: explain --use-server-token-count fixes OSL mismatch under ignor…
ajcasagrande May 7, 2026
b2451e1
docs(architecture): expand Credit System with lifecycle, fields, and …
ajcasagrande May 7, 2026
9e03d93
fix(scenario): make context-overflow rate threshold configurable via env
ajcasagrande May 7, 2026
2201efc
agentic_replay: terminate trajectory on mid-turn context-overflow
cquil11 May 5, 2026
95f3cff
records_manager: fix _render_realtime_block to use PhaseRecordsStats
cquil11 May 5, 2026
2da25a7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 5, 2026
523cdef
records_manager: enrich realtime stats line with input throughput + I…
cquil11 May 5, 2026
5a70bb8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 5, 2026
7f72266
records_manager: realtime ITL row uses scalar inter_token_latency
cquil11 May 5, 2026
d7b4a7d
records_manager: realtime tput row uses total_token_throughput
cquil11 May 5, 2026
423ebd3
metrics: add InputTokenThroughputMetric (system-level prefill TPS)
cquil11 May 5, 2026
81ee574
records_manager: server-side cumulative stats in realtime block
cquil11 May 5, 2026
50b037e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 5, 2026
9feb8f1
metrics: input_token_throughput shows in realtime block (fix console_…
cquil11 May 6, 2026
84be6c6
records_manager: fix NameError in server_snapshot exception handler
cquil11 May 6, 2026
b331dcf
diag: SIGUSR1 -> dump_traceback_all for live hang debugging
cquil11 May 6, 2026
676e0ed
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 6, 2026
31a123c
records_manager: realtime block adds p75, per-user tput percentiles, …
cquil11 May 6, 2026
1472c81
records_manager: realtime block reads raw_metrics (not filtered)
cquil11 May 6, 2026
f586d8b
feat(dataset): content-addressed mmap cache for tokenized datasets
ajcasagrande May 7, 2026
9165125
fix(scenario): auto-inject locked --cache-bust like other scenario fi…
ajcasagrande May 7, 2026
55b8e9e
weka: add live-assistant-response mode (env-gated, default off)
cquil11 May 5, 2026
053f8d8
chore: regenerate ruff baseline after Cameron cherry-pick stack
ajcasagrande May 7, 2026
f12e0e6
chore: post-cherry-pick lint cleanup
ajcasagrande May 7, 2026
aa56ced
Merge cquil11/aiperf cjq/weka-live-assistant-responses
ajcasagrande May 7, 2026
dc152ae
test(timing): harden Environment.DAG.FAIL_FAST monkeypatch against xd…
ajcasagrande May 7, 2026
8afa0a9
fix: harden integration suite, faulthandler, and dataset cache keys
ajcasagrande May 7, 2026
a9c48fe
feat(exporter): write console output to profile_export_console.txt
ajcasagrande May 7, 2026
894062b
realtime: surface cpu_kv_usage, queue depth, sglang retractions
cquil11 May 7, 2026
80b4d08
fix(composer): clamp max_tokens to >= 1 to avoid server rejection
ajcasagrande May 7, 2026
8ac10e1
feat(input): --max-context-length to pre-filter oversized conversations
ajcasagrande May 7, 2026
e40bf1f
agentic_replay: recycle queue spans full pool with active-trace skip
ajcasagrande May 7, 2026
b55d884
fix(scenario): InferenceX AgentX cache-bust target -> FIRST_TURN_PREFIX
ajcasagrande May 7, 2026
c20d19b
Merge cquil11/aiperf cjq/weka-live-assistant-responses (894062b65)
ajcasagrande May 7, 2026
e53ab77
refactor(accumulator): hoist realtime_snapshot's inner helpers to met…
ajcasagrande May 7, 2026
70fecb2
refactor(input): move --max-context-length filter into WekaTraceLoader
ajcasagrande May 7, 2026
7723e14
fix: stop setting HF_HUB_OFFLINE in parent process
ishandhanani May 7, 2026
bcccfb8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 7, 2026
0d292da
docs(specs): design for `aiperf report weka-trace` HTML reports
ajcasagrande May 8, 2026
963f8fc
docs(plans): implementation plan for `aiperf report weka-trace`
ajcasagrande May 8, 2026
a773d10
feat(reporting): add light weka -> ParsedTurn reader (parent-only)
ajcasagrande May 8, 2026
63d628d
test(reporting): cover directory mode + duplicate trace id
ajcasagrande May 8, 2026
fb2e0cc
test(reporting): tighten directory-order assertion to explicit list
ajcasagrande May 8, 2026
1ea33bc
feat(reporting): emit subagent sessions from weka light reader
ajcasagrande May 8, 2026
c05bf58
feat(reporting): add max_context_length pre-filter to weka reader
ajcasagrande May 8, 2026
2291b8e
feat(reporting): add parsed_to_sim_sessions for simulation HTML
ajcasagrande May 8, 2026
e1e295c
fix(reporting): skip target-vs-observed table when no comparisons
ajcasagrande May 8, 2026
13bd822
feat(reporting): write_cache_structure accepts block_size override
ajcasagrande May 8, 2026
00f7120
feat(cli): add `aiperf report weka-trace` HTML report command
ajcasagrande May 8, 2026
460d083
fix(reporting): suppress target-vs-observed when manifest is None
ajcasagrande May 8, 2026
4570860
fix(reporting): honor weka trace semantics in HTML reports
ajcasagrande May 8, 2026
c5385f8
test(weka): add async-subagent + parallel-inner regression fixture
ajcasagrande May 11, 2026
84c6433
feat(weka): reclassify subagent branch as async when sa_end > followi…
ajcasagrande May 11, 2026
2a7f82f
feat(weka): split overlapping subagent inner requests into parallel s…
ajcasagrande May 11, 2026
531c0a3
feat(weka): mirror async-detection and stream-splitting in parallel r…
ajcasagrande May 11, 2026
25e6541
test(weka): regression for real async-subagent trace with parallel in…
ajcasagrande May 11, 2026
2ef4968
docs(plans): plan for async-subagent + parallel-inner-request support
ajcasagrande May 11, 2026
8776da1
feat(weka): honor per-trace block_size with user-override precedence
ajcasagrande May 12, 2026
b05cbf9
Merge PR #899: fix: stop setting HF_HUB_OFFLINE in parent process
ajcasagrande May 12, 2026
9ca6e40
style(weka): ruff-format wrap long lines in block_size test
ajcasagrande May 12, 2026
1848624
feat(weka): relax init_turn_0 to synthesize tail when hash_ids are tr…
ajcasagrande May 12, 2026
41e8fce
feat(weka): relax advance_turn to synthesize tail when curr_hash_ids …
ajcasagrande May 12, 2026
f0a3fb3
test(weka): remove block_size workaround now that loader honors trace…
ajcasagrande May 12, 2026
55d1f4b
build: track transformers main for deepseek v4 support
ajcasagrande May 12, 2026
fb2d982
remove
ajcasagrande May 12, 2026
acbfa05
update docs
ajcasagrande May 12, 2026
65cb61f
docs(specs): trajectory reuse + user_config trace-scan bug fix design
ajcasagrande May 12, 2026
d84a31e
fix: in-flight edits (user_config log spam, DAG-child recycle, tokeni…
ajcasagrande May 12, 2026
db17d9e
plan: trajectory reuse + user_config trace-scan bug fix
ajcasagrande May 12, 2026
2178dcc
fix(user_config): guard timestamp scan against bare-scalar JSON lines
ajcasagrande May 12, 2026
fd69129
feat(trajectory_source): add wrap-fill helper for lane reuse
ajcasagrande May 12, 2026
2bf230c
feat(trajectory_source): auto wrap-fill when concurrency > pool
ajcasagrande May 12, 2026
789467e
feat(agentic_replay): Counter-based _active_traces + _lanes_per_trace
ajcasagrande May 12, 2026
9571f9a
feat(agentic_replay): correlation-id-keyed double-recycle guard
ajcasagrande May 12, 2026
74a4e55
feat(agentic_replay): WARN on wrap-fill with cache_bust=NONE
ajcasagrande May 12, 2026
4163b66
refactor: drop InsufficientTrajectoriesError, supplant w/ wrap-fill
ajcasagrande May 12, 2026
555ba25
test(agentic_replay): E2E wrap-fill happy path
ajcasagrande May 12, 2026
e265ad5
test(agentic_replay): mint unique correlation_ids in component-integr…
ajcasagrande May 12, 2026
8f44b2f
docs(whitepapers): add Effective vs Active metrics technical brief
ajcasagrande May 12, 2026
85652cf
feat(dataset): add semianalysis_cc_traces_weka_no_subagents HF loader
ajcasagrande May 12, 2026
fef78a9
feat(scenario): switch AgentX MVP default corpus to no-subagents 0512…
ajcasagrande May 12, 2026
e9a0092
loader(weka): switch HF dataset to cc-traces-weka-no-subagents-051226
cquil11 May 12, 2026
7da6376
feat(mmap-cache): flock-serialized populates for safe shared-cache use
cquil11 May 12, 2026
fc4bc99
realtime: show ISL/OSL p50/p75/p90/p99 instead of just avg
cquil11 May 13, 2026
d765c87
agentic_replay: log per-warmup-completion progress for non-TTY runs
cquil11 May 13, 2026
7f62d3f
realtime: replace tin/tout per-user rows with single interactivity row
cquil11 May 13, 2026
11ed1f5
realtime: add server-side running-average token throughput to srv row
cquil11 May 13, 2026
4efdd6e
[RecordProcessor] Drop context-overflow records for AGENTIC_REPLAY sc…
camatsemianalysis May 15, 2026
8f41bc7
[LoadGen] Add --failed-request-threshold for in-flight abort
camatsemianalysis May 15, 2026
343f33c
[AgenticReplay] Configurable trajectory start range + per-trajectory …
camatsemianalysis May 15, 2026
fccb847
[AgenticReplay] Log trajectory start-position table at construction
camatsemianalysis May 15, 2026
7d880a1
[RecordProcessor] Fix context-overflow drop hanging at end-of-phase
camatsemianalysis May 15, 2026
929aa76
Merge remote-tracking branch 'upstream/ajc/inferencex-agentx-mvp' int…
cquil11 May 18, 2026
9b858ae
fix(timing): wake outer runner awaits on cancel to avoid full-phase hang
cquil11 May 18, 2026
8aad400
quiet noisy phase-shutdown warnings + tqdm default in non-tty
cquil11 May 19, 2026
a6812b0
fix: UIType.TQDM does not exist — use UIType.SIMPLE
cquil11 May 19, 2026
4eb1a73
Revert "fix: UIType.TQDM does not exist — use UIType.SIMPLE"
cquil11 May 19, 2026
2f30ea8
Revert "quiet noisy phase-shutdown warnings + tqdm default in non-tty"
cquil11 May 19, 2026
61a9ed8
trajectory_source: surface per-lane start-token counts in summary log
cquil11 May 19, 2026
a2b9d6b
bump cc-traces-weka-no-subagents dataset to 051826 (98 traces)
cquil11 May 19, 2026
90c93ab
Revert "trajectory_source: surface per-lane start-token counts in sum…
cquil11 May 19, 2026
a61553f
records_manager: drop preemptions from realtime server-side log row
cquil11 May 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
42 changes: 16 additions & 26 deletions .cursor/rules/python.mdc
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ SPDX-License-Identifier: Apache-2.0

# AIPerf

Python 3.10+ async AI benchmarking tool for measuring LLM inference server performance. 9 services communicate via ZMQ message bus.
Python 3.10+ async AI benchmarking tool for measuring LLM inference server performance. 10 services communicate via ZMQ message bus.

**Reference documentation:**
- [`docs/architecture.md`](docs/architecture.md) - Three-plane architecture, core components, credit system, data flow, communication patterns
Expand All @@ -31,7 +31,7 @@ Python 3.10+ async AI benchmarking tool for measuring LLM inference server perfo
- `BaseComponentService` for services, `BaseService` for SystemController only.
- Message bus for inter-service communication - no shared mutable state.
- CLI commands: one file per command in `cli_commands/`, lazily loaded via import strings in `cli.py`. See `docs/dev/patterns.md`.
- YAML plugin registry for extensible features (`plugins.yaml`).
- YAML plugin registry for extensible features (`src/aiperf/plugin/plugins.yaml`).
- Lambda for expensive logs: `self.debug(lambda: f"{self._x()}")`. Direct string for cheap ones.
- Always `orjson.loads(s)`, `orjson.dumps(d)` for JSON.
- No `Optional[X]` or `Union[X, Y]` - use `X | Y`.
Expand All @@ -42,6 +42,7 @@ Python 3.10+ async AI benchmarking tool for measuring LLM inference server perfo
- Do not create markdown files to document code changes or decisions.
- Do not over-comment code. Removing code is fine without adding comments to explain why.
- No emojis in code or comments.
- Hide a metric from the console table with `console_group = MetricConsoleGroup.NONE`; group it into a separate section with `MetricConsoleGroup.{USAGE,CACHE,PREDICTION,AUDIO,REASONING}`. Default is `DEFAULT`. See `docs/metrics-reference.md` "Metric Console Group Reference".

## Build and Test Commands

Expand All @@ -68,27 +69,27 @@ pre-commit run # Staged files only
pre-commit run --all-files # All files (recommended after significant changes)
```

Hooks: `check-ast`, `debug-statements`, `detect-private-key`, `check-added-large-files`, `check-case-conflict`, `check-executables-have-shebangs`, `check-merge-conflict`, `check-json`, `check-toml`, `check-yaml`, `check-shebang-scripts-are-executable`, `end-of-file-fixer`, `mixed-line-ending`, `no-commit-to-branch`, `requirements-txt-fixer`, `trailing-whitespace`, `codespell`, `add-license`, `generate-cli-docs`, `generate-env-vars-docs`, `generate-plugin-artifacts`, `validate-plugin-schemas`, `test-imports`, `check-agent-files-sync`, `check-ergonomics`, `check-ruff-baselined`, `ruff`, `ruff-format`.
Hooks: `check-ast`, `debug-statements`, `detect-private-key`, `check-added-large-files`, `check-case-conflict`, `check-merge-conflict`, `check-executables-have-shebangs`, `check-shebang-scripts-are-executable`, `check-json`, `check-toml`, `check-yaml`, `end-of-file-fixer`, `trailing-whitespace`, `mixed-line-ending`, `no-commit-to-branch`, `requirements-txt-fixer`, `codespell`, `add-license`, `generate-cli-docs`, `generate-env-vars-docs`, `generate-plugin-artifacts`, `validate-plugin-schemas`, `test-imports`, `ruff`, `ruff-format`.

## Adding a New Service

1. Create class extending `BaseComponentService` with `@on_message` handlers
2. Register in `plugins.yaml` under `service` category with `class`, `description`, `metadata`
3. Add message type to `common/enums/enums.py` if new messages needed
4. Create message class in `messages/` with `message_type` field
5. Validate with `aiperf plugins --validate`
2. Register in `src/aiperf/plugin/plugins.yaml` under `service` category with `class`, `description`, `metadata`
3. Add message type to `src/aiperf/common/enums/enums.py` if new messages needed
4. Create message class in `src/aiperf/common/messages/` with `message_type` field
5. Validate with `make validate-plugin-schemas`

## Adding a New Message

1. Add enum value to `MessageType` in `common/enums/enums.py`
2. Create message class in `messages/` inheriting from `Message` with `message_type` field set
1. Add enum value to `MessageType` in `src/aiperf/common/enums/enums.py`
2. Create message class in `src/aiperf/common/messages/` inheriting from `Message` with `message_type` field set
3. Add `@on_message(MessageType.X)` handler in the receiving service
4. Auto-subscription happens during `@on_init` phase

## Adding a New Plugin

1. Create plugin class implementing the appropriate base
2. Add entry to `plugins.yaml` with `class`, `description`, `metadata`
2. Add entry to `src/aiperf/plugin/plugins.yaml` with `class`, `description`, `metadata`
3. Validate with `make validate-plugin-schemas`
4. Use via `plugins.get_class(PluginType.X, 'name')`

Expand All @@ -98,16 +99,6 @@ Hooks: `check-ast`, `debug-statements`, `detect-private-key`, `check-added-large
- `from tests.harness import mock_plugin` for plugin mocking
- Name: `test_<function>_<scenario>_<expected>` e.g. `test_parse_config_missing_field_raises_error`
- Imports at file top, fixtures for setup, one focus per test
- Use `from pytest import param` and put `# fmt: skip` on the `)` line:
```python
@pytest.mark.parametrize(
"arg",
[
param(..., id="case1"),
param(..., id="case2"),
],
) # fmt: skip
```
- Auto-fixtures (always active): asyncio.sleep runs instantly, RNG=42, singletons reset between tests

## Git Workflow
Expand All @@ -123,6 +114,7 @@ Feature branches use `<username>/feature-name` format, forked from `main`. One P
- Decorators: `@on_init`, `@on_start`, `@on_stop`, `@on_message`, `@on_command`, `@background_task`, `@on_pull_message`, `@on_request`.
- Communication: `publish()` for broadcast, `@on_message` to subscribe, `send_command_and_wait_for_response()` for sync.
- `AIPerfLifecycleMixin` for standalone components: `CREATED` -> `INITIALIZING` -> `INITIALIZED` -> `STARTING` -> `RUNNING` -> `STOPPING` -> `STOPPED`; `FAILED` terminal.
- `dag_jsonl` input type: conversation DAG benchmarks (fork + spawn modes); see `docs/benchmark-modes/dag.md`.

## Pre-Commit Checklist

Expand All @@ -133,20 +125,18 @@ Feature branches use `<username>/feature-name` format, forked from `main`. One P
5. `Field(description=...)` on all Pydantic fields
6. `git commit -s`

## Four-File Sync Rule
## Three-File Sync Rule

`AGENTS.md`, `CLAUDE.md`, `.github/copilot-instructions.md`, and `.cursor/rules/python.mdc` must contain identical content (only headers/frontmatter differ). When updating one, update all four. Run `make check-agent-files-sync` after editing to confirm sync — pre-commit enforces this on every commit that touches one of these files.
`CLAUDE.md`, `.github/copilot-instructions.md`, and `.cursor/rules/python.mdc` must contain identical content (only headers/frontmatter differ). When updating one, update all three. Always diff them after editing to confirm sync.

## Documentation Updates

> **DOCUMENTATION IS REQUIRED, NOT OPTIONAL.** Any PR that adds or changes a feature, CLI option, env var, plugin, message type, or service without updating the relevant docs is incomplete and will not be merged.

When making changes, update the appropriate documentation files using the table below. When adding a new tutorial, also add it to `README.md`'s tutorial index. **Any new file under `docs/` must also be added to `docs/index.yml`** (the Fern site index) — `tools/check_docs_index.py` enforces this in CI. If the change is internal-only and not user-facing (e.g. developer reference, internal mechanics, debugging notes), put the doc under `docs/reference/` rather than skipping documentation.
When making changes, update the appropriate documentation files. When adding a new tutorial, also add it to `README.md`'s tutorial index.

| Change type | Files to update |
|---|---|
| Architecture, components, data flow, communication | `docs/architecture.md` |
| Coding standards, build commands, new patterns | `AGENTS.md` + `CLAUDE.md` + `.github/copilot-instructions.md` + `.cursor/rules/python.mdc` |
| Coding standards, build commands, new patterns | `CLAUDE.md` + `.github/copilot-instructions.md` + `.cursor/rules/python.mdc` |
| Code patterns, examples, base classes | `docs/dev/patterns.md` |
| CLI arguments or commands | `docs/cli-options.md` (auto-generated via `make generate-cli-docs`) |
| Environment variables | `docs/environment-variables.md` (auto-generated via `make generate-env-vars-docs`) |
Expand Down
42 changes: 16 additions & 26 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ SPDX-License-Identifier: Apache-2.0

# AIPerf

Python 3.10+ async AI benchmarking tool for measuring LLM inference server performance. 9 services communicate via ZMQ message bus.
Python 3.10+ async AI benchmarking tool for measuring LLM inference server performance. 10 services communicate via ZMQ message bus.

**Reference documentation:**
- [`docs/architecture.md`](docs/architecture.md) - Three-plane architecture, core components, credit system, data flow, communication patterns
Expand All @@ -26,7 +26,7 @@ Python 3.10+ async AI benchmarking tool for measuring LLM inference server perfo
- `BaseComponentService` for services, `BaseService` for SystemController only.
- Message bus for inter-service communication - no shared mutable state.
- CLI commands: one file per command in `cli_commands/`, lazily loaded via import strings in `cli.py`. See `docs/dev/patterns.md`.
- YAML plugin registry for extensible features (`plugins.yaml`).
- YAML plugin registry for extensible features (`src/aiperf/plugin/plugins.yaml`).
- Lambda for expensive logs: `self.debug(lambda: f"{self._x()}")`. Direct string for cheap ones.
- Always `orjson.loads(s)`, `orjson.dumps(d)` for JSON.
- No `Optional[X]` or `Union[X, Y]` - use `X | Y`.
Expand All @@ -37,6 +37,7 @@ Python 3.10+ async AI benchmarking tool for measuring LLM inference server perfo
- Do not create markdown files to document code changes or decisions.
- Do not over-comment code. Removing code is fine without adding comments to explain why.
- No emojis in code or comments.
- Hide a metric from the console table with `console_group = MetricConsoleGroup.NONE`; group it into a separate section with `MetricConsoleGroup.{USAGE,CACHE,PREDICTION,AUDIO,REASONING}`. Default is `DEFAULT`. See `docs/metrics-reference.md` "Metric Console Group Reference".

## Build and Test Commands

Expand All @@ -63,27 +64,27 @@ pre-commit run # Staged files only
pre-commit run --all-files # All files (recommended after significant changes)
```

Hooks: `check-ast`, `debug-statements`, `detect-private-key`, `check-added-large-files`, `check-case-conflict`, `check-executables-have-shebangs`, `check-merge-conflict`, `check-json`, `check-toml`, `check-yaml`, `check-shebang-scripts-are-executable`, `end-of-file-fixer`, `mixed-line-ending`, `no-commit-to-branch`, `requirements-txt-fixer`, `trailing-whitespace`, `codespell`, `add-license`, `generate-cli-docs`, `generate-env-vars-docs`, `generate-plugin-artifacts`, `validate-plugin-schemas`, `test-imports`, `check-agent-files-sync`, `check-ergonomics`, `check-ruff-baselined`, `ruff`, `ruff-format`.
Hooks: `check-ast`, `debug-statements`, `detect-private-key`, `check-added-large-files`, `check-case-conflict`, `check-merge-conflict`, `check-executables-have-shebangs`, `check-shebang-scripts-are-executable`, `check-json`, `check-toml`, `check-yaml`, `end-of-file-fixer`, `trailing-whitespace`, `mixed-line-ending`, `no-commit-to-branch`, `requirements-txt-fixer`, `codespell`, `add-license`, `generate-cli-docs`, `generate-env-vars-docs`, `generate-plugin-artifacts`, `validate-plugin-schemas`, `test-imports`, `ruff`, `ruff-format`.

## Adding a New Service

1. Create class extending `BaseComponentService` with `@on_message` handlers
2. Register in `plugins.yaml` under `service` category with `class`, `description`, `metadata`
3. Add message type to `common/enums/enums.py` if new messages needed
4. Create message class in `messages/` with `message_type` field
5. Validate with `aiperf plugins --validate`
2. Register in `src/aiperf/plugin/plugins.yaml` under `service` category with `class`, `description`, `metadata`
3. Add message type to `src/aiperf/common/enums/enums.py` if new messages needed
4. Create message class in `src/aiperf/common/messages/` with `message_type` field
5. Validate with `make validate-plugin-schemas`

## Adding a New Message

1. Add enum value to `MessageType` in `common/enums/enums.py`
2. Create message class in `messages/` inheriting from `Message` with `message_type` field set
1. Add enum value to `MessageType` in `src/aiperf/common/enums/enums.py`
2. Create message class in `src/aiperf/common/messages/` inheriting from `Message` with `message_type` field set
3. Add `@on_message(MessageType.X)` handler in the receiving service
4. Auto-subscription happens during `@on_init` phase

## Adding a New Plugin

1. Create plugin class implementing the appropriate base
2. Add entry to `plugins.yaml` with `class`, `description`, `metadata`
2. Add entry to `src/aiperf/plugin/plugins.yaml` with `class`, `description`, `metadata`
3. Validate with `make validate-plugin-schemas`
4. Use via `plugins.get_class(PluginType.X, 'name')`

Expand All @@ -93,16 +94,6 @@ Hooks: `check-ast`, `debug-statements`, `detect-private-key`, `check-added-large
- `from tests.harness import mock_plugin` for plugin mocking
- Name: `test_<function>_<scenario>_<expected>` e.g. `test_parse_config_missing_field_raises_error`
- Imports at file top, fixtures for setup, one focus per test
- Use `from pytest import param` and put `# fmt: skip` on the `)` line:
```python
@pytest.mark.parametrize(
"arg",
[
param(..., id="case1"),
param(..., id="case2"),
],
) # fmt: skip
```
- Auto-fixtures (always active): asyncio.sleep runs instantly, RNG=42, singletons reset between tests

## Git Workflow
Expand All @@ -118,6 +109,7 @@ Feature branches use `<username>/feature-name` format, forked from `main`. One P
- Decorators: `@on_init`, `@on_start`, `@on_stop`, `@on_message`, `@on_command`, `@background_task`, `@on_pull_message`, `@on_request`.
- Communication: `publish()` for broadcast, `@on_message` to subscribe, `send_command_and_wait_for_response()` for sync.
- `AIPerfLifecycleMixin` for standalone components: `CREATED` -> `INITIALIZING` -> `INITIALIZED` -> `STARTING` -> `RUNNING` -> `STOPPING` -> `STOPPED`; `FAILED` terminal.
- `dag_jsonl` input type: conversation DAG benchmarks (fork + spawn modes); see `docs/benchmark-modes/dag.md`.

## Pre-Commit Checklist

Expand All @@ -128,20 +120,18 @@ Feature branches use `<username>/feature-name` format, forked from `main`. One P
5. `Field(description=...)` on all Pydantic fields
6. `git commit -s`

## Four-File Sync Rule
## Three-File Sync Rule

`AGENTS.md`, `CLAUDE.md`, `.github/copilot-instructions.md`, and `.cursor/rules/python.mdc` must contain identical content (only headers/frontmatter differ). When updating one, update all four. Run `make check-agent-files-sync` after editing to confirm sync — pre-commit enforces this on every commit that touches one of these files.
`CLAUDE.md`, `.github/copilot-instructions.md`, and `.cursor/rules/python.mdc` must contain identical content (only headers/frontmatter differ). When updating one, update all three. Always diff them after editing to confirm sync.

## Documentation Updates

> **DOCUMENTATION IS REQUIRED, NOT OPTIONAL.** Any PR that adds or changes a feature, CLI option, env var, plugin, message type, or service without updating the relevant docs is incomplete and will not be merged.

When making changes, update the appropriate documentation files using the table below. When adding a new tutorial, also add it to `README.md`'s tutorial index. **Any new file under `docs/` must also be added to `docs/index.yml`** (the Fern site index) — `tools/check_docs_index.py` enforces this in CI. If the change is internal-only and not user-facing (e.g. developer reference, internal mechanics, debugging notes), put the doc under `docs/reference/` rather than skipping documentation.
When making changes, update the appropriate documentation files. When adding a new tutorial, also add it to `README.md`'s tutorial index.

| Change type | Files to update |
|---|---|
| Architecture, components, data flow, communication | `docs/architecture.md` |
| Coding standards, build commands, new patterns | `AGENTS.md` + `CLAUDE.md` + `.github/copilot-instructions.md` + `.cursor/rules/python.mdc` |
| Coding standards, build commands, new patterns | `CLAUDE.md` + `.github/copilot-instructions.md` + `.cursor/rules/python.mdc` |
| Code patterns, examples, base classes | `docs/dev/patterns.md` |
| CLI arguments or commands | `docs/cli-options.md` (auto-generated via `make generate-cli-docs`) |
| Environment variables | `docs/environment-variables.md` (auto-generated via `make generate-env-vars-docs`) |
Expand Down
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ profile.json
profile.html
.vscode
*.jsonl
!examples/**/*.jsonl
!tests/fixtures/**/*.jsonl
coverage.xml
*.egg-info/
coverage.json
Expand All @@ -50,3 +52,6 @@ src/aiperf/_build_info.py
.cursor/*
!.cursor/rules/
.worktrees/

# dev/benchmarks output dir — local benchmark runs, not committed
dev/benchmarks/results/
Loading
Loading