Skip to content

Commit a007820

Browse files
authored
Add CLAUDE.md (#956)
### What does this PR do? Add CLAUDE.md file with repo overview for AI agents <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Added a comprehensive CLAUDE.md documenting the Model Optimizer: concepts, architecture, design patterns and anti-patterns, security and contribution guidelines, common commands, architecture layout, core abstractions (modes), key components overview, CI/testing and export guidance, setup and workflow tips, and links to further documentation. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Rohan Joshi <rohjoshi@nvidia.com>
1 parent 1ccd945 commit a007820

1 file changed

Lines changed: 106 additions & 0 deletions

File tree

CLAUDE.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# CLAUDE.md
2+
3+
NVIDIA Model Optimizer (ModelOpt): open-source library for model optimization techniques including
4+
quantization, pruning, distillation, sparsity, and speculative decoding to accelerate inference.
5+
Primarily Python codebase with optional C++/CUDA extensions supporting PyTorch, ONNX, and Hugging Face/Megatron models.
6+
7+
> If a `CLAUDE.local.md` file exists alongside this file, read and respect it — it contains
8+
> developer-specific overrides that supplement this shared guidance.
9+
10+
## Rules (Read First)
11+
12+
**CRITICAL (YOU MUST):**
13+
14+
- NVIDIA Apache 2.0 license header on ALL new Python/C++/CUDA files (see `LICENSE_HEADER`)
15+
- `git commit -s -S` (DCO sign-off + cryptographic signing required). Never attribute AI tools in
16+
sign-off line
17+
- `pre-commit` hooks run on commit — if files are modified by hooks, re-stage and commit again
18+
- PRs require CODEOWNERS review (auto-assigned based on `.github/CODEOWNERS`)
19+
- After rebasing, always re-run tests locally before pushing
20+
- All code must follow the security guidelines in `SECURITY.md` — violations are blocked as pre-merge errors
21+
- For contribution guidelines, commit conventions, and PR requirements, see `CONTRIBUTING.md`
22+
23+
## Common Commands
24+
25+
| Task | Command |
26+
|------|---------|
27+
| Install (editable + dev) | `pip install -e ".[dev]"` |
28+
| CPU unit tests | `python -m pytest tests/unit` |
29+
| GPU unit tests | `python -m pytest tests/gpu` |
30+
| Megatron GPU tests | `python -m pytest tests/gpu_megatron` |
31+
| TRT-LLM GPU tests | `python -m pytest tests/gpu_trtllm` |
32+
| Pattern match | `pytest tests/unit -k "test_quantize"` |
33+
| Lint + format (all files) | `pre-commit run --all-files` |
34+
| Lint (diff only) | `pre-commit run --from-ref origin/main --to-ref HEAD` |
35+
| Run via tox (CPU unit) | `tox -e py312-torch210-tf_latest-unit` |
36+
| Build docs | `tox -e build-docs` |
37+
| Build wheel | `tox -e build-wheel` |
38+
39+
## Architecture
40+
41+
ModelOpt is organized into three top-level namespaces:
42+
43+
| Namespace | Path | Role |
44+
|-----------|------|------|
45+
| `modelopt.torch` | `modelopt/torch/` | Core PyTorch optimization library |
46+
| `modelopt.onnx` | `modelopt/onnx/` | ONNX model quantization and export |
47+
| `modelopt.deploy` | `modelopt/deploy/` | Deployment utilities for LLMs |
48+
49+
### `modelopt.torch` Sub-packages
50+
51+
| Sub-package | Path | Role |
52+
|-------------|------|------|
53+
| `opt` | `modelopt/torch/opt/` | Core optimization infrastructure (modes, config, state dicts) |
54+
| `quantization` | `modelopt/torch/quantization/` | PTQ, QAT, and quantization-aware algorithms |
55+
| `prune` | `modelopt/torch/prune/` | Structured and unstructured pruning |
56+
| `distill` | `modelopt/torch/distill/` | Knowledge distillation |
57+
| `sparsity` | `modelopt/torch/sparsity/` | Weight and activation sparsity |
58+
| `speculative` | `modelopt/torch/speculative/` | Speculative decoding (Medusa, EAGLE, etc.) |
59+
| `nas` | `modelopt/torch/nas/` | Neural architecture search |
60+
| `export` | `modelopt/torch/export/` | Checkpoint export for TRT-LLM / Megatron |
61+
| `peft` | `modelopt/torch/peft/` | QLoRA and PEFT integration |
62+
| `_deploy` | `modelopt/torch/_deploy/` | Internal deployment utilities |
63+
| `utils` | `modelopt/torch/utils/` | Shared utilities and plugin infrastructure |
64+
65+
### Core Abstraction: Modes
66+
67+
A **mode** is the unit of model optimization in ModelOpt. Each algorithm (quantization, pruning,
68+
etc.) is implemented as one or more modes. Modes are recorded in the model's `modelopt_state` so
69+
optimization workflows can be composed, saved, and restored.
70+
71+
## Key Files
72+
73+
| File | Role |
74+
|------|------|
75+
| `modelopt/torch/opt/mode.py` | Base class for all optimization modes |
76+
| `modelopt/torch/opt/config.py` | Configuration system for modes |
77+
| `modelopt/torch/opt/conversion.py` | `apply_mode()` / `restore()` entry points |
78+
| `modelopt/torch/quantization/__init__.py` | PTQ/QAT public API |
79+
| `modelopt/torch/export/unified_export_hf.py` | Unified HF checkpoint export |
80+
| `modelopt/torch/export/model_config_export.py` | TRT-LLM model config export |
81+
| `modelopt/deploy/llm/` | LLM deployment utilities |
82+
| `pyproject.toml` | Optional dependency groups (`[onnx]`, `[hf]`, `[all]`, `[dev]`); ruff, mypy, pytest, bandit, and coverage config |
83+
| `.pre-commit-config.yaml` | Pre-commit hooks (ruff, mypy, clang-format, license headers) |
84+
| `tox.ini` | Test environment definitions |
85+
86+
## Design Patterns
87+
88+
| Pattern | Key Points |
89+
|---------|------------|
90+
| **Mode composition** | Optimization algorithms are composed as sequences of modes, each recorded in `modelopt_state` |
91+
| **Plugin system** | Optional integrations (HuggingFace, Megatron, etc.) loaded lazily via `import_plugin()` |
92+
| **Optional dependencies** | Features gated by install extras (`[onnx]`, `[hf]`, `[all]`); avoid hard imports at module level |
93+
| **Config dataclasses** | Each mode has a typed config; use Pydantic or dataclass conventions |
94+
| **State dict** | Models carry `modelopt_state` for checkpoint save/restore across optimization steps |
95+
96+
## CI / Testing
97+
98+
| Layer | Location | Notes |
99+
|-------|----------|-------|
100+
| CPU unit tests | `tests/unit/` | Fast, no GPU needed; run in pre-merge CI |
101+
| GPU unit tests | `tests/gpu/` | Requires CUDA GPU |
102+
| Megatron GPU tests | `tests/gpu_megatron/` | Requires Megatron-Core + GPU |
103+
| TRT-LLM GPU tests | `tests/gpu_trtllm/` | Requires TensorRT-LLM + GPU |
104+
| Example/integration tests | `tests/examples/` | Integration tests for examples; see `tests/examples/README.md` |
105+
| Pre-commit / lint | `.pre-commit-config.yaml` | ruff, mypy, clang-format, license headers, bandit |
106+
| Coverage | `pyproject.toml` | 70% minimum on `modelopt/*` |

0 commit comments

Comments
 (0)