Thank you for considering contributing to Flash-SAE. This document outlines the process and standards for contributions.
# Clone and install in development mode
git clone https://github.com/alepot55/flash-sae.git
cd flash-sae
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"- Formatter: Black with line-length 100
- Linter: Ruff
- Type hints: Required for all public APIs
# Format code
black flash_sae/ tests/ benchmarks/
# Lint
ruff check flash_sae/ tests/All contributions must pass the test suite:
pytest tests/ -vNew features require corresponding tests. Target coverage: 80%+.
Follow Conventional Commits:
feat: add FP8 quantization support
fix: correct gradient computation in encoder backward
perf: optimize decoder gather pattern for H100
docs: update API reference
test: add ghost gradient unit tests
- Fork the repository
- Create a feature branch from
main - Implement changes with tests
- Run the full test suite locally
- Submit PR with clear description
- Tests pass (
pytest tests/ -v) - Code formatted (
black --check .) - Linting passes (
ruff check .) - Documentation updated if needed
- Commit messages follow convention
Triton kernel contributions require:
- Correctness tests comparing against PyTorch reference
- Benchmark results on at least one GPU (report model)
- Memory analysis if claiming efficiency improvements
- Autograd wrapper for training support
python benchmarks/benchmark.py --batch 256 512 1024 2048Include benchmark output in PR description.
Major changes require discussion before implementation. Open an issue with:
- Problem statement: What limitation are you addressing?
- Proposed solution: High-level approach
- Alternatives considered: Why not other approaches?
- Performance implications: Expected speedup/memory impact
By contributing, you agree that your contributions will be licensed under the MIT License.