CuFlash-Attn is in final-governance / archive-ready stabilization. Contributions should bias toward spec alignment, documentation quality, workflow simplification, and bug cleanup over feature expansion.
- NVIDIA GPU with Compute Capability 7.0+ (V100+)
- CUDA Toolkit 12.x
- CMake 3.18+, GCC 9+, C++17
This project uses OpenSpec methodology. All changes must follow the spec-driven cycle:
/opsx:propose <change-name> → review specs → /opsx:apply → /verify → /opsx:archive
- Read specs first:
openspec/specs/is the single source of truth - Propose before coding: Use
/opsx:propose <name>to create a change proposal - Reference spec IDs in tests: e.g.,
// Validates REQ-1.1, Property 1 - Format before commit:
find . -name "*.cu" -o -name "*.cuh" -o -name "*.cpp" -o -name "*.h" | xargs clang-format -i - Run review before concluding major work: use
/reviewon meaningful changesets
CuFlash-Attn is in a final-governance phase. Prefer short, explicit cleanup work over broad feature work:
- Behavior or API change → create or update an OpenSpec change first
- Docs / workflow / metadata cleanup → land it under an existing governance-oriented change when possible
- Avoid speculative expansion → if the work does not improve correctness, maintainability, docs quality, or handoff readiness, it is probably out of scope
- Use
/reviewbefore landing non-trivial diffs → especially for cross-file refactors, workflow edits, and API-adjacent changes - Stay in one focused lane → finish one OpenSpec change cleanly before starting another
cmake --preset release
cmake --build --preset release
ctest --preset release --output-on-failure- Preferred local stack:
clangd+ CMake Tools (or any editor that consumes.clangd) - Generate
build/release/compile_commands.jsonwithcmake --preset release - Machines without
nvcccan still edit docs, specs, workflow files, and most headers;.clangdfallback flags provide partial completion/navigation, but full.cudiagnostics and configure/build steps require a CUDA-capable toolchain - Keep editor tooling minimal. Do not add project-local MCP daemons or Copilot plugins unless a
recurring repository-wide problem cannot be solved by OpenSpec docs, CLI skills,
gh, or the existing workspace settings
- Format: Google style via clang-format (
.clang-format) - Naming: namespaces
lower_case, classesCamelCase, functionslower_case - Commits: Conventional commits —
feat(scope): description,fix(scope): description
- Work directly on
master - Use
git tag v0.x.xfor releases - Keep feature work in short-lived commits, not long-lived branches
- Avoid branch and workflow sprawl; prefer one focused OpenSpec change at a time
- Avoid
/fleet-style parallel branch proliferation for this repository