|
| 1 | +# CodeFlare SDK |
| 2 | + |
| 3 | +Python SDK for simplifying the management of distributed computing resources |
| 4 | +on Kubernetes. Provides interfaces for Ray cluster lifecycle, job submission, |
| 5 | +and Kueue integration. Apache-2.0 licensed, Python ^3.11. |
| 6 | + |
| 7 | +## Repository Structure |
| 8 | + |
| 9 | +| Directory | Description | |
| 10 | +| --- | --- | |
| 11 | +| `src/codeflare_sdk/` | Main package | |
| 12 | +| `src/codeflare_sdk/common/` | Shared utilities (auth, Kueue, widgets) | |
| 13 | +| `src/codeflare_sdk/ray/` | Ray cluster and job management | |
| 14 | +| `src/codeflare_sdk/vendored/` | Vendored KubeRay client — DO NOT MODIFY | |
| 15 | +| `tests/` | E2E and upgrade test suites | |
| 16 | +| `demo-notebooks/` | Jupyter demo notebooks | |
| 17 | +| `docs/` | Sphinx documentation | |
| 18 | +| `images/` | Docker build files | |
| 19 | + |
| 20 | +### Key Packages |
| 21 | + |
| 22 | +``` |
| 23 | +src/codeflare_sdk/ |
| 24 | + common/ |
| 25 | + kubernetes_cluster/ # Auth, API client, error handling |
| 26 | + kueue/ # Local queue listing, default queue resolution |
| 27 | + utils/ # Constants, helpers, validation |
| 28 | + widgets/ # Jupyter/IPython widgets |
| 29 | + ray/ |
| 30 | + cluster/ # Cluster create/config/status/delete |
| 31 | + rayjobs/ # RayJob submit, tracking, runtime env |
| 32 | + client/ # Ray JobSubmissionClient wrapper |
| 33 | +``` |
| 34 | + |
| 35 | +## Setup |
| 36 | + |
| 37 | +```sh |
| 38 | +# Install (development) |
| 39 | +poetry install |
| 40 | + |
| 41 | +# Install with test dependencies |
| 42 | +poetry install --with test |
| 43 | + |
| 44 | +# Install with test + docs dependencies |
| 45 | +poetry install --with test,docs |
| 46 | + |
| 47 | +# Install pre-commit hooks |
| 48 | +pre-commit install |
| 49 | +``` |
| 50 | + |
| 51 | +## Build and Test Commands |
| 52 | + |
| 53 | +```sh |
| 54 | +# Pre-commit (formatting + checks) |
| 55 | +pre-commit run --show-diff-on-failure --color=always --all-files |
| 56 | + |
| 57 | +# Unit tests with coverage (excludes E2E, notebooks, vendored) |
| 58 | +coverage run \ |
| 59 | + --omit="src/**/test_*.py,src/codeflare_sdk/common/utils/unit_test_support.py,src/codeflare_sdk/vendored/**" \ |
| 60 | + -m pytest \ |
| 61 | + --ignore=tests/e2e --ignore=tests/e2e_v2 --ignore=tests/upgrade \ |
| 62 | + --ignore=demo-notebooks --ignore=tests/ui |
| 63 | + |
| 64 | +# Coverage report |
| 65 | +coverage report -m |
| 66 | + |
| 67 | +# Check patch coverage for specific files |
| 68 | +coverage report -m --include="path/to/changed1.py,path/to/changed2.py" |
| 69 | +``` |
| 70 | + |
| 71 | +### Single-File Commands |
| 72 | + |
| 73 | +```sh |
| 74 | +# Format a single file |
| 75 | +black path/to/file.py |
| 76 | + |
| 77 | +# Check formatting without modifying |
| 78 | +black --check path/to/file.py |
| 79 | +``` |
| 80 | + |
| 81 | +### Coverage Requirements |
| 82 | + |
| 83 | +- **Project**: >= 90% (enforced in CI) |
| 84 | +- **Patch**: >= 85% for new/changed files |
| 85 | +- CI uses codecov with patch threshold 85%, overall threshold 2.5% |
| 86 | + |
| 87 | +## Coding Conventions |
| 88 | + |
| 89 | +### Python Style |
| 90 | + |
| 91 | +- **Formatter**: black (via pre-commit) |
| 92 | +- **Naming**: snake_case for functions/variables/modules, PascalCase for classes |
| 93 | +- **Type hints**: required for function parameters and return types |
| 94 | +- **Docstrings**: Google-style (Args, Returns, Raises sections) |
| 95 | +- **License header**: Apache-2.0 at top of every new file |
| 96 | +- **Import order**: standard library, third-party, local (blank line between groups) |
| 97 | +- **Local imports**: use relative imports within the same package, absolute |
| 98 | + `from codeflare_sdk...` when crossing package boundaries or in tests |
| 99 | + |
| 100 | +### Public API |
| 101 | + |
| 102 | +Export new public classes and functions in `src/codeflare_sdk/__init__.py`. |
| 103 | +Do not add public API without listing it there. |
| 104 | + |
| 105 | +### Vendored Code |
| 106 | + |
| 107 | +The `src/codeflare_sdk/vendored/` directory contains a vendored KubeRay Python |
| 108 | +client. Do not modify files in this directory. Do not import directly from |
| 109 | +vendored modules — use the SDK's own wrappers. |
| 110 | + |
| 111 | +### Kubernetes API Patterns |
| 112 | + |
| 113 | +- Call `config_check()` before Kubernetes API calls |
| 114 | +- Use `get_api_client()` to obtain the client — do not instantiate directly |
| 115 | +- Handle `ApiException` with `_kube_api_error_handling(e)` — do not add new |
| 116 | + ad-hoc exception handling patterns |
| 117 | +- Use safe access (`.get()`, `try/except`) when parsing Custom Resource dicts |
| 118 | +- Reuse existing enums (e.g., `RayClusterStatus`) — do not introduce new |
| 119 | + string-based status fields for concepts already modeled |
| 120 | + |
| 121 | +## Testing |
| 122 | + |
| 123 | +- **Framework**: pytest with pytest-mock and pytest-timeout (900s default) |
| 124 | +- **Unit tests**: colocated with source in `src/codeflare_sdk/**/test_*.py` |
| 125 | +- **E2E tests**: in `tests/e2e/`, require a Kubernetes cluster (not run locally) |
| 126 | +- **Global fixtures**: `src/codeflare_sdk/conftest.py` auto-mocks K8s API clients |
| 127 | +- **Mocking**: use `mocker` (pytest-mock) for K8s/API calls |
| 128 | +- **Test helpers**: use functions from `common/utils/unit_test_support.py` |
| 129 | + (e.g., `get_ray_obj_with_status`, `create_cluster_config`) — never hardcode |
| 130 | + raw Kubernetes JSON payloads in test files |
| 131 | +- **Edge cases**: when parsing K8s CRs, add tests with malformed/partial |
| 132 | + payloads (empty items, missing spec/status) |
| 133 | + |
| 134 | +### Pre-Commit Hooks |
| 135 | + |
| 136 | +Pre-commit hooks enforce: |
| 137 | + |
| 138 | +- trailing-whitespace removal |
| 139 | +- end-of-file newline |
| 140 | +- YAML validation |
| 141 | +- Large file checks |
| 142 | +- black formatting |
| 143 | + |
| 144 | +## Cursor Rules (extended guidance) |
| 145 | + |
| 146 | +This repository has more detailed AI coding rules in `.cursor/rules/`: |
| 147 | + |
| 148 | +- `.cursor/rules/01-project-context.mdc` — Grounding, personas, hallucination avoidance |
| 149 | +- `.cursor/rules/02-python-standards.mdc` — Python style, canonical examples, common pitfalls |
| 150 | +- `.cursor/rules/03-testing-and-ci.mdc` — CI workflows, demo notebooks, KinD adaptations |
0 commit comments