|
1 | 1 | # Contributing to Cluster Health Monitor |
2 | 2 |
|
3 | | -First off, thanks for taking the time to contribute! 🎉 |
| 3 | +First off, thanks for taking the time to go through my project! |
4 | 4 |
|
5 | | -This project started as a personal tool for local monitoring and testing of AI models. It has grown into a lightweight, "nvidia-smi wrapper on steroids" that aims to simplify GPU management for developers and researchers. |
| 5 | +This project started as a personal tool for monitoring my local GPU setup, while I play with the AI models. It has grown into a lightweight, "nvidia-smi wrapper on steroids" that makes it easy to manage GPUs, for developers and researchers. |
6 | 6 |
|
7 | | -We welcome contributions of all kinds—bug fixes, new features, documentation improvements, and more. |
| 7 | +All contributions are welcome, bug fixes, new features, documentation improvements, and more. |
8 | 8 |
|
9 | 9 | ## Getting Started |
10 | 10 |
|
11 | 11 | ### Prerequisites |
12 | | -- **OS**: Windows 10/11 (currently the primary target, though Linux/Mac support is on the roadmap). |
13 | | -- **Python**: 3.8+ |
| 12 | + |
| 13 | +- **OS**: Windows 10/11 |
| 14 | +- **Python**: 3.10+ |
14 | 15 | - **CUDA**: Toolkit 12.x (required for GPU benchmarking features). |
15 | 16 |
|
16 | 17 | ### Setting Up the Development Environment |
17 | 18 |
|
18 | 19 | 1. **Clone the repository**: |
| 20 | + |
| 21 | + ```bash |
| 22 | + git clone https://github.com/DataBoySu/Local-GPUMonitor.git |
| 23 | + # Contributing to GPU Health Monitor |
| 24 | + |
| 25 | + Thank you for taking an interest in contributing. This document describes how to get the repository locally, coding and commit guidelines, and the process for submitting changes. |
| 26 | + |
| 27 | + Repository: https://github.com/DataBoySu/Local-GPUMonitor |
| 28 | + |
| 29 | + ## Quick Start (clone & run) |
| 30 | + |
| 31 | + 1. Fork the repository on GitHub and clone your fork: |
| 32 | + |
19 | 33 | ```bash |
20 | | - git clone https://github.com/yourusername/cluster-health-monitor.git |
21 | | - cd cluster-health-monitor |
| 34 | + git clone https://github.com/DataBoySu/Local-GPUMonitor.git |
| 35 | + cd Local-GPUMonitor |
22 | 36 | ``` |
23 | 37 |
|
24 | | -2. **Create a virtual environment**: |
| 38 | + 2. Create and activate a Python virtual environment: |
| 39 | + |
25 | 40 | ```powershell |
26 | 41 | python -m venv .venv |
27 | 42 | .\.venv\Scripts\Activate.ps1 |
28 | 43 | ``` |
29 | 44 |
|
30 | | -3. **Install dependencies**: |
| 45 | + 3. Install dependencies: |
| 46 | + |
31 | 47 | ```powershell |
32 | 48 | pip install -r requirements.txt |
33 | 49 | ``` |
34 | 50 |
|
35 | | -4. **Run the application**: |
| 51 | + 4. Run the application (web dashboard): |
| 52 | + |
36 | 53 | ```powershell |
37 | | - # Run the web server |
38 | 54 | python health_monitor.py web |
39 | 55 | ``` |
40 | 56 |
|
41 | | -## Project Structure |
| 57 | + Or run the CLI mode: |
| 58 | + |
| 59 | + ```powershell |
| 60 | + python health_monitor.py cli |
| 61 | + ``` |
| 62 | + |
| 63 | + ## Branching & Commit Guidelines |
| 64 | + |
| 65 | + - Branch from `main` for new work: `git checkout -b feat/short-description` or `fix/short-description`. |
| 66 | + - Keep commits small and focused. Use clear commit messages (imperative present tense): |
| 67 | + |
| 68 | + `Add VRAM cap enforcement for per-process watchlist` |
| 69 | + |
| 70 | + - Rebase or squash when appropriate before opening a PR to keep history tidy. |
| 71 | + |
| 72 | + ## Pull Requests |
42 | 73 |
|
43 | | -- `monitor/`: Core package source. |
44 | | - - `api/`: FastAPI server and static assets (frontend). |
45 | | - - `collectors/`: System and GPU metric collectors. |
46 | | - - `benchmark/`: GPU stress testing and particle simulation logic. |
47 | | - - `alerting/`: Notification logic (Windows toasts). |
48 | | -- `health_monitor.py`: Main entry point. |
| 74 | + 1. Push your branch to your fork and open a Pull Request against `DataBoySu/Local-GPUMonitor:main`. |
| 75 | + 2. In the PR description include: |
| 76 | + - A short summary of the change/with images if possible. |
| 77 | + - Motivation and any relevant issue links. |
| 78 | + - Testing steps to reproduce or verify the change. |
| 79 | + 3. Ensure CI (if any) passes and address review comments promptly. |
49 | 80 |
|
50 | | -## Roadmap & Future Work |
| 81 | + ## Code Style & Tests |
51 | 82 |
|
52 | | -We are actively looking for help with: |
53 | | -- **Multi-GPU Support**: robust handling of multi-card setups. |
54 | | -- **Cross-Platform Support**: Porting to Linux and macOS. |
55 | | -- **Containerization**: Docker support for easy deployment. |
56 | | -- **Remote Access**: SSH tunneling or secure remote monitoring capabilities. |
57 | | -- **Hardware Support**: AMD (ROCm) and Intel (Arc) GPU support. |
| 83 | + - Python: follow PEP 8 and use type hints where appropriate. We prefer readable, explicit code. |
| 84 | + - JavaScript: keep vanilla JS simple and modular. Follow consistent indentation and naming. |
| 85 | + - Add tests where appropriate (unit tests for collectors, integration tests for API endpoints). If you add tests, include instructions to run them in your PR. |
58 | 86 |
|
59 | | -## Submitting Changes |
| 87 | + ## Running Locally (developer tips) |
60 | 88 |
|
61 | | -1. **Fork the repo** and create your branch from `main`. |
62 | | -2. **Make your changes**. Ensure code is clean and commented where necessary. |
63 | | -3. **Test your changes**. Run the monitor in both CLI and Web modes to ensure no regressions. |
64 | | -4. **Submit a Pull Request**. Describe your changes in detail and link to any relevant issues. |
| 89 | + - To run the web server with auto-reload during development, use your editor's Python run configuration or run with `watchdog`/`honcho` if you add it. |
| 90 | + - For debugging GPU collectors on non-GPU machines, mock or stub out GPU calls (see `monitor/collectors` for structure). |
65 | 91 |
|
66 | | -## Code Style |
| 92 | + ## Communication, Reporting Issues & Security |
67 | 93 |
|
68 | | -- **Python**: Follow PEP 8. We use type hints where possible. |
69 | | -- **JavaScript**: Keep it vanilla and simple. We avoid heavy frontend frameworks to keep the project lightweight. |
| 94 | + - Use GitHub Discussions for general conversation and design proposals: <https://github.com/DataBoySu/Local-GPUMonitor/discussions/9> |
| 95 | + - Open issues for bugs, feature requests, and design discussions |
| 96 | + - For sensitive security issues, please contact the repository owner directly instead of opening a public issue. |
70 | 97 |
|
71 | | -## Community |
| 98 | + ## License |
72 | 99 |
|
73 | | -If you have questions or want to discuss ideas, please open a [Discussion](https://github.com/yourusername/cluster-health-monitor/discussions) or an Issue. |
| 100 | + This project is distributed under the MIT License. See `LICENSE` for details. |
74 | 101 |
|
75 | | -Happy coding! 🚀 |
| 102 | + With your help, I would like to keeps this project useful and evolving. |
0 commit comments