|
| 1 | +# GitHub Actions CI/CD |
| 2 | + |
| 3 | +This directory contains GitHub Actions workflows for automated testing and deployment. |
| 4 | + |
| 5 | +## Workflows |
| 6 | + |
| 7 | +### 1. `ci-cd.yml` - Main CI/CD Pipeline |
| 8 | +**Triggers:** Push to main/develop, Pull Requests, Releases |
| 9 | + |
| 10 | +**Jobs:** |
| 11 | +- **Code Quality** - Linting, formatting checks (non-blocking) |
| 12 | +- **Unit Tests** - Python 3.8, 3.9, 3.10 compatibility |
| 13 | +- **Integration Tests** - End-to-end testing (CPU only in CI) |
| 14 | +- **Quick Functional Test** - Smoke test with dummy data |
| 15 | +- **Docker Build** - Container image verification |
| 16 | +- **Deploy** - Staging/Production deployment (simulated in CI) |
| 17 | + |
| 18 | +### 2. `tests.yml` - Comprehensive Testing |
| 19 | +**Triggers:** Push, Pull Requests |
| 20 | + |
| 21 | +**Jobs:** |
| 22 | +- Unit tests across Python versions |
| 23 | +- Integration tests (CPU-based) |
| 24 | +- Performance tests (requires GPU, self-hosted) |
| 25 | +- PyTorch compatibility tests |
| 26 | +- Docker build verification |
| 27 | + |
| 28 | +### 3. `security.yml` - Security Scanning |
| 29 | +**Triggers:** Push, Pull Requests, Daily schedule |
| 30 | + |
| 31 | +**Jobs:** |
| 32 | +- Dependency vulnerability scanning |
| 33 | +- Code security analysis (Bandit) |
| 34 | +- Secret scanning (TruffleHog) |
| 35 | +- Docker image security (Trivy) |
| 36 | +- CodeQL analysis |
| 37 | +- License compliance |
| 38 | + |
| 39 | +## CI/CD Behavior |
| 40 | + |
| 41 | +### Tests Run on CPU |
| 42 | +Most tests run on GitHub-hosted runners (Ubuntu, CPU only): |
| 43 | +- ✅ Import tests |
| 44 | +- ✅ Configuration tests |
| 45 | +- ✅ Data loading tests |
| 46 | +- ✅ Single-process unit tests |
| 47 | +- ⚠️ GPU tests are skipped (marked with `@pytest.mark.gpu`) |
| 48 | +- ⚠️ Multi-GPU tests are skipped (requires self-hosted runners) |
| 49 | + |
| 50 | +### What Gets Tested |
| 51 | +1. **Code compiles** - All imports work |
| 52 | +2. **Functionality** - Core features work on CPU |
| 53 | +3. **Data pipeline** - Synthetic data generation |
| 54 | +4. **Configuration** - Config loading and validation |
| 55 | +5. **Docker** - Image builds successfully |
| 56 | +6. **Security** - No vulnerabilities or secrets |
| 57 | + |
| 58 | +### What Doesn't Get Tested in CI |
| 59 | +- GPU-specific features (FSDP, mixed precision) |
| 60 | +- Multi-GPU distributed training |
| 61 | +- Performance benchmarks |
| 62 | +- Kubernetes deployment (no cluster in CI) |
| 63 | + |
| 64 | +## Local Testing |
| 65 | + |
| 66 | +### Run All Tests |
| 67 | +```bash |
| 68 | +# Run everything |
| 69 | +pytest |
| 70 | + |
| 71 | +# Skip GPU tests |
| 72 | +pytest -m "not gpu" |
| 73 | + |
| 74 | +# Skip slow tests |
| 75 | +pytest -m "not slow" |
| 76 | + |
| 77 | +# Specific test file |
| 78 | +pytest tests/test_distributed.py -v |
| 79 | +``` |
| 80 | + |
| 81 | +### Run CI Locally |
| 82 | +```bash |
| 83 | +# Install act (GitHub Actions locally) |
| 84 | +brew install act # or: curl https://raw.githubusercontent.com/nektos/act/master/install.sh | sudo bash |
| 85 | + |
| 86 | +# Run workflows |
| 87 | +act push # Simulate push event |
| 88 | +act pull_request # Simulate PR |
| 89 | +``` |
| 90 | + |
| 91 | +## Configuration |
| 92 | + |
| 93 | +### Required Secrets (for actual deployment) |
| 94 | +- `DOCKERHUB_USERNAME` - Docker Hub username |
| 95 | +- `DOCKERHUB_TOKEN` - Docker Hub access token |
| 96 | +- `SLACK_WEBHOOK` - Slack webhook URL (optional) |
| 97 | + |
| 98 | +### Self-Hosted Runner Setup (for GPU tests) |
| 99 | +```bash |
| 100 | +# Install runner on GPU machine |
| 101 | +# Settings → Actions → Runners → New self-hosted runner |
| 102 | + |
| 103 | +# Add labels |
| 104 | +./config.sh --labels gpu |
| 105 | +``` |
| 106 | + |
| 107 | +## Troubleshooting |
| 108 | + |
| 109 | +### Tests Failing in CI |
| 110 | +**Common issues:** |
| 111 | + |
| 112 | +1. **Import errors** |
| 113 | + - Ensure all dependencies in `requirements.txt` |
| 114 | + - Run `pip install -e .` before tests |
| 115 | + |
| 116 | +2. **GPU tests failing** |
| 117 | + - Mark GPU tests: `@pytest.mark.gpu` |
| 118 | + - CI runs with: `pytest -m "not gpu"` |
| 119 | + |
| 120 | +3. **Timeout errors** |
| 121 | + - Increase timeout in `pytest.ini` |
| 122 | + - Or add `@pytest.mark.timeout(600)` to slow tests |
| 123 | + |
| 124 | +4. **Data not found** |
| 125 | + - Generate dummy data first: `python scripts/generate_dummy_data.py` |
| 126 | + - Or use synthetic data (automatic fallback) |
| 127 | + |
| 128 | +### Fixing CI/CD |
| 129 | + |
| 130 | +The updated `ci-cd.yml` now: |
| 131 | +- ✅ Generates dummy data before tests |
| 132 | +- ✅ Skips GPU-only tests |
| 133 | +- ✅ Uses `continue-on-error: true` for optional checks |
| 134 | +- ✅ Runs quick functional test |
| 135 | +- ✅ Has proper timeouts |
| 136 | + |
| 137 | +## Status Badges |
| 138 | + |
| 139 | +Add to your README.md: |
| 140 | + |
| 141 | +```markdown |
| 142 | + |
| 143 | + |
| 144 | + |
| 145 | +``` |
| 146 | + |
| 147 | +## Next Steps |
| 148 | + |
| 149 | +1. ✅ Push code to GitHub |
| 150 | +2. ✅ Workflows run automatically |
| 151 | +3. ✅ Check Actions tab for results |
| 152 | +4. ⚠️ Some tests may be skipped (GPU tests) |
| 153 | +5. ✅ All CPU tests should pass |
| 154 | + |
| 155 | +The CI/CD is designed to verify the code works, even without GPU access! |
0 commit comments