Skip to content

Commit 1bd487e

Browse files
Merge PR #14: GitHub Actions CI
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
2 parents e6e8df5 + 654a67c commit 1bd487e

2 files changed

Lines changed: 167 additions & 0 deletions

File tree

.github/workflows/ci.yaml

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
name: CI
2+
3+
# Runs on every push to main and on every PR targeting main.
4+
#
5+
# Scope: platform-neutral unit tests with 100% line coverage on the
6+
# library modules we actually ship for this commit. We deliberately
7+
# DO NOT run:
8+
#
9+
# * tests/core/ — needs HuggingFace weights
10+
# * tests/system/ — same, plus is slow
11+
# * tests/inference_engine/proposer/ — uses real Qwen3 sparse
12+
# proposer; HF-cache-bound
13+
# * tests/backends/mlx/test_{verifier,proposer,cache,torch_bridge}.py
14+
# — Apple-Silicon only
15+
#
16+
# Mac and CUDA contributors run the full suite locally via
17+
# scripts/run_platform_tests.sh and push the platform-test reports to
18+
# the PR branch as evidence; this CI workflow guards the platform-
19+
# neutral surface so a regression there cannot land on main.
20+
21+
on:
22+
push:
23+
branches: [main]
24+
pull_request:
25+
branches: [main]
26+
workflow_dispatch: {}
27+
28+
# Cancel superseded runs on the same branch — saves CI time on
29+
# rapid-fire pushes.
30+
concurrency:
31+
group: ${{ github.workflow }}-${{ github.ref }}
32+
cancel-in-progress: true
33+
34+
jobs:
35+
unit-tests:
36+
name: unit tests + 100% coverage
37+
runs-on: ubuntu-latest
38+
strategy:
39+
fail-fast: false
40+
matrix:
41+
python-version: ["3.12"]
42+
steps:
43+
- name: Check out
44+
uses: actions/checkout@v4
45+
46+
- name: Set up Python ${{ matrix.python-version }}
47+
uses: actions/setup-python@v5
48+
with:
49+
python-version: ${{ matrix.python-version }}
50+
cache: pip
51+
52+
- name: Install dependencies
53+
run: |
54+
python -m pip install --upgrade pip
55+
pip install -r requirements.txt
56+
57+
- name: Show installed key versions
58+
run: |
59+
python -c "import torch, fastapi, pydantic, prometheus_client, transformers; \
60+
print('torch', torch.__version__); \
61+
print('fastapi', fastapi.__version__); \
62+
print('pydantic', pydantic.VERSION); \
63+
print('transformers', transformers.__version__); \
64+
print('prometheus_client', __import__('importlib.metadata', fromlist=['version']).version('prometheus_client'))"
65+
66+
- name: Run platform-neutral test suite with 100% coverage
67+
env:
68+
PYTHONPATH: .
69+
run: |
70+
pytest \
71+
tests/inference_engine/server/ \
72+
tests/inference_engine/memory/ \
73+
tests/inference_engine/scheduler/ \
74+
tests/inference_engine/pipeline/ \
75+
tests/training/repr_align/ \
76+
tests/backends/mlx/test_env.py \
77+
--cov=inference_engine.server \
78+
--cov=inference_engine.memory \
79+
--cov=inference_engine.scheduler \
80+
--cov=inference_engine.pipeline \
81+
--cov=training.repr_align \
82+
--cov-report=term \
83+
--cov-report=xml:coverage.xml \
84+
--cov-fail-under=100 \
85+
--junitxml=junit.xml \
86+
-v
87+
88+
- name: Upload coverage artifact
89+
if: always()
90+
uses: actions/upload-artifact@v4
91+
with:
92+
name: coverage-py${{ matrix.python-version }}
93+
path: |
94+
coverage.xml
95+
junit.xml
96+
if-no-files-found: warn
97+
retention-days: 14
98+
99+
package-import-smoke:
100+
name: package import smoke
101+
runs-on: ubuntu-latest
102+
steps:
103+
- name: Check out
104+
uses: actions/checkout@v4
105+
- name: Set up Python 3.12
106+
uses: actions/setup-python@v5
107+
with:
108+
python-version: "3.12"
109+
cache: pip
110+
- name: Install dependencies
111+
run: |
112+
python -m pip install --upgrade pip
113+
pip install -r requirements.txt
114+
- name: Import every shipping subpackage
115+
env:
116+
PYTHONPATH: .
117+
run: |
118+
python -c "import inference_engine; \
119+
import inference_engine.server; \
120+
import inference_engine.server.app; \
121+
import inference_engine.server.config; \
122+
import inference_engine.server.engine; \
123+
import inference_engine.server.metrics; \
124+
import inference_engine.server.errors; \
125+
import inference_engine.server.auth; \
126+
import inference_engine.server.tokenizer; \
127+
import inference_engine.server.streaming; \
128+
import inference_engine.server.schemas; \
129+
import inference_engine.memory; \
130+
import inference_engine.memory.slab; \
131+
import inference_engine.memory.pool; \
132+
import inference_engine.scheduler; \
133+
import inference_engine.scheduler.config; \
134+
import inference_engine.scheduler.scheduler; \
135+
import inference_engine.scheduler.session; \
136+
import inference_engine.pipeline; \
137+
import inference_engine.pipeline.coordinator; \
138+
import inference_engine.proposer; \
139+
import inference_engine.proposer.sparse_logits; \
140+
import inference_engine.backends.mlx.env; \
141+
import training.repr_align; \
142+
print('all imports succeeded')"

README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
# DLM Proposer + AR Verifier — runnable KV-cache-saving framework
22

3+
[![CI](https://github.com/FluffyAIcode/Kakeya-LLM-Inference-engine/actions/workflows/ci.yaml/badge.svg?branch=main)](https://github.com/FluffyAIcode/Kakeya-LLM-Inference-engine/actions/workflows/ci.yaml)
4+
[![Release](https://img.shields.io/badge/release-v0.1.0-blue)](https://github.com/FluffyAIcode/Kakeya-LLM-Inference-engine/releases/tag/v0.1.0)
5+
[![Platform](https://img.shields.io/badge/platform-Apple%20Silicon-lightgrey)](docs/local-inference-engine.md)
6+
[![ADRs](https://img.shields.io/badge/ADRs-0001%20%7C%200002-green)](docs/adr/)
7+
38
Runs the speculative-decoding architecture designed in the prior product
49
discussion using **real, public** weights:
510

@@ -387,6 +392,26 @@ admissions, and releases all slabs before the process exits.
387392
Configuration is via env vars (all prefixed `KAKEYA_*`): see the
388393
docstring of [`inference_engine/server/config.py`](inference_engine/server/config.py).
389394

395+
## Continuous integration
396+
397+
Every push to `main` and every PR runs the platform-neutral test
398+
suite on GitHub Actions ([`.github/workflows/ci.yaml`](.github/workflows/ci.yaml)),
399+
enforcing **100% line coverage** on the shipping library modules:
400+
401+
```
402+
inference_engine.server inference_engine.memory inference_engine.scheduler
403+
inference_engine.pipeline inference_engine.proposer training.repr_align
404+
```
405+
406+
Tests that need real Qwen3 weights (`tests/core/`, `tests/system/`,
407+
`tests/inference_engine/proposer/`) are run locally on hosts with the
408+
HuggingFace cache populated; backend-specific suites
409+
(`tests/backends/mlx/test_{verifier,proposer,cache,torch_bridge}.py`)
410+
run on Apple Silicon contributors' machines via
411+
`scripts/run_platform_tests.sh --backend mlx`. The CI workflow
412+
guards the platform-neutral surface so a regression there cannot
413+
land on `main`.
414+
390415
## Architecture Decision Records
391416

392417
Design decisions that the rest of the codebase depends on are recorded

0 commit comments

Comments
 (0)