Releases: swapins/BioGraph-Edge-Quantizer
Releases · swapins/BioGraph-Edge-Quantizer
v1.0.0 — BioGraph Edge Quantizer (INT8 GNN Benchmark)
Full Changelog: https://github.com/swapins/BioGraph-Edge-Quantizer/commits/v1.0.0
BioGraph-Edge-Quantizer v1.0.0
Initial release of a resource-aware Graph Neural Network (GNN) pipeline with validated INT8 quantization for edge-constrained biological graph inference.
Scope
- GraphSAGE-based node classification on biological interaction graphs
- INT8 weight quantization with controlled accuracy degradation
- CPU-only execution for edge environments
- Reproducible benchmarking with fixed evaluation protocol
Key Features
- FP32 → INT8 manual weight packing
- ~75% model size reduction (64 MB → 16 MB)
- Stable latency with bounded variance
- Torch + PyTorch Geometric pipeline
Performance Results
| Metric | FP32 | INT8 | Observation |
|---|---|---|---|
| Model Size | 64.03 MB | 16.02 MB | ~75% reduction |
| Avg Latency | 323.36 ms | 313.64 ms | ~3% improvement |
| P95 Latency | 334.77 ms | 333.91 ms | negligible change |
| Jitter | ±13.90 ms | ±14.46 ms | bounded variance |
🎯 Accuracy Evaluation
- Dataset: STRING v12.0-derived graph
- Nodes: ~10,000
- Edges: ~50,000
- Split: held-out evaluation
| Model | Accuracy | Precision | Recall | Δ |
|---|---|---|---|---|
| FP32 | 91.8% | 90.5% | 92.3% | — |
| INT8 | 90.9% | 89.7% | 91.5% | -0.9% |
Observation:
INT8 quantization preserves predictive behavior with <1% degradation.
Key Insight
Quantization reduces memory footprint significantly, but:
- does not substantially improve latency (compute is graph-aggregation bound)
- improves stability under constrained memory environments (especially ARM)
⚠️ Limitations
- Latency gains are minimal on x86 systems
- Subprocess-based execution introduces overhead
- Not optimized for high-throughput workloads
- No distributed or multi-node support
Reproducibility
- Random seed: 42
- Runs per benchmark: 100
- Execution: CPU-only, single-thread
- Hardware: Intel i5-10210U, 8GB RAM
All results are reproducible under identical conditions.
Edge Validation (ARM)
Tested on Raspberry Pi 4:
- INT8 shows improved performance due to reduced memory pressure
- Confirms edge-device relevance of quantization
Notes
This release represents a benchmark-focused research prototype for:
- model compression
- edge inference behavior
- reproducible GNN evaluation
Next Steps
- Persistent inference service (gRPC / FastAPI)
- ONNX INT8 deployment
- Sparse GNN optimization
- Hardware-aware scheduling