Skip to content

Releases: swapins/BioGraph-Edge-Quantizer

v1.0.0 — BioGraph Edge Quantizer (INT8 GNN Benchmark)

03 May 11:17

Choose a tag to compare

Full Changelog: https://github.com/swapins/BioGraph-Edge-Quantizer/commits/v1.0.0

BioGraph-Edge-Quantizer v1.0.0

Initial release of a resource-aware Graph Neural Network (GNN) pipeline with validated INT8 quantization for edge-constrained biological graph inference.


Scope

  • GraphSAGE-based node classification on biological interaction graphs
  • INT8 weight quantization with controlled accuracy degradation
  • CPU-only execution for edge environments
  • Reproducible benchmarking with fixed evaluation protocol

Key Features

  • FP32 → INT8 manual weight packing
  • ~75% model size reduction (64 MB → 16 MB)
  • Stable latency with bounded variance
  • Torch + PyTorch Geometric pipeline

Performance Results

Metric FP32 INT8 Observation
Model Size 64.03 MB 16.02 MB ~75% reduction
Avg Latency 323.36 ms 313.64 ms ~3% improvement
P95 Latency 334.77 ms 333.91 ms negligible change
Jitter ±13.90 ms ±14.46 ms bounded variance

🎯 Accuracy Evaluation

  • Dataset: STRING v12.0-derived graph
  • Nodes: ~10,000
  • Edges: ~50,000
  • Split: held-out evaluation
Model Accuracy Precision Recall Δ
FP32 91.8% 90.5% 92.3%
INT8 90.9% 89.7% 91.5% -0.9%

Observation:
INT8 quantization preserves predictive behavior with <1% degradation.


Key Insight

Quantization reduces memory footprint significantly, but:

  • does not substantially improve latency (compute is graph-aggregation bound)
  • improves stability under constrained memory environments (especially ARM)

⚠️ Limitations

  • Latency gains are minimal on x86 systems
  • Subprocess-based execution introduces overhead
  • Not optimized for high-throughput workloads
  • No distributed or multi-node support

Reproducibility

  • Random seed: 42
  • Runs per benchmark: 100
  • Execution: CPU-only, single-thread
  • Hardware: Intel i5-10210U, 8GB RAM

All results are reproducible under identical conditions.


Edge Validation (ARM)

Tested on Raspberry Pi 4:

  • INT8 shows improved performance due to reduced memory pressure
  • Confirms edge-device relevance of quantization

Notes

This release represents a benchmark-focused research prototype for:

  • model compression
  • edge inference behavior
  • reproducible GNN evaluation

Next Steps

  • Persistent inference service (gRPC / FastAPI)
  • ONNX INT8 deployment
  • Sparse GNN optimization
  • Hardware-aware scheduling