Skip to content

Commit a44751f

Browse files
danmcleranclaude
andcommitted
README: cover Phase 8 quantization additions
- qactivations.hpp now hosts 256-entry int8 sigmoid/tanh LUT builders; surface them in the quantization feature bullets and project tree - examples/pytorch_quant/xor/ added as second end-to-end quantization example; list it alongside kws_cortex_m_int8 in the feature bullets, Build Examples block, and project tree - Note pytorch/ as the Q-format pipeline counterpart so the two PyTorch-to-TinyMind paths read clearly side by side Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent c5fc543 commit a44751f

1 file changed

Lines changed: 9 additions & 3 deletions

File tree

README.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -67,9 +67,12 @@ A parallel TFLite/CMSIS-NN style affine quantization path that runs **alongside*
6767

6868
- **`QConv2D`**, **`QDepthwiseConv2D`** (per-channel weight scale, TFLite mandate), **`QPointwiseConv2D`**, **`QMaxPool2D`**, **`QAvgPool2D`**, **`QGlobalAvgPool2D`**, **`QDense`** -- int8 weights/activations, int32 accumulators, integer requantization between layers via gemmlowp-style `Requantizer<int32, int8>` (Q0.31 multiplier + shift)
6969
- **`qrelu` / `qrelu6`** activations plus `clampForRelu` / `clampForRelu6` helpers that fold the activation into the upstream Requantizer's saturation pass for runtime efficiency
70+
- **256-entry int8 sigmoid / tanh lookup tables** built host-side via `buildQSigmoidLUT` / `buildQTanhLUT` and applied at runtime via `qApplyLUT` / `qApplyLUTBuffer` -- single load per element, no `<cmath>` on the inference path
7071
- **Host-side calibration** in `cpp/include/qcalibration.hpp`: `RangeObserver`, `computeAffineParamsAsymmetric` / `Symmetric`, `computePerChannelSymmetricScales`, `quantizeBuffer`, `buildRequantizer` -- gated on `FLOAT && STD` so the deployable inference binary never pulls in float math
7172
- **Pure integer at runtime**: the inference path compiles freestanding (`FLOAT=0 STD=0 QUANT=1`); `unit_test/embedded` exercises this corner as `quant_freestanding` and the `unit_test/quantization` Boost.Test suite covers the math
72-
- **End-to-end example**: [`examples/kws_cortex_m_int8/`](examples/kws_cortex_m_int8/) is a side-by-side counterpart to `examples/kws_cortex_m/` -- same KWS pipeline, comparable CSV cycle/byte report, ~10x smaller weight footprint on the convolutional layers
73+
- **End-to-end examples**:
74+
- [`examples/pytorch_quant/xor/`](examples/pytorch_quant/xor/) -- PyTorch float training + per-tensor calibration + `weights.hpp` emission, then a pure-integer C++ forward pass through `QDense` + `qrelu` + `QDense` + int8 sigmoid LUT
75+
- [`examples/kws_cortex_m_int8/`](examples/kws_cortex_m_int8/) -- side-by-side counterpart to `examples/kws_cortex_m/`; same MobileNet-style KWS pipeline, comparable CSV cycle/byte report, ~4x smaller weight footprint on the convolutional layers
7376

7477
### Activation Functions
7578

@@ -689,6 +692,8 @@ cd examples/lstm_sinusoid && make clean && make
689692
cd examples/maze && make clean && make
690693
cd examples/dqn_maze && make clean && make
691694
cd examples/kws_cortex_m && make clean && make
695+
cd examples/kws_cortex_m_int8 && make clean && make
696+
cd examples/pytorch_quant/xor && make clean && make
692697
cd examples/predictive_maintenance && make clean && make
693698
```
694699

@@ -738,7 +743,7 @@ tinymind/
738743
qpointwiseconv2d.hpp # Quantized 1x1 pointwise conv
739744
qpool2d.hpp # QMaxPool2D, QAvgPool2D, QGlobalAvgPool2D
740745
qdense.hpp # Quantized fully-connected layer
741-
qactivations.hpp # Quantized ReLU / ReLU6 + fused-clamp helpers
746+
qactivations.hpp # Quantized ReLU / ReLU6 + fused-clamp helpers + int8 sigmoid/tanh LUTs
742747
activationFunctions.hpp # Activation function policies (9 functions)
743748
fixedPointTransferFunctions.hpp
744749
adam.hpp # Adam optimizer policy
@@ -773,7 +778,8 @@ tinymind/
773778
kws_cortex_m/ # Depthwise-separable CNN pipeline with bench harness
774779
kws_cortex_m_int8/ # int8 quantized counterpart of kws_cortex_m (parallel Q* layers)
775780
predictive_maintenance/ # Binary classifier on AI4I 2020 dataset (Q16.16 MLP)
776-
pytorch/ # PyTorch weight import (MLP + GRU export)
781+
pytorch/ # PyTorch weight import (MLP + GRU export, Q-format pipeline)
782+
pytorch_quant/xor/ # PyTorch -> int8 affine quantization end-to-end (XOR)
777783
unit_test/
778784
nn/ # Neural network tests (171 test cases)
779785
kan/ # KAN tests (16 test cases)

0 commit comments

Comments
 (0)