|
| 1 | +# DeepQuant |
| 2 | + |
| 3 | +A Python library for exporting Brevitas quantized neural networks. |
| 4 | + |
| 5 | +## Installation |
| 6 | + |
| 7 | +### Requirements |
| 8 | + |
| 9 | +- Python 3.11 or higher |
| 10 | +- PyTorch 2.1.2 or higher |
| 11 | +- Brevitas 0.11.0 or higher |
| 12 | + |
| 13 | +### Setup Environment |
| 14 | + |
| 15 | +First, create and activate a new conda environment: |
| 16 | + |
| 17 | +```bash |
| 18 | +mamba create -n brevitas_env python=3.11 |
| 19 | +mamba activate brevitas_env |
| 20 | +``` |
| 21 | + |
| 22 | +### Install Dependencies |
| 23 | + |
| 24 | +Install PyTorch and its related packages: |
| 25 | + |
| 26 | +```bash |
| 27 | +mamba install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 -c pytorch |
| 28 | +``` |
| 29 | + |
| 30 | +### Install the Package |
| 31 | + |
| 32 | +Clone the repository and install in development mode: |
| 33 | + |
| 34 | +```bash |
| 35 | +cd DeepQuant |
| 36 | +pip install -e . |
| 37 | +``` |
| 38 | + |
| 39 | +## Running Tests |
| 40 | + |
| 41 | +### Using Make (Recommended) |
| 42 | + |
| 43 | +The project includes a Makefile with several testing commands: |
| 44 | + |
| 45 | +```bash |
| 46 | +# Run all tests with verbose output |
| 47 | +make test |
| 48 | + |
| 49 | +# Run only neural network test |
| 50 | +make test-nn |
| 51 | + |
| 52 | +# Run only multi-head attention test |
| 53 | +make test-mha |
| 54 | + |
| 55 | +# Run only CNN test |
| 56 | +make test-cnn |
| 57 | + |
| 58 | +# Run only Resnet18 test |
| 59 | +make test-resnet |
| 60 | + |
| 61 | +# Run a specific test file |
| 62 | +make test-single TEST=test_simple_nn.py |
| 63 | + |
| 64 | +# Show all available make commands |
| 65 | +make help |
| 66 | +``` |
| 67 | + |
| 68 | +### Using pytest directly |
| 69 | + |
| 70 | +You can also run tests using pytest commands: |
| 71 | + |
| 72 | +```bash |
| 73 | +# Run all tests |
| 74 | +python -m pytest src/DeepQuant/tests -v -s |
| 75 | + |
| 76 | +# Run a specific test file |
| 77 | +python -m pytest src/DeepQuant/tests/test_simple_nn.py -v -s |
| 78 | +``` |
| 79 | + |
| 80 | +## Project Structure |
| 81 | + |
| 82 | +``` |
| 83 | +DeepQuant/ |
| 84 | +├── Makefile |
| 85 | +├── pyproject.toml |
| 86 | +├── conftest.py |
| 87 | +└── src/ |
| 88 | + └── DeepQuant/ |
| 89 | + ├── custom_forwards/ |
| 90 | + │ ├── activations.py |
| 91 | + │ ├── linear.py |
| 92 | + │ └── multiheadattention.py |
| 93 | + ├── injects/ |
| 94 | + │ ├── base.py |
| 95 | + │ ├── executor.py |
| 96 | + │ └── transformations.py |
| 97 | + ├── tests/ |
| 98 | + │ ├── test_simple_mha.py |
| 99 | + │ ├── test_simple_nn.py |
| 100 | + │ └── test_simple_cnn.py |
| 101 | + ├── custom_tracer.py |
| 102 | + └── export_brevitas.py |
| 103 | +``` |
| 104 | + |
| 105 | +### Key Components |
| 106 | + |
| 107 | +- **Makefile**: Provides automation commands for testing |
| 108 | +- **pyproject.toml**: Defines project metadata and dependencies for editable installation |
| 109 | +- **conftest.py**: Pytest configuration file that handles warning filters |
| 110 | + |
| 111 | +The source code is organized into several key modules: |
| 112 | + |
| 113 | +- **custom_forwards/**: Contains the unrolled forward implementations for: |
| 114 | + |
| 115 | + - Linear layers (QuantLinear, QuantConv2d) |
| 116 | + - Activation functions (QuantReLU, QuantSigmoid, etc.) |
| 117 | + - Multi-head attention (QuantMultiheadAttention) |
| 118 | + |
| 119 | +- **injects/**: Contains the transformation infrastructure: |
| 120 | + |
| 121 | + - Base transformation class and executor |
| 122 | + - Module-specific transformations |
| 123 | + - Validation and verification logic |
| 124 | + |
| 125 | +- **tests/**: Example tests demonstrating the exporter usage: |
| 126 | + |
| 127 | + - Simple neural network (linear + activations) |
| 128 | + - Multi-head attention model |
| 129 | + - Convolutional neural network |
| 130 | + - Resnet18 |
| 131 | + |
| 132 | +- **custom_tracer.py**: Implements a specialized `CustomBrevitasTracer` for FX tracing |
| 133 | + |
| 134 | + - Handles Brevitas-specific module traversal |
| 135 | + - Ensures proper graph capture of quantization operations |
| 136 | + |
| 137 | +- **export_brevitas.py**: Main API for end-to-end model export: |
| 138 | + - Orchestrates the transformation passes |
| 139 | + - Performs the final FX tracing |
| 140 | + - Validates model outputs through the process |
| 141 | + |
| 142 | +## Usage |
| 143 | + |
| 144 | +### Main Function: exportBrevitas |
| 145 | + |
| 146 | +The main function of this library is `exportBrevitas`, which exports a Brevitas-based model to an FX GraphModule with unrolled quantization steps. |
| 147 | + |
| 148 | +```python |
| 149 | +from DeepQuant.export_brevitas import exportBrevitas |
| 150 | + |
| 151 | +# Initialize your Brevitas model |
| 152 | +model = YourBrevitasModel().eval() |
| 153 | + |
| 154 | +# Create an input with the correct shape |
| 155 | +input = torch.randn(1, input_channels, height, width) |
| 156 | + |
| 157 | +# Export the model (with debug information) |
| 158 | +fx_model = exportBrevitas(model, input, debug=True) |
| 159 | +``` |
| 160 | + |
| 161 | +Arguments: |
| 162 | + |
| 163 | +- `model`: The Brevitas-based model to export |
| 164 | +- `example_input`: A representative input tensor for shape tracing |
| 165 | +- `debug`: If True, prints transformation progress (default: False) |
| 166 | + |
| 167 | +When `debug=True`, you'll see the output showing the progress, for example: |
| 168 | + |
| 169 | +``` |
| 170 | +✓ MHA transformation successful - outputs match |
| 171 | +✓ Linear transformation successful - outputs match |
| 172 | +✓ Activation transformation successful - outputs match |
| 173 | +All transformations completed successfully! |
| 174 | +``` |
| 175 | + |
| 176 | +### Example Usage |
| 177 | + |
| 178 | +A simple example script can be found in `example_usage.py` in the root directory of the project. |
| 179 | + |
| 180 | +```python |
| 181 | +import torch |
| 182 | +import torch.nn as nn |
| 183 | +import brevitas.nn as qnn |
| 184 | +from brevitas.quant.scaled_int import Int8ActPerTensorFloat, Int32Bias |
| 185 | +from DeepQuant.export_brevitas import exportBrevitas |
| 186 | + |
| 187 | +# Define a simple quantized model |
| 188 | +class SimpleQuantModel(nn.Module): |
| 189 | + def __init__(self): |
| 190 | + super().__init__() |
| 191 | + self.input_quant = qnn.QuantIdentity(return_quant_tensor=True) |
| 192 | + self.conv = qnn.QuantConv2d( |
| 193 | + in_channels=3, |
| 194 | + out_channels=16, |
| 195 | + kernel_size=3, |
| 196 | + bias=True, |
| 197 | + weight_bit_width=4, |
| 198 | + bias_quant=Int32Bias, |
| 199 | + output_quant=Int8ActPerTensorFloat, |
| 200 | + ) |
| 201 | + |
| 202 | + def forward(self, x): |
| 203 | + x = self.input_quant(x) |
| 204 | + x = self.conv(x) |
| 205 | + return x |
| 206 | + |
| 207 | +# Export the model |
| 208 | +model = SimpleQuantModel().eval() |
| 209 | +dummy_input = torch.randn(1, 3, 32, 32) |
| 210 | +fx_model = exportBrevitas(model, dummy_input, debug=True) |
| 211 | +``` |
0 commit comments