From dcb2adfcf39d66bd19b6ffd0b16c93d9e1d0b5cd Mon Sep 17 00:00:00 2001 From: Minh Nguyen <20390942+minhnd3796@users.noreply.github.com> Date: Tue, 19 Jun 2018 14:43:15 +0700 Subject: [PATCH 1/2] Update Quant_guide.md --- Deployment/Quant_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Deployment/Quant_guide.md b/Deployment/Quant_guide.md index e76a7566..25df2bcb 100644 --- a/Deployment/Quant_guide.md +++ b/Deployment/Quant_guide.md @@ -1,4 +1,4 @@ -Deep learning models are typically trained with floating point data but they can quantized into integers during inference without any loss of performance (i.e. accuracy). Quantizing models includes quantizing both the weights and activation data (or layer input/outputs). In this work, we quantize the floating point weights/activation data to [Qm.n format](https://en.wikipedia.org/wiki/Q_(number_format)), where m,n are fixed within a layer but can vary across different network layers. +Deep learning models are typically trained with floating point data but they can b quantized into integers during inference without any loss of performance (i.e. accuracy). Quantizing models includes quantizing both the weights and activation data (or layer input/outputs). In this work, we quantize the floating point weights/activation data to [Qm.n format](https://en.wikipedia.org/wiki/Q_(number_format)), where m,n are fixed within a layer but can vary across different network layers. ## Quantize weights Quantizing weights is fairly simple, as the weights are fixed after the training and we know their min/max range. Using these ranges, the weights are quantized or discretized to 256 levels. Here is the code snippet for quantizing the weights and biases to 8-bit integers. From b6e199cc8c57f61348c218cac2f49eaeaa7cc207 Mon Sep 17 00:00:00 2001 From: Minh Nguyen <20390942+minhnd3796@users.noreply.github.com> Date: Tue, 19 Jun 2018 14:46:40 +0700 Subject: [PATCH 2/2] Update Quant_guide.md --- Deployment/Quant_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Deployment/Quant_guide.md b/Deployment/Quant_guide.md index 25df2bcb..7442e724 100644 --- a/Deployment/Quant_guide.md +++ b/Deployment/Quant_guide.md @@ -1,4 +1,4 @@ -Deep learning models are typically trained with floating point data but they can b quantized into integers during inference without any loss of performance (i.e. accuracy). Quantizing models includes quantizing both the weights and activation data (or layer input/outputs). In this work, we quantize the floating point weights/activation data to [Qm.n format](https://en.wikipedia.org/wiki/Q_(number_format)), where m,n are fixed within a layer but can vary across different network layers. +Deep learning models are typically trained with floating point data but they can be quantized into integers during inference without any loss of performance (i.e. accuracy). Quantizing models includes quantizing both the weights and activation data (or layer input/outputs). In this work, we quantize the floating point weights/activation data to [Qm.n format](https://en.wikipedia.org/wiki/Q_(number_format)), where m,n are fixed within a layer but can vary across different network layers. ## Quantize weights Quantizing weights is fairly simple, as the weights are fixed after the training and we know their min/max range. Using these ranges, the weights are quantized or discretized to 256 levels. Here is the code snippet for quantizing the weights and biases to 8-bit integers.