Skip to content

Commit 0ab266e

Browse files
committed
README updates
1 parent ef366aa commit 0ab266e

2 files changed

Lines changed: 12 additions & 7 deletions

File tree

meta-llama-Llama-3.1-8B-Instruct/QAIRT/README.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
# Llama3.1-8B Model Optimization
1+
# Llama3.1-8B-Instruct Model Optimization
22

3-
This directory demonstrates the optimization of the [Llama3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) model using various AIMET quantization techniques.
3+
This directory demonstrates the optimization of the [Llama3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B) model using various AIMET quantization techniques.
44

55
## Overview
66

7-
This workflow utilizes a Llama3.1-8B script to perform quantization based on the [Qualcomm-distributed Jupyter notebook](https://qpm.qualcomm.com/#/main/tools/details/Tutorial_for_Llama3_1_Compute) for Llama3.1-8B (v1.0.1.260219) which is available for download via QPM.
7+
This workflow utilizes a Llama3.1-8B-Instruct script to perform quantization based on the [Qualcomm-distributed Jupyter notebook](https://qpm.qualcomm.com/#/main/tools/details/Tutorial_for_Llama3_1_Compute) for Llama3.1-8B-Instruct (v1.0.1.260219) which is available for download via QPM.
88

99
After quantization, the QAIRT GenAIBuilder API is utilized to apply additional model transformations, perform conversion, and compile the model for execution on the HTP backend.
1010

@@ -13,19 +13,21 @@ Finally, a prepared QAIRT DLC is encapsulated in an ONNX protobuf and exported t
1313
## Requirements
1414

1515
This workflow has been tested using the following host configuration:
16-
* Python 3.10
17-
* QAIRT 2.45.40
16+
* Python 3.10
17+
* qairt-dev 0.5.0
18+
* QAIRT 2.45.40
1819

1920
Further, this workflow has been tested on the following target configurations:
20-
* HTP backend on SC8380XP
21-
* HTP backend on SC8480XP
21+
* HTP backend on SC8480XP
2222

2323
## Preparation Instructions
2424

2525
1. Install olive[qairt]
2626

2727
```bash
2828
pip install olive[qairt]
29+
pip list | grep qairt-dev # Ensure the proper qairt-dev version was installed
30+
pip install qairt-dev[onnx]==<version> # Install the proper qairt-dev version, if not installed
2931
```
3032

3133
2. (Optional) Use qairt-vm to install a non-default version of QAIRT and set QAIRT_SDK_ROOT

microsoft-Phi-4-reasoning/QAIRT/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ Finally, a prepared QAIRT DLC is encapsulated in an ONNX protobuf and exported t
1414

1515
This workflow has been tested using the following host configuration:
1616
* Python 3.10
17+
* qairt-dev 0.5.0
1718
* QAIRT 2.45.40
1819

1920
Further, this workflow has been tested on the following target configurations:
@@ -25,6 +26,8 @@ Further, this workflow has been tested on the following target configurations:
2526

2627
```bash
2728
pip install olive[qairt]
29+
pip list | grep qairt-dev # Ensure the proper qairt-dev version was installed
30+
pip install qairt-dev[onnx]==<version> # Install the proper qairt-dev version, if not installed
2831
```
2932

3033
2. (Optional) Use qairt-vm to install a non-default version of QAIRT and set QAIRT_SDK_ROOT

0 commit comments

Comments
 (0)