Skip to content

Commit 308643f

Browse files
Merge pull request #205 from foundation-model-stack/fasoli/fp8_readme
feat: update readme with fp8 dependency groups
2 parents e79e65c + 1854bf3 commit 308643f

1 file changed

Lines changed: 3 additions & 2 deletions

File tree

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ FMS Model Optimizer is a framework for developing reduced precision neural netwo
4141

4242
*Optional packages based on optimization functionality required:*
4343

44-
- **GPTQ** is a popular compression method for LLMs:
44+
- **GPTQ** is a popular compression method for LLMs:
4545
- [gptqmodel](https://pypi.org/project/gptqmodel/) or build from [source](https://github.com/ModelCloud/GPTQModel)
4646
- If you want to experiment with **INT8** deployment in [QAT](./examples/QAT_INT8/) and [PTQ](./examples/PTQ_INT8/) examples:
4747
- Nvidia GPU with compute capability > 8.0 (A100 family or higher)
@@ -100,7 +100,8 @@ pip install -e .
100100

101101
#### Optional Dependencies
102102
The following optional dependencies are available:
103-
- `fp8`: `llmcompressor` package for fp8 quantization
103+
- `fp8`: `llmcompressor` and `torchao` packages for fp8 quantization and inference
104+
- `fp8-infer`: `torchao` package for fp8 inference
104105
- `gptq`: `GPTQModel` package for W4A16 quantization
105106
- `mx`: `microxcaling` package for MX quantization
106107
- `opt`: Shortcut for `fp8`, `gptq`, and `mx` installs

0 commit comments

Comments
 (0)