feat: GPTQModel Migration#102
feat: GPTQModel Migration#102chichun-charlie-liu merged 15 commits intofoundation-model-stack:mainfrom
Conversation
Signed-off-by: Thara Palanivel <130496890+tharapalanivel@users.noreply.github.com>
Signed-off-by: Thara Palanivel <130496890+tharapalanivel@users.noreply.github.com>
Signed-off-by: Thara Palanivel <130496890+tharapalanivel@users.noreply.github.com>
Signed-off-by: Thara Palanivel <130496890+tharapalanivel@users.noreply.github.com>
Signed-off-by: Thara Palanivel <130496890+tharapalanivel@users.noreply.github.com>
fix: Update gh-action-pypi-publish version
Signed-off-by: Thara Palanivel <130496890+tharapalanivel@users.noreply.github.com>
|
Everythign looks fine. The only question I have is that I'm not sure if our unit tests have enough coverage. Could you please confirm that our GPTQ example can still run successfully? |
Signed-off-by: chichun-charlie-liu <57839396+chichun-charlie-liu@users.noreply.github.com>
Signed-off-by: chichun-charlie-liu <57839396+chichun-charlie-liu@users.noreply.github.com>
Signed-off-by: chichun-charlie-liu <57839396+chichun-charlie-liu@users.noreply.github.com>
chichun-charlie-liu
left a comment
There was a problem hiding this comment.
include all the feedback from Bayo, except OoM verifications.
Signed-off-by: chichun-charlie-liu <57839396+chichun-charlie-liu@users.noreply.github.com>
| dev = ["pre-commit>=3.0.4,<5.0"] | ||
| fp8 = ["llmcompressor"] | ||
| gptq = ["auto_gptq>0.4.2", "optimum>=1.15.0"] | ||
| gptq = ["Cython", "gptqmodel>=1.7.3"] |
There was a problem hiding this comment.
Should we be including exllama and exllamav2 here as well? Seems like they are "required" for gptq to work.
There was a problem hiding this comment.
exllama and exllamav2 require GPU for installation and are indeed needed to run GPTQ modules on GPU, but not to run GPTQ on CPU or AIU via our addons for FMS. As we must support the option for users to run FMS-MO in an environment without GPUs, we can't add these two packages to our requirements.
There was a problem hiding this comment.
previously these 2 exllama_kernel packages come from auto_gptq (see setup.py from autogptq, they are not "dependencies" but embedded in the auto_gptq package installation ). We didn't need to install them separately, but the new gptqmodel package indeed has renamed the embedded packages, so we still need to update our code accordingly. (done and pushed to this PR)
…tqmodel_exllama_kernels`
9fef5b2
into
foundation-model-stack:main
Description of the change
Related issue number
How to verify the PR
Was the PR tested