|
1 | 1 | --- |
2 | 2 | sidebar_position: 4 |
3 | | -title: 🦾 Supported models |
4 | | -description: Models Supported by xTuring |
| 3 | +title: 🦾 Supported models and variants |
| 4 | +description: Choose model keys and variant templates for task-based notebooks |
5 | 5 | --- |
6 | 6 |
|
7 | | -<!-- # Models supported by xTuring --> |
8 | | -## Base versions |
9 | | -| Model | Model Key | LoRA | INT8 | LoRA + INT8 | LoRA + INT4 | |
10 | | -| ------ | --- | :---: | :---: | :---: | :---: | |
11 | | -| BLOOM 1.1B| bloom | ✅ | ✅ | ✅ | ✅ | |
12 | | -| Cerebras 1.3B| cerebras | ✅ | ✅ | ✅ | ✅ | |
13 | | -| DistilGPT-2 | distilgpt2 | ✅ | ✅ | ✅ | ✅ | |
14 | | -| Falcon 7B | falcon | ✅ | ✅ | ✅ | ✅ | |
15 | | -| Galactica 6.7B| galactica | ✅ | ✅ | ✅ | ✅ | |
16 | | -| GPT-J 6B | gptj | ✅ | ✅ | ✅ | ✅ | |
17 | | -| GPT-2 | gpt2 | ✅ | ✅ | ✅ | ✅ | |
18 | | -| LLaMA 7B | llama | ✅ | ✅ | ✅ | ✅ | |
19 | | -| LLaMA2 | llama2 | ✅ | ✅ | ✅ | ✅ | |
20 | | -| MiniMaxM2 | minimax_m2 | ✅ | ✅ | ✅ | ✅ | |
21 | | -| Qwen3 0.6B | qwen3_0_6b | ✅ | ✅ | ✅ | ✅ | |
22 | | -| OPT 1.3B | opt | ✅ | ✅ | ✅ | ✅ | |
23 | | - |
24 | | -### Memory-efficient versions |
25 | | -> The above mentioned are the base variants of the LLMs. Below are the templates to get their `LoRA`, `INT8`, `INT8 + LoRA` and `INT4 + LoRA` versions. |
26 | | -
|
27 | | -| Version | Template | |
28 | | -| -- | -- | |
29 | | -| LoRA | <model_key>_lora| |
30 | | -| INT8 | <model_key>_int8| |
31 | | -| INT8 + LoRA | <model_key>_lora_int8| |
32 | | - |
33 | | -### INT4 Precision model versions |
34 | | -> In order to load any model's __`INT4+LoRA`__ version, you will need to make use of `GenericLoraKbitModel` class from `xturing.models`. Below is how to use it: |
| 7 | +Use one task notebook from `examples/notebooks/`, then choose a model key from this page. |
| 8 | + |
| 9 | +## Naming templates |
| 10 | + |
| 11 | +| Variant | Key template | |
| 12 | +| --- | --- | |
| 13 | +| Base | `<model_key>` | |
| 14 | +| LoRA | `<model_key>_lora` | |
| 15 | +| INT8 | `<model_key>_int8` | |
| 16 | +| LoRA + INT8 | `<model_key>_lora_int8` | |
| 17 | +| LoRA + K-bit (INT4 flow) | `<model_key>_lora_kbit` | |
| 18 | + |
| 19 | +## Model keys |
| 20 | + |
| 21 | +| Model | Base key | Available variants | |
| 22 | +| --- | --- | --- | |
| 23 | +| BLOOM 1.1B | `bloom` | `base`, `lora`, `int8`, `lora_int8` | |
| 24 | +| Cerebras 1.3B | `cerebras` | `base`, `lora`, `int8`, `lora_int8` | |
| 25 | +| DistilGPT-2 | `distilgpt2` | `base`, `lora` | |
| 26 | +| Falcon 7B | `falcon` | `base`, `lora`, `int8`, `lora_int8`, `lora_kbit` | |
| 27 | +| Galactica 6.7B | `galactica` | `base`, `lora`, `int8`, `lora_int8` | |
| 28 | +| Generic wrapper | `generic` | `base`, `lora`, `int8`, `lora_int8`, `lora_kbit` | |
| 29 | +| GPT-J 6B | `gptj` | `base`, `lora`, `int8`, `lora_int8` | |
| 30 | +| GPT-2 | `gpt2` | `base`, `lora`, `int8`, `lora_int8` | |
| 31 | +| GPT-OSS 20B | `gpt_oss_20b` | `base`, `lora`, `int8`, `lora_int8`, `lora_kbit` | |
| 32 | +| GPT-OSS 120B | `gpt_oss_120b` | `base`, `lora`, `int8`, `lora_int8`, `lora_kbit` | |
| 33 | +| LLaMA | `llama` | `base`, `lora`, `int8`, `lora_int8`, `lora_kbit` | |
| 34 | +| LLaMA 2 | `llama2` | `base`, `lora`, `int8`, `lora_int8`, `lora_kbit` | |
| 35 | +| Mamba | `mamba` | `base` | |
| 36 | +| MiniMaxM2 | `minimax_m2` | `base`, `lora`, `int8`, `lora_int8`, `lora_kbit` | |
| 37 | +| OPT 1.3B | `opt` | `base`, `lora`, `int8`, `lora_int8` | |
| 38 | +| Qwen3 0.6B | `qwen3_0_6b` | `base`, `lora`, `int8`, `lora_int8`, `lora_kbit` | |
| 39 | +| Stable Diffusion | `stable_diffusion` | `base` | |
| 40 | + |
| 41 | +## INT4-style workflow |
| 42 | + |
| 43 | +For models that expose `*_lora_kbit`, you can still use the generic K-bit API directly: |
| 44 | + |
35 | 45 | ```python |
36 | 46 | from xturing.models import GenericLoraKbitModel |
37 | | -model = GenericLoraKbitModel('/path/to/model') |
| 47 | +model = GenericLoraKbitModel("/path/to/model") |
38 | 48 | ``` |
39 | | -The `/path/to/model` can be replaced with you local directory or any HuggingFace library model like `facebook/opt-1.3b`. |
|
0 commit comments