You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+45-11Lines changed: 45 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -280,15 +280,15 @@ Set `peft_method` to `"lora"`. You can additionally pass any arguments from [Lor
280
280
r: int=8
281
281
lora_alpha: int=32
282
282
target_modules: List[str] = field(
283
-
default_factory=lambda: ["q_proj", "v_proj"],
284
-
metadata={
285
-
"help": "The names of the modules to apply LORA to. LORA selects modules which either \
286
-
completely match or "
287
-
'end with one of the strings. If the value is ["all-linear"], \
288
-
then LORA selects all linear and Conv1D '
289
-
"modules except for the output layer."
290
-
},
291
-
)
283
+
default=None,
284
+
metadata={
285
+
"help": "The names of the modules to apply LORA to. LORA selects modules which either \
286
+
completely match or "
287
+
'end with one of the strings. If the value is ["all-linear"], \
288
+
then LORA selects all linear and Conv1D '
289
+
"modules except for the output layer."
290
+
},
291
+
)
292
292
bias ="none"
293
293
lora_dropout: float=0.05
294
294
```
@@ -331,8 +331,11 @@ Equally you can pass in a JSON configuration for running tuning. See [build doc]
331
331
}
332
332
```
333
333
334
-
Notice the `target_modules` that are set are the default values. `target_modules` are the names of the modules to apply the adapter to. If this is specified, only the modules with the specified names will be replaced. When passing a list of strings, either an exact match will be performed or it is checked if the name of the module ends with any of the passed strings. If this is specified as `all-linear`, then all linear/Conv1D modules are chosen, excluding the output layer. If this is not specified, modules will be chosen according to the model architecture. If the architecture is not known, an error will be raised — in this case, you should specify the target modules manually. See [HuggingFace docs](https://huggingface.co/docs/peft/en/package_reference/lora#peft.LoraConfig) for more details.
334
+
Notice the `target_modules` are the names of the modules to apply the adapter to.
335
+
- If this is specified, only the modules with the specified names will be replaced. When passing a list of strings, either an exact match will be performed or it is checked if the name of the module ends with any of the passed strings. If this is specified as `all-linear`, then all linear/Conv1D modules are chosen, excluding the output layer. If this is specified as `lm_head` which is an output layer, the `lm_head` layer will be chosen. See the Note of this [section](#recommended-target-modules-per-model-architecture) on recommended target modules by model architecture.
336
+
- If this is not specified, modules will be chosen according to the model architecture. If the architecture is not known, an error will be raised — in this case, you should specify the target modules manually. See [HuggingFace docs](https://huggingface.co/docs/peft/en/package_reference/lora#peft.LoraConfig) for more details.
335
337
338
+
#### How to get list of LoRA target_modules of a model
336
339
For each model, the `target_modules` will depend on the type of model architecture. You can specify linear or attention layers to `target_modules`. To obtain list of `target_modules` for a model:
337
340
338
341
```py
@@ -387,7 +390,38 @@ For example for LLaMA model the modules look like:
387
390
You can specify attention or linear layers. With the CLI, you can specify layers with `--target_modules "q_proj" "v_proj" "k_proj" "o_proj"` or `--target_modules "all-linear"`.
388
391
389
392
#### Recommended target modules per model architecture
390
-
As per [LoRA paper](https://arxiv.org/pdf/2106.09685), section 4.2 , by using the query and value projection matrices, we can achieve reasonable quality with efficient GPU utilization. Hence, while thinking about what LoRA adapters to specify, we recommend starting with query and value matrices. You could also refer to the defaults specified by PEFT library for popular model architectures in section [TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING](https://github.com/huggingface/peft/blob/7b1c08d2b5e13d3c99b7d6ee83eab90e1216d4ba/src/peft/utils/constants.py#L70) as a good starting point.
393
+
As per [LoRA paper](https://arxiv.org/pdf/2106.09685), section 4.2 , by using the query and value projection matrices, we can achieve reasonable quality with efficient GPU utilization. Hence, while thinking about what LoRA adapters to specify, we recommend starting with query and value matrices. You could also refer to the defaults specified by PEFT library for popular model architectures in section [TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING](https://github.com/huggingface/peft/blob/7b1c08d2b5e13d3c99b7d6ee83eab90e1216d4ba/src/peft/utils/constants.py#L70) as a good starting point.
394
+
395
+
<details>
396
+
397
+
<summary>How to specify lm_head as a target module</summary>
398
+
399
+
Since `lm_head` is an output layer, it will **not** be included as a target module if you specify `all-linear`. You can, however, specify to apply the LoRA adapter to the `lm_head` layer by explicitly naming it in the `target_modules` arg.
400
+
401
+
**NOTE**: Specifying `["lm_head", "all-linear"]` will not tune the `lm_head` layer, but will run the equivalent of `["all-linear"]`. To include `lm_head`, you must explicitly specify all of the layers to tune on. Using the example of the Llama model above, you would need to list `"q_proj" "v_proj" "k_proj" "o_proj" "lm_head"` to tune the all linear layers including `lm_head`. These 5 layers will be produced in the LoRA adapter.
402
+
403
+
Example 1:
404
+
```json
405
+
{
406
+
"target_modules": ["lm_head"] // this produces lm_head layer only
407
+
}
408
+
```
409
+
410
+
Example 2:
411
+
```json
412
+
{
413
+
"target_modules": ["lm_head", "c_proj", "c_attn", "c_fc"] // this produces lm_head, c_proj, c_attn and c_fc layers
414
+
}
415
+
```
416
+
417
+
Example 3:
418
+
```json
419
+
{
420
+
"target_modules": ["lm_head", "all-linear"] // this produces the equivalent of all-linear only, no lm_head
0 commit comments