Commit 6b0ad8f
fix: use alpha/rank scaling in LoRaLayer (standard LoRA convention) (#846)
* fix: use alpha/rank scaling in LoRaLayer (standard LoRA convention)
LoRaLayer used raw `alpha` as the scaling factor instead of `alpha / rank`.
With the default alpha=16, rank=8, this made the LoRA contribution 8x
larger than PEFT, the original LoRA paper, and mlx-lm.
Before: scale = alpha = 16.0
After: scale = alpha / rank = 2.0
Also fixes replace_lora_with_linear to use the same corrected scale.
Added tests verifying:
- scale = alpha / rank
- Forward pass produces (alpha/rank) * (x @ A @ B)
- Default settings give 2x scaling, not 16x
Fixes #845
* test: add B=0 initialization test for LoRaLayer
Verifies that when B is zeros (default init), the LoRA layer output
equals the base linear layer output exactly (no LoRA contribution).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: move LoRA scaling tests into test_trainer.py
Move TestLoRaScaling class from test_trainer_utils.py into
test_trainer.py as suggested in review, and revert
test_trainer_utils.py to its original state.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Prince Canuma <prince.gdt@gmail.com>1 parent e2e9e67 commit 6b0ad8f
2 files changed
Lines changed: 68 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| |||
142 | 143 | | |
143 | 144 | | |
144 | 145 | | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
145 | 210 | | |
146 | 211 | | |
147 | 212 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
| 34 | + | |
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
39 | | - | |
| 39 | + | |
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
46 | | - | |
| 46 | + | |
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| |||
0 commit comments