Commit 5bb8fbc
fix: Handle wide matrices in orthogonal initializer (#629)
* fix: Handle wide matrices in orthogonal initializer
QR decomposition of an {m, n} matrix produces Q of shape {m, m},
which fails when n > m (e.g. LSTM weights {hidden, 4*hidden}).
Generate a {max(m,n), max(m,n)} square random matrix so QR always
produces enough orthogonal columns, then slice to {m, n}.
Adds tests for wide 2D and high-rank shapes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Apply suggestions from code review
* Apply suggestion from @polvalente
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Paulo Valente <16843419+polvalente@users.noreply.github.com>1 parent d5ecacb commit 5bb8fbc
2 files changed
Lines changed: 34 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
686 | 686 | | |
687 | 687 | | |
688 | 688 | | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
| 693 | + | |
| 694 | + | |
| 695 | + | |
689 | 696 | | |
690 | 697 | | |
691 | 698 | | |
692 | | - | |
| 699 | + | |
693 | 700 | | |
694 | 701 | | |
695 | | - | |
| 702 | + | |
696 | 703 | | |
697 | 704 | | |
698 | 705 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
164 | 164 | | |
165 | 165 | | |
166 | 166 | | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
167 | 192 | | |
168 | 193 | | |
169 | 194 | | |
| |||
0 commit comments