Commit 8176173
authored
perf: use
The backward step of indexing operation is costly on GPU. Using
dedicated `torch.embedding` mitigates this problem.
<details><summary>Profiling results</summary>
<p>
Before: 32ms
<img width="612" alt="image"
src="https://github.com/user-attachments/assets/6e2a4de1-433a-4b6a-8b59-a8458d66897c"
/>
---
After: 0.5ms
<img width="334" alt="image"
src="https://github.com/user-attachments/assets/199ac925-8382-4a43-a6bb-584bb60159b2"
/>
</p>
</details>
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Refactor**
- Updated the embedding lookup mechanism in the model, potentially
improving how embeddings are retrieved internally. No changes to the
user interface or method signatures.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->torch.embedding for type embedding (deepmodeling#4747)1 parent 30b762e commit 8176173
1 file changed
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
286 | 286 | | |
287 | 287 | | |
288 | 288 | | |
289 | | - | |
| 289 | + | |
290 | 290 | | |
291 | 291 | | |
292 | 292 | | |
| |||
0 commit comments