Commit f412e29
committed
fix: address PR review feedback (realAsma)
- Fold pre_quant_scale on GPU before .cpu() move (perf fix)
- Use torch.allclose instead of torch.equal in test (nit)
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>1 parent c6b93b9 commit f412e29
2 files changed
Lines changed: 3 additions & 3 deletions
File tree
- modelopt/torch/export/plugins
- tests/gpu/torch/export
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
105 | 105 | | |
106 | 106 | | |
107 | 107 | | |
108 | | - | |
| 108 | + | |
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
112 | | - | |
| 112 | + | |
113 | 113 | | |
114 | 114 | | |
115 | 115 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
83 | | - | |
| 83 | + | |
84 | 84 | | |
85 | 85 | | |
86 | 86 | | |
| |||
0 commit comments