Commit 9374e3f
Skip scratch pad eviction data in enrichment mode to avoid cudaFree overhead (pytorch#5645)
Summary:
Pull Request resolved: pytorch#5645
X-link: https://github.com/facebookresearch/FBGEMM/pull/2593
CONTEXT: In KVZCH enrichment mode (_enrichment_enabled), the ssd_scratch_pad_eviction_data list accumulates UVA tensors every forward pass via _prefetch. The backward hook _evict_from_scratch_pad pops entries but does nothing useful (evict() is skipped in embedding_cache_mode, RES is disabled). The .clear() call in enrichment_query_id then triggers expensive cudaFree calls when releasing those UVA tensors, causing GPU stalls visible in Perfetto traces.
WHAT: Skip appending to ssd_scratch_pad_eviction_data in _prefetch when _enrichment_enabled is True. Add early return in _evict_from_scratch_pad for enrichment mode. Remove the now-unnecessary .clear() in enrichment_query_id since the list is always empty.
Reviewed By: emlin
Differential Revision: D101102800
fbshipit-source-id: 16dcd8d32d55f77478235f4a27a3be10f692e2881 parent 9637997 commit 9374e3f
1 file changed
Lines changed: 12 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1652 | 1652 | | |
1653 | 1653 | | |
1654 | 1654 | | |
| 1655 | + | |
| 1656 | + | |
| 1657 | + | |
1655 | 1658 | | |
1656 | 1659 | | |
1657 | 1660 | | |
| |||
1665 | 1668 | | |
1666 | 1669 | | |
1667 | 1670 | | |
| 1671 | + | |
| 1672 | + | |
| 1673 | + | |
| 1674 | + | |
| 1675 | + | |
1668 | 1676 | | |
1669 | 1677 | | |
1670 | 1678 | | |
| |||
2421 | 2429 | | |
2422 | 2430 | | |
2423 | 2431 | | |
2424 | | - | |
| 2432 | + | |
| 2433 | + | |
| 2434 | + | |
| 2435 | + | |
2425 | 2436 | | |
2426 | 2437 | | |
2427 | 2438 | | |
| |||
5228 | 5239 | | |
5229 | 5240 | | |
5230 | 5241 | | |
5231 | | - | |
5232 | | - | |
5233 | | - | |
5234 | | - | |
5235 | | - | |
5236 | | - | |
5237 | | - | |
5238 | 5242 | | |
5239 | 5243 | | |
5240 | 5244 | | |
| |||
0 commit comments