Commit 4e605fa
fix dram_kv.hit_rate_pct normalization (#5777)
Summary:
X-link: facebookresearch/FBGEMM#2706
The `dram_kv.hit_rate_pct` metric in `SSDTableBatchedEmbeddingBags` was computed as `dram_read_hit_count / (dram_read_hit_count + dram_read_miss_count)` — the denominator only counts requests that reached DRAM, i.e. L1 misses. When `l1_cache_size` grows, L1 absorbs more keys and only the long-tail keys fall through to DRAM, so the L1-conditional DRAM hit rate drops mechanically even though the system is doing more — not less — work in the cheaper tier.
This diff changes `dram_kv.hit_rate_pct` to be normalized against `num_unique` (total unique indices in the batch, captured from the L1 reporting path):
hit_rate_pct = 100.0 * (num_unique - dram_read_miss_count) / num_unique
Semantically this is now the overall (L1 + DRAM) hit rate — the fraction of unique requests that did not miss at DRAM. The value stays stable as cache sizes shift between tiers.
Algebraically equivalent to the expanded form `L1_hit + (1 - L1_hit) * DRAM_hit_conditional` under the assumption that every L1 miss reaches DRAM (the only path today). A code comment documents this caveat in case a future SSD-bypass path is added.
Implementation:
- `_report_uvm_cache_stats` stashes `num_unique` into `_last_l1_num_unique` so `_report_dram_kv_perf_stats` can use it as the normalization denominator without re-reading L1 counters. Both reporters fire from the same `should_report(self.step)` cadence, so the stashed value corresponds to the same reporting window.
- `l1_hit_rate_pct` and `l2_cache.hit_rate_pct` are untouched.
Differential Revision: D1057270131 parent 0fcdfc0 commit 4e605fa
1 file changed
Lines changed: 22 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1233 | 1233 | | |
1234 | 1234 | | |
1235 | 1235 | | |
| 1236 | + | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
1236 | 1240 | | |
1237 | 1241 | | |
1238 | 1242 | | |
| |||
4185 | 4189 | | |
4186 | 4190 | | |
4187 | 4191 | | |
| 4192 | + | |
4188 | 4193 | | |
4189 | 4194 | | |
4190 | 4195 | | |
| |||
4847 | 4852 | | |
4848 | 4853 | | |
4849 | 4854 | | |
4850 | | - | |
4851 | | - | |
| 4855 | + | |
| 4856 | + | |
| 4857 | + | |
| 4858 | + | |
| 4859 | + | |
| 4860 | + | |
| 4861 | + | |
| 4862 | + | |
| 4863 | + | |
| 4864 | + | |
| 4865 | + | |
| 4866 | + | |
| 4867 | + | |
| 4868 | + | |
| 4869 | + | |
| 4870 | + | |
| 4871 | + | |
4852 | 4872 | | |
4853 | 4873 | | |
4854 | 4874 | | |
| |||
0 commit comments