Commit 9fed2a5
add overall L1→DRAM hit rate metric (#5777)
Summary:
X-link: facebookresearch/FBGEMM#2706
`SSDTableBatchedEmbeddingBags` already emits per-tier hit rates — `ssd_tbe.prefetch.l1_hit_rate_pct`, `l2_cache.hit_rate_pct`, and `dram_kv.hit_rate_pct` — but each is conditional on requests that reached that tier. As `l1_cache_size` grows, L1 absorbs more keys and only the long-tail keys fall through to DRAM, so the L1-conditional DRAM hit rate drops mechanically even though the system is doing more — not less — work in the cheaper tier. None of the existing per-tier metrics give an at-a-glance answer to "what fraction of unique requests were served from cache (L1 or DRAM), without paying SSD cost?".
This diff adds an `ssd_tbe.overall_hit_rate_pct` aggregate metric (per-TBE: `ssd_tbe.tbe_id{N}.overall_hit_rate_pct`) defined as:
overall_hit_rate_pct = 100.0 * (num_unique - dram_read_miss_count) / num_unique
i.e. the fraction of unique requests that did not miss at DRAM. The value stays stable as cache sizes shift between L1 and DRAM.
Algebraically equivalent to the expanded form `L1_hit + (1 - L1_hit) * DRAM_hit_conditional` under the assumption that every L1 miss reaches DRAM (the only path today). A code comment documents this caveat in case a future SSD-bypass path is added.
The existing per-tier metrics (`l1_hit_rate_pct`, `l2_cache.hit_rate_pct`, `dram_kv.hit_rate_pct`) are left unchanged — they remain useful for diagnosing per-tier behavior.
Implementation:
- `_report_uvm_cache_stats` stashes `num_unique` into `_last_l1_num_unique` so `_report_dram_kv_perf_stats` can use it as the normalization denominator without re-reading L1 counters. Both reporters fire from the same `should_report(self.step)` cadence, so the stashed value corresponds to the same reporting window.
Reviewed By: kausv
Differential Revision: D1057270131 parent 07767a8 commit 9fed2a5
1 file changed
Lines changed: 34 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1233 | 1233 | | |
1234 | 1234 | | |
1235 | 1235 | | |
| 1236 | + | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
1236 | 1240 | | |
1237 | 1241 | | |
1238 | 1242 | | |
| |||
1313 | 1317 | | |
1314 | 1318 | | |
1315 | 1319 | | |
| 1320 | + | |
| 1321 | + | |
| 1322 | + | |
1316 | 1323 | | |
1317 | 1324 | | |
1318 | 1325 | | |
| |||
1366 | 1373 | | |
1367 | 1374 | | |
1368 | 1375 | | |
| 1376 | + | |
1369 | 1377 | | |
1370 | 1378 | | |
1371 | 1379 | | |
| |||
4185 | 4193 | | |
4186 | 4194 | | |
4187 | 4195 | | |
| 4196 | + | |
4188 | 4197 | | |
4189 | 4198 | | |
4190 | 4199 | | |
| |||
4864 | 4873 | | |
4865 | 4874 | | |
4866 | 4875 | | |
| 4876 | + | |
| 4877 | + | |
| 4878 | + | |
| 4879 | + | |
| 4880 | + | |
| 4881 | + | |
| 4882 | + | |
| 4883 | + | |
| 4884 | + | |
| 4885 | + | |
| 4886 | + | |
| 4887 | + | |
| 4888 | + | |
| 4889 | + | |
| 4890 | + | |
| 4891 | + | |
| 4892 | + | |
| 4893 | + | |
| 4894 | + | |
| 4895 | + | |
| 4896 | + | |
| 4897 | + | |
| 4898 | + | |
| 4899 | + | |
| 4900 | + | |
4867 | 4901 | | |
4868 | 4902 | | |
4869 | 4903 | | |
| |||
0 commit comments