i#7731: Optimize reuse distance collection using splay tree#7732
i#7731: Optimize reuse distance collection using splay tree#7732artur-abd wants to merge 36 commits intoDynamoRIO:masterfrom
Conversation
4b6bffb to
3961c95
Compare
Improve reuse distance using splay tree instead of linked list. Splay tree have asymptotic O(log n) amortizated time for all operations instead of O(n / const) for linked list with skip list. Perfomance improve achives in avarage 10-40% based on different benchmarks. Fixes DynamoRIO#7731
3961c95 to
1b8aa74
Compare
|
Thank you for contributing. Please mark for review when ready. Also, please don't force-push a shared branch: see https://dynamorio.org/page_code_reviews.html about squash-and-merge, where the final merge commit message comes from, and how force pushes break the review process. Use separate incremental commits with commit messages that describe what each one is doing (and not just repeating the merge message). |
Fixes description for line_ref_splay_t and it's methods Fixes: DynamoRIO#7731
|
Thanks for your comments. PR is ready for review. |
There was a problem hiding this comment.
Pull request overview
Optimizes drcachesim’s reuse-distance collection by replacing the linked-list (+skip list) data structure with a splay tree to reduce per-access overhead.
Changes:
- Replaced the internal reuse-distance tracking structure with a splay tree implementation.
- Updated reuse-distance tool/shard wiring to use the new node type and tree container.
- Added unit tests for the splay tree and marked skip-list-related CLI options as deprecated.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| clients/drcachesim/tools/reuse_distance.h | Replaces list/skip-list logic with a splay tree, adds splay operations and subtree-size tracking. |
| clients/drcachesim/tools/reuse_distance.cpp | Updates shard initialization and reuse-distance processing to use the new node/tree types. |
| clients/drcachesim/tests/reuse_distance_test.cpp | Adds a dedicated splay-tree correctness/invariant test. |
| clients/drcachesim/common/options.h | Renames the skip-list distance option variable to reflect deprecation. |
| clients/drcachesim/common/options.cpp | Deprecates skip-list-related option help text (and keeps CLI flag name). |
| clients/drcachesim/analyzer_multi.cpp | Updates option wiring to use the renamed deprecated option variable. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@abhinav92003, thanks for review. |
|
@abhinav92003 |
Ack. I'll review the new changes. |
| return 0; | ||
| splay(ref); | ||
| // Get the reuse distance of ref. | ||
| int64_t dist = get_size(ref->left); |
There was a problem hiding this comment.
get_size(ref->left) is the reuse distance of ref only because we did the splay operation on ref before this correct? Add comment.
There was a problem hiding this comment.
Yes, after the splay operation, ref becomes the root of the splay tree, and the index of these elements is exactly equal to the size of the left subtree.
There was a problem hiding this comment.
Adding this to the code comment would be helpful to a future reader.
| struct line_ref_node_t *right; // the right child of the node | ||
| struct line_ref_node_t *parent; // the parent of the node | ||
| uint64_t size; // Size of the subtree | ||
| uint64_t time_stamp; // the most recent reference time stamp on this line |
There was a problem hiding this comment.
The splay tree maintains the ordered binary tree invariant on time_stamp right (larger timestamps on the left subtree), right? Add to comment.
There was a problem hiding this comment.
The splay tree maintains this invariant, but time_stamp is just additional data to verify that the ref is distant.
The splay tree implicitly supports an array, where the element index is the number of node that are to the left. I took this idea from an implicit treap. I also benchmarked implementation based on implicit treap, but it was slowed by 5-15%.
There was a problem hiding this comment.
The splay tree maintains this invariant, but time_stamp is just additional data to verify that the ref is distant.
Add to comment?
Returned the skip list implementation, implemented base classes for containers and refs, and added the ability to select an implementation using options.
|
@abhinav92003, thanks for your comments, I added an implementation selection using a flag, and also fixed the tests to test both implementations. |
Thanks; will review the new changes. |
abhinav92003
left a comment
There was a problem hiding this comment.
I have just a few minor comments. LGTM.
Thanks for the contribution. This is a useful optimization. And apologies for the (much) longer than usual turnaround time in my reviews.
| DROPTION_SCOPE_FRONTEND, "reuse_splay_tree", false, | ||
| "Use splay tree for reuse distance calculation.", | ||
| "Uses a splay tree data structure instead of a skip list for tracking reuse " | ||
| "distances. Splay tree provide better complexity and consume less memory " |
There was a problem hiding this comment.
s/better complexity/better time complexity/
| }; | ||
|
|
||
| struct line_ref_entry_t { | ||
| uint64_t time_stamp; // the most recent reference time stamp on this line |
There was a problem hiding this comment.
Either delete the default constructor if we don't need it, or add in-place init for these members. Same for other structs.
| } | ||
| }; | ||
|
|
||
| /* A splay tree node for the cache line reference info */ |
There was a problem hiding this comment.
Add banner comment (example at https://dynamorio.org/page_code_style.html) to separate out the splay tree related functions.
| { | ||
| std::cerr << "Reuse tag list:\n"; | ||
| line_ref_node_t *node = root_; | ||
| // the last visited node |
There was a problem hiding this comment.
I suppose this is the last visited node in the current node's subtree? Or if there's none such, it's nullptr. Update comment.
| // more efficient computation of the depth. Each node in the skip | ||
| // list stores its depth from the front. | ||
| struct line_ref_list_t { | ||
| struct line_ref_list_t : public line_ref_container_t { |
There was a problem hiding this comment.
As per https://dynamorio.org/page_code_style.html, this should be a class as well because it has methods other than the constructor.
| // The splay tree is a binary search tree that uses the splay operation for balancing, | ||
| // which allows deleting and inserting an element at any position in O(log n) amortized | ||
| // time. | ||
| class line_ref_splay_t : public line_ref_container_t { |
There was a problem hiding this comment.
It will be nicer if the implementation could be moved to the cpp file, instead of the header.
| new_parent->parent = node->parent; | ||
| } | ||
| if (node->parent == nullptr) | ||
| root_ = new_parent; |
There was a problem hiding this comment.
Okay, since splay has the preconditions to call it. Add comment by the assert at L873?
| } | ||
|
|
||
| if (node->parent == nullptr) | ||
| root_ = new_parent; |
There was a problem hiding this comment.
Okay, since splay has the preconditions to call it. Add comment by the assert at L848?
| return 0; | ||
| splay(ref); | ||
| // Get the reuse distance of ref. | ||
| int64_t dist = get_size(ref->left); |
There was a problem hiding this comment.
Adding this to the code comment would be helpful to a future reader.
| struct line_ref_node_t *right; // the right child of the node | ||
| struct line_ref_node_t *parent; // the parent of the node | ||
| uint64_t size; // Size of the subtree | ||
| uint64_t time_stamp; // the most recent reference time stamp on this line |
There was a problem hiding this comment.
The splay tree maintains this invariant, but time_stamp is just additional data to verify that the ref is distant.
Add to comment?
Improve reuse distance using splay tree instead of linked list. Splay tree have asymptotic O(log n) amortizated time for all operations instead of O(n / const) for linked list with skip list.
Perfomance improve achives in avarage 10-40% based on different benchmarks.
Fixes #7731