During analysis of two heapdumps shared here the following issue was discovered as a hotspot that can benefit from optimization:
Performance Data
| Metric |
WITH transitive |
WITHOUT transitive |
Ratio |
| indexOfChild() self time (µs) |
70,405,568 |
35,739,594 |
2.0× |
| DeltaDataTree.lookup() (µs) |
55,328,734 |
24,397,133 |
2.3× |
Description
indexOfChild() performs a binary search on a sorted array of child nodes:
protected int indexOfChild(String localName) {
AbstractDataTreeNode[] nodes = this.children;
int left = 0, right = nodes.length - 1;
while (left <= right) {
int mid = (left + right) / 2;
int compare = localName.compareTo(nodes[mid].name);
// ...
}
return -1;
}
While the binary search itself is O(log n), DeltaDataTree.lookup() calls indexOfChild() at each path segment level:
public DataTreeLookup lookup(IPath key) {
int keyLength = key.segmentCount();
for (DeltaDataTree tree = this; tree != null; tree = tree.parent) {
AbstractDataTreeNode node = tree.rootNode;
for (int i = 0; i < keyLength; i++) {
node = node.childAtOrNull(key.segment(i)); // calls indexOfChild
}
}
}
The delta tree chain (tree.parent) can be deep, and each delta layer requires re-traversing the path from root.
The combined indexOfChild() time (70.4 seconds) represents the single largest CPU consumer in the entire build.
Suggested Fix
- Compact delta chains more aggressively: The parent chain walk is expensive. More frequent immobilization of the delta tree would reduce the number of layers to traverse.
- Cache frequent lookups: If the same resource paths are looked up repeatedly (e.g., project roots, source folders), a small LRU cache on
DeltaDataTree.lookup() could avoid repeated traversals.
- Consider hybrid data structure: For nodes with many children (> 100), consider switching from sorted array + binary search to a
HashMap<String, AbstractDataTreeNode> for O(1) child lookup. This is particularly relevant for directories containing many files (e.g., classpath container folders).
- Reduce
key.segment(i) overhead: Each call to IPath.segment(i) on Path objects does array access. Ensure paths are not being reconstructed or re-parsed in hot loops.
During analysis of two heapdumps shared here the following issue was discovered as a hotspot that can benefit from optimization:
Performance Data
Description
indexOfChild()performs a binary search on a sorted array of child nodes:While the binary search itself is O(log n),
DeltaDataTree.lookup()callsindexOfChild()at each path segment level:The delta tree chain (
tree.parent) can be deep, and each delta layer requires re-traversing the path from root.The combined
indexOfChild()time (70.4 seconds) represents the single largest CPU consumer in the entire build.Suggested Fix
DeltaDataTree.lookup()could avoid repeated traversals.HashMap<String, AbstractDataTreeNode>for O(1) child lookup. This is particularly relevant for directories containing many files (e.g., classpath container folders).key.segment(i)overhead: Each call toIPath.segment(i)onPathobjects does array access. Ensure paths are not being reconstructed or re-parsed in hot loops.