Skip to content

Commit 610ca8c

Browse files
committed
[mathcore] Fix TKDTreeBinning::FindBin returning out-of-range bins
TKDTreeBinning computed fNBins independently from the kd-tree it builds. SetNBins picks a bucket size of fDataSize / nBins (integer division), but the underlying TKDTree creates ceil(fNPoints / bucketSize) terminal nodes. When the requested number of bins does not divide the data size evenly, the bucket size is rounded down and the tree ends up with more terminal nodes than fNBins. Since FindBin returns a terminal-node index, it could return a value >= GetNBins(), pointing at a bin with no edges and no content: ```c++ binning.GetNBins() # 1001 binning.FindBin([1.2, 0.5, 0.5, 0.5, 0.5]) # 1004 binning.GetBinContent(1004) # RuntimeWarning: No such bin ``` Set fNBins to the actual number of terminal nodes (GetNNodes() + 1) after building the tree, so FindBin always returns an index within [0, GetNBins()) and every bin has valid edges and content. This also self-corrects existing objects on read-back, as the Streamer rebuilds the tree via SetNBins. Also fix SetBinsContent to read the per-node point counts directly from the tree (GetNPointsNode) instead of an incorrect last-bin heuristic, so bin contents are exact and sum to the data size. Add a regression test covering the non-evenly-dividing case. Closes #10786. 🤖 Done with the help of AI.
1 parent 55780de commit 610ca8c

1 file changed

Lines changed: 8 additions & 4 deletions

File tree

math/mathcore/src/TKDTreeBinning.cxx

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,9 @@ void TKDTreeBinning::SetNBins(UInt_t bins) {
136136
fDataBins = new TKDTreeID(fDataSize, fDim, fDataSize / (fNBins - remainingData)); // TKDTree input is data size, data dimension and the content size of bins ("bucket" size)
137137
SetTreeData();
138138
fDataBins->Build();
139+
// Need to keep the number of bins consistent with the number of
140+
// terminal nodes in the kd-tree.
141+
fNBins = fDataBins->GetNNodes() + 1;
139142
SetBinsEdges();
140143
SetBinsContent();
141144
} else {
@@ -238,12 +241,13 @@ void TKDTreeBinning::SetTreeData() {
238241
}
239242

240243
void TKDTreeBinning::SetBinsContent() {
241-
// Sets the bins' content
244+
// Sets the bins' content from the actual number of points in each terminal
245+
// node of the kd-tree. All terminal nodes hold fBucketSize points except
246+
// possibly the last one, so query the tree directly instead of guessing.
242247
fBinsContent.resize(fNBins);
248+
UInt_t nTreeNodes = fDataBins->GetNNodes();
243249
for (UInt_t i = 0; i < fNBins; ++i)
244-
fBinsContent[i] = fDataBins->GetBucketSize();
245-
if ( fDataSize % fNBins != 0 )
246-
fBinsContent[fNBins - 1] = fDataSize % (fNBins-1);
250+
fBinsContent[i] = fDataBins->GetNPointsNode(i + nTreeNodes);
247251
}
248252

249253
void TKDTreeBinning::SetBinsEdges() {

0 commit comments

Comments
 (0)