Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions roottest/root/tree/chain/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ ROOTTEST_ADD_TEST(TChainCleanup
MACRO execTChainCleanup.cxx+
OUTREF execTChainCleanup.ref)

ROOTTEST_ADD_TEST(TChainAddOtherChainGetEntries
MACRO chain_add_getentries.C+)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aclic'd macro works great. I wonder if an executable can be better, making testing a bit faster, of course if no dictionary is required to be generated on the fly.


# to avoid copying of ROOT files - just run from source directory
ROOTTEST_ADD_TEST(subdir
MACRO runsubdir.C
Expand Down
103 changes: 103 additions & 0 deletions roottest/root/tree/chain/chain_add_getentries.C
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
// Minimal reproducer: TChain::Add(TChain*) does not update the cached entry
// count. After merging chains this way, GetEntries() / GetEntriesFast() return
// a wrong (too small) value, even though every file and every entry is present
// in the merged chain (a LoadTree walk and CopyTree both see the full sample).
//
// Run: root -l -b -q chain_add_getentries.C
//
// Observed with ROOT 6.40.00.

#include "TChain.h"
#include "TFile.h"
#include "TTree.h"
#include "TRandom3.h"
#include "TSystem.h"
#include "TROOT.h"

namespace {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this then is executed as a macro, why this namespace?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the macro is a copy/paste of the user report).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see: great to have such reproducers delivered to us! Perhaps this can be streamlined before integrating.


// Create nfiles files under dir, each with a TTree "tree" holding a random
// number of entries. Returns the total number of entries written.
Long64_t make_sample(const std::string& dir, int nfiles, TRandom3& rng)
{
gSystem->mkdir(dir.c_str(), kTRUE);
Long64_t total = 0;
for (int f = 0; f < nfiles; ++f) {
const std::string fname = dir + "/f_" + std::to_string(f) + ".root";
TFile out(fname.c_str(), "RECREATE");
TTree t("tree", "tree");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding here that time and again we've had issues with tests opening a TFile and creating a TTree on the stack. Maybe to avoid headeaches better to do both on the heap.

Double_t x;
t.Branch("x", &x);
const Long64_t n = 50 + rng.Integer(200); // 50..249 entries per file
for (Long64_t i = 0; i < n; ++i) { x = rng.Gaus(); t.Fill(); }
t.Write();
out.Close();
total += n;
}
return total;
}

TChain* make_chain(const std::string& dir, int nfiles)
{
TChain* c = new TChain("tree");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many places in the test with a bare new without a corresponding delete, it would be much better to handle these via unique_ptr

for (int f = 0; f < nfiles; ++f)
c->Add((dir + "/f_" + std::to_string(f) + ".root").c_str());
return c;
}

} // namespace

int chain_add_getentries()
{
printf("ROOT version: %s\n\n", gROOT->GetVersion());

TRandom3 rng(42);
const std::string base = "chain_add_repro_files";
gSystem->Exec(("rm -rf " + base).c_str());

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something xplatform would be better here?


// Three independent samples with different file counts.
const int nA = 4, nB = 7, nC = 5;
const Long64_t tA = make_sample(base + "/A", nA, rng);
const Long64_t tB = make_sample(base + "/B", nB, rng);
const Long64_t tC = make_sample(base + "/C", nC, rng);
const Long64_t truth = tA + tB + tC;
printf("entries written: A=%lld B=%lld C=%lld → true total = %lld\n",
tA, tB, tC, truth);

// One chain per sample; query each count (e.g. to print per-sample yields).
TChain* a = make_chain(base + "/A", nA);
TChain* b = make_chain(base + "/B", nB);
TChain* c = make_chain(base + "/C", nC);
printf("per-chain GetEntries(): a=%lld b=%lld c=%lld (sum=%lld)\n",
a->GetEntries(), b->GetEntries(), c->GetEntries(),
a->GetEntries() + b->GetEntries() + c->GetEntries());

// Merge by adding TChains into a TChain.
a->Add(b);
a->Add(c);

bool result = (a->GetEntries() == truth);

printf("\nafter a->Add(b); a->Add(c):\n");
printf(" GetEntries() = %lld ← %s (expected %lld)\n",
a->GetEntries(), result ? "Correct" : "WRONG", truth);
printf(" GetEntriesFast() = %lld ← %s\n", a->GetEntriesFast(), (a->GetEntriesFast() == truth) ? "Correct" : "WRONG");
printf(" files in chain = %d (all present)\n",
a->GetListOfFiles()->GetEntries());

// The entries really are all there: an explicit LoadTree walk reaches them.
Long64_t nav = 0;
while (a->LoadTree(nav) >= 0) ++nav;
printf(" reachable via LoadTree loop = %lld ← correct total\n", nav);

// Workaround: add the files into a single chain instead of chaining chains.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm not following the logic well, but why is the workaround part of the test?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it is no longer strictly necessary but harmless to do the additional cross check. I will update the comment.

TChain* good = new TChain("tree");
for (int f = 0; f < nA; ++f) good->Add((base + "/A/f_" + std::to_string(f) + ".root").c_str());
for (int f = 0; f < nB; ++f) good->Add((base + "/B/f_" + std::to_string(f) + ".root").c_str());
for (int f = 0; f < nC; ++f) good->Add((base + "/C/f_" + std::to_string(f) + ".root").c_str());
printf("\nsingle chain (Add per file): GetEntries() = %lld ← %s\n",
good->GetEntries(), (good->GetEntries() == truth) ? "Correct" : "WRONG");

return result ? 0 : 1;
}

14 changes: 10 additions & 4 deletions tree/tree/src/TChain.cxx
Original file line number Diff line number Diff line change
Expand Up @@ -222,19 +222,25 @@ Int_t TChain::Add(TChain* chain)
delete[] fTreeOffset;
fTreeOffset = trees;
}
chain->GetEntries(); //to force the computation of nentries

TIter next(chain->GetListOfFiles());
Int_t nf = 0;
TChainElement* element = nullptr;
while ((element = (TChainElement*) next())) {
Long64_t nentries = element->GetEntries();
if (fTreeOffset[fNtrees] == TTree::kMaxEntries) {
if (nentries == TTree::kMaxEntries) {
fEntries = TTree::kMaxEntries;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this line completely. fEntries is the member of this chain (the one where Add(TChain *other) is being called). So why are we setting the number of entries to kMaxEntries if the entries of the other chain are not known? Aren't we losing info on the preexisting number of entries in this chain?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not. fEntries == TTree::kMaxEntries indicates that the total number of entries is not known, which is now the case since we don't know the number of element of one of the element of the chain.

fTreeOffset[fNtrees+1] = TTree::kMaxEntries;
} else {
fTreeOffset[fNtrees+1] = fTreeOffset[fNtrees] + nentries;
if (fTreeOffset[fNtrees] == TTree::kMaxEntries) {
fTreeOffset[fNtrees+1] = TTree::kMaxEntries;
} else {
fTreeOffset[fNtrees+1] = fTreeOffset[fNtrees] + nentries;
}
if (fEntries != TTree::kMaxEntries)
fEntries += nentries;
}
fNtrees++;
fEntries += nentries;
TChainElement* newelement = new TChainElement(element->GetName(), element->GetTitle());
newelement->SetPacketSize(element->GetPacketSize());
newelement->SetNumberEntries(nentries);
Expand Down
Loading