[io] Several fixes to handling of gaps and recovery by jblomer · Pull Request #22481 · root-project/root

jblomer · 2026-06-04T13:44:36Z

Fixes

Free segment recovery
Free segment header writing for large gaps (>2GB)

Replaces #19251
Fixes #19245

pcanal · 2026-06-04T15:19:06Z

+         }
+         if (fWritable) {
+            const Long64_t last = idcur - nbytes - 1;
+            if (fFree->Last() && static_cast<TFree *>(fFree->Last())->GetLast() + 1 == idcur) {


It feels like rather than equality a greater than would be more 'stable', isn't it?

I think I'd prefer the strict equality because we construct it this way in the loop before. I.e., we want to pass the condition exactly when we added a free segment in the last loop iteration and in this case we constructed idcur to point just after the last free byte.

github-actions · 2026-06-04T16:04:46Z

Test Results

20 files 20 suites 3d 6h 6m 25s ⏱️
3 861 tests 3 826 ✅ 1 💤 34 ❌
69 939 runs 69 766 ✅ 139 💤 34 ❌

For more details on these failures, see this check.

Results for commit 1d071c7.

♻️ This comment has been updated with latest results.

The free segment header for gaps >2GB was previously truncated, leaving the chain of segments broken. Now we will link together several smaller gaps on disk to keep the chain intact. The free list, however, will still contain one large gap.

In TFile::Recover(), incorrect free segments were added during recovery. The file offset counter should be incremented _after_ adding the free segment, as it is done now. Also, remove an unnecessary `Seek()`, which will take place at the next loop iteration anyway.

Do not use an existing free list during file recovery. It will result in stale and/or duplicated entries in the recovered free list. Instead, only use the information picked up from the segment headers.

If a new key is created in a free segment such that the remaining size is larger than 2G, we have to first read the original free segment header before we update the new, smaller header. Along the way, we change TKey::fLeft from Int_t to Long64_t.

jblomer · 2026-06-12T07:58:14Z

@pcanal I'm afraid this requires a re-review because the fix for a key written in a large gap are substantial. In order to leave the new key and the new, smaller gap with a valid linked list of segments, I took the approach of reading the links in the bytes that are going to be overwritten by the key. I hope this is ok, even though "writing" can now also involve reading some bytes. The advantage is that all the I/O remains local to the location of the key. There is another, write-only solution where I would simply create a new linked list of segments after the key until the end of the gap. That, however, may result in significant seeking and I think that's worse.

silverweed · 2026-06-12T07:33:03Z

+   f->Write();
+   f->Close();
+
+   EXPECT_EQ(2, fnCountGaps(fileGuard.GetPath()));


What happens if we also delete va0? Do all the gaps coalesce and we remain with a single gap or does the file get truncated and all gaps disappear? Should we test this?

silverweed · 2026-06-12T07:35:05Z

+   Long64_t nlast = newfree->GetLast();
+   assert(nfirst > 0 && nfirst <= first && nlast >= last);
+   Long64_t nbytesl = nlast - nfirst + 1;
+   assert(nbytesl >= static_cast<Long64_t>(sizeof(Int_t)));


Are these assertions valid even if the current TFile comes from opening a "corrupted" file?
If yes, probably a comment explaining why would be beneficial; if no, this should be an exception

silverweed · 2026-06-12T07:36:48Z

   enum { kStartBigFile  = 2000000000 };
+   enum {
+      kMaxGapSize = 2000000000
+   }; // Maximum absolute value of the free segment on-disk size marker


We might wanna stress out (via a static_assert or comment) that this MUST be less than numeric_limits<Int_t>::max()

silverweed · 2026-06-12T07:39:05Z

+   const Long64_t nlast = newfree->GetLast();
   assert(nfirst > 0 && nfirst <= first && nlast >= last);
-   Long64_t nbytesl = nlast - nfirst + 1;
+   Long64_t nbytesl = std::min(nlast, fEND) - nfirst + 1;


Should this change to std::min be already part of the previous commit?

silverweed · 2026-06-12T07:44:15Z

   }
+   fnWriteGapHeader(offset, nbytesl);
+
+   if (last == fEND - 1)


As far as I understand this check is not influenced by the while above, so in principle couldn't this be done before it and, if positive, used to skip the loop altogether? (since in this case we are truncating the file before the whole free chain if I understand correctly)

pcanal · 2026-06-12T21:45:54Z

   Bool_t result = f->WriteBuffer(buffer,nsize);
+
+   if (nGaps == 2) {
+      f->WriteBuffer(gapHeaderBuffers[1].data(), sizeof(Int_t));


Why don't we need a f->Seek(end_of_first_gap) here?

Because for just wrote buffer before which included the first gap.

jblomer self-assigned this Jun 4, 2026

jblomer requested review from bellenot and pcanal as code owners June 4, 2026 13:44

jblomer added the in:I/O label Jun 4, 2026

jblomer mentioned this pull request Jun 4, 2026

TFile issues when coalescing gaps #19245

Open

1 task

jblomer changed the title ~~[io] several fixes to handling of gaps and recovery~~ [io] Several fixes to handling of gaps and recovery Jun 4, 2026

jblomer mentioned this pull request Jun 4, 2026

Fix TFile gap coalescing #19251

Closed

jblomer force-pushed the tfile-fix-gaps-v2 branch 2 times, most recently from 84edc84 to b13a691 Compare June 4, 2026 14:06