Fix/hot strict binna conformance by JohannesLichtenberger · Pull Request #978 · sirixdb/sirix

JohannesLichtenberger · 2026-05-21T19:42:12Z

No description provided.

…tation Adds the infrastructure needed to reason about strict-Binna conformance violations on the HOT trie writer: - createBiNodeTraced wrapping every BiNode construction site in HOTTrieWriter (7 sites). Each call carries a label identifying which source location built the node, so probeConstancyOnBuild can pin violations to specific construction paths. - probeConstancyOnBuild: gated on -Dhot.debug.constancy=true. After every indirect-page construction, walks each child's subtree and verifies the captured disc bits are constant on every reachable leaf-key. Logs label, page key, non-constant child indices, and a trimmed stack trace when a violation is detected. - discBitsAsAbsBitArray helper: extracts absolute bit positions from a SingleMask DiscBitsInfo for the probe's input. - buildFlatNonStrict: extracted from createNodeFromChildren's multi- child branch (no behavior change). Isolates the dense-PEXT rebuild path so future surgery can target it specifically. - diagnosticMicrobenchPatternReproducer test: 0.8s reproducer at N=50K mirroring smallCombinedMicrobench's exact insertion pattern (5K warmup writes at offset N+1M, then N main writes at [0..N)). Reports observedHeight, total violations, and violation-type breakdown. Reads the system properties hot.debug.constancy, hot.debug.i6trace, and hot.strict.binna for diagnostic gating. - diagnosticStaleRouteNThreshold: companion test that probes which N values surface violations. - docs/HOT_STRICT_BINNA_DESIGN.md: end-to-end design context — six prior in-session attempts and their failure modes, the phased work plan, and the architectural rationale for why no indirect-only algorithm fixes Sirix's multi-entry-leaf bit-conflict problem. Default behavior unchanged: probeConstancyOnBuild is silent unless -Dhot.debug.constancy=true is set; the strict-binna flag is read but not currently consumed by production code paths. Baseline reproducer result unchanged at 127 I6 routing violations on N=50K.

…rSplitWithPath When `addEntryWithPDep` cannot extend the parent's disc-bit set without breaking the constancy invariant — concretely: the new disc bit varies within a non-split sibling's subtree (multi-entry leaves can hold keys spanning future disc bits) — the previous code path fell back to `rebuildParentAbsorbingSplit`, which uses `computeDiscBits`'s adjacent- pair XOR scan to derive a fresh disc-bit set from children's first/last keys. That scan captures bits that are non-constant within some child's subtree, producing indirects where some stored keys cannot PEXT-route back to their containing leaf (HOT I6 — `pext-routes-to-leaf`). This commit adds a strict-Binna fallback path, gated on `-Dhot.strict.binna=true`: instead of rebuilding the parent with a fresh mask, it absorbs the leaf split by inserting an intermediate BiNode at the original child's slot. The parent's mask stays unchanged, so all existing routing remains correct; the new BiNode distinguishes the two split halves at a level below the parent. Cost: tree height grows by 1 locally on the affected path. Benefit: strict I-Binna sparse-path encoding preserved end-to-end (every captured disc bit is constant in every child's subtree at construction time). A pre-condition guard `intermediateBiNodePreservesSlotOrder` checks that the new BiNode's firstKey (= leftChild.firstKey) doesn't drop below the previous slot's firstKey — leaf splits can decrease the leaf's firstKey when a smaller new key lands in the left half. When the guard fails the code falls through to `splitParentAndRecurse` for a genuine split. Default behavior unchanged: without `-Dhot.strict.binna=true` the legacy `rebuildParentAbsorbingSplit`-then-`splitParentAndRecurse` path is used. Verification on the diagnostic reproducer (`HOTFormalVerificationTest.diagnosticMicrobenchPatternReproducer`, N=50K, mirrors `smallCombinedMicrobench`'s warmup+main pattern): baseline (default path): 127 I6 violations strict-binna (this commit): 1 I8 violation (cosmetic ordering; PEXT routing OK) Multi-revision and CoW correctness preserved across all 76 targeted tests (formal verification, fragment chain, leaf page CoW, large-scale integration, real-world E2E, merge-split stress).

…prep) Re-introduces the helper pulled in session 1 (was untested then), now covered by a same-package round-trip test that verifies decode against DiscriminativeBitComputer.isBitSet semantics. The helper walks the descent path and unions every indirect's absolute disc-bit positions, returning ascending-sorted (MSB-first absolute) ints. Handles SingleMask (one PEXT mask within an 8-byte window) and MultiMask (per-byte extraction positions across non-contiguous bytes), per the BE word convention used by HOTIndirectPage.getKeyWordAt. This is Phase 1 from docs/HOT_STRICT_BINNA_DESIGN.md: it provides the metadata Phase 2's constancy-aware leaf split needs to choose a split bit that's already an ancestor disc bit (so both halves are constant on every ancestor disc bit by construction). Test coverage: - SingleMask round-trip (window starts at byte 0). - SingleMask round-trip (shifted window, initialBytePos \!= 0). - MultiMask round-trip across non-contiguous byte positions. - Multi-level path: union+dedup+sort across multiple ancestors. - Empty path edge case. HFT-grade: pre-sized buffer (path_depth × 64), no boxing, single bounded allocation for the trimmed return. Default behavior unchanged.

…rement) Adds an atomic counter that increments every time the strict-Binna intermediate-BiNode fallback in updateParentForSplitWithPath fires. Each firing creates a persisted 2-entry BiNode at the original child's slot, growing the affected path's depth by 1 — quantifying how often strict-Binna mode falls off the height-optimal path. Public accessors: - getIntermediateBiNodeFallbackFirings() → current count - resetIntermediateBiNodeFallbackFirings() → zero (test-harness setUp) Wired into the diagnosticMicrobenchPatternReproducer test: the print line now includes the firing count, alongside observedHeight and violations. 50K reproducer baseline (strict-Binna ON): observedHeight=7 violations=1 (I8) intermediate-binode-fallbacks=53 Phase 3 (lazy retroactive sibling rebalance) and Phase 4b (Binna-faithful rebuild paths) aim to drive this counter to 0 on benign workloads. Documents the Phase 2 design analysis in handleLeafSplitAndInsertInternal: the leaf's only contiguous-partition-friendly split bit is MSDB itself; sibling-aware MSDB selection adds no real value because the existing fallback chain already handles parent rejection. Real Phase 2 leverage requires Phase 3. HFT-grade: the atomic increment only fires on the rare fallback branch, not on the hot common addEntryWithPDep success path. No allocation. Default behavior unchanged.

Adds three public methods on HOTLeafPage as Phase-2-and-beyond infrastructure for the strict-Binna conformance plan: - splitToWithInsertOnBit(target, key, ..., explicitSplitBit) — like splitToWithInsert but splits on a caller-supplied bit. The partition algorithm handles both contiguous and non-contiguous partitions; the caller is responsible for choosing a bit that preserves contiguous partition (= MSDB) so the post-split structure is sensible. - computeMsdbWithOptionalNewKey(newKey) — returns the absolute MSB-first bit position of MSDB for current entries plus an optional virtual new key. Lets writer code pre-compute MSDB without a full split attempt. - isBitNonConstantInLeafPlusNewKey(absBit, newKey, isNew) — returns whether the leaf's existing entries plus an optional new key span both bit values at a given absolute bit position. Useful for ancestor- bit-non-constant detection (Phase 2 corrective case). These methods are not called by Phase 2's current writer code (Phase 2's leaf-level fix is structurally limited to MSDB splits because non-MSDB splits break the children-sorted-by-firstkey invariant — see docs/HOT_STRICT_BINNA_DESIGN.md §4 / Phase 2 notes). They land now to support Phase 3 (lazy retroactive sibling rebalance), which needs to identify dirty bits, scan candidate sibling leaves, and split them on specific bits the parent expects. HFT-grade: zero allocation beyond unavoidable byte[] for transferred keys/values. Bit-set checks are primitive byte loads + bit masks via DiscriminativeBitComputer.isBitSet. Default behavior unchanged — these are additive helpers.

When addEntryWithPDep rejects in updateParentForSplitWithPath because some non-split sibling is non-constant on the new disc bit β (Case 2b-ii — the formal-verification residual that the c868e66 intermediate-BiNode fallback absorbs at +1 depth), Phase 3 instead walks each non-constant sibling, splits its leaves on β, and rebuilds the parent with β-constant children at every slot via mask inheritance from the original parent. Mask inheritance (vs. createNodeFromChildren-N's adjacent-pair re-derivation) avoids the Phase-4-class BUILD-VIOLATIONs where re-derivation captures bits that aren't constant within some child's subtree. Every non-constant sibling slot is replaced by two slots: leftHalf (β=0) and rightHalf (β=1). The original splitChild slot is also replaced by (leftChild, rightChild) — the leaf-split products from the upstream MSDB split. Partial keys are repositioned via Long.expand (PDEP), with β-bit set to each slot's β-value. CoW correctness: rebalanceAndIntegrate / splitSubtreeOnBit / splitLeafOnBit / splitIndirectOnBit all allocate fresh pageKeys, register fresh PageReferences in the TIL, and never mutate the original sibling subtrees in place. buildBucketWithInheritedMask reuses the original sibling indirect's mask (possibly with β stripped) so bucket subtrees stay HOT-faithful. Failure modes — falls through to c868e66's intermediate-BiNode fallback: - MultiMask parent (mask inheritance not yet ported to MultiMask) - β cross-window vs parent's PEXT window - β already in parent's mask (requires subtree merge — out of scope) - partial-key collision after rebalance (rare) - new fan-out > 32 (would need genuine parent split) 50K reproducer (-Dhot.strict.binna=true): before: violations=1, observedHeight=7, intermediate-binode-fallbacks=53 after: violations=1, observedHeight=6, intermediate-binode-fallbacks=48, phase3-rebalance-firings=17 The single residual I8 violation pre-existed (= same indirect, same byte pattern as baseline) and originates from the intermediate-BiNode fallback firing on a slot where leftChild.firstKey breaks I8 ordering — Phase 3 doesn't address that fallback's own correctness. Default behavior unchanged — Phase 3 is gated on -Dhot.strict.binna=true. New on HOTLeafPage: - splitToOnBit(target, splitBit): pure no-insert partition by absolute bit - isBitNonConstantInLeaf(absBit): cheap β-constancy probe New on HOTTrieWriter: - rebalanceAndIntegrate: Phase 3 entry point - buildRebalancedParentWithInheritedMask: mask-inheritance integrator - splitSubtreeOnBit / splitLeafOnBit / splitIndirectOnBit: recursive splitter - buildBucketWithInheritedMask: bucket subtree builder reusing parent's mask - wrapBucketInSubtree: 1- vs N-child bucket helper - PHASE3_REBALANCE_FIRINGS counter + getPhase3RebalanceFirings/reset - diagnosePhase3Skip / diagnoseIntegrateFail (gated on -Dhot.debug.phase3) All 17 formal-verification + multi-version + multi-rev + multi-layer + bug-fix regression tests green.

Adds Phase 4 infrastructure for β-already-in-mask subtree merge: - PHASE4_SUBTREE_MERGE_FIRINGS counter + accessors. - tryPhase4SubtreeMerge dispatch wired into updateParentForSplitWithPath between Phase 3 rebalance and intermediate-BiNode fallback. - subtreeMerge entry point covering both SingleMask and MultiMask cases. - Diagnostic skip-reason logging gated on -Dhot.debug.phase4=1. Status: implementation compiles + all tests green, but Phase 4 doesn't yet fire on the 50K reproducer (counter=0). The 48 fallback firings remain. Next session: trace why tryPhase4SubtreeMerge returns false under -Dhot.debug.phase4=1 and complete the wire-up.

Boolean.getBoolean only matches the literal string "true". The test was setting "1" so the diagnostic skip-reason log was silently no-op. With the fix, Phase 4 skip-reason diagnostic on the 50K reproducer: 42 firings → reason=no-sibling-slot 6 firings → reason=not-beta-in-mask This refines the case taxonomy: β-already-in-mask has two sub-cases. (1) parent has both β=0 and β=1 sibling slots — Phase 4 subtree merge handles this (0 firings observed → algorithm is correct for it). (2) parent has β in mask but only one polarity represented in children (sparse-path off-path) — the dominant 42 firings. Resolution requires hoisting moveHalf's keys to the ancestor where β actually discriminates between subtrees. Also updates docs/HOT_STRICT_BINNA_DESIGN.md with the live status.

Walks the descend path UP from the rejecting parent looking for the FIRST ancestor whose mask contains β AND whose stored partials include the EXACT target — descend slot's stored partial XOR-ed with the β-bit-position. When such an ancestor exists, replaces parent's slot[splitChildIdx] with the keep half and bulk-inserts the move half's keys into the hoist ancestor's exact target slot. CoW-correct via withUpdatedChild's pageKey-preserving semantics. The strict criterion (exact target match) is required for I6 PEXT-routing preservation: a loose criterion ("any slot with β=¬v_L") routes moveHalf keys into a subtree whose OTHER disc-bit values disagree, so subsequent lookups subset-match the descend slot at the hoist ancestor and land in originalParent where the keys no longer reside. Empirically the loose criterion produced 11,776 I6 violations on the diagnostic 50K reproducer (every moved key misrouted) — the strict criterion preserves I6 by guaranteeing exact-match routing for moveHalf keys after the move. Empirical activation profile: on the diagnostic microbench-pattern reproducer the strict criterion never fires (all 42 no-sibling-slot cases have no ancestor with the exact-target slot — β is genuinely vestigial at every level above parent, either because parent IS the root or because higher ancestors discriminate on earlier-byte disc bits and don't carry β). Those cases continue to fall through to the intermediate-BiNode fallback (48 firings — unchanged from before this commit). The hoist machinery is in place for workloads where the strict-target ancestor exists.

…s (Case 2b-iv-b-β) When the new disc bit β from a leaf split is already a parent disc bit and no existing sibling encodes the inverse polarity, addEntryWithPDep can now add a new sibling slot for the moveHalf at partial = splitChild.partial XOR β-bit, growing fan-out from k to k+1 with the parent's mask unchanged. Mirrors Binna's reference single-key-leaf semantics for the β-already-in-mask case. Three stored-partial gates protect routing soundness: 1. HOT I3 — partials must remain unique (rejects Case 2b-iv-a where Phase 4 subtree-merge applies). 2. Fresh polarity — every existing sibling must encode β=v_L. If a sibling Y has Y[β]=¬v_L at a partial != newPartial, post-firing routing for stored keys whose densePK ⊇ L1.partial would mis-route to L1 via subset-match. 3. Subtree-β constancy — declared partial-β bit must match each sibling's actual subtree-β value (catches multi-entry-leaf pollution). When all three gates pass, the new slot is added with partial=splitChildPartial XOR β-bit, children sorted by partial-key (HOT I7), parent's pageKey preserved (no upstream CoW required). Gated on hot.strict.binna=true. Fires zero times on the diagnostic 50K microbench-pattern reproducer (every β-already-in-mask case there has either a polluted sibling or a sibling at the inverse polarity, so the gates reject and the existing fallback path runs unchanged). Other workloads with genuinely fresh polarity benefit without growing tree depth. Diagnostic counter ADDENTRY_FRESH_POLARITY_FIRINGS exposed via getAddEntryFreshPolarityFirings() / resetAddEntryFreshPolarityFirings(), plumbed into the diagnostic reproducer's metrics line.

…ns (Case 2b-iv-b-β) MultiMask analog of the SingleMask fresh-polarity helper (commit 6199de0). When the new disc bit β is already in a MultiMask parent's extraction tables and the gates pass, addEntryMultiMask now adds a new sibling slot for the moveHalf at partial = splitChild.partial XOR β-bit, with the parent's extraction-positions and extraction-masks unchanged. Same three gates as the SingleMask version: 1. HOT I3 — newPartial must not collide with any existing sibling. 2. Fresh polarity — every existing sibling must encode β=v_L. 3. Subtree-β constancy — declared partial bit must match each sibling's actual subtree-β value. β output position decoding uses multiMaskBetaOutputPos (the same decoder shared with Phase 4 subtree-merge) — walks extractionPositions in order, counting set mask bits MSB-first within each byte. mostSignificantBitIndex recomputed defensively from the (unchanged) tables. Gated on hot.strict.binna=true. Fires zero times on the diagnostic 50K microbench-pattern reproducer (no MultiMask β-already-in-mask case there satisfies the gates). Other workloads with genuinely fresh polarity and MultiMask parents benefit without growing tree depth. Increments the same ADDENTRY_FRESH_POLARITY_FIRINGS counter as the SingleMask path so the diagnostic line reports a single combined total.

@disabled

…agnosis Adds 6 AtomicLong counters (BCH_FALLBACK_*) wired to each createNodeFromChildren fallback in buildCompressedHalf — MultiMask parent, identical keys, cross-window split bit, newMask==0, unknown child, partial-key collision. Plus a [bch.multimask] debug print on the MultiMask path gated on -Dhot.debug.bchfallback=true. The microbench-pattern reproducer now reports per-fallback firings alongside the existing intermediate-binode / phase3 / phase4 counters. With strict-Binna and the I8-gate next-side check tightened, exactly 1 MultiMask-parent fallback fires — producing the 5993 I6-pext-routes-to-leaf cascade Phase 4b is meant to address. The other five fallbacks fire 0 times on this workload, so Phase 4b's scope is substantially smaller than the design doc's 4-named-cases plan suggested. docs/HOT_PHASE_4B_DIAGNOSIS.md captures the diagnosis and a 7-step fix plan (4b.0 helpers + round-trip test → 4b.6 hard-gate createNodeFromChildren-N) with test gates per step. Also adds a @disabled probe test (diagnosticPhase4SubtreeMergeFiringSweep) that empirically confirmed Phase 4 subtree-merge fires zero times across 21 feasible configs of natural CAS workloads — Case 2b-iv-a's preconditions are too narrow to arise from realistic insert sequences. Default behavior unchanged: counters start at zero, prints fire only under the debug flag, intermediateBiNodePreservesSlotOrder doc-comment updated to warn future readers that the marginal I8 violation is load-bearing until Phase 4b lands. Reproducer baseline (1 violation, height 6, 48 fallbacks, 17 phase3 firings) preserved bit-identical.

Adds three package-private static helpers in HOTTrieWriter that Phase 4b.2 will use to port C++'s compressEntries{,AndAddOneEntryIntoNewNode} algorithm to the MultiMask-parent path of buildCompressedHalf: 1. computeRelevantBitsFromPartials(partials, indices) — extracts the existing adjacent-pair OR scan from buildCompressedHalf into a layout-independent helper. Mirrors C++'s SparsePartialKeys::getRelevantBitsForRange. 2. extractMultiMaskSubset(parent, relevant) — returns a MultiMaskSubLayout record (extractionPositions, extractionMasks, numExtractionBytes, totalBits, msbIndex) containing only parent's mask bits whose output positions are set in `relevant`. Mirrors C++'s DiscriminativeBitsRepresentation.extract template. 3. (computePartialKeyMultiMaskDirect — already existed; now package-private so tests can use it as the encoding oracle.) The round-trip identity proven by 9 same-package tests in MultiMaskSubLayoutTest: computePartialKeyMultiMaskDirect(K, newLayout) == (int) Long.expand(Long.compress(oldPartial, relevant), allOnes(newLayout.totalBits)) — directly encoding key K under the sub-layout matches repositioning K's old partial via Long.compress→Long.expand. This is the precondition for Phase 4b.2's MultiMask-parent path: existing children's partials can be repositioned in the new sub-layout without re-scanning their full first keys. Also refactors buildCompressedHalf's SingleMask path to call the helper — bit-identical baseline preserved (50K reproducer: 1 violation, height 6, 48 intermediate-binode-fallbacks, 17 phase3-firings, all bch-fallbacks=0). Test gates per docs/HOT_PHASE_4B_DIAGNOSIS.md step 4b.0: - Round-trip green (9/9 tests pass) - Existing tests unchanged (reproducer bit-identical)

…virtual-BiNode insight Adds buildCompressedHalfMultiMask: structural port of the SingleMask inheritance code to MultiMask parents. Mirrors C++'s template extractAndExecuteWithCorrectMaskAndDiscriminativeBitsRepresentation by explicitly branching on parent.getLayoutType() (Java has no template equivalent). The MultiMask path: 1. computes relevant bits via the helper from 4b.0, 2. extracts a sub-MultiMask layout via extractMultiMaskSubset, 3. extends the layout with the leaf-split's β if both halves land here, 4. encodes split-product partials via computePartialKeyMultiMaskDirect, 5. repositions inherited partials via Long.compress / Long.expand, 6. enforces I3 uniqueness, sorts by partial-key (HOT I7), 7. constructs MultiMask SpanNode/MultiNode (or BiNode for length 2). Each fallback path (identical-keys, newMask-zero, unknown-child, partial-collision) increments its corresponding BCH_FALLBACK_* counter, preserving the diagnostic visibility from fdaba3c. ARCHITECTURAL CORRECTION captured in HOT_PHASE_4B_DIAGNOSIS.md: Re-reading the C++ reference revealed that BiNode is a plain value-type struct (BiNodeInterface.hpp), never persisted as a tree node. C++'s integrateBiNodeIntoTree (HOTSingleThreaded.hpp:493-547) treats the leaf-split's two halves as ONE addition (valueToInsert) + ONE replacement (valueToReplace) — never two separate sibling insertions. Sirix's c868e66 intermediate-BiNode workaround materializes a HOTIndirectPage where C++ uses a stack-local struct. That extra persisted node creates the marginal I8 violation we see today, and the bothSplitHere case in splitParentAndRecurse + buildCompressedHalf is a structural mismatch with the C++ algorithm's semantics. The Phase 4b.2 MultiMask path lands here as a structural port that handles all healthy cases correctly. The bothSplitHere case with multi-entry-leaf pathology falls back via BCH_FALLBACK_PARTIAL_COLLISION (no worse than pre-Phase-4b on that case). The complete fix requires the virtual-BiNode refactor (Phase 4b-vb) — adopting C++'s one-insert-one-replace integration semantics throughout the leaf-split → parent integration backbone. ~3 weeks of follow-up work, scoped in HOT_PHASE_4B_DIAGNOSIS.md. Default reproducer state preserved bit-identically (1 violation, height 6, 48 intermediate-binode-fallbacks, 17 phase3-firings, all bch-fallbacks=0). MultiMaskSubLayoutTest still 9/9 green.

…load Adds an overload of HOTLeafPage.splitToWithInsert that takes an optional int[] newSideOut output array. On success, newSideOut[0] receives 0 if the just-inserted key landed in `this` (LEFT, β=0) or 1 if it landed in `target` (RIGHT, β=1). Untouched on failure. The 5-arg overload is preserved as a thin wrapper passing null — no churn to existing callers. Phase 4b-vb consumers (the C++-faithful integration path) need this to compute valueToInsert / valueToReplace per integrateBiNodeIntoTree's semantics — pass the half WITHOUT the new key as valueToReplace (replaces splitChild's slot in-place) and the half WITH the new key as valueToInsert (the single new entry added at the parent level). Adds same-class round-trip test (Phase 4b-vb.0 gate). 4 cases: - new key smaller (β=0) → newSideOut=0, key lands in `this`, - new key larger (β=1) → newSideOut=1, key lands in `target`, - null newSideOut accepted (backward-compat), - failure path leaves sentinel value unchanged. Existing 5-test SplitToWithInsertEdgeCases nested suite still green. No behavior change for any production caller (all currently use the 5-arg overload). Future Phase 4b-vb.1 work will switch select callers to the 6-arg form to thread newSide upward.

…ateParentForSplitWithPath Adds an `int newSide` parameter to updateParentForSplitWithPath. Threads the value from splitToWithInsert's newSideOut through: - the main leaf-overflow path (HOTTrieWriter:691 area) — captures splitToWithInsert's newSide and passes to updateParentForSplitWithPath, - the Phase 4 sibling-subtree leaf-split path (splitLeafAndIntegrateInSiblingSubtree) — captures locally and passes, - the two recursive splitParentAndRecurse → updateParentForSplitWithPath callers — passes newSide=1 (C++ recursive convention from integrateBiNodeIntoTree, which always passes newIsRight=true on recursion because parentNode.split() places the new entry in mRight by construction at the upper level). Phase 4b-vb.3 will refine these when splitParentAndRecurse adopts proper one-insert-one-replace semantics. The parameter is currently unused inside updateParentForSplitWithPath — the threading is a mechanical one-shot change so future Phase 4b-vb.2 / 4b-vb.3 edits can consume it without re-touching every call site. Default reproducer state preserved bit-identically (1 violation, height 6, 48 fallbacks, 17 phase3-firings, all bch-fallbacks=0). HOTLeafPageSplitTest + MultiMaskSubLayoutTest still green.

… semantics Adds SplitProductRoles record and classifySplitProducts() helper. Mirrors C++'s integrateBiNodeIntoTree lines 512-514 (HOTSingleThreaded.hpp): given (leftChild, rightChild, newSide), classify into: - valueToInsert = the half receiving the just-inserted key (newSide==1 → rightChild; newSide==0 → leftChild). Treated as the SINGLE new entry from parent's perspective. - valueToReplace = the OTHER half. Conceptually overwrites splitChild's existing slot pointer at parent. - entryOffset = the C++ offset variable (0 if valueToReplace==leftChild, 1 otherwise) for the slot-overwrite bookkeeping. Sirix's addEntryWithPDep already places both products by partial-key sort, so the role distinction is currently information-only. Phase 4b-vb.3 will consume it in splitParentAndRecurse's refactor — the half receiving the new entry is built with valueToInsert as its single new child (no bothSplitHere), then the other half (or splitChildIdx's slot in the same half) gets valueToReplace overwritten in. Three same-class tests in MultiMaskSubLayoutTest cover newSide=1, newSide=0, and the -1 sentinel (falls through to newSide=0 semantics). 12/12 tests pass. Default reproducer state preserved bit-identically. No call sites use classifySplitProducts yet — the helper is the foundation for Phase 4b-vb.3 to consume mechanically without re-deriving the offsets.

… popcount==1 In buildCompressedHalf and buildCompressedHalfMultiMask, the 2-child case unconditionally produced a BiNode with disc-bit computed from boundary keys (child[0].lastKey vs child[1].firstKey). On multi-entry-leaf trees that's unsound: child[0]'s subtree may span both values of bit `db` (the boundary keys disambiguate but interior keys may not), violating the BiNode's constancy invariant. Surfaced when the I8 gate next-side check is tightened (splitParentAndRecurse + buildCompressedHalfMultiMask running) — produced a constancy build-violation at the 2-child output node. Fix: only emit a BiNode when newMask has exactly one bit set (SingleMask: Long.bitCount == 1; MultiMask: finalTotalBits == 1). For multi-bit cases fall through to the SpanNode/SpanNodeMultiMask path, which preserves the full disc-bit set in the indirect's mask + multi-bit partials. Also documents the failed Phase 4b-vb.3 attempt at including originalChildIndex in parentIndices: the change captured bits distinguishing splitChild's projection from siblings (matching C++'s getRelevantBitsForRange semantics) but produced incorrect stored partials in the resulting indirect (I-Binna-sparse-path violations) due to additional subtleties in the relevant→newMask→inherited-encoding chain. Reverted; the fix shape is left as a comment in splitParentAndRecurse for the next session's analysis. Default reproducer state preserved bit-identically (1 violation, height 6, 48 intermediate-binode-fallbacks, 17 phase3-firings, all bch-fallbacks=0). MultiMaskSubLayoutTest and HOTLeafPageSplitTest still green.

Adds the diagnostic infrastructure used by this session's deep dive into the I-Binna-sparse-path violations, and the Phase 2 plumbing for future constancy-aware leaf splits. None of this changes default behavior. HOTIndirectPage: - Adds a public static volatile POST_CREATE_HOOK Consumer<HOTIndirectPage> that runs after every indirect factory invocation. Default: null (zero overhead). HOTTrieWriter's class-load static block installs it. HOTTrieWriter: - probeSparsePathStatic(built, label) — runs the sparse-path NECESSARY check (stored & ~dense) == 0 on every child of a freshly-built indirect. Logs to /tmp/sirix-sparse-path-violations.log with stack trace + frame trim. Gated on -Dhot.debug.sparsepath=true. - staticGetFirstKeyFromChild — null-safe first-key access from already-attached pages, for use from the static probe (no activeReader available). - bytesToHex helper. - BCH_SINGLEMASK_ENTRIES + BCH_ENCODING_MISMATCHES counters + accessors + reset. - findOffendingAncestorBit(pathNodes, pathDepth, leaf, keyBuf) — Phase 2 helper that returns the first ancestor disc bit β where keyBuf's value differs from leaf.firstKey's value (= insertion would break β-constancy). Currently unused; preserved for future Phase 1+2 integration. - handleLeafSplitAndInsert(explicitSplitBit) overload — Phase 2 plumbing that splits the leaf on an explicit ancestor bit via splitToWithInsertOnBit instead of the MSDB-based splitToWithInsert. The 13-arg form remains as the default (explicitSplitBit = -1). AbstractHOTIndexWriter: - doIndex now contains a comment block documenting the Phase 2 eager constancy-aware split experiment that was tried + reverted. The eager split fired on nearly every insert in early-stage trie construction, causing tree height to explode 6 → 12 and violations 1 → 635 on the 50K reproducer (per design doc §6 Variant A's warning that non-MSDB splits break contiguous-partition invariants). Phase 2 needs Phase 1 (per-leaf ancestor-bit tracking) infrastructure first. HOTFormalVerificationTest: - diagnosticMicrobenchPatternReproducer sets hot.debug.sparsepath + hot.debug.bch.encoding properties; prints bch-encoding counters per run. docs/HOT_PHASE_4B_DIAGNOSIS.md: - Updates with the 4-step causal chain of the I-Binna-sparse-path violation (= multi-entry-leaf β-constancy break via subsequent inserts). - Documents three reverted attempts (parentIndices-includes-splitChildIdx, OR-removal-in-addEntryWithPDep, eager-constancy-aware-leaf-split) and why each didn't fix the root cause. - Path forward: Phase 1 + Phase 2 (proper, contiguous-partition-preserving) + Phase 3 of the original design plan. ~3-4 weeks. Default reproducer state preserved bit-identically: N=50000 · violations=1 · intermediate-binode-fallbacks=48 · phase3-rebalance-firings=17 · all bch-fallbacks=0 · all bch-encoding=0. HOT regression suite (HOTFormalVerificationTest + HOTLeafPageSplitTest + MultiMaskSubLayoutTest): BUILD SUCCESSFUL in 5m 20s.

…reverted; helpers stay Adds: - HOTLeafPage.computeMsdbWithKey(byte[] key) — public helper that returns the MSDB this leaf would have if `key` were inserted. Wraps existing private findEntry + findMsdbWithNewKey logic. Constrains the existing findOffendingAncestorBit helper to only return a non-negative β when β EQUALS the leaf's MSDB-with-K (= contiguous-partition case where splitting on β = splitting on MSDB, which preserves HOT I8). The naive "any non-constant ancestor β" version (= attempt #1, reverted earlier) fired non-MSDB splits which broke contiguity. Tried re-enabling the eager Phase 2 path in AbstractHOTIndexWriter.doIndex with this constrained helper. Result: 1 → 645 violations (vs attempt #1's 1 → 635). Even contiguous β = MSDB splits expose downstream cascade issues in addEntryWithPDep / splitParentAndRecurse / intermediate-BiNode paths — they construct stored partials assuming I-Binna invariant, which is the same assumption the eager split tries to enforce (circular). Conclusion: Phase 2 alone is insufficient. The full fix requires Phase 4b's sparse-path-encoding-throughout-indirect-construction refactor FIRST, then Phase 2 to maintain leaf-level β-constancy. Multi-week per phase. Phase 2 disabled in AbstractHOTIndexWriter.doIndex (= falls through to normal merge + standard MSDB-based overflow handling). The helpers and the handleLeafSplitAndInsert(explicitSplitBit) overload remain in place for the future Phase 1+2 integration session(s). Default reproducer state preserved bit-identically (1 violation, height 6, 48 intermediate-binode-fallbacks, 17 phase3-firings, all bch-fallbacks=0, all encoding-mismatches=0). HOT regression suite (HOTFormalVerificationTest + HOTLeafPageSplitTest + MultiMaskSubLayoutTest): BUILD SUCCESSFUL in 5m 45s.

Adds an explicit I5-leaf-constancy check that walks each child's ENTIRE subtree (not just c.firstKey) and verifies the sparse-path subset condition holds for every key in the subtree: for each indirect i: for each child c at slot s with c.stored \!= 0: for each key K in c's subtree (recursive walk): require (c.stored & ~PEXT(K, i.mask)) == 0 The existing I-Binna-sparse-path check tested this only against c.firstKey. That missed the multi-entry-leaf case where firstKey was correct but interior keys violated constancy — exactly the gap our Phase 4b deep-dive kept hitting without being able to localize. Empirical confirmation: - Default reproducer (I8 gate as-is): N=50000, 1 violation = I8 only. Trie's leaf-level β-constancy IS preserved in default mode. - I8 gate tightened (next-side check): 5995 violations = 2 I5-leaf-constancy + 5993 I6-pext-routes-to-leaf cascade. The I6 cascade is a downstream consequence of the I5 violations, not an independent fault. This is the missed case the user predicted ("I think we always miss cases"). Now we can localize multi-entry-leaf β-mixing precisely at the indirect/child level, with the offending subtree-key dumped in the violation message — giving the next session a concrete starting point for designing the fix. Walks subtrees recursively via walkLeavesUntilFalse (= predicate-style visitor). Cost: O(N) per indirect where N = subtree key count. Validator is test-only, so the cost is acceptable for full-trie verification. HOT regression suite: BUILD SUCCESSFUL in 5m 59s. Default reproducer state preserved bit-identically (1 violation, height 6, 48 fallbacks, 17 phase3, all bch-fallbacks=0, all encoding-mismatches=0).

… helpers Extends probeSparsePathStatic with an I5-strict subtree walk: after detecting the existing sparse-path violation against c.firstKey, also walks c's swizzled subtree to find any interior key K where (c.stored & ~PEXT(K, mask)) \!= 0. Logs to /tmp/sirix-sparse-path-violations.log under -Dhot.debug.sparsepath=true, including the offending failingKey for direct correlation with HOTInvariantValidator's I5-leaf-constancy violations. Adds two helper methods for future Phase 1+2+3 implementation: - computeSubtreeIntersectionDenseSingleMask(c, initialBytePos, newMask) - computeSubtreeIntersectionDenseMultiMask(c, extPos, extMasks, numExtractionBytes) Each walks c's subtree (via the activeReader) and returns the AND of dense PEXT over every key. The result is the strongest sparse-path-encoded c.stored that provably satisfies I5-strict (sparse ⊆ dense_K for every K in subtree). Tried using these helpers in buildCompressedHalf to replace the inherit-from- old-partial encoding. Reverted: the intersection produces MORE collisions than the old approach (= more partial-collision fallbacks → more createNodeFromChildren calls → MORE I5 violations). The helpers remain in place as infrastructure for the broader Phase 1+2+3 refactor that maintains leaf-level β-constancy at insert time, where intersection encoding becomes safe. Also adds staticWalkLeavesUntilFalse + walkLeavesUntilFalseInstance helpers (predicate-style subtree walks) used by the probe + intersection helpers. Default reproducer state preserved bit-identically (1 violation, height 6, 48 intermediate-binode-fallbacks, 17 phase3-firings, all bch-fallbacks=0, all encoding-mismatches=0).

…urces Multi-session formal-verification effort, Stage A: enumerate every invariant the HOT design requires, cross-referenced from primary sources (C++ reference implementation at speedskater/hot, Sirix design doc, Phase 4b diagnosis, existing HOTInvariantValidator, writer source). The catalog systematically lists 16 invariants — I1 through I12 plus the 4 supplementary ones (I-Binna-sparse-path, I-PDEP-PEXT-identity, I-CoW-pageKey, I-CoW-multi-rev). For each: - Stable name (Sirix I-prefix where possible) - Formal predicate - Source citation (file + line) - Operation × invariant preservation table - Current Sirix-side coverage status (validator + writer) - Sirix-specific notes (= multi-entry-leaf concerns) GAPS identified (= invariants in C++ but not currently checked by Sirix): - I4-first-partial-zero: never checked. Each indirect's partialKeys[0] must equal 0 per Binna's "first mask always zero" rule (HOT SingleThreadedNode.hpp:179, 244, 267, 292, 320). Currently only implicit via partial-key sort (= 0 sorts to slot 0). - I11-trie-condition: never explicitly checked. Disc bits must become strictly less significant down the tree (parent.MSB < child.discBit per HOTSingleThreaded.hpp:521-523). - I12-disc-bit-monotone: subsumed by I11 if added. Operation × invariant preservation matrix (§3) shows every construction site's per-invariant coverage. Reveals the campaign's recurring failure pattern: every fix attempt addressed ONE column without checking others. The cascades came from violations of an UNCHECKED invariant in a DIFFERENT operation. Stage F's design must close all 9 columns × 11 rows simultaneously. Multi-entry-leaf specific concerns documented in §4. Three architectural options per design doc §3 — eager / lazy / single-key-leaves — with trade-offs. Operations inventory in §5 (~25 ops × 14 invariants = 350-cell matrix for Stage B). Path forward (§6): A→B→C→D→E→F→G→H. Stage B next: per-operation preservation matrix sourced from writer's code, not intuition. This document is the source of truth for all subsequent stages. No existing code changed. Default reproducer state preserved bit-identically (1 violation, height 6, 48 intermediate-binode-fallbacks, 17 phase3-firings).

Per-operation analysis of all 35 mutating operations across HOTLeafPage, HOTIndirectPage, and HOTTrieWriter. Each operation scored against the 16 invariants from Stage A's catalog with one of: PRESERVES (with mechanism citation), WEAK (preserved on happy path only), VIOLATES (known broken), N/A, or UNKNOWN. Sourced from the writer's actual code, not intuition. The doc is research output — no code changes. Hard violations surfaced: - Op 1/2 (leaf.put / mergeWithNodeRefs): I5/I-SP. Leaf inserts have no awareness of ancestor disc bits — the root cause of every cascade in the campaign. - Op 22/23 (buildCompressedHalf SM/MM): I5/I-SP/I6. "Constancy check removed" decision is correct under sparse-path encoding *if* upstream data is sound; multi-entry-leaf invalidates that pre-condition. - Op 24/25 (rebuildParentAbsorbingSplit / createNodeFromChildren-N): I5/I-SP/I6. Documented at line 5298-5304. - Op 26 (createBiNodeTraced as intermediate-BiNode): I8 next-side, the documented marginal violation. - Op 34/35 (Phase 4 bulkInsert): inherit Op 1's I5 via recursion. Recurring failure pattern explained: each fix attempt (parentIndices / sparse-path no-OR / eager Phase 2 / constrained Phase 2 / intersection- encoding) closed ONE column of the matrix while opening multiple others. Stage F's design must close all hard-violation cells simultaneously. Convergence: every cascade traces to leaf-level I5 violation. Stage F must enforce I5 at INSERT time, not OVERFLOW time. Stage C next: extend HOTInvariantValidator with I4, I11, and the leaf- insert pre-condition check that surfaces Op 1/2's I5 gap.

Three new structural checks land in HOTInvariantValidator: I4-first-partial-zero Per Binna's C++ HOTSingleThreadedNode.hpp:179-320 ("THE FIRST MASK ALWAYS IS ZERO\!\!"). Verifies the smallest stored partial across all child slots equals zero — under sparse-path encoding the leftmost child by partial-key sort takes LEFT at every BiNode on its path, so all on-path bits are 0. I11-trie-condition Per C++ HOTSingleThreaded.hpp:521-523 (assert(parent.MSB < splitBit)). For every parent → child indirect edge, child.mask MSB must be strictly less significant (= larger absolute bit position) than parent.MSB. Run during walk(), so it covers every edge in the tree. I-leaf-insert-precondition Static helper checkLeafInsertPreservesI5(rootRef, newKey, reader). Walks the PEXT-descent path; at each ancestor scans the chosen child's subtree for each captured β bit. If the subtree is β-constant at value v but newKey.β \!= v, reports a violation: insert would change subtree from β-constant to β-mixed. Surfaces Op 1/2 (leaf.put) I5 gap at the operation that creates it (Stage B §5.4 convergence point). Mixed existing values do NOT trigger — those are either off-path bits (legal) or already-broken I5 (separately reported). HOTInvariantValidatorChecksTest exercises positive + negative pairs for each check (6 tests, all pass). Synthetic indirect/leaf pages built via public factory methods + in-memory PageReferences; Mockito stub reader (unused since pages are pre-loaded). Empirical Stage E preview from the 50K microbench-pattern reproducer (hot.strict.binna=true): I4 fires once at indirect 11 (smallest partial 0x38, must be 0). I11 does NOT fire — trie condition holds throughout. The 100K CAS workload (default mode, no strict-binna paths) passes with zero violations. So I4 is specifically a strict-Binna-paths gap, not a default-writer gap. Narrows Stage F's fix scope to rebalanceAndIntegrate / intermediate-BiNode / Phase 4 paths. Updated class Javadoc to enumerate I1..I11 + I-Binna-sparse-path + I-leaf-insert-precondition with cross-references to Stage A's catalog.

Stage D — embeds a post-mutation validation gate into diagnosticMicrobenchPatternReproducer, gated on -Dhot.strict.validate=true (forwarded via build.gradle systemProperty). When enabled: - Captures violation/firing-counter checkpoints every 250 main-phase inserts. - Calls HOTInvariantValidator.checkLeafInsertPreservesI5(...) BEFORE every main-phase insert, recording which keys would break I5 if inserted. - Counter deltas attribute violations to the operations that fired between adjacent checkpoints. Stage E — docs/HOT_EMPIRICAL_FAILURE_TABLE.md (270 lines). Empirical analysis of one full Stage D run answers the 5 open questions from Stage B §7 and flips multiple ⚠/? cells in the master matrix to ❌ or ✅ based on observation. Headline findings: - First-failure indices: I8/I11 at idx 500, I-leaf-insert-precond at 515, I4 at idx 1750. - 195 of 202 checkpoints show {I4=1, I8=1} — the de-facto baseline. - I11 fired once at idx 500 then SELF-CLEARED by main+1500 (later happy- path operation restored monotonicity at the offending edge). - All four violation types appear with ZERO recovery-counter deltas in the preceding window. The campaign was patching the wrong subsystem — these violations are caused by HAPPY-PATH addEntryWithPDep / splitParentAndRecurse / buildCompressedHalf, not by the recovery fallbacks (Phase 3 / Phase 4 / fresh-polarity / intermediate-BiNode / BCH fallbacks). - Phase 4 / fresh-polarity / BCH fallbacks: 0 firings across 50K inserts. Confirmed inert on this workload. - Pre-condition fires 72 times (0.14% of inserts) — the keys that would turn β-constant subtrees β-mixed. Stage F implication: the fix scope is happy-path correctness in addEntryWithPDep and splitParentAndRecurse, NOT recovery-path patching. The earlier session-narrative ("Op 1 leaf.put silently breaks I5") was correct directionally but missed the actual error: the writer's PDEP repositioning logic is producing structurally invalid layouts on the default success path. Stage F's per-operation preservation logic must target lines 1430-1448 (addEntryWithPDep partial repositioning), 4319-4467 (splitParentAndRecurse), 4663-4898 (buildCompressedHalf SingleMask main path), and the splitToWithInsert MSDB fall-back at handleLeafSplitAndInsertInternal:1031-1037. build.gradle: forward -Dhot.strict.validate to test JVM.

…minated Stage F deliverable: docs/HOT_FIX_DESIGN.md (290 lines). Per-operation preservation logic for the 4 happy-path bugs Stage E surfaced. Each bug (B1-B5) gets mechanism + fix + blast radius + validation strategy. Sourced from Stage E's empirical data, not intuition. Stage G partial: G.1 (Bug B1, addEntryWithPDep) — I4 self-check rejects when no slot's partial = 0. Counter G1_I4_REJECT_ADDENTRY tracks firings. G.1b (Bug B1 MultiMask variant, addEntryMultiMask) — same self-check. POST_CREATE_HOOK trace pinpointed addEntryMultiMask:3636 as the actual source of "indirect 11 smallest partial 0x38" — not addEntryWithPDep as Stage F initially predicted. G.3 (Bug B3, buildCompressedHalf) — I4 self-check, falls back to createNodeFromChildren-N if no zero partial. Counter G3_I4_REJECT_BCH. Stage D gate result on 50K reproducer (-Dhot.strict.binna=true -Dhot.strict.validate=true): Before: violations=2 {I4=1, I8=1}, BN=48, P3=17, FP=0, P4=0 After: violations=1 {I8=1}, BN=48, P3=25, FP=0, P4=0 G.1 rejected once; Phase 3 absorbed 8 cases (17 → 25 firings). Stage G.4 (next-side I8 gate tightening) ATTEMPTED + REVERTED. Cascaded 1 → 5998 violations (5993 I6, 2 I5, 3 I11) from splitParentAndRecurse's sparse-path encoding bugs in buildCompressedHalf. The intermediate-BiNode workaround was hiding deeper bugs. The 1 remaining I8 violation cannot be tightened until splitParentAndRecurse / buildCompressedHalf produce correct I6-routing partials. Documented in intermediateBiNodePreservesSlotOrder. 100K CAS default-mode test still passes 0 violations. Stage G continues next session: investigate buildCompressedHalf SingleMask path I6 violation root cause. Until that's fixed, the 1 marginal I8 from intermediate-BiNode is the lesser-evil baseline.

…sedHalf G.5 (computeRelevantBitsFromPartials): replace adjacent-XOR with OR(partials) ^ AND(partials). Equivalent on sorted-contiguous sequences (Sirix's current pattern), but strictly correct on any partial sequence. Defensive against future workloads with non-contiguous partial subsets. G.6 (buildCompressedHalf SingleMask inherited-children): replace sparse PEXT-via-relevant + PDEP-into-newLayout with DIRECT PEXT from the child's firstKey under newMask. The sparse-path encoding's bit-elision optimization is correct ONLY when upstream β-constancy holds (= dropped bits are truly constant in the child's subtree). Under multi-entry-leaf workloads that breaks. Direct PEXT is structurally sound for I6 routing of firstKey, and the I-leaf-insert-precondition still surfaces the leaf-level β-mixing as a separate gap. Trade-off: loses the bit-elision optimization (a few mask bits per half), gains routing correctness on cascade. Neither fix changes the 50K reproducer's baseline (still 1 marginal I8 from intermediate-BiNode workaround). 100K CAS default test still passes 0 violations. The 1 remaining I8 violation requires deeper investigation — G.4 next-side gate retried with G.5+G.6 still cascades to 5998 I6/I5/I11 violations from a path NOT covered by these fixes (likely createNodeFromChildren-N or a downstream addEntryWithPDep success). Next session must trace the cascade source.

Phase 2 retry attempt with G.1/G.1b/G.3/G.5/G.6 in place: cascaded 1 → 517 violations (512 I6, 2 I-Binna-sparse-path, 2 I5, 1 I8). G-fixes' I4 self-checks didn't prevent the cascade. Empirical pin (Stage G PEXT trace at idx 1750+): cascade source is indirect 26 child[0] with stored partial 0x0 ("don't care") which matches keys with bit 0x8000=1 even though child[0]'s subtree was originally β=0-only. Routing prefers more-specific match (child[1]'s partial 0x8000) over child[0]'s 0x0; the actual key lives in child[0]'s subtree → I6 violation. The fundamental issue: under sparse-path encoding, partial 0x0 means BOTH "leftmost on every BiNode" AND "subtree has all bit-0 paths". Multi-entry-leaf β-mixing breaks the second claim AT INSERT TIME, but routing soundness only verifies the first via subset-match. Phase 2 split-on-β fixes ONE leaf but the indirect structure already encodes the broken β-constancy claim. Required fix: split-on-β PLUS reroute the misfit half to a new sibling slot at the appropriate ancestor level (= Phase 4 subtree-merge for the leaf-insert case). Phase 4 currently only fires for addEntryWithPDep-rejection cases — extending it to the leaf-insert pre-emptive split case is a multi-week integration task. Stable state preserved: 1 I8 violation on 50K reproducer + 0 on 100K default. Helpers (findOffendingAncestorBit, computeMsdbWithKey, handleLeafSplitAndInsert explicit-bit overload) remain in place.

After 11 distinct Stage G fix attempts (G.1 through G.12 plus retries), documented in HOT_FIX_DESIGN_V2.md, the empirical evidence is conclusive: - The 1 marginal I8 violation cannot be fixed via localized patches in the existing writer architecture. - Every gate tightening (G.4, G.7, G.8, G.10) cascades into 5x to 5998x more violations because the rejected case forces a fall- through path that produces DIFFERENTLY-broken layouts. - Multi-entry leaves vs sparse-path encoding semantics are fundamentally incompatible under random workloads. Sirix's writer has 14+ indirect-construction operations, each with subtly different invariant-preservation properties — patching one shifts the bug to another. Tested architectural fixes: G.12 (single-entry leaves under -Dhot.strict.binna): correctness- wise sound. N=5K still produces 2 I8 violations (the intermediate-BiNode workaround creates them independently of leaf semantics). N=50K times out at 600s (every insert triggers split → ~2495 BN firings at 5K, scales linearly). G.12 + G.4 (single-entry + next-side I8 gate): same 2 I8 at N=5K. G.12 + disable-intermediate-BN: 5288 violations cascade. The honest finding: the marginal I8 violation requires a multi-week architectural rework (Option B in HOT_FIX_DESIGN_V2.md §3.2): constancy-aware leaf split + Phase 4 routing-restructure for the leaf-insert-precondition cases, integrated across the 14+ indirect-construction ops via a single canonical builder. None of the existing helpers are missing; the integration is what's missing. Stage G's landed fixes (G.1, G.1b, G.3, G.5, G.6) preserved: they eliminated the I4 violation and added defensive correctness without regression. The 1-violation baseline holds.

Independent proof-review (post-commit 0234f43) flagged 4 critical errors, 5 soft spots, and 3 notation issues. This commit addresses them: Critical fixes: - S6.10 Theorem 2: construction violated (F3). Re-construct with densePK_K = S_{k+1} OR 2^{m-1} so densePK_K differs from every S_c and (F3) holds. Sub-case A's I7 argument now follows from the bit-m-1 extension. - S6.6 Case (iv.a) Exact-match: clean case-split on c_ins relative to p: in (p, a] -> I8 fails via lex chain; <= p -> I7 fails because densePK_K > S_a > S_{c_ins}. - S6.6 Case (v.a) Exact-match: removed (F3) miscite, replaced with I7 contradiction (sigma_L > S_a > S_c for c < a). - S6.10 Sub-case B counting: restated cleanly as three-category fate (untouched/preserved/extracted) with at-most-one survivor at index 0, forcing k+1 extractions = k+1 touches. Soft spot fixes: - Lemma 7: expanded from 3 sub-cases to 4 covering ADD/SPLIT-L/SPLIT-R/ REPLACE/merged-content; clarified proof references Definition 4 directly. - Lemma 9: distinguished SRC bound from SRC-tightness; weaker statement S'_{c'} ⊆ intersection densePK is what SRC alone guarantees; tightness is a separate convention; theorems now use the appropriate bound. - Theorem 5: scope clarified -- TB-h inherits Theorem 2 (correctness fails), TB-s/TB-m show SRC-impossibility for adversarial K'; the table summary in S6.16 updated accordingly. - Theorem 6: insert cost now stated as O(depth) per insert (every ancestor re-ORs), not O(1); delete cost remains Omega(content). - Observation 1 in S6.15.3: added explicit citation to HOTLeafPage.findEntry and Binna's S3.3 for the single-entry-leaf contrast. Notation: - Added Lemma 7.5 capturing the firing-implied lex chain K < firstKey(p) < firstKey(a) < ... so subsequent proofs cite it directly. - Case (iv.a) non-exact-match regime now made explicit as a sub-case (not buried in a parenthetical). Doc grows from 1228 to 1358 lines. Six theorems and two corollaries remain; the table in S6.16 is updated to reflect Theorem 5's revised scope.

HOTIndexStressTest.testLargeNameIndex (50K unique names) failed with IllegalArgumentException from HOTIncrementalInsert.addEntry: "split bit N is already a discriminative bit of the node — not a valid integration". Root cause. The multi-value-leaf adaptation accumulates discriminative bits in ancestor masks via addEntry's zero-fill convention. Over many revisions, this can produce cross-level mask overlap: a parent's discBits include a bit β that equals one of its child node's MSB. When the writer subsequently splits that child via splitIndirect or splitIndirectWithEntry, the resulting BiNode has β = child.MSB. integrate then calls addEntry(parent, biNode), which requires β NOT to be in parent's mask — but it is. addEntry throws. The trie invariant (Binna's Theorem II(d): parent.MSB < child.MSB and parent.discBits disjoint from child.discBits) is the implicit precondition. Under single-entry leaves it holds by construction; under multi-value-leaf adaptation it can be violated. Fix. Add canIntegrateBiNodeCleanly() that simulates integrate's cascade upfront — at each spine level, checks whether the cascade's β at that level collides with the target node's mask. Three call sites in tryBranchIncremental now pre-check before invoking integrate: * branchFullNodeAtExistingBit (the betaIsDiscBit-full path; β = node.MSB after splitIndirect). * branchSplitFullNode (the betaIsNewDiscBit + full path; β = node.MSB after splitIndirectWithEntry). * singleEntry && leafEntry case (β = the branching bit; biNode height=1). When the simulator detects a cascade-time collision, the call site returns false and the caller (branchAboveLeaf) falls back to scoped rebuildSubtree(insertDepth) — same fallback discipline as the other non-resolvable cases (Direction-1 I8-unsafe and C2 collision). Cost is O(subtree-at-insertDepth), bounded. Verified: * HOTIndexStressTest.testLargeNameIndex passes (23.7s, 50K names). * 927-test HOT suite green (no regressions). The pre-check is a clean predicate (no exceptions for control flow). The canonical incremental fix for cross-level mask overlap is open work — it would require recognizing the overlap as a structural opportunity and folding the BiNode's two halves into parent's existing β-aligned slots. That's a meaningful primitive design exercise; deferred to a follow-up.

Replaces the scoped-rebuild fallback added in 7507cbf with a genuine incremental primitive. The cross-level mask overlap case (parent's mask already contains the split BiNode's β = child.MSB) is now folded in place rather than triggering rebuildSubtree. New primitive HOTIncrementalInsert.mergeBiNodeAtExistingDiscBit: when β is already a discriminative bit of the target node (column c), the node's slot at affectedChildIndex encoded the pre-split child as β-column=0 (the off-path- straddle that addEntry's zero-fill creates). splitIndirect puts the β=0 keys (incl. the child's firstKey, since β=child.MSB) in biNode.left, so the left half keeps the slot's existing partial and the β=1 straddle half gets oldPartial | β-bit -- identical to addChildAtCombination's comboPartial, which BetaIsDiscBitRoutingProbe already proves routes correctly. An assertion documents the v==0 invariant (it never fires across the 927-test HOT suite). integrate now dispatches on β-membership at each cascade level: - β fresh to parent's mask -> addEntry (canonical). - β in parent's mask, parent not full -> mergeBiNodeAtExistingDiscBit. - β in parent's mask, parent full -> splitIndirectWithSlotReplaceAndInsertion (the full-node analog), then recurse. Membership uses a new zero-alloc HOTIndirectPage.isDiscriminativeBit(beta) (constant-time bit test for SINGLE_MASK, bounded scan for MULTI_MASK) so the common β-fresh path adds no allocation -- only the rare overlap branch materializes the disc-bit array for the column index. Removed: the canIntegrateBiNodeCleanly pre-check and its three call sites in AbstractHOTIndexWriter (the cascade simulation is no longer needed -- integrate handles the overlap directly instead of bailing to rebuild). This eliminates one of the two engineering-deferred rebuild fallbacks (the other -- Direction-1 I8-unsafe -- is the theoretically-necessary one proved in HOT_PAPER_IMPOSSIBILITY.md). Advances the paper's S9.3 sparse-mu gap. Verified: * HOTIndexStressTest 50K names: 23.7s (rebuild) -> 4.06s incremental (5.8x). * 927-test HOT suite green with assertions enabled (v==0 invariant holds). * 4/4 HOT stress tests + full sirix-query suite green. * 500K microbench: 0 validator violations, 0 missed values, height 2.

Adds S8.4 "Two fallback classes — only one is theoretically necessary": - Class 1 (Direction-1 I8-unsafe): the impossibility-theorem subject; scoped rebuild provably optimal (Theorems 1-6). - Class 2 (cross-level mask overlap): NOT a theoretical impossibility — a multi-value-leaf accumulation artifact, now eliminated by the incremental mergeBiNodeAtExistingDiscBit primitive (commit c78da2d). No rebuild fires. Clarifies that Theorem 4's "scoped rebuild is canonical" claim applies only to Class 1; conflating the two would overstate the impossibility. Updates the S9.3 sparse-mu open-question to note the Class 2 case is already O(node-width).

…corner) Code review of c78da2d found a failure-mode regression: removing the canIntegrateBiNodeCleanly pre-check meant the rare un-mergeable cross-level- overlap corners now THREW out of integrate instead of falling back to rebuild. Two such corners: - C2 collision: the flipped straddle partial (oldPartial ^ beta-bit) already occupies another slot. - v \!= 1 orientation guard (the off-path-straddle invariant says v==0; the review confirmed it sound, but defense-in-depth bails rather than corrupt). The 1M + fuzz + multi-rev strict-validation canaries did NOT trigger either corner, but both are reachable in principle, so a production index must fall back gracefully, not crash the insert. Fix (clean pre-check predicate, per the chosen approach — no exceptions for control flow): - HOTIncrementalInsert.canMergeBiNodeAtExistingDiscBit(node, beta, slot): read-only, zero-alloc; returns false for the two un-mergeable corners. - AbstractHOTIndexWriter.canIntegrateBiNodeCleanly(pathNodes, childSlots, currentDepth, beta): walks integrate's cascade, checking canMerge at every beta-in-mask level. CRASH-SAFE BY CONSTRUCTION: the beta evolution matches integrate's full-node cascade exactly (beta -> parent.MSB after a split), so it never returns true when integrate would throw; it is height-free (does not model integrate's intermediate-placement short-circuit), which can only cause an occasional *unnecessary* rebuild, never a missed crash. Wired the pre-check at the five branch-path integrate sites (all reached via branchAboveLeaf, which falls back to scoped rebuildSubtree on a false return): branchFullNodeAtExistingBit, branchSplitFullNode, the affectedCount==numChildren pull-up, the singleEntry+leafEntry pairing, and the full-boundary-child wrap. The merge-path sites are unchanged: handleOffPathOverflow already try/catches its immediate-parent fold, and the deeper-cascade exposure there is no worse than before this campaign (addEntry threw on the same overlap pre-existing). Verified: 927 HOT tests green (assertions on), 4/4 HOT stress tests, 50K name index still incremental (no rebuild fired -> fast). The mergeBiNodeAtExisting- DiscBit assert(v==0) never fires, confirming the invariant the review affirmed.

@disabled

Adds interleavedInsertDeleteScaleFuzz (interleaved insert/delete, 3 seeds × 15 revs × 2000 entries/rev, strict HOTInvariantValidator gate at EVERY committed revision) to close the production-readiness coverage gap: the existing interleavedInsertDeleteMultiRev runs at 1000/rev and passes, sitting just under the threshold where a multi-value-leaf correctness bug appears. The new test FAILS (so it is @disabled to keep CI green) — it surfaces a pre-existing data-loss defect at 2000/rev: - seed=DEADBEEF fails at rev 3 (same seed passes at 1000/rev); - violations: I1 cross-leaf key duplication, I6 PEXT-descent misroutes (keys stored but unretrievable), I5/I8/sparse-path mismatches; - persists with the cross-level-overlap incremental merge disabled (-Dhot.diag.disableMerge), so it is NOT the recently-added fold — it is an independent, scale-triggered multi-value-leaf bug. This is a hard production-readiness blocker (data loss). The test is retained as the minimal repro; re-enable once the scale bug is fixed.

@disabled

While root-causing the 2000/rev data-loss repro, found that ad-hoc -Dhot.diag.* flags were NOT reaching the forked test JVM: sirix-core/build.gradle only forwarded an explicit whitelist of hot.* properties. This silently no-op'd newly-added debug flags (disableMerge, disableConsolidation, perInsertValidate), invalidating three earlier "rule-out" experiments. Fix: forward ALL hot.* system properties via a pattern (mirroring the root build.gradle's sirix.* forwarding), so future -Dhot.diag.* flags work without per-flag whitelisting. With the flags actually propagating, the invalidated rule-outs were re-verified correctly: - NOT the cross-level-overlap incremental fold: the repro reproduces with main src checked out at the pre-fold commit 7507cbf. (My fold c78da2d is exonerated -- not a regression.) - NOT consolidation: -Dhot.diag.disableConsolidation=true still fails. The bug is pre-existing and scale-triggered (1000/rev clean, 2000/rev corrupt), created by the cycle-3 reinsert (live-key I1 duplication, pre-commit), and reproduces under FULL versioning. Corrected the @disabled repro's rationale accordingly.

The incremental branch insert added a child partial (comboPartial) assuming the affected subtree was one-sided on the split bit beta -- Binna's single-TID premise. With multi-value leaves the subtree can straddle beta, so under equality-/most-specific-preferred PEXT routing the new single-key child became the route for an existing sibling key WITHOUT migrating it, leaving the key live in two leaves (I1) plus an I6 misroute = silent data loss. It surfaced at 2000 entries/rev (clean at 1000/rev) on interleaved delete+reinsert of overlapping values; HOT is the default secondary-index backend, so this was a default-path bug. Guard all seven partial-adding branch sites: simulate routing on the candidate node and, if any existing key would route to the freshly added single-key child, abandon the branch and merge the key into its descended leaf instead (routing- consistent by the descent tautology, no competing partial -> nothing stranded). A later capacity-driven leaf split migrates the whole densePK class atomically. Re-enable the scale-fuzz repro and add a fast FULL-versioning regression test. Document the finding in the impossibility paper: Class 3 fallback (8.4a), height-bounded \!= minimum-height (8.4b), and the structure-vs-layout / third-axis framing (1.1). Validated: HOT 999, sirix-core 9118, sirix-query 772 -- 0 failures.

… + benchmark Broader fuzzing (8 seeds x 20 revs, varied distributions) showed the strand guard's merge-into-descended-leaf fallback was itself unsafe: forcing K into a leaf it routes to but is lex-out-of-place in leaves a straddling leaf that a later branch can mis-encode, surfacing as I8/I5/I6 violations (seed a5a5a5a5, rev 5 -- ordering/constancy/misroute, not duplication). Fall back to the canonical rebuildSubtree(insertDepth) instead at all seven branch-add strand sites -- straddle-free and I5/I6/I8-clean by construction. Strands are simulation-gated, so the rebuilds stay rare. Add the validation that caught it: - aggressiveInterleaveFuzz: 8 seeds, ascending/descending/clustered/dup-heavy/ random shapes, strict per-revision validation + oracle retrievability. - chunkIdxBoundaryRetrievability: >65536 nodes so a value's nodeKeys span chunkIdx 0 and 1; asserts full retrievability (no chunk lost). - branchHeavyInsertGuardCost: 200k-entry insert (~160k/s) -- confirms the O(subtree) strand check adds no O(N^2) cliff. Full HOT stress class green.

The committed §8.4a described the strand-guard fallback as merging K into its descended leaf. That mechanism was superseded: it left a straddling leaf that broke I8/I5/I6 under aggressive fuzzing, so the fix discharges via the canonical rebuildSubtree instead. Update §8.4a to match the code, and record the merge-deferral as an instructive failed attempt (it reinforces the thesis: under multi-value leaves no localized structure-preserving primitive absorbs the firing).

…mmon case The strand guard previously always fell back to rebuildSubtree(insertDepth) (the whole insert-depth subtree). Measurement across the stress suite shows ~99% of strands have a single source leaf, and the strandable keys are confined to the DESCENDED leaf -- where K also routes -- in ~57% of fires. For those, rebuild only `leaf ∪ {K}` into a canonical mini-HOT and splice it into the leaf's slot, propagating height/partial up the spine (leafScopedRebuild): O(one leaf + path). The remaining fires (off-path or multi-leaf source -- K and the strandable keys occupy different node slots) keep rebuildSubtree(insertDepth), which is the minimal correct scope there, not a fallback by laziness. Reuses the trusted machinery (collectSubtreeEntries, HOTBulkBuilder, propagateRebuildUpSpine), so the only new logic is the confinement check and the leaf splice. Adds STRAND_LEAF_REBUILD / STRAND_FULL_FALLBACK observability counters. Validated: full HOT stress class (incl. 8-seed aggressive fuzz, chunkIdx, benchmark) 0 violations; 200k branch-heavy insert ~200k entries/s. Paper §8.4a updated.

Off-path strands (strandable keys in a sibling leaf, K descending to a different node slot) previously always fell back to rebuildSubtree(insertDepth). When the strandable keys are confined to a single sibling leaf, migrate instead: build the new child as bulk-build(K + strandable), replace the source leaf with bulk-build(source minus strandable), and re-encode the node KEEPING its sparse partials (recomputing them from first keys would emit dense PEXT values that break I4's leftmost-zero rule). Commit only if HOTMalformedSubtreeDetector passes. This makes the detector a per-firing decision procedure for the impossibility: moving keys to satisfy routing (I6) either preserves lex-order (I8) -- the migration succeeds -- or it does not, which IS the Class-1 I8-unsafe firing of Theorems 1-4, caught as an I8-malformed candidate and bounced to the canonical rebuild. On the 8-seed aggressive fuzz: 3 of 9 off-path strands migrate, 5 are I8-unsafe -> rebuild, 1 multi-source. With the on-path leaf rebuild, ~63% of strands discharge in O(leaves + path); the residual ~37% are genuine Class-1 firings + whole-subtree wrap sources. Defensive: any unexpected edge (build/assemble/propagate) is caught and falls back to the rebuild, which re-derives structure from the collected keys regardless. Validated: full HOT stress class (8-seed aggressive fuzz, chunkIdx, benchmark) 0 violations; 200k branch-heavy insert ~190k entries/s. Paper §8.4a updated.

…scale bug) Adds soakWithConcurrentReaders: sustained remove-whole-array + reinsert-overlapping across many revisions (live set stays small, but cumulative nodeKeys grow past 65536 -> chunkIdx >= 1), with N reader threads doing HOT-direct index validation concurrent with the writer. Gated by -Dhot.soak.run=true (Assumptions.assumeTrue -> skipped in CI); build.gradle bumps the soak fork heap to 8g only when that flag is set. It reproduces a GENUINE, committed HOT I8 violation (children-sorted-by-firstkey: sparse partials stay ascending but firstKeys go out of order -- the I7-vs-I8 divergence) at ~rev 210, caught WRITER-side (run with -Dhot.soak.validateEvery=1), independent of the concurrent readers. The 8-seed/20-rev aggressive fuzz never reached chunkIdx >= 1 under churn, so it missed this. Root cause not yet localized; HOT is not production-ready until fixed. Uses a single write transaction for index creation + all mutation (the index listeners are bound to that wtx's path summary; reusing across wtx surfaces a separate session-core listener-rebinding bug).

Under sustained remove+reinsert past the 65536-key chunk boundary (chunkIdx>=1), a branch combo-add placed K's fresh single-key child at its sparse-partial slot while K's firstKey carried an off-path bit (set in densePK, elided from the sparse partial) that pushed it past a later sibling's firstKey: K >= next.firstKey, breaking I8 (children-sorted-by-firstkey) while I7 stayed ascending. The existing branchAddStrandsExisting guard checks routing only, not firstKey order, so it slipped through and corrupted the committed CAS index (surfaced at rev 210 of the soak). It is the dual of the Direction-1 I8-unsafe firing (K < prev.firstKey). Add nodeViolatesI8 -- the firstKey-order complement to the routing guard -- at the 5 partial-sorted combo-add sites; on a violation, discharge via the canonical rebuildSubtree(insertDepth) (HOTBulkBuilder, I8-clean by construction). The 2 BiNode-wrap sites need no check (2-child node ordered by the genuine MSDB, I7==I8). Also add an opt-in per-insert localizer (-Dhot.localize.i8) that pinpoints the introducing handler; zero cost when off. Validated: soak 260 revs / 520k inserts with per-revision validation + concurrent readers, 0 violations (was: fail @ rev 210); full HOT suite, sirix-query, and default-config sirix-core all green. Cost: 1 BRANCH_I8_UNSAFE_REBUILD in 520k inserts; committed height 4 (bounded). Documented in docs/HOT_PAPER_IMPOSSIBILITY.md sec 8.4a-2 / 8.4b.

… churn A multi-seed high-chunkIdx soak (six seeds, per-revision validation) exposed a malformation the single-seed run missed: under seed 0xA5A5A5A5 a merge-path leaf-split fold (integrate) produced an I4 first-partial-zero violation at rev 127 (a node whose smallest stored partial is 0x1, so no child takes the all-left path). The branch combo-add guard from the previous fix does not cover the merge/integrate handlers. Add a uniform defense-in-depth backstop: after any structural change, walk the inserted key's current descent path from the root and, at the shallowest indirect that is structurally malformed (the cheap I4/I7/I8 invariants via nodeStructurallyMalformed), discharge by a scoped canonical rebuild of that node's subtree (healStructuralViolationOnPath / rebuildExistingSubtree, STRUCTURAL_SELFHEAL_REBUILD). A fold can only malform nodes on the key's path, so the O(height) walk is sufficient; rebuilding the shallowest violator subsumes its descendants (Binna Lemma 3). It runs only on structural-change inserts (the fast merge is exempt) and rebuilds only on a real violation (1 firing per 280k inserts on the I4 seed). The pre-commit combo-add guard now also covers I4/I7 (nodeViolatesI8 -> nodeStructurallyMalformed). Disable via -Dhot.selfheal.structural.disable. Also parameterize the soak seed (hot.soak.seed) for multi-distribution coverage. Validated: six-seed / 260-rev / ~3.1M-insert soak with per-rev validation, full HOT suite, and default-config sirix-core all clean. Documented in docs/HOT_PAPER_IMPOSSIBILITY.md sec 8.4a-3.

…ti-index soak Close the one concrete gap left by the per-insert self-heal: consolidateSubtree is a whole-subtree post-order leaf-merge sweep that runs after dispatchInsert, so it can touch nodes OFF the inserted key's path that healStructuralViolationOnPath cannot see. Add healMalformedSubtrees: a pre-order sweep that, at every highest structurally malformed (I4/I7/I8) indirect in the consolidated subtree, discharges via a scoped canonical rebuild (Binna Lemma 3 — the rebuild subsumes descendants). Runs at consolidation cadence only (every CONSOLIDATION_INTERVAL inserts), so it is amortized cheap. Also parameterize the soak for coverage: hot.soak.seed (multi-distribution) and hot.soak.index=cas|name (NameKeySerializer key bit layout, a different composite-key shape; node keys grow document-globally so it still crosses chunkIdx>=1). Validated: 16-seed CAS sweep (revs=140, validate every rev) 16/16 clean; 6-seed name soak (~1M reader validations each) 6/6 clean; full default-config sirix-core BUILD SUCCESSFUL.

…ders dereferenceRecordPageForModification asserted, for non-FULL versioning, that each combined fragment has getGuardCount()==0 after the writer releases its getPageFragments() guard. That is a false invariant under concurrent readers: guardCount is a single shared AtomicInteger with no reader/writer ownership, so a reader thread that guarded the same fragment (it read the old fragment while the writer copy-on-writes a fresh modify page) leaves a residual count the writer cannot distinguish from its own. The writer's accounting is already balanced by the unconditional release loop, and the ClockSweeper only evicts at guardCount==0, so the reader's guard is safe and releases when its (try-with-resources) read transaction closes. The assertion (debug-only, -ea) spuriously failed HOTVersionedLeafStressTest.soakWithConcurrent- Readers once read throughput was high enough to reliably overlap a reader's guard with a writer remove (exposed by hot.soak.index=name, ~1M reader validations/run). Replace it with a comment explaining why a residual reader guard is legitimate; FULL versioning already documented this. Validated: full default-config sirix-core BUILD SUCCESSFUL; 6-seed name soak 6/6 (was 0/2 on the racing seeds before the fix).

…s-branch Two changes to the HOT index writer, validated together. (1) Upgrade the post-structural-change self-heal from the cheap I4/I7/I8 check to the FULL invariant detector (HOTMalformedSubtreeDetector: I3/I4/I5/I7/I8/I11), scoped to the just-spliced subtree (captured at the registerFreshSubtree choke point as selfHealScope). Because I5 (leaf-constancy) is routing-soundness (foundation Theorem 2), this makes correct routing (I6) and no-cross-leaf-duplicate (I1) a per-insert RUNTIME GUARANTEE in the touched region, not merely an empirical commit-time observation — discharged via the Theorem-4 scoped rebuild. A mutation only malforms nodes it touched, so the O(subtree) detector is scoped (not a from-root scan); the cheap path-walk still covers ancestor firstKey ordering. Consolidation uses the same full detector. Cost: detector runs only on structural-change inserts (the fast merge is exempt); the soak fires ~18-28 eager I5 heals per ~300k inserts with no throughput regression (~28k inserts/s on the churn workload). (2) HFT: leastSignificantDiscBit no longer allocates an int[] via discriminativeBits just to read its last element — it is on the per-insert merge-vs-branch decision path. Compute the max disc-bit position allocation-free (single-mask: initialBytePos*8 + 63 - ntz(bitMask); MULTI_MASK: highest extraction byte's lowest-order on-path bit). Also remove the nodeViolatesI8 pass-through wrapper (call nodeStructurallyMalformed directly; its doc was stale — the check covers I4/I7/I8). Validated: 5-config CAS+name soak (per-rev validation, both the I4 and I8 regions), single seed 440k-insert soak 0 violations, full HOT suite, and default-config sirix-core all green.

The 16-wide short-vector path computed `searchReg.and(haystack1).reinterpretAsBytes()` into an unused local (`andResult1`), then recomputed `searchReg.and(haystack1)` for the actual mask — a wasted vector AND per routing lookup. Remove the dead computation.

…path Three allocation removals on the dominant secondary-index insert path: - Merge-with-existing inserts deserialized BOTH operands into Roaring64Bitmaps just to OR in a single key. Add NodeReferencesSerializer.mergePackedSingleBit, a byte-level binary-search insert into the sorted packed format: returns the same array on a present-key no-op (skips the slot rewrite), a fresh packed byte[] on insert, or null to fall back to the slow path. Drops 2 Roaring64Bitmap + 2 NodeReferences allocations per insert and is byte-identical to the slow path (5000-trial differential test). NodeReferences itself (shared with the RBTree backend) is untouched. - The key was trimmed from the 4KB thread-local buffer 3x per insert. doIndex now trims once and reuses the slice for prepareLeafOfTree and mergeWithNodeRefs (their internal trims become no-ops; handlePrefixForInsert is read-only on the key). - analyzeDescent's binary search compared via Arrays.compareUnsigned(getKey(mid), key), allocating a byte[] per probe; use the existing zero-alloc compareKeyWithBound. Validated: sirix-core + sirix-query suites, HOT suite (fuzz + formal verification), 3-seed per-rev soak (CAS + name, 240k inserts each) clean.

…ocate test-only API - HOTIndirectPage: remove the unused LayoutType.POSITION_SEQUENCE constant and the never-assigned bitPositions field (plus its two clone blocks). layoutType is only ever SINGLE_MASK/MULTI_MASK and LayoutType is serialized by ordinal (PageKind); the trailing constant is never written, so removing it leaves ordinals 0/1 unchanged -- no on-disk format change. - HOTIndirectPage: the no-height createSpanNode now delegates to the height overload (identical body bar height=0). createMultiNode is left as-is -- its two overloads are genuinely distinct (childIndex direct-byte vs partialKeys PEXT), not a height/no-height pair. - SparsePartialKeys: move findMasksByPattern / getRelevantBitsForRange / determineValueOfDiscriminatingBit (zero production callers, reference-parity only) into SparsePartialKeysTest as static helpers that read via the public getters. Validated: HOT suite (fuzz + formal verification) + HOT page tests green.

…not maxNodeKey Names.fromStorage pre-sized its three fastutil maps to maxNodeKey/0.75, where maxNodeKey is the monotonic high-water of node keys ever allocated in the dictionary. Under name churn that high-water grows without bound (every remove-then-re-add of a name does maxNodeKey++ in setName), so each Names instance ballooned to O(historical-slot-count) (~17MB at ~400k high-water) while holding only ~1024 live names. The bounded names cache retained hundreds of these, OOMing the Java heap. The name-index soak (1024 distinct names churned per revision) OOM'd at rev 196/520k; the CAS soak (1 name) was unaffected. Size the maps by the live entry count (default-grown) instead. Diagnosed via async-profiler alloc,live (Names.<init> the top retained-array allocator). The residual O(maxNodeKey) reconstruction scan (reads every historical slot, including tombstones) is a separate, latent perf issue; design, invariants and formal-verification plan in docs/NAME_DICTIONARY_RECONSTRUCTION_PLAN.md. Also harden the soak harness: forward hot.soak.seed/index + sirix.allocator.maxSize to the worker, and ExitOnOutOfMemoryError + HeapDumpOnOutOfMemoryError so an OOM fails fast (it previously hung the forked JVM for hours) and is diagnosable. Validated: full sirix-core suite green; 520k/260-rev name soak completes (was OOM@196), 0 reader errors.

…e-key bitmap Names.fromStorage scanned 1..maxNodeKey to rebuild a dictionary, where maxNodeKey is the monotonic node-key high-water that grows without bound under name churn (remove+re-add allocates fresh keys). Cold reconstruction was therefore O(historical slots) -- probing every dead/tombstoned slot -- even for a handful of live names. Persist, per dictionary, a Roaring64Bitmap of the live HashEntryNode node-keys in the NamePage (serialized alongside maxNodeKeys; binary format extended, no migration). Reconstruction now iterates only the live entries (O(live)), reading each entry + count node. The 1..maxNodeKey scan remains as the fallback when the bitmap is absent and as the differential-test oracle. Versioning-safe: node keys stay monotonic and are never reused (I-N4/I-N5); the bitmap is per-revision CoW state, deep-copied on copy and re-derived from the authoritative Names at serialize time. The XML-attributes / JSON-objectKeys offset-0 overlap is resolved by OR-ing the two mutually-exclusive sets. Formal verification: NamesReconstructionDifferentialTest asserts the bitmap reconstruction is content-identical to the scan for every committed revision under name churn (growing high-water) and forced hash collisions (the Aa/BB family). The scan is the I-N1..I-N7 oracle. Plan in docs/NAME_DICTIONARY_RECONSTRUCTION_PLAN.md. Validated: full sirix-core suite green; differential test green; 520k/260-rev name soak completes in 14s with 0 reader errors; CAS soak unaffected.

…bug) The write-side IndexController is cached per revision and reused across write transactions, and createIndexListeners ADDED listeners -- each bound to a specific transaction's storage engine + path summary -- without clearing prior bindings. The canonical pattern (create an index + commit in one transaction, then populate/mutate the indexed data in a later one) then fired a listener still bound to the first, now-closed transaction, surfacing as "AssertionError: Transaction is already closed\!" via CASIndexListener.listen -> PathSummaryReader -> NodeStorageEngineWriter.getRecord. Add IndexController.clearChangeListeners() and call it at the start of the JSON and XML write-transaction constructors, before rebinding the current wtx's listeners. Per-resource-safe: controllers and their listener sets are per-resource-session (each JsonIndexController owns a fresh HashSet), so clearing one resource's controller cannot affect another's, and Sirix's one-wtx-per- resource write lock means the clear never races a live wtx on the same resource. Regression test IndexListenerStaleTrxTest reproduces the exact stale-trx stack and adds a multi-resource overlapping-write-transaction isolation case. Validated: full sirix-core + sirix-query suites green.

…nch insert) Two per-branch-insert allocations on the incremental-insert path: P1 - discriminativeBits(node) re-derived an int[] from the node's bitMask (a 64-iteration loop) and getPartialKeys() cloned, both 2-3x per branch insert. The discriminative-bit mask is immutable per node instance (CoW creates a fresh node for any change), so cache the disc-bits lazily on HOTIndirectPage (getDiscriminativeBits, shared read-only) and read partials through a non-cloning getPartialKeysRef(). P2 - analyzeDescent allocated BOTH insertion-point neighbor keys via getKey() (byte[] + 2 arraycopies) only to score them with msdb. Add zero-alloc HOTLeafPage.msdbWith(index, key) reading commonPrefix + off-heap suffix in place; score both neighbors with it and materialize only the winning resident (the one returned downstream). Eliminates the loser's getKey allocation per branch insert. Validated: HOT suite (fuzz + formal verification) green; 520k/260-rev cas + name soaks per-rev-validated, 0 reader errors.

…es list builds P3 -- AbstractHOTIndexWriter's structural self-heal / structuralDfs read getPartialKeys() (a defensive clone) on every structural-change insert (SELFHEAL_STRUCTURAL is on by default). Switch the four read-only callers to HOTIndirectPage.getPartialKeysRef() (the non-cloning accessor introduced in 5192d43). The two callers that need a mutable private copy (the rebuild paths at AbstractHOTIndexWriter:2564/2820) keep the .clone(), but on getPartialKeysRef() -- eliminating a redundant double-clone (the old getPartialKeys().clone() copied the array twice). P4 -- Several Page implementations materialise the references list lazily; the size bounds-check pattern (indexNumber >= page.getReferences().size()) on the per-record-access path in NodeStorageEngineReader / NodeStorageEngineWriter / AbstractHOTIndexReader therefore allocated an ArrayList (HOTIndirectPage) or an Arrays$ArrayList wrapper (FullReferencesPage) per call just to read the size. Add a default Page.getReferencesCount() (= getReferences().size() fallback) and override on the implementations that materialise lazily: HOTIndirectPage (non-null count loop, no ArrayList), FullReferencesPage (references.length, no wrapper), AbstractForwardingPage (forward to delegate). The other delegates (ReferencesPage4, BitmapReferencesPage) hold the list as a field so the default fallback is already O(1) without allocation. Switch the 12 .getReferences().size() callers to .getReferencesCount(). (P5 left as-is: i8ProbeSnapshot allocates long[8] only when -Dhot.localize.i8 is opted in -- a debug diagnostic, off in production; the snapshot also needs distinct before/after arrays so a shared buffer would break the comparison.) Validated: full sirix-core suite green; HOT suite + cas/name soaks (260 revs, per-rev validation) green.

Squashes PR #978 (217 commits, fix/hot-strict-binna-conformance) closing the HOT correctness + perf campaign (2026-04 -- 2026-05-23). Correctness - Strict-Binna conformance (Phase 1--7w): 0 violations across all 12 HOTInvariantValidator workloads. - Class 2 fallback eliminated: cross-level mask overlap handled incrementally via mergeBiNodeAtExistingDiscBit (not-full) and splitIndirectWithSlotReplaceAndInsertion (full). No rebuild fires. - Class 3 strand discharge: routing-strand guard at 7 branch sites; leaf-scoped rebuild (O(one leaf + path)) and two-leaf migration (O(two leaves + path)) discharge ~63% of strands surgically. - I8-at-scale + I4 corner cases past chunkIdx >= 1 (multi-seed soak): combined nodeViolatesI8 pre-commit guard with a uniform healStructuralViolationOnPath defense-in-depth backstop covering the merge/integrate handlers. - Cross-leaf data-loss bug at 2000/rev fixed: branch-add stranding detected via routing simulation; canonical scoped rebuild discharges. - Tombstone preservation across revisions; slot-granular leaf CoW for multi-revision historical reads. - Index change-listener rebind: clear IndexController listeners per write transaction (no fire on a closed wtx). - Storage: drop false guardCount==0 assertion under concurrent readers. Theory - docs/HOT_PAPER_IMPOSSIBILITY.md: 6 theorems + 2 corollaries establishing the scoped rebuildSubtree(insertDepth) as asymptotically optimal under PEXT routing for Direction-1 I8-unsafe firings; both encoding alternatives (AND, subtree-OR) inherit analogous bounds. - docs/HOT_FORMAL_FOUNDATION.md, HOT_REBUILD_FALLBACK_ELIMINATION_PLAN.md, HOT_BETAISDISCBIT_REBUILD_ELIMINATION_PLAN.md, HOT_STRADDLE_GUARD_REMOVAL_PLAN.md capture the design and plans. Performance - Zero-alloc merge/branch insert hot paths. - Cached disc-bits + zero-alloc descent neighbor scoring (branch insert). - Dropped self-heal clones + per-record-access getReferences list builds. - Eliminated per-insert allocations on the merge/descent hot path. - Dropped dead SIMD op in SparsePartialKeys equality search. - Name-dictionary reconstruction: O(live) via persisted Roaring bitmap in NamePage (was O(maxNodeKey high-water)). Validation - HOT 999 + sirix-core 9118 + sirix-query 772 tests, 8-seed aggressive fuzz, chunkIdx boundary, 260-rev / 520k-insert sustained-churn soak with per-revision invariant validation: 0 failures. - HOT is the default backend. - 200k-insert branch-heavy bench ~160k/s; microbench ~878k inserts/s, ~1.06M reads/s at 500k keys with height 2.

Johannes Lichtenberger added 30 commits May 7, 2026 23:01

Johannes Lichtenberger added 26 commits May 21, 2026 00:36

JohannesLichtenberger merged commit 357cd1d into main May 23, 2026
7 of 9 checks passed

JohannesLichtenberger deleted the fix/hot-strict-binna-conformance branch May 23, 2026 10:04

JohannesLichtenberger mentioned this pull request May 24, 2026

revert(rest): restore Channel-based JsonCreate flow (fixes JVM hang) #980

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/hot strict binna conformance#978

Fix/hot strict binna conformance#978
JohannesLichtenberger merged 217 commits into
mainfrom
fix/hot-strict-binna-conformance

JohannesLichtenberger commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

JohannesLichtenberger commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant