Skip to content

New Arena Range#855

Draft
mjp41 wants to merge 19 commits into
mainfrom
BackendArenaRange
Draft

New Arena Range#855
mjp41 wants to merge 19 commits into
mainfrom
BackendArenaRange

Conversation

@mjp41

@mjp41 mjp41 commented Jun 9, 2026

Copy link
Copy Markdown
Member

This PR replaces them with a new backend that handles snmalloc's full
(exponent, mantissa) size class sequence end-to-end, and still maintains
the snmalloc invariant that any allocation is aligned by the largest power
of two that divides its size.

Free blocks are kept in a set of bins, with a few extra bins to handle the
alignment subtleties: a 5-unit block at the wrong address cannot serve a
4-unit request (because of the higher alignment), whereas a 6-unit block
can serve every smaller size. A small precomputed mask per request hides
exactly those bins that cannot serve it.

Consolidation is Doug Lea�style: on free we look left and right and merge
maximally. Blocks live in two layers of red-black trees � one tree per
non-empty bin (for selection within a bin) and one address-keyed tree over
all free blocks in the allocator (for left/right neighbour lookup on
free). Because the trees scale, the same backend stacks at multiple levels
of the range pipeline, the way buddy allocators used to.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown

Coverage report (cross-platform merged)

Lines covered (src/snmalloc/**): 3059 / 3432 (89.13%)

Merged line coverage is the per-line union across all platforms. Region coverage is reported per-platform only; no cross-platform region total is computed.

Per-directory breakdown

Directory Lines covered Lines executable %
src/snmalloc/stl 2 4 50.00%
src/snmalloc/override 94 156 60.26%
src/snmalloc/pal 330 449 73.50%
src/snmalloc/mitigations 14 18 77.78%
src/snmalloc/aal 51 59 86.44%
src/snmalloc/global 265 306 86.60%
src/snmalloc/ds_core 390 440 88.64%
src/snmalloc/ds 339 366 92.62%
src/snmalloc/mem 842 876 96.12%
src/snmalloc/ds_aal 108 112 96.43%
src/snmalloc/backend_helpers 522 541 96.49%
src/snmalloc/backend 102 105 97.14%
Per-platform contributions (advisory)
Platform Lines covered Lines executable Lines % Regions covered Regions executable Regions %
freebsd-14 6084 6677 91.12% 6007 9481 63.36%
linux-self-host-shim-checks 6222 6810 91.37% 6274 10283 61.01%
linux-self-host-shim-checks-selfhost 1871 2566 72.92% 1900 3777 50.30%
macos-14 6102 6636 91.95% 6123 9835 62.26%
windows-2022 6006 6656 90.23% 6047 9789 61.77%

@mjp41 mjp41 force-pushed the BackendArenaRange branch 3 times, most recently from 89fc74d to 0c47339 Compare June 12, 2026 13:12
mjp41 and others added 19 commits June 12, 2026 14:28
get_mut<true> base-adjusted p before calling register_range, which then
re-applied the base subtraction internally and tripped its out-of-range
guard for legitimate in-range addresses. The path is reachable on PALs
without LazyCommit (e.g. PALNoAlloc<PALLinux>) when get<true>/get_mut<true>
is called on an in-range address of a bounded pagemap.

Move the register_range call before the p = p - base adjust so it sees
the un-adjusted address that its bounds check expects. Add a regression
test in func-pagemap that wraps DefaultPal with a stub stripping
LazyCommit; this exercises the previously-broken path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
snmalloc::alloc<size, Conts, align>() applies aligned_size(align, size)
internally; snmalloc::dealloc<size>(p) did not. When the alignment
upgrade pushed the reservation into a different sizeclass than `size`,
check_size fired under the check flavour. Reproducer:
alloc<33*1024, _, 128*1024>(); dealloc<33*1024>(p)
=> "Dealloc rounded size mismatch: 0xa000 != 0x20000".

Merge dealloc<size> into a single template `dealloc<size, align = 1>`
applying aligned_size(align, size) before check_size. The default
align=1 preserves existing one-argument-template behaviour because
aligned_size(1, size) == size.

Move aligned_size from sizeclasstable.h to sizeclassstatic.h so the
test library header can use it without pulling in the full runtime
sizeclass machinery. Existing consumers still get it transitively via
the pal.h -> ds_core.h -> sizeclassstatic.h include chain.

Mirror the merge in the test library header: dealloc<size, align=1>
and alloc<size, ZeroMem, align=1>. Add aligned_dealloc to
TESTLIB_ONLY_TESTS.

Includes src/test/func/aligned_dealloc/ with the canonical reproducer
and additional (S, A) pairs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduces
src/snmalloc/backend_helpers/arenabins.h, which owns the
chunk-unit size-class scheme and the non-empty-bins bitmap that
later commits will use to drive bin selection inside Arena.

Public surface (the integration contract for later commits):

  * range_t, carve_t, carve(block, n_chunks), max_supported_chunks().
  * Nested Bitmap with add(block), find_for_request(n_chunks),
    clear(bin_id), and TOTAL_BINS.

Everything else (the size-class encoding, the per-SC tables, the
free-side classifier bin_index) is private. The unit test reaches
it via a friend struct ArenaBinsTestAccess<B> that is only
forward-declared in the header and defined in the test translation
unit, so the production header carries no test-only surface.

Implementation:

  * Two power-of-two-sized rodata tables indexed by raw sc id with
    shift+add. bitmap_info_t (4 words via alignas) feeds
    Bitmap::find_for_request; carve_info_t (2 words) feeds carve
    and the free-side cascade-fit predicate.
  * bitmap_info_t fields (start_word, first_mask, second_mask) are
    pre-shifted into the bitmap's word layout so find_for_request
    is two ANDs on the hot word + word-boundary fall-through.
  * Tables are populated at constexpr build time by BinTable()
    consuming the canonical bin_subsets table; the strict-chain
    invariant on bin_subsets is checked at compile time via throw
    in the constexpr constructor.
  * Fast path uses the runtime CLZ intrinsic via the new
    bits::to_exp_mant<MANTISSA_BITS, LOW_BITS> (paired with the
    existing to_exp_mant_const); the _const variant is restricted
    to constexpr table construction and test static_asserts.
    bits::prev_pow2_bits / prev_pow2_bits_const are added alongside
    for symmetric runtime / constexpr access.

The new test cross-checks bin classification, carve, and
find_for_request against a brute-force scanner derived directly
from bin_subsets, for B in {1, 2, 3}. Exhaustive single-bit and
multi-bit randomised bitmap states are covered, plus word-boundary
straddle cases enumerated automatically from the table.

No production code path is changed: ArenaBins<B> is unused
in the build until later commits compose it into Arena.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a public RBTree method that returns the strict neighbours of a
probe value K in a single root-to-leaf descent:

  - every left turn (parent > K) records the parent as the current
    successor candidate
  - every right turn (parent < K) records the parent as the current
    predecessor candidate

At loop exit the tightest neighbours are returned as
`stl::Pair<K, K>{pred, succ}`; either component is `Rep::null` when no
such neighbour exists.

The "K not in tree" precondition is asserted via SNMALLOC_ASSERT and
expands to nothing in Release. Arena, the planned caller, relies
on the invariant that two free blocks cannot share a starting address.

test_neighbours exercises the algorithm against std::set::lower_bound /
upper_bound as oracle. Boundary probes (K=0, K=size+1) plus random
probes that skip oracle hits keep every call within the precondition.
The sweep reuses the existing test()'s size range but caps to the first
few seeds per size to keep the per-test time budget in check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduce Arena, a free-range allocator that manages chunks
within a bounded arena using a dual red-black-tree scheme:

- Bin trees: one per size-class bin, for best-fit allocation lookups
  driven by a non-empty-bins bitmap.
- Range tree: keyed by address, for O(log n) neighbour lookup during
  consolidation of adjacent free blocks.

Key design decisions:
- Single-chunk (min-size) blocks live only in bin tree 0, not the
  range tree, keeping range-tree overhead proportional to multi-chunk
  blocks. The min-size bin is probed as a fallback during consolidation.
- Three-variant encoding (Min/TwoMin/Large) in pagemap metadata bits
  avoids a range-tree lookup for the common 1-chunk and 2-chunk cases.
- WordRef handle and TreeRep<RefFn> template follow the existing
  BackendStateWordRef / BuddyChunkRep patterns from largebuddyrange.h.
- Consolidation in add_block checks predecessor then successor,
  merging adjacent blocks and re-inserting the result.
- remove_block uses Bins::carve to split oversized blocks, re-inserting
  remainders.

Also:
- Add neighbours() to RBTree: single-descent strict-neighbour query.
- Add for_each() to RBTree: in-order traversal for invariant checking.
- Make ArenaBins::bin_index public (sole consumer is Arena).
- Add ArenaBins::Bitmap::test() for invariant verification.
- Five-clause structural invariant gated on bool parameter (defaults to
  Debug), checked at entry/exit of add_block and remove_block.
- Comprehensive test suite: word-level round-trips, tree operations,
  empty-state invariant, add/remove without consolidation, consolidation
  case matrix (8 pred/succ combinations), overflow detection, and
  randomised stress test with oracle validation (50 seeds x 500 ops).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Make Arena fully generic over its Rep, mirroring the
  Buddy/Rep layering. The class no longer holds any bit-layout
  constants; Rep supplies the full RBTree Rep for both the bin
  trees and the range tree, owning red-bit (and any tag-bit)
  packing privately.
- Rep concept now requires:
    using BinRep         -- full RBTree Rep for the bin trees
    using RangeRep       -- full RBTree Rep for the range tree
    get_variant / set_variant
    get_large_size_chunks / set_large_size_chunks
    can_consolidate(higher_addr) -> bool
- Add can_consolidate checks in add_block before each (predecessor
  and successor) merge, and update the invariants to tolerate
  boundary-blocked adjacency.
- MockRep grows inner BinRep / RangeRep structs that each provide
  the full RBTree Rep interface over the mock-entry array, with a
  private red-bit at bit 8.
- New tests verify that can_consolidate returning false at a
  specific address prevents predecessor- and successor-side merges
  independently, including at min-block boundaries.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the LargeArenaRange wrapper that drops into the
LargeBuddyRange slot, generalises Arena and ArenaBins
on MIN_SIZE_BITS, and converts the arena/range API boundary to bytes
throughout.

* LargeArenaRange<REFILL_SIZE_BITS, MAX_SIZE_BITS, Pagemap,
  MIN_REFILL_SIZE_BITS> with a PagemapRep that packs variant tag, RB
  red bit and the consolidated large-block size into the first pagemap
  word, and uses the second word for in-tree links. Provides
  alloc_range / dealloc_range / add_range over the bin-tree arena.
* parent_dealloc unifies the old parent_dealloc_range and
  dealloc_overflow paths; add_range uses bits::align_up /
  bits::align_down for parent-input trimming.
* ArenaBins<B, MIN_SIZE_BITS> generalises the bin scheme so its
  range_t, carve and find_for_request all speak bytes (multiples of
  UNIT_SIZE = 1 << MIN_SIZE_BITS). Tests cover MIN_SIZE_BITS in {0, 4,
  14}.
* Arena<Rep, MIN_SIZE_BITS, MAX_SIZE_BITS>: add_block /
  remove_block / variant_of / insert_block / range_from_addr /
  invariants all work in bytes. remove_block returns a scalar address
  (0 = failure); the size half of the old pair was tautological.
  CHUNKS_BITS / addr_to_chunk / chunk_to_addr removed.
* PagemapRep::get_large_size / set_large_size are bytes-in / bytes-out;
  storage still scales by MIN_SIZE_BITS so the shifted field fits a
  pagemap word.
* Tests: func-largearenarange exercises alloc/dealloc/refill/large
  paths against a mock parent; func-arena and
  func-arenabins updated for the bytes-throughout convention
  (chunk_size(N) helper at the test boundary).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lines

Mechanical substitution of every LargeBuddyRange instantiation in the
default in-tree range pipelines:

- src/snmalloc/backend/standard_range.h (GlobalR, LargeObjectRange).
- src/snmalloc/backend/meta_protected_range.h (GlobalR, CentralObjectRange,
  CentralMetaRange, the conditional_t huge-page cache, ObjectRange,
  MetaRange).

After this change snmalloc uses the Arena bin-tree allocator
instead of the power-of-two buddy for all large-range management in
the default pipelines. LargeBuddyRange and BuddyChunkRep remain in
the tree, available for alternative configurations.

Two issues uncovered during testing and fixed here:

1. arena.h: Arena::add_block's successor-min branch
   called Rep::can_consolidate(succ_addr) before contains_min(succ_addr)
   confirmed succ_addr is in our region. For a block added at the very
   top of a registered region (e.g. last 8 MiB of a 256 MiB fixed
   region), succ_addr = addr + size sits one chunk past the pagemap's
   mapped backing, and the can_consolidate probe segfaults. The fix
   reorders the checks so the tree-membership test gates the pagemap
   read, matching the documented pattern in buddy.h:90-93.

   Regression coverage: MockRep gains a per-chunk `boundary` field on
   `mock_entry`. `MockRep::can_consolidate(addr)` now returns
   `!mock_store[mock_index(addr)].boundary` — faithful to the real
   `PagemapRep::can_consolidate` reading `entry.is_boundary()`. The
   `mock_index` bounds assertion fires on any out-of-range probe, so
   the unsafe pattern trips in unit tests rather than only as a
   segfault in production. A new test_block_at_arena_top_edge adds a
   block whose succ_addr would address chunk MOCK_ARENA_CHUNKS;
   without the reorder this reproduces the original failure.

   This unification also subsumed the previous BoundaryMockRep and its
   boundary_addrs global std::set: the four boundary tests now run on
   Arena<K> and set mock_store[mock_index(addr)].boundary = true
   instead. Net -35 lines in arena.cc.

2. arenabins.h: the BinTable constexpr constructor used
   throw "..." as a constexpr-eval-fails trick to surface invariant
   violations as compile errors. throw requires exception support,
   which is disabled in the main allocator (-fno-exceptions), so this
   broke builds. Replaced with SNMALLOC_CHECK(false && "..."),
   which calls a non-constexpr error path and achieves the same
   compile-time failure without runtime exception machinery.

Full ctest suite passes (86/86, --timeout 120 -j 4).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replaces the tagged small/large encoding and the leading-zero-count
large-class indexing with a single uniform exp+mantissa scheme:

    value == 0                              : unmapped sentinel
    value in [1, 1 + NUM_SMALL_SIZECLASSES) : small  (sc = value - 1)
    value in [1 + NUM_SMALL_SIZECLASSES,
             1 + NUM_SMALL_SIZECLASSES + NUM_LARGE_CLASSES)
                                            : large  (lc = ...)

Small classes use `from_exp_mant(sc)` (unchanged). Large classes
continue the same exp+mantissa namespace as
`from_exp_mant(NUM_SMALL_SIZECLASSES + lc)`. The discriminator tag bit
is gone — small and large share one contiguous index space — and the
sentinel slot 0 lets the size-lookup fast path return 0 / 0 for
unmapped pointers without a branch.

The `SIZECLASS_REP_SIZE` / `REMOTE_BACKEND_MARKER` / `REMOTE_MIN_ALIGN`
chain is re-derived from the new `SIZECLASS_BITS` (renamed from
`TAG_SIZECLASS_BITS`); RED_BIT / VARIANT_SHIFT / LARGE_SIZE_SHIFT in
`largearenarange.h` and RED_BIT in `largebuddyrange.h` derive
from the new public `MetaEntryBase::BACKEND_LAYOUT_FIRST_FREE_BIT` so
future widenings propagate automatically.

A new `MAX_LARGE_SIZECLASS_SIZE` constant gates user-supplied sizes at
the API boundary (`alloc_not_small`, `round_size`, `check_size`,
`rust_realloc`) — replacing the loose `> 2^63` bound. `ENCODED_ADDRESS_BITS`
caps the encoding at `BITS - 1` so the constant survives 32-bit
platforms where `DefaultPal::address_bits == BITS`.

The `large_size_to_chunk_sizeclass` helper is removed —
its `+NUM_SMALL_SIZECLASSES` / `-NUM_SMALL_SIZECLASSES` round-trip
through an `lc` index cancels in the uniform scheme, so
`size_to_sizeclass_full`'s large branch inlines the `to_exp_mant`
directly.

Front-end semantics are unchanged: `large_size_to_chunk_size` still
returns `next_pow2(size)` and the front end still reserves pow2 chunk
sizes. The non-pow2 large sizeclasses exist in `sizeclass_metadata`
(with `slab_mask = info.align - 1`) but are unreachable from
`size_to_sizeclass_full` until a follow-up commit drops the
`next_pow2` rounding.

Tests:
- `sizeclass.cc`: sentinel sanity, raw-value adjacency, range disjoint,
  large monotonicity, pow2 round-trip, non-pow2 rounds up.
- `rounding.cc`: extends to pow2 large sizeclasses, verifying
  `index_in_object` / `is_start_of_object` at representative offsets.
- `cheri.cc`: large-class verification loop bound updated to
  `NUM_LARGE_CLASSES`.
- Loop bounds in tests use `ENCODED_ADDRESS_BITS` to avoid
  `bits::one_at_bit(BITS)` UB on 32-bit.

ctest: 86/86 passing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Encode (sizeclass, slab-offset) jointly in the pagemap entry so the
front end can recover the allocation start for an arbitrary interior
chunk of a multi-slab-tile large allocation. The front end still
only issues pow2 large requests, so every materialised entry today
has offset=0; this lays the groundwork for non-pow2 large support without front-end changes.

Key pieces:
- offset_and_sizeclass_t packs sizeclass into the low SIZECLASS_BITS
  and per-chunk offset into the next OFFSET_BITS of one word.
- Backend::alloc_chunk loops over slab tiles, writing each tile's
  slab_index into the offset bits of its pagemap entry.
- SizeClassTable is split into three by purpose:
  * start_ (sizeclass_data_start, 32B/row, indexed by osc): hot
    path for start_of_object on every dealloc.
  * align_ (sizeclass_data_align, 16B/row, indexed by sc): used by
    is_start_of_object alignment check in -check builds.
  * slab_ (sizeclass_data_slab, 4B/row, indexed by sc): cold; slab
    init thresholds.
- start_of_object branches on osc.offset() == 0 (testable from bits
  already loaded in osc.raw()), so the offset=0 hot path skips the
  offset_bytes load and offset-shift arithmetic. Combined with the
  table split, perf-external_pointer-fast matches the baseline
  (~290 ms median) with no regression; perf-singlethread-check is
  within noise.
- New src/test/func/large_offset targeted test reaches the
  multi-slab-tile branch via the public backend API.
- check_invariant in Arena now uses SNMALLOC_CHECK rather
  than SNMALLOC_ASSERT, so callers that opt in via enabled=true get
  the invariant checks even in Release builds (which is what the
  tests want); the #ifndef NDEBUG wrapper is no longer needed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
A request like malloc(70 KiB) at the default INTERMEDIATE_BITS = 2
now reserves the smallest enclosing exp+mantissa sizeclass (80 KiB)
rather than next_pow2(size) (128 KiB). Sizes that already land on a
class boundary reserve exactly that size; mid-exponent sizes shrink
by up to ~33%.

Mechanics:

  sizeclasstable.h
    - size_to_sizeclass_full drops next_pow2(size); to_exp_mant ceils
      directly to the smallest enclosing class.
    - round_size's large branch matches the reservation
      (sizeclass_full_to_size of the chosen class), so
      DefaultConts::success zeroes exactly the reservation for calloc.
    - large_size_to_chunk_size removed (the one caller in corealloc
      uses sizeclass_full_to_size(sc) directly with a hoisted sc).
    - compute_max_large_slab_index tightened to meta.size / slab_size
      - 1 (the actual worst case the runtime pagemap loop writes).

  backend.h
    - alloc_chunk's pow2 precondition relaxed to the slab-tile
      invariant: size is a positive multiple of slab_size.

  corealloc.h
    - large alloc path hoists size_to_sizeclass_full / chunk size into
      locals so each table lookup happens once.

Tests:

  - large_offset_frontend/: new front-end counterpart to
    large_offset/. Exhaustively round-trips every large sizeclass and
    walks every chunk-aligned interior pointer for a boundary and a
    non-boundary request.
  - memory/: adds test_calloc_non_pow2_large as a calloc zeroing smoke
    test; clamps the end-of-stride probe in check_external_pointer_large
    since non-pow2 reservations are tighter than the next pow2.
  - sizeclass/: deterministic round_size gate over every large class
    (S maps to itself; S_prev+1 ceils to S).
  - large_offset/: backend test now passes the chunk-multiple reserve
    (= sizeclass_full_to_size(sc)) instead of next_pow2(size).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Note that MetaEntryBase::operator= is load-bearing: the pagemap
  writes back through it, so META_BOUNDARY_BIT survives every
  metadata mutation without explicit preservation by callers.
- Correct the BackendStateWordRef single-pointer ctor comment: it
  is required by RBRepMethods for sentinel construction from
  &Rep::root, not a legacy convenience.
- Arena: drop spurious const on contains_min and
  check_invariant (no const callers exist), removing the
  const_cast laundering.
- Arena::check_invariant: lift the five clause titles into
  the docblock; trim the inline labels to single-line markers.
- Arena::add_block: drop cross-file line-number reference
  to buddy.h.
- largearenarange.h / arenabins.h: replace
  SNMALLOC_CHECK(false && "msg") with SNMALLOC_CHECK_MSG.
- largearenarange.h: rename `auto refill` to `refill_range`
  to avoid shadowing the enclosing function.
- Tests: use "test/..." quoted include style for consistency.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduces the building blocks for the SmallBuddyRange ->
SmallArenaRange migration. Nothing is wired into the production pipeline
yet (the existing SmallBuddyRange remains the LocalMetaRange) — this
commit only adds the new components and their gate test.

* InplaceRep<Authmap, ChunkBounds>: in-band red-black-tree node Rep
  for Arena that stores the tree pointers inside the free block
  itself. Supports CHERI provenance via the Authmap mechanism (the same
  write-once cap table used by dealloc_meta_data); node accesses go
  through Authmap::amplify_from_address. can_consolidate refuses
  merging across MIN_CHUNK_SIZE boundaries to keep Arena's
  MAX_SIZE_BITS == MIN_CHUNK_BITS invariant intact.

* SmallArenaRange<Authmap>::Type<ParentRange>: a wrapper around
  Arena<InplaceRep<Authmap, ChunkBounds>, MIN_BITS,
  MIN_CHUNK_BITS> presenting the standard Range interface. Serves
  arbitrarily-unit-aligned sizes (not just powers of two). Replaces
  the historical alloc_range_with_leftover with
  alloc_size_with_align(size, align), which makes alignment an
  explicit parameter and donates the unit-aligned tail back to the
  arena.

* amplify_from_address<bool potentially_out_of_range>(address_t) on
  DummyAuthmap (pass-through reinterpret_cast) and BasicAuthmap (lookup
  + pointer_offset). Lets InplaceRep recover an arena cap for an
  address it knows only as an integer.

* New test target smallarenarange covering the rep accessor
  round-trips, arena add/remove/consolidation/carve, a 30-seed x 500-op
  stress, the can_consolidate chunk-boundary refusal, and four
  alloc_size_with_align scenarios (exact fit, pow2 align over non-pow2
  size, align larger than size, MIN_CHUNK_SIZE bypass).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* StandardLocalState and MetaProtectedRangeLocalState gain an
  Authmap template parameter, plumbed through alongside Pagemap.
  Both configs and the domestication test pass their Authmap into
  the LocalState instantiation.
* The three SmallBuddyRange uses in the meta-range pipes are
  replaced with SmallArenaRange<Authmap>.
* BackendAllocator::alloc_meta_data calls the new
  alloc_size_with_align(size, alignment) primitive, with
  alignment = max(next_pow2(size), MetaRangeT::UNIT_SIZE). The
  next_pow2 keeps behaviour identical to the previous
  buddy-rounded path; the max floors the alignment at the meta
  range's UNIT_SIZE so alloc_size_with_align's precondition holds
  for any positive size.
* FixedRangeConfig's inline Authmap gains amplify_from_address
  (the new SmallArenaRange path needs it).

SmallBuddyRange.h is now orphaned but stays in tree until the cleanup commit
removes it.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
LargeBuddyRange, SmallBuddyRange and their shared buddy.h are no
longer reachable now that SmallArenaRange owns the metadata path
and LargeArenaRange owns the large-range path. Delete them
(-848 lines) and clean up stale references in comments, README,
AddressSpace.md, and the MIN_HEAP_SIZE_FOR_THREAD_LOCAL_BUDDY
constant (renamed ..._CACHE since the per-thread cache it gates
is no longer specifically a buddy).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
With only one Arena type and a Small/Large pair of range adapters
built on it, drop the redundant 'Backend' prefix and pair the range
adapters by size:

  BackendArena       -> Arena
  BackendArenaBins   -> ArenaBins
  BackendArenaRange  -> LargeArenaRange   (pairs with SmallArenaRange)

The bulk of this rename has been distributed across the prior commits
so each one introduces its symbols and files under the new name. What
remains here is content that doesn't translate to a pure text-or-path
substitution:

* Test-internal `using Arena = ...;` aliases became `using TestArena =
  ...;` to avoid colliding with the renamed class template, and call
  sites in test bodies were updated to match.
* Documentation/comment polish replacing residual "buddy" terminology
  with "thread-local cache range" wording matching the post-removal
  pipeline.
* CMakeLists test-name reorder to keep the alphabetical ordering.
* A small clang-format reflow of the friend declaration in arenabins.h.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
LargeArenaRange / SmallArenaRange accept any UNIT_SIZE-aligned
request; the pow2 rounding the backend was applying to metadata
sizes was a leftover from the buddy era and inflated every slab's
metadata block to the next power of two. With a ClientMeta
provider whose per-slab storage is non-pow2 (e.g. allocation
bitmap + small fixed header), this rounding doubled the metadata
overhead.

Publish MIN_META_ALIGN on each LocalState (= MetaRange::UNIT_SIZE).
Add BackendAllocator::meta_size_round, which pads to MIN_META_ALIGN
and steps up to MIN_CHUNK_SIZE for requests that would bypass the
small range to the parent. Replace all four next_pow2-rounded
metadata sites in backend.h with this helper.

A new test func/client_meta_nonpow2 installs a ClientMetaDataProvider
whose per-slab storage is non-pow2 and exercises alloc/dealloc
round-tripping across several sizeclasses; any disagreement between
alloc-side and dealloc-side rounding would trip the meta range's
dealloc_range assertions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Drop the LargeBuddyRange framing (that range no longer exists).
* Align the mechanism description with the in-tree code, which builds
  positive serve masks rather than the inverse skip masks the original
  sketch used.
* Add brief sections on the two-tree structure (one bin tree per
  non-empty bin + one range tree for coalescing) and on the two reps
  Arena ships with: PagemapRep behind LargeArenaRange for whole-chunk
  allocations, InplaceRep behind SmallArenaRange for sub-chunk metadata.
* Link out to AddressSpace.md.

Also add PLAN.md to .gitignore so contributors can keep a local
planning document without committing it.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three semantic CI fixes whose intent doesn't translate back into the
historical APIs of their natural target commits:

* sizeclasstable.h: restore sentinel slot's slab_mask = ~size_t(0) so
  the bounds-checked memcpy shim treats foreign pointers as unbounded
  and any memcpy bound check trivially passes the sentinel through to
  the destination's native checks.
* arena.cc: replace the indirect mock_index probe in can_consolidate
  with an explicit in-range guard so GCC's release-mode -Warray-bounds
  analysis sees a visible guard covering the mock_store[] read.
* largearenarange.cc: pass MinBaseSizeBits<Pal>() as the
  MIN_REFILL_SIZE_BITS template arg so PalRange has enough to reserve
  on Windows.

Also rolls in small clang-format-15 drift and a stray duplicate include
that surfaced after re-targeting the bulk of the original CI fix into
the commits that introduced each affected line.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mjp41 mjp41 force-pushed the BackendArenaRange branch from 0c47339 to 5e31187 Compare June 12, 2026 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant