Skip to content

Latest commit

 

History

History
102 lines (67 loc) · 7.4 KB

File metadata and controls

102 lines (67 loc) · 7.4 KB

pbr-cpp-memory-pool v0.4.0 — Milestone 4: Thread-Safe Variant

Fourth tagged release of pbr-cpp-memory-pool. Milestone 4 adds optional, compile-time-configurable thread safety — and proves the single-threaded fast path is preserved at zero cost (spec §2.4). The C ABI, the C++ wrapper, and the public headers are unchanged: thread safety is a property of how the library is built, not of its interface.

What's in the box

A compile-time thread-safety Strategy on a Template Method skeleton (ADR-0020, ADR-0021)

memory_pool_alloc / memory_pool_free are refactored into an alloc_skeleton / free_skeleton Template Method: the skeleton owns the invariant, race-free guards (null pool / null block / foreign-pointer range check — all reading only post-creation-immutable fields, kept outside any lock) and delegates the synchronized free-list head mutation to two compile-time policy hooks, Policy::pop_head / Policy::push_head. The exhaustion test lives inside pop_head, so the lock-free policy can re-test inside its CAS loop — the design pivot that lets one skeleton + two hooks fit all three policies.

Thread safety is a Strategy bound at compile time (policy-based, not runtime-virtual), so the single-threaded build pays literally nothing — no atomic, no branch, no indirect call. Exactly one policy is compiled, selected by the new PBR_MEMORY_POOL_THREAD_SAFETY macro and aliased to ActivePolicy:

Mode Policy Mechanism
NONE (default) SingleThreadedPolicy No synchronization — the v0.3.0 head pop/push verbatim.
MUTEX MutexPolicy A std::mutex held across the O(1) pop/push.
LOCKFREE LockFreePolicy A Treiber-stack compare_exchange_weak loop on an ABA-tagged std::atomic<TaggedHead> head.

The mode is fixed library-wide at build time via a CMake option (-DPBR_MEMORY_POOL_THREAD_SAFETY=NONE|MUTEX|LOCKFREE), mapped to a PRIVATE compile definition — a per-pool runtime flag is rejected because it would re-introduce a hot-path branch (spec §2.4). struct memory_pool gains policy state conditionally (a 16-byte atomic tagged head under LOCKFREE, a std::mutex under MUTEX); the ADR-0015 static_assert(sizeof(memory_pool) <= 128) holds in all three modes. Per-thread caches are deferred (ADR-0020 §4) — the Strategy seam keeps them a non-breaking future addition.

Concurrent stress tests + ThreadSanitizer (M4.4)

The concurrency_stress CTest binary drives Pool from eight threads, checking three invariants under contention: no over-vend / distinctness (a concurrent drain hands out exactly block_count distinct blocks), full recovery / no leak (exact block_count recovered after heavy churn), and exclusive ownership (a per-thread byte marker proves no double-vend). The suite runs under the thread-safety CI job (MUTEX + LOCKFREE × GCC + Clang). A new tsan CI job runs it under MUTEX with ThreadSanitizer, verifying the mutex-guarded path is data-race free. LOCKFREE is intentionally excluded from TSan (its Treiber-stack next-link reads are a benign, not-cleanly-expressible-in-C++17 race; correctness is covered by the logical invariants + ADR-0020 §3).

Comparative benchmark — fast path vs. concurrent path (M4.5)

pool_vs_malloc_bench gains a concurrent scenario (--scenario concurrent, --threads N): T threads run the interleaved loop on a shared pool, reporting aggregate ns/op vs malloc. Canonical numbers (Intel i5-6600K Skylake, MSVC 19.51 Release, 4 threads — full report at docs/bench/v0.4.0-windows-msvc-x64-threading.md):

Measure NONE MUTEX LOCKFREE
single-thread interleaved (ns/op) 9.3 47.2 31.7
concurrent, 4 threads, pool (ns/op) 9.5¹ 69.5 41.8

¹ NONE is clamped to one thread (the racy build's fast-path baseline).

  • The single-thread fast path is preservedNONE matches the M2.9 numbers, ~5× faster than malloc.
  • Synchronization has a real uncontended cost; LOCKFREE < MUTEX.
  • Under contention, LOCKFREE beats MUTEX, but a single-shared-head pool cannot out-scale malloc's per-thread arenas — the evidence motivating the deferred per-thread caches.

Architecture Decision Records

Two ADRs accepted in Milestone 4, taking the running total from 19 to 21:

  • ADR-0020 — Thread-safety Strategy and compile-time configuration knob.
  • ADR-0021 — Template Method allocation skeleton with thread-safety hook points.

Design-patterns catalogue

Two patterns flip to Implemented in docs/patterns/README.md: Strategy (the three policies + macro) and Template Method (the alloc_skeleton / free_skeleton frame). Composite, Decorator, and Observer remain Planned against the Milestone 5–6 items.

Spec Coverage Map

Two rows flip to ✅: §2.4 (optional, configurable thread safety; single-thread fast path preserved) and §6.3 (benchmark vs malloc — the concurrent comparative re-run completes the contract). Coverage at the close of Milestone 4 is ten rows ✅; the remaining ⏳ work is §2.2's dynamic-growth half (Milestone 5) and the instrumentation items (Milestone 6).

What this release does not contain

  • Dynamic growth on exhaustion — Milestone 5 → v0.5.0. Allocation surfaces exhaustion in fixed mode.
  • Per-thread caches — deferred (ADR-0020 §4); the Strategy seam keeps them a future addition.
  • Instrumented / observable variants (Decorator, Observer, statistics) — Milestone 6 → v0.6.0.
  • Doxygen-rendered API site, install / packaging (vcpkg, Conan) — Milestone 7 → v1.0.0.

Verifying the release

Each platform tarball produced by release.yml contains the public headers under include/it/d4np/memorypool/, the static archive under lib/, and the top-level LICENSE, README.md, CHANGELOG.md. SHA-256 checksums of every attached artifact live in SHA256SUMS:

sha256sum --check SHA256SUMS

The release pipeline re-runs the full CI matrix against the tagged commit — a green release workflow run is the canonical "reproducible from a cold runner" signal.

Build and use

The default build is single-threaded (the fast path). To build a thread-safe library:

cmake --preset release -DPBR_MEMORY_POOL_THREAD_SAFETY=LOCKFREE
cmake --build --preset release

The public API is identical across modes:

#include <it/d4np/memorypool/memory_pool.hpp>

int main() {
    using namespace it::d4np::memorypool;
    if (auto pool = Pool::make(64, 1024)) {     // thread-safe iff the library was built so
        void* block = pool->try_allocate();
        // ...
        pool->deallocate(block);
    }
}

Links