Skip to content

perf: O(log n) priority_queue delete via in-element index tracking (#263)#401

Draft
bushidocodes wants to merge 1 commit into
masterfrom
feat/priority-queue-olog-n-delete
Draft

perf: O(log n) priority_queue delete via in-element index tracking (#263)#401
bushidocodes wants to merge 1 commit into
masterfrom
feat/priority-queue-olog-n-delete

Conversation

@bushidocodes

Copy link
Copy Markdown
Contributor

Closes #263.

What

The min-heap priority_queue located elements for deletion with an O(n) linear scan (priority_queue_delete_nolock). This makes delete O(log n): each element stores its own 1-based heap index, the queue keeps that slot in sync on every move, and delete reads it directly and repairs the heap in O(log n). This is the approach from the composite codebase referenced in the issue — and the local_runqueue_mtds.c delete path even carried a TODO: Apply the composite way of PQ deletion O(logn).

Design

  • Optional, opt-in per queue. priority_queue_initialize takes a new get_index_fn accessor that returns a pointer to the element's in-struct size_t index slot. Pass NULL to keep the legacy O(n) linear-scan delete — so queues that are never delete()'d from (global_request_scheduler_minheap, tgrq->sandbox_requests) are untouched and behavior-preserving.
  • Index maintained on every move (append, swaps in percolate up/down, dequeue, delete) via small record/clear/swap helpers; percolate_up is generalized to percolate_up_from(start).
  • Latent bug fixed. Arbitrary-position delete must repair the heap up or down — the old code only percolated down, which left the heap invalid whenever the element moved into the hole was smaller than the hole's parent. The new delete picks the correct direction.
  • Added a size_t pq_idx slot + accessor to the four element types that are delete()'d (sandbox, tenant_global_request_queue, perworker_tenant_sandbox_queue, tenant_timeout) and wired the accessor into the 7 queues that call delete. Each tracked element is a member of at most one tracked queue at a time, so a single slot suffices.

Correctness

Deterministic unit test against the real header (heap-invariant assertions, the percolate-up-on-delete case the old code got wrong, absent-element delete returning −1, and 200k random enqueue/delete/dequeue ops proving the tracked O(log n) path is behaviorally identical to the legacy O(n) path). Clean under ASan + UBSan.

Scheduler validation (in the dev container)

Experiment Schedulers exercised Result
traps EDF/FIFO/MTDS × pre/non-preempt (all 5 common/*.env)
stack_overflow all 5 scheduler configs
fibonacci/bimodal EDF preemption (heavy sandbox churn)
empty/concurrency 10k req × concurrency 1–100

traps + stack_overflow sweep MTDS, exercising all four tracked element types' delete paths (sandbox, tgrq, pwt, tenant_timeout).

deadline_description failed in its post-run perf-log analysis ("Failed to get baseline execution for cifar10") — but it fails identically on master (verified by rebuilding the pre-change binary and re-running). All HTTP requests served fine; the panic is in offline bash analysis of perf.log, unrelated to this change. It is not in CI or test.mk all.

Microbenchmark — changed algorithm vs existing O(n)

Standalone harness, container clang-13 -O3, timing delete-all-in-random-order at heap size N (best of 9). old O(n) is the true pre-change header; new O(logn) is this PR.

N old O(n) ns new O(logn) ns speedup
8 17.6 21.4 0.82×
32 20.5 27.1 0.76×
128 31.0 31.3 0.99×
256 43.1 32.3 1.33×
512 68.6 32.6 2.10×
1024 121.1 33.7 3.59×
2048 224.1 37.0 6.06×
4096 441.1 40.3 10.95×

Indexed delete stays ~flat (~21→40 ns) as the heap grows; the O(n) scan grows linearly (18→441 ns). Crossover ≈ N=128: for tiny heaps the index-maintenance constant makes it marginally slower, but it wins increasingly past that — up to ~11× at 4096.

Mapping to real queue sizes: the MTDS tenant queues are small (RUNTIME_MAX_TENANT_COUNT=32, ~neutral), but the local runqueue (RUNTIME_RUNQUEUE_SIZE=256, grows) and per-tenant sandbox queues (RUNTIME_TENANT_QUEUE_SIZE=4096) get deep under load — exactly where the old O(n) delete hurt. Beyond average throughput, this bounds worst-case delete latency (O(n) on a full queue is ~440 ns and climbs with load; O(log n) stays ~40 ns), which helps scheduler tail-latency/predictability under overload.

🤖 Generated with Claude Code

…ng (#263)

The min-heap priority_queue located elements for deletion with an O(n) linear
scan. This makes delete O(log n) by having each element store its own 1-based
heap index, which the queue keeps in sync on every move (append, swaps during
percolate up/down, dequeue, delete). delete then reads the element's slot
directly and repairs the heap in O(log n). Mirrors the approach in the
composite codebase referenced in the issue.

- priority_queue.h:
  * Add an optional `get_index_fn` accessor (returns a pointer to the element's
    in-struct size_t index slot). NULL preserves the legacy O(n) linear-scan
    delete, so the change is opt-in per queue and behavior-preserving where not
    wired up.
  * Maintain the index slot on every element move via small record/clear/swap
    helpers; generalize percolate_up into percolate_up_from(start_index).
  * Rewrite delete_nolock: O(1) lookup via the slot (with a defensive
    items[i]==value validation that still returns -1 for absent elements), then
    repair the heap. Also fixes a latent correctness bug: arbitrary-position
    delete must percolate the replacement UP or DOWN, not down-only — the old
    code could leave the heap invalid when the moved last element was smaller
    than the hole's parent.

- Add a `size_t pq_idx` slot + accessor to the four element types that are
  delete()'d (sandbox, tenant_global_request_queue, perworker_tenant_sandbox_queue,
  tenant_timeout) and wire the accessor into the 7 queues that call delete.
  Queues that are only enqueued/dequeued (global_request_scheduler_minheap,
  tgrq->sandbox_requests) keep the NULL/legacy path. Each tracked element is a
  member of at most one tracked queue at a time, so a single slot is sufficient.

Verified with a deterministic unit test against the real header (heap-invariant
checks, the percolate-up-on-delete case, absent-element delete, and 200k random
ops proving the tracked O(log n) path is behaviorally identical to the legacy
O(n) path), clean under ASan+UBSan.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@bushidocodes bushidocodes requested a review from emil916 June 23, 2026 01:34
@bushidocodes

Copy link
Copy Markdown
Contributor Author

@emil916 - I'm pretty fried and don't have the braincells to review this closely and compare to the composite link you posted way back when, but here's an Opus 4.8 attempt at what you want with some preliminary perf numbers. I'm leaving this as a draft because I haven't reviewed this as closely. You might be more familiar here and be able to determine if this is an improvement a bit faster.

@emil916

emil916 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

I have already implemented this in my feat branch. I thought this was already merged. I remember there were some complications when I did it. So, it is good to have another view. I'll look at this later, as well. Thank you, Sean!

@bushidocodes

Copy link
Copy Markdown
Contributor Author

@emil916 - Feel free to close this out if it's a dupe of what you're going to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sledge heap removal takes O(n)

2 participants