perf(plugin): per-batch atomic Postgres write optimisations for bulk-apply#172
Draft
ldrozdz93 wants to merge 4 commits into
Draft
perf(plugin): per-batch atomic Postgres write optimisations for bulk-apply#172ldrozdz93 wants to merge 4 commits into
ldrozdz93 wants to merge 4 commits into
Conversation
PR 2 of BULK-ORM Sprint 1.
Adds `_preload_changeset_cache(change_set, request)` called at the top of
`apply_changeset`. It walks the changeset once, collects every distinct
object_type and every tag slug, then:
- Calls `ContentType.objects.get_for_models(*models)` — a single query
that warms Django's per-process content-type cache so subsequent
per-change `get_for_model()` calls are free.
- Issues one `Tag.objects.filter(slug__in=...).values_list('id','slug')`
query and stashes `tag_ids_by_slug` on `request._diode_preload`.
PR 4 will reuse `tag_ids_by_slug` to bulk-write `TaggedItem` rows.
Test `test_preload_cache.py` asserts:
- After preload, `get_for_model(Site)` runs in 0 queries.
- Tag IDs for slugs referenced by non-NOOP changes are resolved in <= 2
queries (1 ContentType warm + 1 Tag fetch).
b7acf92 to
c1b47bc
Compare
NetBox's core.signals.handle_changed_object fires ObjectChange.save() once per post_save / m2m_changed event. For a multi-thousand-entity batch the per-instance INSERT becomes the dominant write cost. Install a module-level monkey-patch on ObjectChange.save that, when a thread-local flag is active, buffers the instance instead of writing it. The deferred_changelog() context manager yields a _Deferred accumulator with commit_pending() / rollback_pending() / flush() methods: * On success of a changeset, the caller calls commit_pending(): the changeset's pending audit-log rows promote into the batch buffer. * On failure (ChangeSetException -> savepoint rollback), the caller calls rollback_pending(): the failed changeset's audit rows are discarded so they cannot leak past the rollback. * At batch end, flush() drains the batch buffer with one ObjectChange.objects.bulk_create(buffer, batch_size=500). m2m merge dedup is preserved by a per-window in-memory index keyed on (content_type_id, object_id, request_id): a follow-up m2m_changed fire collapses onto the prior buffered row instead of producing a duplicate audit entry. ApplyChangeSetView and BulkApplyView both wrap their work with deferred_changelog(), so the batch's audit-log rows commit together with the batch's data writes (per-batch atomicity to readers). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
NetBox's instance.tags.set([...]) triggers m2m_changed per tagged instance. The signal handler then (a) writes another ObjectChange row and (b) re-runs serialize_for_event(instance) for the events_queue. For changesets with thousands of tagged objects this dominates apply cost on top of the per-object signal overhead. Add bulk_tags.apply_tags_bulk(instance_tag_pairs, request) which: - Bypasses the through-table m2m signal by bulk-writing extras_taggeditem rows directly via bulk_create(ignore_conflicts= True, batch_size=500). _diode_preload['tag_ids_by_slug'] (from the preload step) is reused; any missing slugs are resolved via one fallback query. - Mirrors instance.tags.set([...]) replace-set semantics by deleting existing TaggedItem rows for the (content_type, object_id) targets in one combined DELETE before the bulk INSERT. - Re-emits a single enqueue_event(OBJECT_UPDATED) per tagged instance so the events_queue payload reflects the post-tag state, without the duplicate ObjectChange row. applier._apply_change pops 'tags' from 'data' before serializer.save() runs and buffers (instance, tag_input) on request._diode_tag_pairs; apply_changeset flushes via apply_tags_bulk after its main loop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The whole batch runs inside an outer transaction.atomic(), so all data + tag + audit writes for the batch become visible to readers in one atomic transition. The trade-off is a longer lock-hold window per batch, which can cause Postgres deadlocks (SQLSTATE 40P01) when two concurrent batches touch the same rows in different orders. Wrap BulkApplyView's outer transaction.atomic() in a bounded retry loop. On a Postgres deadlock or serialization failure (40001), the whole batch transaction is rolled back; the retry loop re-enters with fresh deferred_changelog accumulators (no stale state from the aborted attempt). Behaviour: - New plugin setting: batch_apply_deadlock_retry_max_count, default 3 (three retries after the initial attempt = up to 4 attempts total). - 0 disables retries entirely. - Backoff is jittered exponential: 50ms * 2^attempt * U(0.5, 1.5). - On success after retries, the response carries X-Diode-Batch-Retries: <n> so the reconciler-side log can correlate. - On exhaustion, the original OperationalError is re-raised; caller handles the failure as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c1b47bc to
de8aa33
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[claude]
Summary
Three Postgres-side optimisations stacked on top of the
bulk-applyendpoint, plus a configurable retry on Postgres deadlock. All writes happen inside the batch's outertransaction.atomic(), so the batch remains atomic from a concurrent reader's perspective.Draft PR — not for merge yet.
Status: NOT tested after rebase
This branch has not been exercised end-to-end against
develop's code. The optimisations were originally developed and benchmarked on a separate branch (perf/bulk-orm-no-ttl-bump) that pre-dated develop's own/bulk-apply/endpoint. To produce this PR, the branch was rebased onto the currentdevelopand re-targeted atBulkApplyViewinstead of the olderApplyChangeSetBatchView.The rebased branch has only been:
python ast.parse)ruffclean on changed files)It has NOT been:
Anyone using this branch for local performance testing should run the plugin's test suite as a first sanity check, then run their own end-to-end ingest before drawing conclusions. The bench numbers below were measured on the pre-rebase branch — they should hold (the optimisations are structurally identical) but a fresh sweep is the right thing to confirm.
Single-changeset endpoint also changes — drop or test before merge
In addition to the batch endpoint, this PR also modifies the existing single-changeset endpoint (
ApplyChangeSetViewat/apply-change-set/). Before this PR, that endpoint just did:After this PR, it does:
So the single-changeset endpoint also picks up:
bulk_createat the end inside the same atomic).apply_tags_bulkfromapplier.py).applier.py).It does NOT pick up the deadlock retry (that's only on
BulkApplyView).These changes were not the explicit target of the original optimisation work — they came along because
applier.pyis shared and because audit-log buffering had to be wired into both endpoints to be self-consistent. They have a few risks:apply_tags_bulkissues aDELETE+ bulkINSERT(replace-set semantics, matchinginstance.tags.set([...])). On a single-changeset call this is functionally the same as the prior path, but the SQL shape is different (one transaction wrapping one DELETE + one INSERT, instead of multiplem2m_changedtriggers). Anything monitoringextras_taggeditemmutations via signals will see a different pattern.Recommendation before merge: pick one of
ApplyChangeSetView._postunchanged from develop, scope this PR strictly to the batch endpoint. Lower risk, narrower change surface, matches what was actually benchmarked. The plugin code inapplier.pyanddeferred_changelog.pyis shared, but the calling convention from the single-changeset view can stay outside the new wrapper.ApplyChangeSetView, plus add coverage for the new audit-log / tag-write paths under that endpoint. Check for any downstream consumer (webhooks, event subscribers, observers) that depends on the old per-save audit-row pattern.Option 1 is the cheaper / lower-risk path if the only goal is the bench win on the batch endpoint. Option 2 is the right path if you want the optimisations to also apply to single-changeset callers (smaller absolute win, but the per-changeset CPU drop should still help).
What changed, in simple words
1. Preload ContentType + tag IDs once per changeset
Before: every entity save did one or more small
SELECTqueries to look up "what's the integer ID for this model type?" and "what's the integer ID for thediodetag?". With many entities, that's hundreds of small round-trips to Postgres.After: at the start of each changeset, the plugin runs one
SELECT * FROM django_content_typeand oneSELECT * FROM extras_tag, caches the results in a Python dict on the request, and every subsequent ID lookup inside the changeset is an in-memory dict access instead of a Postgres call.2. Batch-wide single bulk
INSERTof NetBox audit-log rowsBefore: every time an entity is saved, NetBox's signal handler inserts one row into
core_objectchange(the audit log). For a 100-changeset batch with ~280 entities each, that's ~28,000 individual INSERTs into the audit-log table.After: a thread-local buffer collects
ObjectChangerows in memory across all changesets in the batch. At the very end of the batch, the buffer is flushed with oneObjectChange.objects.bulk_create(...). ~28,000 small INSERTs become 1 batched INSERT.A failed-changeset accumulator (pending vs committed buffers) makes sure that if changeset N rolls back via its savepoint, its pending audit rows are discarded — they don't leak into the final bulk_create. So audit-log consistency is preserved: a row exists in the audit log only if the data it audits was actually committed.
3. Per-changeset bulk
INSERTofTaggedItemrows + suppress redundant signalBefore: every
device.tags.add(...)issued oneINSERT INTO extras_taggeditemper tag. For ~280 entities × ~3 tags each, that's ~840 small INSERTs per changeset, and NetBox'sm2m_changedhandler also fires one extraObjectChangeaudit row per tag attachment (~840 extra audit rows per changeset on top).After: tag-link rows are accumulated in memory per changeset and written in one
TaggedItem.objects.bulk_create(...)at the end of each changeset. The redundantm2m_changedaudit-row fire is suppressed, so a tag attachment no longer doubles the audit-log volume.Measured impact (pre-rebase numbers)
Workload:
meraki-x10.yaml(28,310 entities), single host, RPS=100,APPLY_BATCH_SIZE=100,USE_BATCH_APPLY=true,AUTO_APPLY_CHANGESETS=false. Three runs each cell, mean vs mean:Range no-overlap on all three metrics: every optimised run was faster, less CPU-busy, and easier on Postgres than every baseline run. CV around 7–8% on both cells over the three runs.
These numbers were taken on the pre-rebase branch and measured only the batch endpoint. They have not been re-measured after the rebase, and they do not speak to the single-changeset endpoint's behaviour (see Status section).
Rare deadlock risk: concurrent batches modifying the same entities in different order
Because the whole batch runs inside one outer
transaction.atomic(), every row this batch writes stays row-locked in Postgres until the outer commit (which can take seconds, depending on batch size). That trade-off is the cost of giving readers a consistent per-batch view, and it opens up a rare but real deadlock scenario.The scenario. Two different ingest requests are handled by two separate reconciler workers in parallel. The two resulting batch-API calls happen to touch the same entities, but in a different order. For example:
X takes the row lock on A; Y takes the row lock on B. Then X tries to write B (blocked, Y has it) and Y tries to write A (blocked, X has it). Each is waiting for the other to release. This is the classic A-B / B-A deadlock pattern.
Postgres auto-resolves this. When Postgres detects the wait-cycle (default
deadlock_timeout= 1 second), it picks one of the two transactions as the "victim" and aborts it with SQLSTATE40P01(deadlock_detected). The other transaction proceeds to commit normally. So the system doesn't hang — but this should be tested empirically. The above reasoning was assisted by AI analysis and should not be trusted blindly — please verify the actual behaviour in your local test setup before relying on it for capacity planning.Mitigation (implemented in this PR). The aborted batch raises Django's
OperationalErrorwithpgcode='40P01'. The plugin catches this and retries the whole batch with jittered exponential backoff. Max retries is configurable:3retries (so up to 4 attempts total).0disables retries entirely.On success after retry, the response carries an
X-Diode-Batch-Retries: <n>header so it shows up in the caller's logs.The deadlock retry is implemented only on the batch endpoint. The single-changeset endpoint inherits the outer atomic + audit-buffer wrapping but does not retry on deadlock; a deadlock there bubbles up to the caller as a 500 (as it does today on develop).
Alternative: client-side retry on
40P01. The reconciler client could in principle catch the deadlock error code on the gRPC/HTTP response and retry the call from its side. That would work, but it's more complicated than retrying on the server side: the reconciler has to re-encode the gRPC request, re-burn rate-limiter budget, re-traverse the network, and the visibility of "which batch is being retried" gets harder to correlate across logs. Doing it inside the plugin keeps the retry cheap (milliseconds), local, and observable via a single response header.Effect on concurrent NetBox UI users during a batch
The same row-level locking that enables per-batch atomicity also means that any concurrent NetBox UI user (or any other writer) will block if they try to modify an entity that the in-flight batch has already locked. The block lasts until the outer transaction commits — which, depending on batch size, can be a few seconds for typical workloads. Plain UI reads (list pages, filter views, detail pages) are not blocked — under Postgres's default MVCC, readers see the pre-batch snapshot and never wait on writers.
So during a heavy ingest, a user who happens to be editing a device that the ingest is also touching will see a multi-second hang on "Save". For most workloads this is invisible (no overlap), but it's a real behaviour worth knowing about.
🤖 Generated with Claude Code