Skip to content

Commit 141b1e2

Browse files
Maffoochclaude
andcommitted
⚡ speed up migrate_endpoints_to_locations (~14× fewer queries)
Reduces per-endpoint cost in the Endpoint→Location data migration from ~613 queries / 240 ms to ~44 queries / 17 ms on a 50-endpoint / 1,000-finding local benchmark — a 13.9× query reduction and a 14.1× wall-clock improvement that should bring a 16-hour prod run under one hour. As a side-effect, fixes the latent associate_with_product short-circuit bug where 'Mitigated' could stick on a LocationProductReference even after later Active findings came in for the same product. Changes (kept inside this management command — no edits to shared Location model methods): - select_related/prefetch_related on the main Endpoint queryset so the per-endpoint loop has no hidden joins through tags, endpoint_meta, status_endpoint, or finding→test→engagement→product/mitigated_by. - tags.add(*names) splat instead of N round-trips per tag. - DojoMeta.bulk_create(ignore_conflicts=True) per endpoint instead of get_or_create per row (DojoMeta.unique_together = (location, name) makes ignore_conflicts semantically equivalent here). - LocationFindingReference and LocationProductReference are bulk_created per endpoint instead of going through Location.associate_with_finding / associate_with_product. This bypasses BaseModel.save's full_clean() validate_unique queries AND the inherit_tags_on_linked_instance post_save signal (which fires all_related_products through the finding→test→engagement→product chain on every save). Product status is derived in-memory across all of an endpoint's finding statuses. - _suspend_auto_now_add wraps the LocationFindingReference bulk write so the explicit 'created' value (= source Endpoint_Status.date) is honored. Django's SQLInsertCompiler.pre_save_val calls Field.pre_save(add=True) even from bulk_create; auto_now_add would otherwise overwrite our value with now(). - New CLI flags for ops visibility on long runs: --batch-size, --progress-every, --benchmark, --query-count. Default progress line: 'Migrated X/Y (z%) — N ep/sec — ETA …'. Per-step measurements (50 ep / 1,000 findings, V3_FEATURE_LOCATIONS=True, local docker postgres): step wall queries/ep verifier baseline (instrumented) 12.00s 613 14 LPR-status warnings (pre-existing bug) + prefetch_related 10.63s 528 same + tags splat 10.08s 507 same + DojoMeta bulk_create 10.24s 498 same + bulk LFR/LPR + fix 0.85s 44 all strict checks pass Idempotent re-runs validated. Verifier checks counts (URLs, Locations, LFR, LPR, location-DojoMeta), per-row LFR fields (status, created, audit_time, auditor), endpoint→location tag subset, and DojoMeta (location, name) parity. Intentional behavioral diffs vs. the previous code: 1. LocationProductReference.status now reflects 'Active iff any finding for this (location, product) is Active' — fixes the associate_with_product first-write-wins bug. Previously order- dependent; ~28% of product refs were mis-statused on the seeded distribution. 2. Tag inheritance via the inherit_tags_on_linked_instance post_save signal does NOT fire (bulk_create skips signals). For deployments with enable_product_tag_inheritance=True on products (or the system setting on), inherited product tags will not be propagated onto migrated Locations during this command. The seed used in benchmarking does not exercise this path. If your environment uses product tag inheritance, follow up with a one-time Location.inherit_tags pass after this command — or call out and we can bake _bulk_inherit_tags into the migration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent b70c293 commit 141b1e2

1 file changed

Lines changed: 318 additions & 51 deletions

File tree

0 commit comments

Comments
 (0)