You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
perf(tags): batch_mode + per-batch bulk inheritance during import (Phase B Stage 2)
Wraps the import / reimport hot loop in `tag_inheritance.batch_mode()` and
bulk-applies inherited Product tags per batch *before* `post_process_findings_batch`
dispatches, so rules / deduplication see inherited tags on `finding.tags`.
Changes:
- `process_findings` (DefaultImporter + DefaultReImporter) now runs its
finding-creation loop inside `batch_mode()`. Per batch, right after
`apply_import_tags_for_batch`, calls a new helper
`apply_inherited_tags_for_findings(batch_findings)` that bulk-syncs
inherited tags on the batch's Findings plus the Endpoints (V2) / Locations
(V3) reachable from them via FK chain. Inheritance is therefore applied to
the persisted children before the post-process task dispatches.
- `inherit_instance_tags` in `dojo/tags_signals.py` now early-returns when
`tag_inheritance.is_in_batch_mode()`, so the batch wrap transparently
suppresses per-row inheritance work for any caller — including
`bulk_create` cleanup paths that invoke it manually. `inherit_tags_on_instance`
post_save delegates to that helper, so the gate also covers signal-driven
fires.
- `EndpointManager.get_or_create_endpoints` replaces its per-row
`inherit_instance_tags(ep)` cleanup loop with a single
`apply_inherited_tags_for_endpoints(created)` bulk call. Inside the importer
the per-batch helper already covers these endpoints via
`Endpoint.status_endpoint.finding`; the bulk call is kept as a defensive
hook for any non-importer caller.
- `propagate_tags_on_product_sync` (used by the product-tag-toggle Celery
task) gains an early-exit when neither system-wide nor per-product
inheritance is enabled, eliminating ~9 wasted reads per call on
inheritance-off products. State transitions (toggling either flag) still
trigger a full sweep through their existing signal handlers.
- `Location` gains `iter_related_products()`: a related-manager
(`self.products` + `self.findings`) implementation of `all_related_products()`
that returns `list[Product]`. Callers that pre-issue
`prefetch_related("products__product__tags",
"findings__finding__test__engagement__product__tags")` get zero extra
queries per Location. The existing JOIN'd `all_related_products()` is kept
for the per-instance signal path where prefetching is not possible.
- `_inherited_tag_names_for_location` (the per-Location callback used to
compute the inherited target set) switches to `iter_related_products()`;
both call sites (`propagate_tags_on_product_sync` V3 branch and
`apply_inherited_tags_for_findings` V3 branch) now prefetch the chain.
Query-count impact on `unittests/test_tag_inheritance_perf.py` (pins updated
in this commit):
| Hot path | Before | After | Δ |
|-----------------------------------------|--------:|-------:|------:|
| ZAP scan import V2 (19 findings) | 1385 | 477 | -908 |
| ZAP scan import V3 | 1263 | 945 | -318 |
| ZAP reimport no-change V2 | 69 | 75 | +6 |
| ZAP reimport no-change V3 | 87 | 102 | +15 |
| Product tag add → 100 locations (V3) | 316 | 125 | -191 |
| Product tag remove → 100 locations (V3) | 266 | 75 | -191 |
Small reimport-no-change regressions are the unavoidable per-batch helper
read cost (2 reads × Finding + 2 reads × Endpoint/Location + 1 product tags
read). Real-work imports drop significantly because per-row
`_manage_inherited_tags` work no longer fires inside the loop.
0 commit comments