You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
perf(watson): prefetch relations + force async indexing (#14881)
* perf(watson): prefetch relations + force async indexing
Watson's SearchAdapter resolves __-separated relation paths via per-instance
getattr, triggering an N+1 query storm during async indexing. For Finding
(test__engagement__product__name + jira_issue__jira_key) and Vulnerability_Id
(finding__test__engagement__product__name) on a 1000-row batch this adds
thousands of extra queries per task.
dojo/utils_watson_prefetch.py auto-derives select_related / prefetch_related
paths from each adapter's fields/store by walking model._meta, then applies
them in update_watson_search_index_for_model. Toggle:
DD_WATSON_INDEX_PREFETCH_ENABLED (default True). On any error we log loudly
and fall back to the plain queryset so indexing still completes.
Also adds force_async=True to dojo_dispatch_task / we_want_async — keeps
the watson indexer in the background even when the caller is a
block_execution=True user, since index updates are slow and never need
to be synchronous from the user's perspective.
Tests:
- unittests/test_watson_index_prefetch.py (10 tests) — path classification
for Product/Finding/Vulnerability_Id/Endpoint, unknown-path drop, setting
toggle, derivation-raise fallback with log assertion.
- unittests/test_celery_dispatch_force_async.py (4 tests) — force_async
precedence over sync=True and block_execution.
* test(watson): fix CI failures from watson prefetch + force_async
- test_tag_inheritance_perf: update V2/V3 import baselines (-52 each)
to reflect adapter-derived select_related/prefetch_related in the
async watson indexer running inline under CELERY_TASK_ALWAYS_EAGER.
- test_watson_async_search_index: add CELERY_TASK_ALWAYS_EAGER=True to
the threshold=0 case. force_async=True now always dispatches via
apply_async; without eager mode the task never runs and the index
stays empty.
* perf(watson): intermediate flush + always-async index dispatch
Wrap watson.search_context_manager.add_to_context with a size-based hook
that drains the per-request context to async celery tasks as soon as it
reaches WATSON_ASYNC_INDEX_UPDATE_BATCH_SIZE, instead of waiting for
end-of-request. Bounds in-memory growth on long-running imports and lets
celery workers start indexing batches earlier (parallel fanout).
Hook installed once in dojo.apps.ready(). BATCH_SIZE doubles as
threshold; set to 0/negative to disable the intermediate flush.
Drop WATSON_ASYNC_INDEX_UPDATE_THRESHOLD: index dispatch is now
unconditionally async. Removes the sub-threshold sync branch (which
blocked the request on _bulk_save_search_entries) and the
disable-async path.
Consolidate _extract_tasks_for_async + _trigger_async_index_update +
_dispatch_async_index_batches + _flush_search_context_intermediate into
one helper `_drain_search_context_to_async` that groups, dispatches,
and discards entries from the set in place. With the set drained,
watson's end() bulk-saves an empty iterator — no explicit invalidate()
needed.
Tests:
- test_watson_intermediate_flush: new — drain dispatches + clears,
threshold-triggered hook, threshold=0 disables, invalid context skips.
- test_watson_async_search_index: collapse three threshold-variant
tests into one, class-level CELERY_TASK_ALWAYS_EAGER=True.
- test_tag_inheritance_perf: reimport no-change baselines V2 69→74,
V3 87→92 (always-async path adds 5 queries vs prior sub-threshold
sync branch).
* upgrade notes
* test(watson): query-count assertion for prefetch helper
Lock in the N+1 elimination claim directly with CaptureQueriesContext —
previously only observed indirectly via the ZAP import perf test.
* test(watson): supply Product.description in flush hook fixtures
CI runs the V3_FEATURE_LOCATIONS=True matrix where BaseModel.save calls
full_clean — Product.description is blank=False, so the bare fixture
ValidationErrors out. Local default (V3 off) skips validation, masking
this in the prior run.
Copy file name to clipboardExpand all lines: docs/content/releases/os_upgrading/2.59.md
+5Lines changed: 5 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -88,4 +88,9 @@ As announced in DefectDojo 2.57.0, the Stub Findings feature has been removed. T
88
88
89
89
Any requests to this endpoint will now return a 404 Not Found error. The Stub Findings UI is no longer available.
90
90
91
+
## Configuration change in Watson Search Indexing
92
+
93
+
In [PR 14881](https://github.com/DefectDojo/django-DefectDojo/pull/14881)We optimized the way the Django Watson search index is updated during imports and reimports. There is not a single configuration setting to manage the threshold: `DD_WATSON_ASYNC_INDEX_UPDATE_BATCH_SIZE`. The default value should work fine for most instances.
94
+
95
+
91
96
For more information, check the [Release Notes](https://github.com/DefectDojo/django-DefectDojo/releases/tag/2.59.0).
0 commit comments