Commit dbdcd06
feat(celery Wave 5 P4): production robustness 3-pack — cleanup transient/intentional split + parse_version short-circuit + reconciler stuck-parse re-enqueue
Wave 5 Phase 4 (PR-C / task #29) — closes 3 latent issues surfaced
through Wave 4 ratify trail:
* **P4-1 cleanup transient-vs-intentional split**
(per architect msg=6aa8ca88 T2 ratify minor obs A):
Pre-Wave-5 ``_resolve_cleanup_worker`` collapsed any factory
exception into "drop the DB row". Transient infra errors
(Qdrant blip, ES network glitch) lost the retry signal — DB row
was deleted before next cycle could retry. Wave 5 P4 returns a
new ``CleanupWorkerResolution(worker, transient)`` distinguishing
the two: ``WorkerFactoryError`` is a by-design gate (drop the
row to bound index growth), any other Exception is transient
(skip DB row drop so next cycle retries when backend recovers).
Counts surface ``transient_deferred`` so operators can track
recovery rate.
* **P4-2 parse_version short-circuit**
(per huangheng T3 chunk 2 obs B + Wave 5 backlog item):
Pre-Wave-5 ``parse_document`` always re-runs DocParser even when
the resulting artifact directory is byte-identical (parse_version
is content-derived). Wave 5 P4 adds ``short_circuit_if_artifacts_exist=True``
default — if all three derived artifacts (markdown.md /
outline.json / chunks.jsonl) already exist under the canonical
``derived/parse_<version>/`` path, skip DocParser + writes
entirely. Eliminates the ~30s OCR / Word rerun cost on rebuild
of unchanged content. Tests can pass ``False`` to force re-parse.
* **P4-3 reconciler stuck-document parse re-enqueue**
(per architect Wave 4 T3 chunk 2 obs A — production gap close):
Pre-Wave-5 parse failures (DocParser raise / source missing)
silently dropped the document_id in the parse worker; operator
saw ``document.status == PENDING`` forever with no signal to
re-trigger. Wave 5 P4 adds
``reconcile_stuck_documents_for_parse_reenqueue`` to the
reconciler loop. Detects documents with
``Document.gmt_created < now - cooldown_seconds`` AND zero
``document_index`` rows AND ``Document.gmt_updated < now -
cooldown_seconds`` (cooldown filter prevents 30-s tick storm),
then pushes a fresh ``ParseDispatchPayload`` matching the upload
handler's contract. ``gmt_updated`` bumps after each push so
the cooldown predicate rate-limits re-enqueue.
Three-class tag (Wave 3 production-readiness invariant):
* must-be-real: cleanup transient/intentional split prevents
silent retry-signal loss; parse_version short-circuit eliminates
OCR rerun waste; stuck-parse reconciler closes the document.status
PENDING-forever gap
* may-be-gated: ``short_circuit_if_artifacts_exist=False`` for
tests pinning DocParser invocation count
* fully-resolves: 3 of Wave 5 P1 backlog items (per architect
ratify trail msg=6aa8ca88 + huangheng obs trail)
Deltas:
* ``aperag/indexing/cleanup.py`` — ``CleanupWorkerResolution``
dataclass + WorkerFactoryError-vs-Exception split in
``_resolve_cleanup_worker``; both ``cleanup_orphan_parse_versions``
+ ``cleanup_for_deleted_documents`` honor ``resolution.transient``
to skip DB row drop on transient infra failures; collection
cascade aggregates ``transient_deferred``.
* ``aperag/indexing/parser.py`` — ``_all_artifacts_present``
predicate + ``short_circuit_if_artifacts_exist`` parameter +
early-return ParseResult when all 3 artifacts present.
* ``aperag/indexing/reconciler.py`` — ``STUCK_PARSE_COOLDOWN_SECONDS``
+ ``reconcile_stuck_documents_for_parse_reenqueue`` async scan +
``_select_stuck_documents_for_reenqueue`` SQL query +
``_build_parse_payload_for_document`` payload reconstruction +
``_resolve_collection_parser_config`` / ``_resolve_collection_modalities``
(mirror ``document_service`` shape) + ``_mark_stuck_documents_reenqueued``
bumps gmt_updated; ``run_reconcile_loop`` calls the new scan.
* ``aperag/indexing/__init__.py`` — re-export new symbols.
* ``tests/integration/test_p4_robustness_3pack.py`` (new) — 13
tests across 3 layers (cleanup transient/intentional split / parse
short-circuit / reconciler re-enqueue with cooldown + payload
shape + skip-when-indexed + skip-when-no-object-path).
* ``tests/unit_test/indexing/test_t3_1_dispatcher_path_c.py`` —
added ``transient_deferred: 0`` to expected empty-counts dict.
Local gates:
* ``pytest tests/unit_test/indexing/ tests/integration/`` — 232
passed / 48 skipped (incl. 13 new ``test_p4_robustness_3pack``).
* ``ruff check aperag/ tests/integration/test_p4_robustness_3pack.py``
— clean.
* ``ruff format`` — applied.
Branch: local ``chenyexuan/celery-wave5-p4`` based on main
``19d3d70`` (Wave 4 PR #1731 squash); will rebase onto
``bryce/celery-wave5`` once Bryce opens the Wave 5 draft PR.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>1 parent e070716 commit dbdcd06
6 files changed
Lines changed: 1014 additions & 49 deletions
File tree
- aperag/indexing
- tests
- integration
- unit_test/indexing
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
158 | 158 | | |
159 | 159 | | |
160 | 160 | | |
| 161 | + | |
161 | 162 | | |
162 | 163 | | |
163 | 164 | | |
| 165 | + | |
164 | 166 | | |
165 | 167 | | |
166 | 168 | | |
| |||
276 | 278 | | |
277 | 279 | | |
278 | 280 | | |
279 | | - | |
| 281 | + | |
280 | 282 | | |
281 | 283 | | |
282 | 284 | | |
| 285 | + | |
283 | 286 | | |
284 | 287 | | |
285 | 288 | | |
| 289 | + | |
286 | 290 | | |
287 | 291 | | |
288 | 292 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
84 | 84 | | |
85 | 85 | | |
86 | 86 | | |
| 87 | + | |
87 | 88 | | |
88 | 89 | | |
89 | 90 | | |
| |||
92 | 93 | | |
93 | 94 | | |
94 | 95 | | |
| 96 | + | |
95 | 97 | | |
96 | 98 | | |
97 | 99 | | |
| |||
164 | 166 | | |
165 | 167 | | |
166 | 168 | | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
167 | 200 | | |
168 | 201 | | |
169 | 202 | | |
170 | 203 | | |
171 | 204 | | |
172 | | - | |
| 205 | + | |
173 | 206 | | |
174 | 207 | | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
180 | | - | |
181 | | - | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
186 | 224 | | |
187 | 225 | | |
188 | 226 | | |
189 | | - | |
190 | | - | |
| 227 | + | |
| 228 | + | |
191 | 229 | | |
192 | | - | |
| 230 | + | |
| 231 | + | |
193 | 232 | | |
194 | 233 | | |
195 | 234 | | |
196 | 235 | | |
197 | 236 | | |
198 | | - | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
199 | 248 | | |
200 | | - | |
| 249 | + | |
201 | 250 | | |
202 | 251 | | |
203 | 252 | | |
204 | | - | |
205 | | - | |
| 253 | + | |
| 254 | + | |
206 | 255 | | |
207 | 256 | | |
208 | 257 | | |
| |||
301 | 350 | | |
302 | 351 | | |
303 | 352 | | |
| 353 | + | |
304 | 354 | | |
305 | 355 | | |
306 | 356 | | |
| |||
315 | 365 | | |
316 | 366 | | |
317 | 367 | | |
318 | | - | |
| 368 | + | |
319 | 369 | | |
320 | 370 | | |
321 | 371 | | |
322 | 372 | | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
323 | 380 | | |
324 | 381 | | |
325 | 382 | | |
| |||
424 | 481 | | |
425 | 482 | | |
426 | 483 | | |
| 484 | + | |
427 | 485 | | |
428 | 486 | | |
429 | 487 | | |
| |||
444 | 502 | | |
445 | 503 | | |
446 | 504 | | |
447 | | - | |
| 505 | + | |
448 | 506 | | |
449 | 507 | | |
450 | 508 | | |
451 | 509 | | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
452 | 517 | | |
453 | 518 | | |
454 | 519 | | |
| |||
623 | 688 | | |
624 | 689 | | |
625 | 690 | | |
| 691 | + | |
626 | 692 | | |
627 | 693 | | |
628 | 694 | | |
| |||
641 | 707 | | |
642 | 708 | | |
643 | 709 | | |
644 | | - | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
645 | 717 | | |
646 | 718 | | |
647 | 719 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
329 | 329 | | |
330 | 330 | | |
331 | 331 | | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
332 | 355 | | |
333 | 356 | | |
334 | 357 | | |
| |||
398 | 421 | | |
399 | 422 | | |
400 | 423 | | |
| 424 | + | |
401 | 425 | | |
402 | 426 | | |
403 | 427 | | |
| |||
406 | 430 | | |
407 | 431 | | |
408 | 432 | | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
409 | 443 | | |
410 | 444 | | |
411 | 445 | | |
| |||
433 | 467 | | |
434 | 468 | | |
435 | 469 | | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
436 | 476 | | |
437 | 477 | | |
438 | 478 | | |
| |||
446 | 486 | | |
447 | 487 | | |
448 | 488 | | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
449 | 528 | | |
450 | 529 | | |
451 | 530 | | |
| |||
470 | 549 | | |
471 | 550 | | |
472 | 551 | | |
473 | | - | |
474 | | - | |
475 | | - | |
476 | | - | |
477 | | - | |
478 | | - | |
479 | | - | |
480 | | - | |
481 | | - | |
482 | | - | |
483 | | - | |
484 | | - | |
485 | | - | |
486 | | - | |
487 | | - | |
488 | | - | |
489 | | - | |
490 | | - | |
491 | | - | |
492 | 552 | | |
493 | 553 | | |
494 | 554 | | |
| |||
0 commit comments