Skip to content

Limit chunked ingest chord fan-out to respect parser concurrency limits#2087

Open
JSv4 wants to merge 2 commits into
mainfrom
codex/propose-fix-for-unbounded-chunk-processing
Open

Limit chunked ingest chord fan-out to respect parser concurrency limits#2087
JSv4 wants to merge 2 commits into
mainfrom
codex/propose-fix-for-unbounded-chunk-processing

Conversation

@JSv4

@JSv4 JSv4 commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

Motivation

  • The non-eager chunked ingest path previously built a Celery chord over every prepared chunk and thus bypassed the parser's max_concurrent_chunks throttling, allowing an authenticated uploader to enqueue an unbounded number of chunk tasks and impact availability.

Description

  • Enforce the parser concurrency limit in the non-eager branch by only creating the Celery chord when len(chunk_inputs) <= parser_instance.max_concurrent_chunks in opencontractserver/tasks/doc_tasks.py.
  • When the chunk count exceeds max_concurrent_chunks, fall back to the existing in-process process_document path so the synchronous dispatcher and its thread-pool throttling still control parser-service concurrency.
  • Add a regression test test_ingest_doc_large_pdf_falls_back_when_chunk_count_exceeds_limit in opencontractserver/tests/test_chunk_tasks.py that asserts oversized chunk sets do not call self.replace(...) / dispatch a chord and instead use the synchronous path.

Testing

  • Static sanity check: compiled changed files with python -m py_compile opencontractserver/tasks/doc_tasks.py opencontractserver/tests/test_chunk_tasks.py (succeeded).
  • Django test run attempted: python manage.py test opencontractserver.tests.test_chunk_tasks --keepdb (blocked in this environment due to missing django dependency, so tests could not be executed end-to-end here).

Codex Task

@codecov

codecov Bot commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant