fix(taskprocessing): claim tasks atomically so parallel workers don't duplicate#61053
fix(taskprocessing): claim tasks atomically so parallel workers don't duplicate#61053bygadd wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes duplicate task execution when running multiple occ taskprocessing:worker processes in parallel by making task claiming atomic and preventing already-finished tasks from being re-claimed.
Changes:
- Updated
WorkerCommand::processNextTask()to loop until it can successfullylockTask()the next scheduled task (skipping tasks that can’t be claimed using an ignore list). - Tightened
TaskMapper::lockTask()to only claim tasks inSTATUS_SCHEDULED, preventing re-claiming completed tasks.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| lib/private/TaskProcessing/Db/TaskMapper.php | Restricts atomic claim update to scheduled tasks only. |
| core/Command/TaskProcessing/WorkerCommand.php | Adds atomic lock-and-retry loop when selecting the next task to process. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Production follow-up: this fixes duplicates, but the claim-loop can't keep up under backlog — proposing an atomic
|
|
@bygadd Thank you for the thorough investigation! Yes, please update this PR to the SKIP LOCKED approach! Please note that we have a dedicated process for adding new indexes on existing tables: The index-adding-code should be added in the existing migration that adds the table (or the last created column), and there should be code to add the index on existing instances using |
df826bb to
b00f32b
Compare
|
@marcelklehr updated to the
One open point from your note: I placed the index in the table-creating migration |
|
Hello, Please see https://github.com/nextcloud/server/blob/master/.github/CONTRIBUTING.md#ai-assisted-contributions regarding AI contributions. |
b00f32b to
1cd198a
Compare
|
Thanks @come-nc, and sorry for the missing disclosure — I've updated the commit message and will add a note to the PR description. To give some context on ownership: I run the production instance where this actually bit us. We had a few thousand core:text2text tasks piling up in the queue — mail summaries and the Assistant were barely draining and effectively unusable. I investigated it using Claude Code as a tool, but I drove the diagnosis and made the decisions. I validated the fix on my own instance: deployed it reversibly, injected a backlog, watched throughput go from ~0–5/min to ~79/min, and confirmed no duplicate processing and that started_at is recorded correctly. Happy to walk through any part of it. |
|
Thank you for updating the PR. It looks solid. |
Oracle is being difficult as always. |
|
I wonder if we should use this new mechanism in the ex app endpoint as well. It was designed when we didn't have support for kubernetes yet, but when running ex apps in kubernetes, there may be multiple instances competing for the same taks as well. |
Good point @marcelklehr — multiple ex-app instances under k8s can definitely race. Worth noting this PR already closes the correctness side there: the endpoint goes through lockTask, which now guards = SCHEDULED, so two instances can't both claim the same task — the loser just moves on. What's left is the efficiency win (dropping the fetch-then-lock ignore-list loop for the atomic SKIP LOCKED claim). That endpoint filters by preferred provider and has a batch variant, so I'd rather do it as a focused follow-up than widen this PR. Happy to open it. |
|
Pushed d20c487 with the review feedback addressed.
For Oracle, I switched it to the same bounded-retry fallback we use for SQLite. The issue was that Oracle does not allow the row-limiting clause together with I also cleaned up the ordering comments, added a regression test, and tested the full change on a production MariaDB instance with multiple workers. The backlog drains correctly, there are no duplicate picks, and |
That would be great! Thank you |
…ite index Replace the worker retry/ignore-list claim-loop with a single atomic SELECT ... FOR UPDATE SKIP LOCKED claim (SQLite bounded-retry fallback), preserving the no-duplicate guarantee while removing the thundering-herd contention that throttled backlog draining. Add a (status,type,last_updated) index via the table-creating migration + db:add-missing-indices listener. Signed-off-by: Yoan Bozhilov <bygadd@gmail.com> Assisted-by: Claude Code:claude-opus-4-8
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Signed-off-by: Marcel Klehr <mklehr@gmx.net>
…Oracle fallback Address review feedback (@marcelklehr, Copilot): - lockTask claims only SCHEDULED tasks (was status != RUNNING) and stamps started_at in the same atomic UPDATE, so a finished task cannot be re-claimed and the external-provider claim path records started_at as well. - claimWithBoundedRetry re-reads after lockTask instead of a follow-up UPDATE. - Oracle joins SQLite on the bounded-retry fallback: Oracle cannot combine a row-limiting clause with FOR UPDATE (ORA-02014), which failed the claim tests on Oracle CI. - Reword the worker docblock/comments to "prefer oldest available" (parallel SKIP LOCKED does not guarantee a strict global order). - Add a regression test that lockTask does not resurrect a finished task. Signed-off-by: Yoan Bozhilov <bygadd@gmail.com> Assisted-by: Claude Code:claude-opus-4-8
d20c487 to
f1f1659
Compare
Developed with AI assistance (Claude Code)
Summary
Replace the TaskProcessing worker's retry/ignore-list claim loop with a single atomic
SELECT ... FOR UPDATE SKIP LOCKEDclaim, and add the supporting composite index via the table-creating migration plus thedb:add-missing-indiceslistener (per maintainer guidance). This supersedes the earlier dup-fix retry-loop approach on this PR.Production context
On a real deployment we observed a TaskProcessing backlog of ~6,200 scheduled tasks draining at only ~5 tasks/min with multiple workers running. Instrumentation showed the claim phase consumed ~99% of each worker's wall time — not the actual task execution. Two compounding faults:
SELECTed the same oldest scheduled row, then locked it separately. The losers built a per-workerid NOT IN (...)ignore list and retried, which made each subsequentSELECTprogressively heavier under contention.WHERE status = ? [AND type IN (...)] ORDER BY last_updated ASC LIMIT 1claim had no covering index. On a large tasks table this degraded to a full scan (observed ~11s on a 213k-row table), so the claim loop alone took 50–240s per task.After the change, the same backlog drains at ~79 tasks/min — the worker spends its time executing tasks, not contending for them.
Approach
TaskMapper::claimOldestScheduledTask()claims the oldest scheduled task in one atomic step. On MySQL/MariaDB/PostgreSQL it runsSELECT ... ORDER BY last_updated ASC LIMIT 1 FOR UPDATE SKIP LOCKEDinside a transaction, then a guardedUPDATEtoRUNNING. Concurrent workers skip rows already locked by another transaction, so no two workers ever claim the same task and no retry/ignore-list is needed. The no-duplicate guarantee is preserved.SQLite fallback. SKIP LOCKED is unsupported on SQLite (Doctrine throws), so we feature-detect via
IDBConnection::getDatabaseProvider()and fall back to the existing bounded lock-and-retry claim (claimWithBoundedRetry). This stays correct because theUPDATE ... WHERE status = SCHEDULEDis atomic and SQLite serialises writers.findOldestScheduledByType()/lockTask()are kept for the fallback and backward compatibility.Manager::claimNextScheduledTask()exposes the claim on the publicIManagerinterface.WorkerCommand::processNextTask()now makes a single claim call instead of fetch-then-lock:null-> no work, otherwise process. The old retry/ignore-list loop is removed.Composite index
taskp_status_type_updon(status, type, last_updated)so the ordered oldest-scheduled claim is index-served (~11s -> ms). Following @marcelklehr's guidance, the index is not a standalone migration: it is added to the table-creating migration (Version30000Date20240429122720) for fresh installs, and registered incore/Listener/AddMissingIndicesListener.phpso existing installs pick it up viaocc db:add-missing-indices.started_atrecorded at claim time. Since the worker now receives the task alreadyRUNNING, the laterManager::setTaskStatusSCHEDULED -> RUNNINGedge (which used to setstarted_at) no longer fires; the claim'sUPDATEpersistsstarted_atdirectly, so the timestamp is preserved.A task is only ever transitioned
SCHEDULED -> RUNNINGby claiming; it is never markedFAILEDby the claim path.Testing
composer cs:fix+composer cs:check: clean on the touched files.composer psalm: no new errors introduced by these changes.tests/lib/TaskProcessing/) green, including the SKIP LOCKED claim tests (claim returns the running task, returns null when nothing is schedulable, and never double-claims), astarted_at-recorded-at-claim test, and a SQL-shape test asserting the claim emitsFOR UPDATE/SKIP LOCKEDon non-SQLite.WorkerCommandTestupdated for the single-claim flow.taskprocessing:workerprocesses): an injected backlog drains via the SKIP LOCKED claim, no duplicates, andstarted_atis populated for every claimed task.occ db:add-missing-indicesis a no-op there because the index already exists (idempotenthasIndexcheck) — i.e. no re-index is triggered on installs that already have it.Note for reviewers
@marcelklehr suggested adding the index to "the existing migration that adds the table (or the last created column)." This PR adds it to the table-creating migration
Version30000. If you'd prefer the latest column-adding migration instead, that is a one-line move of the sameaddIndexcall to a different migration file — happy to adjust.