Description:
PR #4901 fixed exception handling in TaskQueue.execute(), but TaskQueue.run() has equivalent unprotected paths:
TaskQueue
if (execute.exec != currentExecutor) {
tasks.addFirst(execute);
execute.exec.execute(runner); // uncaught — any Throwable kills the runner
currentExecutor = execute.exec;// never reached
return;
}
In the above case, if an exception is thrown from execute.exec.execute() currentExecutor remains non-null. As a result subsequent calls to execute() will enqueue tasks that never get submitted. The event-bus pending task queue keeps growing without any consumer.
Possible fix:
if (execute.exec != currentExecutor) {
tasks.addFirst(execute);
try {
execute.exec.execute(runner);
} catch (Throwable t) {
tasks.pollFirst(); // remove the item added
currentExecutor = null; // allow next execute() to submit a new runner
throw t; // propagate to caller
}
currentExecutor = execute.exec;
return;
}
- Vert.x: 5.0.7
- JDK: OpenJDK 21.0.10+7-LTS
Do you have a reproducer?
Reproducer added taskqueue-taskstall-reproducer
I have just used TaskQueue to reproduce the bug. Made use of Log4j2 IllegalStateException error to replicate un-catched exception that could terminate the runner.
In production environment; Log4j2's java.lang.IllegalStateException: Recursion depth became negative: -1 exception thrown by AbstractLogger.decrementRecursionDepth caused the task stall.
Tasks submitted to event-bus were no longer consumed. Log4j issue should not break internal TaskQueue used by Vertx.
Reproducer specifically targeted log.error at line 82; other areas exception could escape.
— same class of fix should be applied to lines 66 (resume.latch.run()) and 70-74 (execute.exec.execute(runner)), which share the same vulnerability pattern.
Description:
PR #4901 fixed exception handling in TaskQueue.execute(), but TaskQueue.run() has equivalent unprotected paths:
TaskQueue
In the above case, if an exception is thrown from execute.exec.execute() currentExecutor remains non-null. As a result subsequent calls to execute() will enqueue tasks that never get submitted. The event-bus pending task queue keeps growing without any consumer.
Possible fix:
Do you have a reproducer?
Reproducer added taskqueue-taskstall-reproducer
I have just used TaskQueue to reproduce the bug. Made use of Log4j2 IllegalStateException error to replicate un-catched exception that could terminate the runner.
In production environment; Log4j2's java.lang.IllegalStateException: Recursion depth became negative: -1 exception thrown by AbstractLogger.decrementRecursionDepth caused the task stall.
Tasks submitted to event-bus were no longer consumed. Log4j issue should not break internal TaskQueue used by Vertx.
Reproducer specifically targeted log.error at line 82; other areas exception could escape.
— same class of fix should be applied to lines 66 (resume.latch.run()) and 70-74 (execute.exec.execute(runner)), which share the same vulnerability pattern.