Skip to content

Commit 5e41abb

Browse files
committed
schedule: zephyr_ll: fix use-after-free when freeing an active user-space task
zephyr_ll_task_sched_free() frees an active (RUNNING/RESCHEDULE) task by setting pdata->freeing and waiting on pdata->sem for the scheduler thread to remove the task from its run list before the memory is released. Under CONFIG_SOF_USERSPACE_LL this function runs in kernel context while pdata->sem is a sys_sem allocated on the user heap. sys_sem_take() returns -EINVAL immediately when called from kernel context, so the wait is a no-op: pdata is freed (and the struct task is subsequently freed by pipeline_free()) while the task is still linked in sch->tasks with n_tasks != 0 and the scheduling domain handler still set. Because n_tasks is non-zero, schedule_free() does not stop the LL timer, and the next timer tick runs zephyr_ll_run() over the dangling task, dereferencing freed memory and taking a load/store-privilege exception (EXCCAUSE 26) in the user-space LL thread. Stop relying on the cross-privilege semaphore handshake in this path. When the task must be waited for, mark it cancelled so that, should it actually be mid-execution on the scheduler's temporary list, it is removed via the cancel path without re-running task->run() on resources the caller may already have freed. If the task is still linked on the run list, the scheduler thread is provably not executing it (a running task is moved off sch->tasks with the lock dropped), so remove it directly and skip the wait. This guarantees the task is delisted (n_tasks -> 0, handler -> NULL) before pdata is freed, eliminating both the dangling list entry and the stray timer wakeups. Verified on PTL with the standalone user-space LL boot tests: the userspace_ll suite, including pipeline_two_components_user, now passes without the fatal exception at teardown. Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
1 parent 9e3baae commit 5e41abb

1 file changed

Lines changed: 41 additions & 0 deletions

File tree

src/schedule/zephyr_ll.c

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,20 @@ static void zephyr_ll_task_done(struct zephyr_ll *sch,
118118
domain_unregister(sch->ll_domain, task, --sch->n_tasks);
119119
}
120120

121+
#if CONFIG_SOF_USERSPACE_LL
122+
/* The caller must hold the scheduler lock */
123+
static bool zephyr_ll_task_on_run_list(struct zephyr_ll *sch, struct task *task)
124+
{
125+
struct list_item *list;
126+
127+
list_for_item(list, &sch->tasks)
128+
if (list == &task->list)
129+
return true;
130+
131+
return false;
132+
}
133+
#endif
134+
121135
/* The caller must hold the lock and possibly disable interrupts */
122136
static void zephyr_ll_task_insert_unlocked(struct zephyr_ll *sch, struct task *task)
123137
{
@@ -459,6 +473,33 @@ static int zephyr_ll_task_free(void *data, struct task *task)
459473

460474
pdata->freeing = true;
461475

476+
#if CONFIG_SOF_USERSPACE_LL
477+
/*
478+
* Under CONFIG_SOF_USERSPACE_LL this function runs in kernel context
479+
* while the scheduler itself runs in a user-mode thread. The pdata->sem
480+
* handshake used below cannot synchronise across that privilege boundary
481+
* (sys_sem_take() returns -EINVAL when called from kernel context), so
482+
* relying on it would free an active task while it is still linked in
483+
* sch->tasks, leaving the scheduler to dereference freed memory on the
484+
* next timer tick.
485+
*
486+
* Instead remove the task directly here. Mark it cancelled first so that,
487+
* in the unlikely event it is currently mid-execution on the scheduler's
488+
* temporary list, it is removed via the cancel path without re-running
489+
* task->run() on resources the caller (e.g. pipeline_free()) may already
490+
* have freed. If the task is still linked on the run list, the scheduler
491+
* thread is not executing it (a running task is moved off sch->tasks with
492+
* the lock dropped), so it is safe to remove it now and skip the wait.
493+
*/
494+
if (must_wait) {
495+
task->state = SOF_TASK_STATE_CANCEL;
496+
if (zephyr_ll_task_on_run_list(sch, task)) {
497+
zephyr_ll_task_done(sch, task);
498+
must_wait = false;
499+
}
500+
}
501+
#endif
502+
462503
zephyr_ll_unlock(sch, &flags);
463504

464505
if (must_wait)

0 commit comments

Comments
 (0)