Skip to content

Commit 875e23d

Browse files
committed
schedule: zephyr_ll: fix use-after-free when freeing an active user-space task
zephyr_ll_task_sched_free() frees an active (RUNNING/RESCHEDULE) task by setting pdata->freeing and waiting on pdata->sem for the scheduler thread to remove the task from its run list before the memory is released. Under CONFIG_SOF_USERSPACE_LL this function runs in kernel context while pdata->sem is a sys_sem allocated on the user heap. sys_sem_take() returns -EINVAL immediately when called from kernel context, so the wait is a no-op: pdata is freed (and the struct task is subsequently freed by pipeline_free()) while the task is still linked in sch->tasks with n_tasks != 0 and the scheduling domain handler still set. Because n_tasks is non-zero, schedule_free() does not stop the LL timer, and the next timer tick runs zephyr_ll_run() over the dangling task, dereferencing freed memory and taking a load/store-privilege exception (EXCCAUSE 26) in the user-space LL thread. Stop relying on the cross-privilege semaphore handshake in this path. When the task must be waited for, mark it cancelled so that, should it actually be mid-execution on the scheduler's temporary list, it is removed via the cancel path without re-running task->run() on resources the caller may already have freed. If the task is still linked on the run list, the scheduler thread is provably not executing it (a running task is moved off sch->tasks with the lock dropped), so remove it directly and skip the wait. This guarantees the task is delisted (n_tasks -> 0, handler -> NULL) before pdata is freed, eliminating both the dangling list entry and the stray timer wakeups. Verified on PTL with the standalone user-space LL boot tests: the userspace_ll suite, including pipeline_two_components_user, now passes without the fatal exception at teardown. Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
1 parent 9b146ad commit 875e23d

1 file changed

Lines changed: 41 additions & 0 deletions

File tree

src/schedule/zephyr_ll.c

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,20 @@ static void zephyr_ll_task_done(struct zephyr_ll *sch,
118118
domain_unregister(sch->ll_domain, task, --sch->n_tasks);
119119
}
120120

121+
#if CONFIG_SOF_USERSPACE_LL
122+
/* The caller must hold the scheduler lock */
123+
static bool zephyr_ll_task_on_run_list(struct zephyr_ll *sch, struct task *task)
124+
{
125+
struct list_item *list;
126+
127+
list_for_item(list, &sch->tasks)
128+
if (list == &task->list)
129+
return true;
130+
131+
return false;
132+
}
133+
#endif
134+
121135
/* The caller must hold the lock and possibly disable interrupts */
122136
static void zephyr_ll_task_insert_unlocked(struct zephyr_ll *sch, struct task *task)
123137
{
@@ -469,6 +483,33 @@ static int zephyr_ll_task_free(void *data, struct task *task)
469483

470484
pdata->freeing = true;
471485

486+
#if CONFIG_SOF_USERSPACE_LL
487+
/*
488+
* Under CONFIG_SOF_USERSPACE_LL this function runs in kernel context
489+
* while the scheduler itself runs in a user-mode thread. The pdata->sem
490+
* handshake used below cannot synchronise across that privilege boundary
491+
* (sys_sem_take() returns -EINVAL when called from kernel context), so
492+
* relying on it would free an active task while it is still linked in
493+
* sch->tasks, leaving the scheduler to dereference freed memory on the
494+
* next timer tick.
495+
*
496+
* Instead remove the task directly here. Mark it cancelled first so that,
497+
* in the unlikely event it is currently mid-execution on the scheduler's
498+
* temporary list, it is removed via the cancel path without re-running
499+
* task->run() on resources the caller (e.g. pipeline_free()) may already
500+
* have freed. If the task is still linked on the run list, the scheduler
501+
* thread is not executing it (a running task is moved off sch->tasks with
502+
* the lock dropped), so it is safe to remove it now and skip the wait.
503+
*/
504+
if (must_wait) {
505+
task->state = SOF_TASK_STATE_CANCEL;
506+
if (zephyr_ll_task_on_run_list(sch, task)) {
507+
zephyr_ll_task_done(sch, task);
508+
must_wait = false;
509+
}
510+
}
511+
#endif
512+
472513
zephyr_ll_unlock(sch, &flags);
473514

474515
if (must_wait)

0 commit comments

Comments
 (0)