Commit 1c36ab5
committed
net: macb: re-check ISR after IER re-enable in macb_tx_poll
macb_tx_poll() runs with TCOMP masked, drains the TX ring, then
calls napi_complete_done() and re-enables TCOMP via IER. An
existing comment in the function notes:
/* Packet completions only seem to propagate to raise
* interrupts when interrupts are enabled at the time, so if
* packets were sent while interrupts were disabled,
* they will not cause another interrupt to be generated when
* interrupts are re-enabled.
*/
and mitigates this by calling macb_tx_complete_pending(), which
inspects driver-visible ring state (descriptor->ctrl, after rmb())
and reschedules NAPI if a completion is observable in memory.
On PCIe-attached parts (BCM2712 + RP1 on Raspberry Pi 5 is the
setup we have in front of us), the descriptor DMA write that sets
TX_USED may not have retired to system memory at the point
macb_tx_complete_pending() runs. The rmb() synchronises the CPU
view of earlier CPU writes; it is not sufficient to retire an
in-flight peripheral DMA write. Under that ordering the in-memory
descriptor can still read TX_USED=0 when the hardware has in fact
completed the frame; the check returns false; NAPI exits; the
quirk above prevents the re-enabled IER from re-firing; the ring
goes quiescent.
Add an explicit ISR read after the IER write. The MMIO read
serves two independent purposes:
(1) It is an architected PCIe read barrier for earlier
peripheral-originated DMA writes on the same path, so a
subsequent macb_tx_complete_pending() observes any TX_USED
write that was in flight at the time of the barrier.
(2) It samples the hardware ISR directly, so a TCOMP bit that
the hardware set while TCOMP was masked is visible here,
independently of whether the descriptor DMA has retired.
If either signal indicates pending work, reschedule NAPI via the
same path as the existing check.
This patch addresses one of three candidate races for the silent
TX stall described in the cover letter. Whether it is sufficient
by itself, or whether it requires the PCIe posted-write flush in
patch 1/3 to cover the observed behaviour, we have not yet
verified at runtime.
Link: cilium/cilium#43198
Link: https://bugs.launchpad.net/ubuntu/+source/linux-raspi/+bug/2133877
Signed-off-by: Lukasz Raczylo <lukasz@raczylo.com>1 parent c20bbf3 commit 1c36ab5
1 file changed
Lines changed: 18 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2000 | 2000 | | |
2001 | 2001 | | |
2002 | 2002 | | |
2003 | | - | |
2004 | | - | |
2005 | | - | |
2006 | | - | |
2007 | | - | |
2008 | | - | |
2009 | | - | |
2010 | | - | |
2011 | | - | |
| 2003 | + | |
| 2004 | + | |
| 2005 | + | |
| 2006 | + | |
| 2007 | + | |
| 2008 | + | |
| 2009 | + | |
| 2010 | + | |
| 2011 | + | |
| 2012 | + | |
| 2013 | + | |
| 2014 | + | |
| 2015 | + | |
| 2016 | + | |
| 2017 | + | |
| 2018 | + | |
2012 | 2019 | | |
2013 | | - | |
| 2020 | + | |
| 2021 | + | |
2014 | 2022 | | |
2015 | 2023 | | |
2016 | 2024 | | |
| |||
0 commit comments