Skip to content

Commit 1c36ab5

Browse files
committed
net: macb: re-check ISR after IER re-enable in macb_tx_poll
macb_tx_poll() runs with TCOMP masked, drains the TX ring, then calls napi_complete_done() and re-enables TCOMP via IER. An existing comment in the function notes: /* Packet completions only seem to propagate to raise * interrupts when interrupts are enabled at the time, so if * packets were sent while interrupts were disabled, * they will not cause another interrupt to be generated when * interrupts are re-enabled. */ and mitigates this by calling macb_tx_complete_pending(), which inspects driver-visible ring state (descriptor->ctrl, after rmb()) and reschedules NAPI if a completion is observable in memory. On PCIe-attached parts (BCM2712 + RP1 on Raspberry Pi 5 is the setup we have in front of us), the descriptor DMA write that sets TX_USED may not have retired to system memory at the point macb_tx_complete_pending() runs. The rmb() synchronises the CPU view of earlier CPU writes; it is not sufficient to retire an in-flight peripheral DMA write. Under that ordering the in-memory descriptor can still read TX_USED=0 when the hardware has in fact completed the frame; the check returns false; NAPI exits; the quirk above prevents the re-enabled IER from re-firing; the ring goes quiescent. Add an explicit ISR read after the IER write. The MMIO read serves two independent purposes: (1) It is an architected PCIe read barrier for earlier peripheral-originated DMA writes on the same path, so a subsequent macb_tx_complete_pending() observes any TX_USED write that was in flight at the time of the barrier. (2) It samples the hardware ISR directly, so a TCOMP bit that the hardware set while TCOMP was masked is visible here, independently of whether the descriptor DMA has retired. If either signal indicates pending work, reschedule NAPI via the same path as the existing check. This patch addresses one of three candidate races for the silent TX stall described in the cover letter. Whether it is sufficient by itself, or whether it requires the PCIe posted-write flush in patch 1/3 to cover the observed behaviour, we have not yet verified at runtime. Link: cilium/cilium#43198 Link: https://bugs.launchpad.net/ubuntu/+source/linux-raspi/+bug/2133877 Signed-off-by: Lukasz Raczylo <lukasz@raczylo.com>
1 parent c20bbf3 commit 1c36ab5

1 file changed

Lines changed: 18 additions & 10 deletions

File tree

drivers/net/ethernet/cadence/macb_main.c

Lines changed: 18 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2000,17 +2000,25 @@ static int macb_tx_poll(struct napi_struct *napi, int budget)
20002000
if (work_done < budget && napi_complete_done(napi, work_done)) {
20012001
queue_writel(queue, IER, MACB_BIT(TCOMP));
20022002

2003-
/* Packet completions only seem to propagate to raise
2004-
* interrupts when interrupts are enabled at the time, so if
2005-
* packets were sent while interrupts were disabled,
2006-
* they will not cause another interrupt to be generated when
2007-
* interrupts are re-enabled.
2008-
* Check for this case here to avoid losing a wakeup. This can
2009-
* potentially race with the interrupt handler doing the same
2010-
* actions if an interrupt is raised just after enabling them,
2011-
* but this should be harmless.
2003+
/*
2004+
* TCOMP events that fire while the interrupt is masked do
2005+
* not re-fire when IER is re-enabled. Catch this two ways
2006+
* to avoid losing a wakeup:
2007+
*
2008+
* (1) Read ISR -- catches completions the hardware flagged
2009+
* but that we did not see as an interrupt. The MMIO
2010+
* read doubles as a PCIe read barrier, flushing any
2011+
* in-flight descriptor TX_USED DMA writes into memory.
2012+
* (2) macb_tx_complete_pending() inspects the ring after
2013+
* that flush, catching a descriptor whose TX_USED is
2014+
* now visible as a result of the barrier.
2015+
*
2016+
* This can race with the interrupt handler taking the same
2017+
* path if an interrupt fires just after the IER write;
2018+
* rescheduling NAPI in that case is harmless.
20122019
*/
2013-
if (macb_tx_complete_pending(queue)) {
2020+
if ((queue_readl(queue, ISR) & MACB_BIT(TCOMP)) ||
2021+
macb_tx_complete_pending(queue)) {
20142022
queue_writel(queue, IDR, MACB_BIT(TCOMP));
20152023
if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
20162024
queue_writel(queue, ISR, MACB_BIT(TCOMP));

0 commit comments

Comments
 (0)