Commit bf35444
committed
kernel: macb silent TX stall — v2 patch series
Update the three net-macb silent-TX-stall patches from RFC v1 to
PATCH net-next v2. The v2 series is on lore at:
https://lore.kernel.org/netdev/20260514215459.36109-1-lukasz@raczylo.com/T/
v2 changes from v1 (already merged in #1526):
* 0001 (PCIe posted-write flush after TSTART doorbell) — now gated
behind a new MACB_CAPS_PCIE_POSTED_WRITES capability, set only on
raspberrypi_rp1_config. v1 applied the readback to every macb
variant; SoC-integrated parts (Atmel, Microchip, SiFive, Xilinx)
have no fabric posted-write concern and were paying the
non-posted-read latency for nothing.
* 0002 (PCIe read barrier before TX completion descriptor check) —
replaces the v1 form, which was a regression on read-clear ISR
silicon. v1 read ISR with a TCOMP mask in macb_tx_poll(); on
raspberrypi_rp1_config (where MACB_CAPS_ISR_CLEAR_ON_WRITE is
not set) that read consumes every bit set in ISR, but the
use-site masks down to TCOMP and discards the rest -- any
RCOMP / ROVR / TXUBR bit at that instant is silently consumed
and the IRQ handler that would have processed it sees ISR=0.
On level-triggered IRQ the consumed bit drops the line before
GIC delivery, vanishing the IRQ entirely. Caught by self-audit
while preparing v2; disclosed to netdev in the reply thread.
v2 replaces the destructive ISR read with (void)queue_readl(
queue, IMR), the read-only mask mirror -- non-destructive, same
PCIe-barrier effect on prior peripheral DMA writes.
* 0003 (TX stall watchdog) — tracks tail movement via a bool flag
set by macb_tx_complete() instead of a tx_tail snapshot
(form suggested by Phil Elwell on raspberrypi/linux#7340).
Adds a netif_carrier_ok() gate at the top of the watchdog tick
so the boot-time false positive seen on autoneg-still-pending
boots no longer fires (observed at ~25% rate on a 24-node fleet
rolling reboot under v1). Swaps netdev_warn_once to
netdev_warn_ratelimited so operators can count real events
across the netdev lifetime.
The v2 patch 2 IMR-barrier form has been running in production on a
24-node Raspberry Pi 5 fleet since 2026-05-14; ~190 cumulative
node-hours so far, zero mid-runtime TX stalls, zero user-space
watchdog RECOVER events. Pre-patch baseline (~0.5 stall/node-hour
at fleet level) would have predicted ~95 mid-runtime stalls in that
window; observed is 0.
Related:
* netdev v1 RFC thread: https://lore.kernel.org/netdev/cover.1777064117.git.lukasz@raczylo.com/T/
* netdev v2 series: https://lore.kernel.org/netdev/20260514215459.36109-1-lukasz@raczylo.com/T/
* raspberrypi/linux merge: raspberrypi/linux#7340
* raspberrypi/linux v2 PR: raspberrypi/linux#7369
Signed-off-by: Lukasz Raczylo <lukasz@raczylo.com>1 parent c5a1685 commit bf35444
6 files changed
Lines changed: 333 additions & 315 deletions
File tree
- kernel/build/patches
Lines changed: 72 additions & 48 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
2 | | - | |
3 | | - | |
4 | | - | |
| 1 | + | |
5 | 2 | | |
6 | | - | |
7 | | - | |
8 | | - | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
9 | 6 | | |
10 | 7 | | |
11 | | - | |
12 | | - | |
13 | | - | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | | - | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
18 | 15 | | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
26 | 23 | | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
31 | 30 | | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
41 | 39 | | |
42 | 40 | | |
43 | 41 | | |
44 | 42 | | |
45 | 43 | | |
46 | | - | |
47 | | - | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
48 | 47 | | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
49 | 63 | | |
50 | | - | |
| 64 | + | |
51 | 65 | | |
52 | 66 | | |
53 | | - | |
| 67 | + | |
54 | 68 | | |
55 | 69 | | |
56 | 70 | | |
57 | 71 | | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
62 | 76 | | |
63 | | - | |
| 77 | + | |
| 78 | + | |
64 | 79 | | |
65 | 80 | | |
66 | 81 | | |
67 | | - | |
| 82 | + | |
68 | 83 | | |
69 | 84 | | |
70 | 85 | | |
71 | 86 | | |
72 | | - | |
73 | | - | |
| 87 | + | |
| 88 | + | |
74 | 89 | | |
75 | | - | |
| 90 | + | |
| 91 | + | |
76 | 92 | | |
77 | 93 | | |
78 | 94 | | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
79 | 103 | | |
80 | | - | |
| 104 | + | |
81 | 105 | | |
Lines changed: 66 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
Lines changed: 0 additions & 106 deletions
This file was deleted.
0 commit comments