Commit 6451d58
sockmap: Fix sk_psock_drop() race vs sock_map_{unhash,close,destroy}().
syzbot reported a splat in sock_map_destroy() [0], where psock was
NULL even though sk->sk_prot still pointed to tcp_bpf_prots[][].
The stack trace shows how badly the path was excercised, see
inet_release() calls tcp_close(), not sock_map_close() yet, but
finally reaching sock_map_destroy().
The root cause is a lack of synchronisation.
Even if sk_psock_get() fails to bump psock->refcnt, it does not
guarantee that sk_psock_drop() has finished, and thus sk->sk_prot
might not have been restored to the original one.
Commit 4b4647a ("sock_map: avoid race between sock_map_close
and sk_psock_put") attempted to address this, but it was insufficient
for two reasons.
It did not cover sock_map_unhash() and sock_map_destroy(), and
it missed the corner case where sk_psock() is NULL.
On non-x86 platforms, sk_psock_restore_proto(sk, psock) and
rcu_assign_sk_user_data(sk, NULL) can be reordered because there
is no address dependency between sk->sk_prot and sk->sk_user_data.
sk_psock_get() returning NULL implies nothing about sk->sk_prot.
Let's simply retry sk_psock_get() in the unlikely case.
Note that we cannot avoid loop even if we added memory barrier
in sk_psock_drop() and sock_map_psock_get_checked().
Also note that sock_map_destroy() cannot be called from softirq
while sock_map_close() has also been running.
It is because sock_map_destroy() requires SOCK_DEAD, so sock_map_destroy()
cannot happen until sock_map_close() has finished the saved_close()
(which is tcp_close()).
[0]:
WARNING: CPU: 1 PID: 8459 at net/core/sock_map.c:1667 sock_map_destroy+0x28b/0x2b0 net/core/sock_map.c:1667
Modules linked in:
CPU: 1 UID: 0 PID: 8459 Comm: syz.0.1109 Not tainted syzkaller #0 PREEMPT_{RT,(full)}
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
RIP: 0010:sock_map_destroy+0x28b/0x2b0 net/core/sock_map.c:1667
Code: 8b 36 49 83 c6 38 4c 89 f0 48 c1 e8 03 42 80 3c 38 00 74 08 4c 89 f7 e8 93 62 22 f9 4d 8b 3e e9 79 ff ff ff e8 a6 2b c3 f8 90 <0f> 0b 90 eb 9c e8 9b 2b c3 f8 4c 89 e7 be 03 00 00 00 e8 0e 4e bc
RSP: 0018:ffffc9000d067be8 EFLAGS: 00010293
RAX: ffffffff88fb30aa RBX: ffff888024832000 RCX: ffff888024283b80
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: dffffc0000000000 R11: ffffed100862e946 R12: dffffc0000000000
R13: ffff888024832000 R14: ffffffff995b2208 R15: ffffffff88fb2e20
FS: 0000555579a7d500(0000) GS:ffff8881269c2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00002000000048c0 CR3: 000000003713a000 CR4: 00000000003526f0
Call Trace:
<TASK>
inet_csk_destroy_sock+0x166/0x3a0 net/ipv4/inet_connection_sock.c:1294
__tcp_close+0xcc1/0xfd0 net/ipv4/tcp.c:3262
tcp_close+0x28/0x110 net/ipv4/tcp.c:3274
inet_release+0x144/0x190 net/ipv4/af_inet.c:435
__sock_release net/socket.c:649 [inline]
sock_close+0xc0/0x240 net/socket.c:1439
__fput+0x45b/0xa80 fs/file_table.c:468
task_work_run+0x1d4/0x260 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop+0xec/0x110 kernel/entry/common.c:43
exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline]
do_syscall_64+0x2bd/0x3b0 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f265847ebe9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd158dfbd8 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
RAX: 0000000000000000 RBX: 000000000002ddb0 RCX: 00007f265847ebe9
RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
RBP: 00007f26586a7da0 R08: 0000000000000001 R09: 0000000e158dfecf
R10: 0000001b30a20000 R11: 0000000000000246 R12: 00007f26586a5fac
R13: 00007f26586a5fa0 R14: ffffffffffffffff R15: 00007ffd158dfcf0
</TASK>
Fixes: 1aa12bd ("bpf: sockmap, add sock close() hook to remove socks")
Fixes: b05545e ("bpf: sockmap, fix transition through disconnect without close")
Fixes: d8616ee ("bpf, sockmap: Fix sk->sk_forward_alloc warn_on in sk_stream_kill_queues")
Reported-by: syzbot+b0842d38af58376d1fdc@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/bpf/69cec5ef.050a0220.2dbe29.0009.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260420194846.1089595-1-kuniyu@google.com1 parent 375e4e3 commit 6451d58
1 file changed
Lines changed: 25 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1630 | 1630 | | |
1631 | 1631 | | |
1632 | 1632 | | |
| 1633 | + | |
1633 | 1634 | | |
1634 | 1635 | | |
1635 | 1636 | | |
1636 | 1637 | | |
1637 | 1638 | | |
| 1639 | + | |
| 1640 | + | |
1638 | 1641 | | |
1639 | 1642 | | |
1640 | 1643 | | |
1641 | 1644 | | |
| 1645 | + | |
| 1646 | + | |
| 1647 | + | |
1642 | 1648 | | |
1643 | | - | |
1644 | | - | |
| 1649 | + | |
1645 | 1650 | | |
1646 | 1651 | | |
1647 | 1652 | | |
| |||
1652 | 1657 | | |
1653 | 1658 | | |
1654 | 1659 | | |
| 1660 | + | |
1655 | 1661 | | |
1656 | 1662 | | |
1657 | 1663 | | |
1658 | 1664 | | |
1659 | 1665 | | |
| 1666 | + | |
| 1667 | + | |
1660 | 1668 | | |
1661 | 1669 | | |
1662 | 1670 | | |
1663 | 1671 | | |
1664 | 1672 | | |
1665 | 1673 | | |
| 1674 | + | |
| 1675 | + | |
| 1676 | + | |
1666 | 1677 | | |
1667 | | - | |
1668 | | - | |
| 1678 | + | |
1669 | 1679 | | |
1670 | 1680 | | |
1671 | 1681 | | |
| |||
1676 | 1686 | | |
1677 | 1687 | | |
1678 | 1688 | | |
| 1689 | + | |
1679 | 1690 | | |
1680 | 1691 | | |
1681 | | - | |
| 1692 | + | |
1682 | 1693 | | |
1683 | 1694 | | |
1684 | 1695 | | |
1685 | | - | |
1686 | | - | |
1687 | | - | |
1688 | 1696 | | |
1689 | 1697 | | |
1690 | 1698 | | |
1691 | 1699 | | |
1692 | 1700 | | |
| 1701 | + | |
| 1702 | + | |
| 1703 | + | |
| 1704 | + | |
| 1705 | + | |
| 1706 | + | |
1693 | 1707 | | |
1694 | 1708 | | |
1695 | | - | |
1696 | 1709 | | |
1697 | 1710 | | |
| 1711 | + | |
| 1712 | + | |
| 1713 | + | |
1698 | 1714 | | |
1699 | 1715 | | |
1700 | | - | |
1701 | | - | |
1702 | | - | |
1703 | | - | |
1704 | | - | |
1705 | 1716 | | |
1706 | 1717 | | |
1707 | 1718 | | |
| |||
0 commit comments