Fix three endianness issues in test_progs #32

kernel-patches-bot · 2020-09-10T05:47:18Z

Pull request for series with
subject: Fix three endianness issues in test_progs
version: 1
url: https://patchwork.ozlabs.org/project/netdev/list/?series=200681

kernel-patches-bot · 2020-09-10T05:47:19Z

Master branch: 8081ede
series: https://patchwork.ozlabs.org/project/netdev/list/?series=200681
version: 1

patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232443.3099637-2-iii@linux.ibm.com/ applied successfully
patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232443.3099637-3-iii@linux.ibm.com/ applied successfully
patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232443.3099637-4-iii@linux.ibm.com/ applied successfully

the resulting int. It is ok on little endian, but not on big endian. Fix by checking char instead. Fixes: 8a027dc ("selftests/bpf: add sockopt test that exercises sk helpers") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- tools/testing/selftests/bpf/prog_tests/sockopt_sk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

endian architecture, and therefore fails on s390. Fix by introducing LSB and LSW macros and using them to perform narrow loads. Fixes: 0ab5539 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- .../selftests/bpf/progs/test_sk_lookup.c | 264 ++++++++++-------- 1 file changed, 149 insertions(+), 115 deletions(-)

This sort of works on x86 (unless followed by non-0), but hard fails on s390. Fix by using long instead of int. Fixes: 2d7824f ("selftests: bpf: Add test for sk_assign") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- tools/testing/selftests/bpf/prog_tests/sk_assign.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

kernel-patches-bot · 2020-09-10T21:26:48Z

Master branch: 2f7de98
series: https://patchwork.ozlabs.org/project/netdev/list/?series=200681
version: 1

patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232443.3099637-2-iii@linux.ibm.com/ applied successfully
patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232443.3099637-3-iii@linux.ibm.com/ applied successfully
patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232443.3099637-4-iii@linux.ibm.com/ applied successfully

kernel-patches-bot · 2020-09-11T00:22:17Z

At least one diff in series https://patchwork.ozlabs.org/project/netdev/list/?series=200681 expired. Closing PR.

When the dwc2 platform device is removed, it unregisters the generic phy. usb_remove_phy() is called and the dwc2 usb_phy is removed from the "phy_list", but the uevent may still attempt to get the usb_phy from the list, resulting in a page fault bug. Currently we can't access the usb_phy from the "phy_list" after the device is removed. As a fix check to make sure that we can get the usb_phy before moving forward with the uevent. [ 84.949345] BUG: unable to handle page fault for address:00000007935688d8 [ 84.949349] #PF: supervisor read access in kernel mode [ 84.949351] #PF: error_code(0x0000) - not-present page [ 84.949353] PGD 0 P4D 0 [ 84.949356] Oops: 0000 [#1] SMP PTI [ 84.949360] CPU: 2 PID: 2081 Comm: rmmod Not tainted 5.13.0-rc4-snps-16547-ga8534cb092d7-dirty #32 [ 84.949363] Hardware name: Hewlett-Packard HP Z400 Workstation/0B4Ch, BIOS 786G3 v03.54 11/02/2011 [ 84.949365] RIP: 0010:usb_phy_uevent+0x99/0x121 [ 84.949372] Code: 8d 83 f8 00 00 00 48 3d b0 12 22 94 74 05 4c 3b 23 75 5b 8b 83 9c 00 00 00 be 32 00 00 00 48 8d 7c 24 04 48 c7 c2 d4 5d 7b 93 <48> 8b 0c c5 e0 88 56 93 e8 0f 63 8a ff 8b 83 98 00 00 00 be 32 00 [ 84.949375] RSP: 0018:ffffa46bc0f2fc70 EFLAGS: 00010246 [ 84.949378] RAX: 00000000ffffffff RBX: ffffffff942211b8 RCX: 0000000000000027 [ 84.949380] RDX: ffffffff937b5dd4 RSI: 0000000000000032 RDI: ffffa46bc0f2fc74 [ 84.949383] RBP: ffff94a306613000 R08: 0000000000000000 R09: 00000000fffeffff [ 84.949385] R10: ffffa46bc0f2faa8 R11: ffffa46bc0f2faa0 R12: ffff94a30186d410 [ 84.949387] R13: ffff94a32d188a80 R14: ffff94a30029f960 R15: ffffffff93522dd0 [ 84.949389] FS: 00007efdbd417540(0000) GS:ffff94a513a80000(0000) knlGS:0000000000000000 [ 84.949392] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 84.949394] CR2: 00000007935688d8 CR3: 0000000165606000 CR4: 00000000000006e0 [ 84.949396] Call Trace: [ 84.949401] dev_uevent+0x190/0x1ad [ 84.949408] kobject_uevent_env+0x18e/0x46c [ 84.949414] device_release_driver_internal+0x17f/0x18e [ 84.949418] bus_remove_device+0xd3/0xe5 [ 84.949421] device_del+0x1c3/0x31d [ 84.949425] ? kobject_put+0x97/0xa8 [ 84.949428] platform_device_del+0x1c/0x63 [ 84.949432] platform_device_unregister+0xa/0x11 [ 84.949436] dwc2_pci_remove+0x1e/0x2c [dwc2_pci] [ 84.949440] pci_device_remove+0x31/0x81 [ 84.949445] device_release_driver_internal+0xea/0x18e [ 84.949448] driver_detach+0x68/0x72 [ 84.949450] bus_remove_driver+0x63/0x82 [ 84.949453] pci_unregister_driver+0x1a/0x75 [ 84.949457] __do_sys_delete_module+0x149/0x1e9 [ 84.949462] ? task_work_run+0x64/0x6e [ 84.949465] ? exit_to_user_mode_prepare+0xd4/0x10d [ 84.949471] do_syscall_64+0x5d/0x70 [ 84.949475] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 84.949480] RIP: 0033:0x7efdbd563bcb [ 84.949482] Code: 73 01 c3 48 8b 0d c5 82 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 95 82 0c 00 f7 d8 64 89 01 48 [ 84.949485] RSP: 002b:00007ffe944d7d98 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 84.949489] RAX: ffffffffffffffda RBX: 00005651072eb700 RCX: 00007efdbd563bcb [ 84.949491] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005651072eb768 [ 84.949493] RBP: 00007ffe944d7df8 R08: 0000000000000000 R09: 0000000000000000 [ 84.949495] R10: 00007efdbd5dfac0 R11: 0000000000000206 R12: 00007ffe944d7fd0 [ 84.949497] R13: 00007ffe944d8610 R14: 00005651072eb2a0 R15: 00005651072eb700 [ 84.949500] Modules linked in: uas configfs dwc2_pci(-) phy_generic fuse crc32c_intel [last unloaded: udc_core] [ 84.949508] CR2: 00000007935688d8 [ 84.949510] ---[ end trace e40c871ca3e4dc9e ]--- [ 84.949512] RIP: 0010:usb_phy_uevent+0x99/0x121 Fixes: a8534cb ("usb: phy: introduce usb_phy device type with its own uevent handler") Reviewed-by: Peter Chen <peter.chen@kernel.org> Signed-off-by: Artur Petrosyan <Arthur.Petrosyan@synopsys.com> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20210710092247.D7AFEA005D@mailhost.synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

The following error is reported when running "./test_progs -t for_each" under arm64: bpf_jit: multi-func JIT bug 58 != 56 ...... JIT doesn't support bpf-to-bpf calls The root cause is the size of BPF_PSEUDO_FUNC instruction increases from 2 to 3 after the address of called bpf-function is settled and there are two bpf-to-bpf calls in test_pkt_access. The generated instructions are shown below: >before callback_fn is jited, its addr is 0x1-00000001 0x48: 21 00 C0 D2 movz x1, #0x1, lsl #32 0x4c: 21 00 80 F2 movk x1, #0x1 >after callback_fn is jited, its addr is 0xfffffe0017f2fb84 0x48: E1 3F C0 92 movn x1, #0x1ff, lsl #32 0x4c: 41 FE A2 F2 movk x1, #0x17f2, lsl #16 0x50: 81 70 9F F2 movk x1, #0xfb84 Fixing it by using emit_addr_mov_i64() for BPF_PSEUDO_FUNC, so the size of jited image will not change. Fixes: 69c087b ("bpf: Add bpf_for_each_map_elem() helper") Signed-off-by: Hou Tao <houtao1@huawei.com>

The following error is reported when running "./test_progs -t for_each" under arm64: bpf_jit: multi-func JIT bug 58 != 56 [...] JIT doesn't support bpf-to-bpf calls The root cause is the size of BPF_PSEUDO_FUNC instruction increases from 2 to 3 after the address of called bpf-function is settled and there are two bpf-to-bpf calls in test_pkt_access. The generated instructions are shown below: 0x48: 21 00 C0 D2 movz x1, #0x1, lsl #32 0x4c: 21 00 80 F2 movk x1, #0x1 0x48: E1 3F C0 92 movn x1, #0x1ff, lsl #32 0x4c: 41 FE A2 F2 movk x1, #0x17f2, lsl #16 0x50: 81 70 9F F2 movk x1, #0xfb84 Fixing it by using emit_addr_mov_i64() for BPF_PSEUDO_FUNC, so the size of jited image will not change. Fixes: 69c087b ("bpf: Add bpf_for_each_map_elem() helper") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211231151018.3781550-1-houtao1@huawei.com

The following error is reported when running "./test_progs -t for_each" under arm64: bpf_jit: multi-func JIT bug 58 != 56 ...... JIT doesn't support bpf-to-bpf calls The root cause is the size of BPF_PSEUDO_FUNC instruction increases from 2 to 3 after the address of called bpf-function is settled and there are two bpf-to-bpf calls in test_pkt_access. The generated instructions are shown below: >before callback_fn is jited, its addr is 0x1-00000001 0x48: 21 00 C0 D2 movz x1, #0x1, lsl #32 0x4c: 21 00 80 F2 movk x1, #0x1 >after callback_fn is jited, its addr is 0xfffffe0017f2fb84 0x48: E1 3F C0 92 movn x1, #0x1ff, lsl #32 0x4c: 41 FE A2 F2 movk x1, #0x17f2, lsl #16 0x50: 81 70 9F F2 movk x1, #0xfb84 Fixing it by using emit_addr_mov_i64() for BPF_PSEUDO_FUNC, so the size of jited image will not change. Fixes: 69c087b ("bpf: Add bpf_for_each_map_elem() helper") Signed-off-by: Hou Tao <houtao1@huawei.com>

Rafael reports that on a system with LX2160A and Marvell DSA switches, if a reboot occurs while the DSA master (dpaa2-eth) is up, the following panic can be seen: systemd-shutdown[1]: Rebooting. Unable to handle kernel paging request at virtual address 00a0000800000041 [00a0000800000041] address between user and kernel address ranges Internal error: Oops: 96000004 [#1] PREEMPT SMP CPU: 6 PID: 1 Comm: systemd-shutdow Not tainted 5.16.5-00042-g8f5585009b24 #32 pc : dsa_slave_netdevice_event+0x130/0x3e4 lr : raw_notifier_call_chain+0x50/0x6c Call trace: dsa_slave_netdevice_event+0x130/0x3e4 raw_notifier_call_chain+0x50/0x6c call_netdevice_notifiers_info+0x54/0xa0 __dev_close_many+0x50/0x130 dev_close_many+0x84/0x120 unregister_netdevice_many+0x130/0x710 unregister_netdevice_queue+0x8c/0xd0 unregister_netdev+0x20/0x30 dpaa2_eth_remove+0x68/0x190 fsl_mc_driver_remove+0x20/0x5c __device_release_driver+0x21c/0x220 device_release_driver_internal+0xac/0xb0 device_links_unbind_consumers+0xd4/0x100 __device_release_driver+0x94/0x220 device_release_driver+0x28/0x40 bus_remove_device+0x118/0x124 device_del+0x174/0x420 fsl_mc_device_remove+0x24/0x40 __fsl_mc_device_remove+0xc/0x20 device_for_each_child+0x58/0xa0 dprc_remove+0x90/0xb0 fsl_mc_driver_remove+0x20/0x5c __device_release_driver+0x21c/0x220 device_release_driver+0x28/0x40 bus_remove_device+0x118/0x124 device_del+0x174/0x420 fsl_mc_bus_remove+0x80/0x100 fsl_mc_bus_shutdown+0xc/0x1c platform_shutdown+0x20/0x30 device_shutdown+0x154/0x330 __do_sys_reboot+0x1cc/0x250 __arm64_sys_reboot+0x20/0x30 invoke_syscall.constprop.0+0x4c/0xe0 do_el0_svc+0x4c/0x150 el0_svc+0x24/0xb0 el0t_64_sync_handler+0xa8/0xb0 el0t_64_sync+0x178/0x17c It can be seen from the stack trace that the problem is that the deregistration of the master causes a dev_close(), which gets notified as NETDEV_GOING_DOWN to dsa_slave_netdevice_event(). But dsa_switch_shutdown() has already run, and this has unregistered the DSA slave interfaces, and yet, the NETDEV_GOING_DOWN handler attempts to call dev_close_many() on those slave interfaces, leading to the problem. The previous attempt to avoid the NETDEV_GOING_DOWN on the master after dsa_switch_shutdown() was called seems improper. Unregistering the slave interfaces is unnecessary and unhelpful. Instead, after the slaves have stopped being uppers of the DSA master, we can now reset to NULL the master->dsa_ptr pointer, which will make DSA start ignoring all future notifier events on the master. Fixes: 0650bf5 ("net: dsa: be compatible with masters which unregister on shutdown") Reported-by: Rafael Richter <rafael.richter@gin.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>

The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <xukuohai@huawei.com>

Every time run this BPF selftests (./test_sockmap) on a Loongarch platform, a Kernel panic occurs: ''' Oops[kernel-patches#1]: CPU: 20 PID: 23245 Comm: test_sockmap Tainted: G OE 6.10.0-rc2+ kernel-patches#32 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018 ... ... ra: 90000000043a315c tcp_bpf_sendmsg+0x23c/0x420 ERA: 900000000426cd1c sk_msg_memcopy_from_iter+0xbc/0x220 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: tls xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT Process test_sockmap (pid: 23245, threadinfo=00000000aeb68043, task=...) Stack : ... ... ... Call Trace: [<900000000426cd1c>] sk_msg_memcopy_from_iter+0xbc/0x220 [<90000000043a315c>] tcp_bpf_sendmsg+0x23c/0x420 [<90000000041cafc8>] __sock_sendmsg+0x68/0xe0 [<90000000041cc4bc>] ____sys_sendmsg+0x2bc/0x360 [<90000000041cea18>] ___sys_sendmsg+0xb8/0x120 [<90000000041cf1f8>] __sys_sendmsg+0x98/0x100 [<90000000045b76ec>] do_syscall+0x8c/0xc0 [<90000000030e1da4>] handle_syscall+0xc4/0x160 Code: ... ---[ end trace 0000000000000000 ]--- ''' This crash is because a NULL pointer is passed to page_address() in sk_msg_memcopy_from_iter(). Due to the difference in architecture, page_address(0) will not trigger a panic on the X86 platform but will panic on the Loogarch platform. So this bug was hidden on the x86 platform, but now it is exposed on the Loogarch platform. This bug is a logic error indeed. In sk_msg_memcopy_from_iter(), an invalid "sge" is always used: if (msg->sg.copybreak >= sge->length) { msg->sg.copybreak = 0; sk_msg_iter_var_next(i); if (i == msg->sg.end) break; sge = sk_msg_elem(msg, i); } If the value of i is 2, msg->sg.end is also 2 when entering this if block. sk_msg_iter_var_next() increases i by 1, and now i is 3, which is no longer equal to msg->sg.end. The break will not be triggered, and the next sge obtained by sk_msg_elem(3) will be an invalid one. The correct approach is to check (i == msg->sg.end) first, and then invoke sk_msg_iter_var_next() if they are not equal. Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: NipaLocal <nipa@local>

Every time run this BPF selftests (./test_sockmap) on a Loongarch platform, a Kernel panic occurs: ''' Oops[kernel-patches#1]: CPU: 20 PID: 23245 Comm: test_sockmap Tainted: G OE 6.10.0-rc2+ kernel-patches#32 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018 ... ... ra: 90000000043a315c tcp_bpf_sendmsg+0x23c/0x420 ERA: 900000000426cd1c sk_msg_memcopy_from_iter+0xbc/0x220 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: tls xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT Process test_sockmap (pid: 23245, threadinfo=00000000aeb68043, task=...) Stack : ... ... ... Call Trace: [<900000000426cd1c>] sk_msg_memcopy_from_iter+0xbc/0x220 [<90000000043a315c>] tcp_bpf_sendmsg+0x23c/0x420 [<90000000041cafc8>] __sock_sendmsg+0x68/0xe0 [<90000000041cc4bc>] ____sys_sendmsg+0x2bc/0x360 [<90000000041cea18>] ___sys_sendmsg+0xb8/0x120 [<90000000041cf1f8>] __sys_sendmsg+0x98/0x100 [<90000000045b76ec>] do_syscall+0x8c/0xc0 [<90000000030e1da4>] handle_syscall+0xc4/0x160 Code: ... ---[ end trace 0000000000000000 ]--- ''' This crash is because a NULL pointer is passed to page_address() in sk_msg_memcopy_from_iter(). Due to the difference in architecture, page_address(0) will not trigger a panic on the X86 platform but will panic on the Loogarch platform. So this bug was hidden on the x86 platform, but now it is exposed on the Loogarch platform. This bug is a logic error indeed. In sk_msg_memcopy_from_iter(), an invalid "sge" is always used: if (msg->sg.copybreak >= sge->length) { msg->sg.copybreak = 0; sk_msg_iter_var_next(i); if (i == msg->sg.end) break; sge = sk_msg_elem(msg, i); } If the value of i is 2, msg->sg.end is also 2 when entering this if block. sk_msg_iter_var_next() increases i by 1, and now i is 3, which is no longer equal to msg->sg.end. The break will not be triggered, and the next sge obtained by sk_msg_elem(3) will be an invalid one. The correct approach is to check (i == msg->sg.end) first, and then invoke sk_msg_iter_var_next() if they are not equal. Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: NipaLocal <nipa@local>

KMSAN reported uninit-value access in __unix_walk_scc() [1]. In the list_for_each_entry_reverse() loop, when the vertex's index equals it's scc_index, the loop uses the variable vertex as a temporary variable that points to a vertex in scc. And when the loop is finished, the variable vertex points to the list head, in this case scc, which is a local variable on the stack (more precisely, it's not even scc and might underflow the call stack of __unix_walk_scc(): container_of(&scc, struct unix_vertex, scc_entry)). However, the variable vertex is used under the label prev_vertex. So if the edge_stack is not empty and the function jumps to the prev_vertex label, the function will access invalid data on the stack. This causes the uninit-value access issue. Fix this by introducing a new temporary variable for the loop. [1] BUG: KMSAN: uninit-value in __unix_walk_scc net/unix/garbage.c:478 [inline] BUG: KMSAN: uninit-value in unix_walk_scc net/unix/garbage.c:526 [inline] BUG: KMSAN: uninit-value in __unix_gc+0x2589/0x3c20 net/unix/garbage.c:584 __unix_walk_scc net/unix/garbage.c:478 [inline] unix_walk_scc net/unix/garbage.c:526 [inline] __unix_gc+0x2589/0x3c20 net/unix/garbage.c:584 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xade/0x1bf0 kernel/workqueue.c:3312 worker_thread+0xeb6/0x15b0 kernel/workqueue.c:3393 kthread+0x3c4/0x530 kernel/kthread.c:389 ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Uninit was stored to memory at: unix_walk_scc net/unix/garbage.c:526 [inline] __unix_gc+0x2adf/0x3c20 net/unix/garbage.c:584 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xade/0x1bf0 kernel/workqueue.c:3312 worker_thread+0xeb6/0x15b0 kernel/workqueue.c:3393 kthread+0x3c4/0x530 kernel/kthread.c:389 ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Local variable entries created at: ref_tracker_free+0x48/0xf30 lib/ref_tracker.c:222 netdev_tracker_free include/linux/netdevice.h:4058 [inline] netdev_put include/linux/netdevice.h:4075 [inline] dev_put include/linux/netdevice.h:4101 [inline] update_gid_event_work_handler+0xaa/0x1b0 drivers/infiniband/core/roce_gid_mgmt.c:813 CPU: 1 PID: 12763 Comm: kworker/u8:31 Not tainted 6.10.0-rc4-00217-g35bb670d65fc kernel-patches#32 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 Workqueue: events_unbound __unix_gc Fixes: 3484f06 ("af_unix: Detect Strongly Connected Components.") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Signed-off-by: NipaLocal <nipa@local>

KMSAN reported uninit-value access in __unix_walk_scc() [1]. In the list_for_each_entry_reverse() loop, when the vertex's index equals it's scc_index, the loop uses the variable vertex as a temporary variable that points to a vertex in scc. And when the loop is finished, the variable vertex points to the list head, in this case scc, which is a local variable on the stack (more precisely, it's not even scc and might underflow the call stack of __unix_walk_scc(): container_of(&scc, struct unix_vertex, scc_entry)). However, the variable vertex is used under the label prev_vertex. So if the edge_stack is not empty and the function jumps to the prev_vertex label, the function will access invalid data on the stack. This causes the uninit-value access issue. Fix this by introducing a new temporary variable for the loop. [1] BUG: KMSAN: uninit-value in __unix_walk_scc net/unix/garbage.c:478 [inline] BUG: KMSAN: uninit-value in unix_walk_scc net/unix/garbage.c:526 [inline] BUG: KMSAN: uninit-value in __unix_gc+0x2589/0x3c20 net/unix/garbage.c:584 __unix_walk_scc net/unix/garbage.c:478 [inline] unix_walk_scc net/unix/garbage.c:526 [inline] __unix_gc+0x2589/0x3c20 net/unix/garbage.c:584 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xade/0x1bf0 kernel/workqueue.c:3312 worker_thread+0xeb6/0x15b0 kernel/workqueue.c:3393 kthread+0x3c4/0x530 kernel/kthread.c:389 ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Uninit was stored to memory at: unix_walk_scc net/unix/garbage.c:526 [inline] __unix_gc+0x2adf/0x3c20 net/unix/garbage.c:584 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xade/0x1bf0 kernel/workqueue.c:3312 worker_thread+0xeb6/0x15b0 kernel/workqueue.c:3393 kthread+0x3c4/0x530 kernel/kthread.c:389 ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Local variable entries created at: ref_tracker_free+0x48/0xf30 lib/ref_tracker.c:222 netdev_tracker_free include/linux/netdevice.h:4058 [inline] netdev_put include/linux/netdevice.h:4075 [inline] dev_put include/linux/netdevice.h:4101 [inline] update_gid_event_work_handler+0xaa/0x1b0 drivers/infiniband/core/roce_gid_mgmt.c:813 CPU: 1 PID: 12763 Comm: kworker/u8:31 Not tainted 6.10.0-rc4-00217-g35bb670d65fc kernel-patches#32 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 Workqueue: events_unbound __unix_gc Fixes: 3484f06 ("af_unix: Detect Strongly Connected Components.") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: NipaLocal <nipa@local>

KMSAN reported uninit-value access in raw_lookup() [1]. Diag for raw sockets uses the pad field in struct inet_diag_req_v2 for the underlying protocol. This field corresponds to the sdiag_raw_protocol field in struct inet_diag_req_raw. inet_diag_get_exact_compat() converts inet_diag_req to inet_diag_req_v2, but leaves the pad field uninitialized. So the issue occurs when raw_lookup() accesses the sdiag_raw_protocol field. Fix this by initializing the pad field in inet_diag_get_exact_compat(). Also, do the same fix in inet_diag_dump_compat() to avoid the similar issue in the future. [1] BUG: KMSAN: uninit-value in raw_lookup net/ipv4/raw_diag.c:49 [inline] BUG: KMSAN: uninit-value in raw_sock_get+0x657/0x800 net/ipv4/raw_diag.c:71 raw_lookup net/ipv4/raw_diag.c:49 [inline] raw_sock_get+0x657/0x800 net/ipv4/raw_diag.c:71 raw_diag_dump_one+0xa1/0x660 net/ipv4/raw_diag.c:99 inet_diag_cmd_exact+0x7d9/0x980 inet_diag_get_exact_compat net/ipv4/inet_diag.c:1404 [inline] inet_diag_rcv_msg_compat+0x469/0x530 net/ipv4/inet_diag.c:1426 sock_diag_rcv_msg+0x23d/0x740 net/core/sock_diag.c:282 netlink_rcv_skb+0x537/0x670 net/netlink/af_netlink.c:2564 sock_diag_rcv+0x35/0x40 net/core/sock_diag.c:297 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline] netlink_unicast+0xe74/0x1240 net/netlink/af_netlink.c:1361 netlink_sendmsg+0x10c6/0x1260 net/netlink/af_netlink.c:1905 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x332/0x3d0 net/socket.c:745 ____sys_sendmsg+0x7f0/0xb70 net/socket.c:2585 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2639 __sys_sendmsg net/socket.c:2668 [inline] __do_sys_sendmsg net/socket.c:2677 [inline] __se_sys_sendmsg net/socket.c:2675 [inline] __x64_sys_sendmsg+0x27e/0x4a0 net/socket.c:2675 x64_sys_call+0x135e/0x3ce0 arch/x86/include/generated/asm/syscalls_64.h:47 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xd9/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Uninit was stored to memory at: raw_sock_get+0x650/0x800 net/ipv4/raw_diag.c:71 raw_diag_dump_one+0xa1/0x660 net/ipv4/raw_diag.c:99 inet_diag_cmd_exact+0x7d9/0x980 inet_diag_get_exact_compat net/ipv4/inet_diag.c:1404 [inline] inet_diag_rcv_msg_compat+0x469/0x530 net/ipv4/inet_diag.c:1426 sock_diag_rcv_msg+0x23d/0x740 net/core/sock_diag.c:282 netlink_rcv_skb+0x537/0x670 net/netlink/af_netlink.c:2564 sock_diag_rcv+0x35/0x40 net/core/sock_diag.c:297 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline] netlink_unicast+0xe74/0x1240 net/netlink/af_netlink.c:1361 netlink_sendmsg+0x10c6/0x1260 net/netlink/af_netlink.c:1905 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x332/0x3d0 net/socket.c:745 ____sys_sendmsg+0x7f0/0xb70 net/socket.c:2585 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2639 __sys_sendmsg net/socket.c:2668 [inline] __do_sys_sendmsg net/socket.c:2677 [inline] __se_sys_sendmsg net/socket.c:2675 [inline] __x64_sys_sendmsg+0x27e/0x4a0 net/socket.c:2675 x64_sys_call+0x135e/0x3ce0 arch/x86/include/generated/asm/syscalls_64.h:47 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xd9/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Local variable req.i created at: inet_diag_get_exact_compat net/ipv4/inet_diag.c:1396 [inline] inet_diag_rcv_msg_compat+0x2a6/0x530 net/ipv4/inet_diag.c:1426 sock_diag_rcv_msg+0x23d/0x740 net/core/sock_diag.c:282 CPU: 1 PID: 8888 Comm: syz-executor.6 Not tainted 6.10.0-rc4-00217-g35bb670d65fc kernel-patches#32 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 Fixes: 432490f ("net: ip, diag -- Add diag interface for raw sockets") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: NipaLocal <nipa@local>