forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up power8 occ hwmon driver #44
Labels
Comments
@adamliyi can you please take a look at these warnings. Although the driver is in the tree, they must be fixed before I will review again. |
OK. I will fix the warning. |
adamliyi
added a commit
to adamliyi/openbmc_linux
that referenced
this issue
Feb 29, 2016
Fixed warning message when compiling occ hwmon driver. The issue is tracked by: openbmc#44 Signed-off-by: Yi Li <adamliyi@msn.com>
I've commented on the patch, it looks ok. Is __beXX what we should be using? |
amboar
pushed a commit
to amboar/linux
that referenced
this issue
Dec 23, 2016
…nitcall fs_initcall is definitely too late to initialize DMA-debug hash tables, because some drivers might get probed and use DMA mapping framework already in core_initcall. Late initialization of DMA-debug results in false warning about accessing memory, that was not allocated, like this one: ------------[ cut here ]------------ WARNING: CPU: 5 PID: 1 at lib/dma-debug.c:1104 check_unmap+0xa1c/0xe50 exynos-sysmmu 10a60000.sysmmu: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x000000006ebd0000] [size=16384 bytes] Modules linked in: CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc5-00028-g39dde3d-dirty openbmc#44 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c0119dd4>] (unwind_backtrace) from [<c01122bc>] (show_stack+0x20/0x24) [<c01122bc>] (show_stack) from [<c062714c>] (dump_stack+0x84/0xa0) [<c062714c>] (dump_stack) from [<c0132560>] (__warn+0x14c/0x180) [<c0132560>] (__warn) from [<c01325dc>] (warn_slowpath_fmt+0x48/0x50) [<c01325dc>] (warn_slowpath_fmt) from [<c06814f8>] (check_unmap+0xa1c/0xe50) [<c06814f8>] (check_unmap) from [<c06819c4>] (debug_dma_unmap_page+0x98/0xc8) [<c06819c4>] (debug_dma_unmap_page) from [<c076c3e8>] (exynos_iommu_domain_free+0x158/0x380) [<c076c3e8>] (exynos_iommu_domain_free) from [<c0764a30>] (iommu_domain_free+0x34/0x60) [<c0764a30>] (iommu_domain_free) from [<c011f168>] (release_iommu_mapping+0x30/0xb8) [<c011f168>] (release_iommu_mapping) from [<c011f23c>] (arm_iommu_release_mapping+0x4c/0x50) [<c011f23c>] (arm_iommu_release_mapping) from [<c0b061ac>] (s5p_mfc_probe+0x640/0x80c) [<c0b061ac>] (s5p_mfc_probe) from [<c07e6750>] (platform_drv_probe+0x70/0x148) [<c07e6750>] (platform_drv_probe) from [<c07e25c0>] (driver_probe_device+0x12c/0x6b0) [<c07e25c0>] (driver_probe_device) from [<c07e2c6c>] (__driver_attach+0x128/0x17c) [<c07e2c6c>] (__driver_attach) from [<c07df74c>] (bus_for_each_dev+0x88/0xc8) [<c07df74c>] (bus_for_each_dev) from [<c07e1b6c>] (driver_attach+0x34/0x58) [<c07e1b6c>] (driver_attach) from [<c07e1350>] (bus_add_driver+0x18c/0x32c) [<c07e1350>] (bus_add_driver) from [<c07e4198>] (driver_register+0x98/0x148) [<c07e4198>] (driver_register) from [<c07e5cb0>] (__platform_driver_register+0x58/0x74) [<c07e5cb0>] (__platform_driver_register) from [<c174cb30>] (s5p_mfc_driver_init+0x1c/0x20) [<c174cb30>] (s5p_mfc_driver_init) from [<c0102690>] (do_one_initcall+0x64/0x258) [<c0102690>] (do_one_initcall) from [<c17014c0>] (kernel_init_freeable+0x3d0/0x4d0) [<c17014c0>] (kernel_init_freeable) from [<c116eeb4>] (kernel_init+0x18/0x134) [<c116eeb4>] (kernel_init) from [<c010bbd8>] (ret_from_fork+0x14/0x3c) ---[ end trace dc54c54bd3581296 ]--- This patch moves initialization of DMA-debug to core_initcall. This is safe from the initialization perspective. dma_debug_do_init() internally calls debugfs functions and debugfs also gets initialised at core_initcall(), and that is earlier than arch code in the link order, so it will get initialized just before the DMA-debug. Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
shenki
pushed a commit
that referenced
this issue
Jan 3, 2019
[ Upstream commit ca08987 ] There is no expression deactivation call from the rule replacement path, hence, chain counter is not decremented. A few steps to reproduce the problem: %nft add table ip filter %nft add chain ip filter c1 %nft add chain ip filter c1 %nft add rule ip filter c1 jump c2 %nft replace rule ip filter c1 handle 3 accept %nft flush ruleset <jump c2> expression means immediate NFT_JUMP to chain c2. Reference count of chain c2 is increased when the rule is added. When rule is deleted or replaced, the reference counter of c2 should be decreased via nft_rule_expr_deactivate() which calls nft_immediate_deactivate(). Splat looks like: [ 214.396453] WARNING: CPU: 1 PID: 21 at net/netfilter/nf_tables_api.c:1432 nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables] [ 214.398983] Modules linked in: nf_tables nfnetlink [ 214.398983] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 4.20.0-rc2+ #44 [ 214.398983] Workqueue: events nf_tables_trans_destroy_work [nf_tables] [ 214.398983] RIP: 0010:nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables] [ 214.398983] Code: 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 8e 00 00 00 48 8b 7b 58 e8 e1 2c 4e c6 48 89 df e8 d9 2c 4e c6 eb 9a <0f> 0b eb 96 0f 0b e9 7e fe ff ff e8 a7 7e 4e c6 e9 a4 fe ff ff e8 [ 214.398983] RSP: 0018:ffff8881152874e8 EFLAGS: 00010202 [ 214.398983] RAX: 0000000000000001 RBX: ffff88810ef9fc28 RCX: ffff8881152876f0 [ 214.398983] RDX: dffffc0000000000 RSI: 1ffff11022a50ede RDI: ffff88810ef9fc78 [ 214.398983] RBP: 1ffff11022a50e9d R08: 0000000080000000 R09: 0000000000000000 [ 214.398983] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff11022a50eba [ 214.398983] R13: ffff888114446e08 R14: ffff8881152876f0 R15: ffffed1022a50ed6 [ 214.398983] FS: 0000000000000000(0000) GS:ffff888116400000(0000) knlGS:0000000000000000 [ 214.398983] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 214.398983] CR2: 00007fab9bb5f868 CR3: 000000012aa16000 CR4: 00000000001006e0 [ 214.398983] Call Trace: [ 214.398983] ? nf_tables_table_destroy.isra.37+0x100/0x100 [nf_tables] [ 214.398983] ? __kasan_slab_free+0x145/0x180 [ 214.398983] ? nf_tables_trans_destroy_work+0x439/0x830 [nf_tables] [ 214.398983] ? kfree+0xdb/0x280 [ 214.398983] nf_tables_trans_destroy_work+0x5f5/0x830 [nf_tables] [ ... ] Fixes: bb7b40a ("netfilter: nf_tables: bogus EBUSY in chain deletions") Reported by: Christoph Anton Mitterer <calestyo@scientia.net> Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914505 Link: https://bugzilla.kernel.org/show_bug.cgi?id=201791 Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki
pushed a commit
that referenced
this issue
Jan 16, 2019
…lace() [ Upstream commit 4c05ec4 ] basechain->stats is rcu protected data which is updated from nft_chain_stats_replace(). This function is executed from the commit phase which holds the pernet nf_tables commit mutex - not the global nfnetlink subsystem mutex. Test commands to reproduce the problem are: %iptables-nft -I INPUT %iptables-nft -Z %iptables-nft -Z This patch uses RCU calls to handle basechain->stats updates to fix a splat that looks like: [89279.358755] ============================= [89279.363656] WARNING: suspicious RCU usage [89279.368458] 4.20.0-rc2+ #44 Tainted: G W L [89279.374661] ----------------------------- [89279.379542] net/netfilter/nf_tables_api.c:1404 suspicious rcu_dereference_protected() usage! [...] [89279.406556] 1 lock held by iptables-nft/5225: [89279.411728] #0: 00000000bf45a000 (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid+0x1f/0x70 [nf_tables] [89279.424022] stack backtrace: [89279.429236] CPU: 0 PID: 5225 Comm: iptables-nft Tainted: G W L 4.20.0-rc2+ #44 [89279.430135] Call Trace: [89279.430135] dump_stack+0xc9/0x16b [89279.430135] ? show_regs_print_info+0x5/0x5 [89279.430135] ? lockdep_rcu_suspicious+0x117/0x160 [89279.430135] nft_chain_commit_update+0x4ea/0x640 [nf_tables] [89279.430135] ? sched_clock_local+0xd4/0x140 [89279.430135] ? check_flags.part.35+0x440/0x440 [89279.430135] ? __rhashtable_remove_fast.constprop.67+0xec0/0xec0 [nf_tables] [89279.430135] ? sched_clock_cpu+0x126/0x170 [89279.430135] ? find_held_lock+0x39/0x1c0 [89279.430135] ? hlock_class+0x140/0x140 [89279.430135] ? is_bpf_text_address+0x5/0xf0 [89279.430135] ? check_flags.part.35+0x440/0x440 [89279.430135] ? __lock_is_held+0xb4/0x140 [89279.430135] nf_tables_commit+0x2555/0x39c0 [nf_tables] Fixes: f102d66 ("netfilter: nf_tables: use dedicated mutex to guard transactions") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Driver was reworked and is now upstream |
shenki
pushed a commit
that referenced
this issue
Feb 4, 2019
[ Upstream commit 2383a76 ] Vega10 has multiple interrupt rings, so this can be called from multiple calles at the same time resulting in: [ 71.779334] ================================ [ 71.779406] WARNING: inconsistent lock state [ 71.779478] 4.19.0-rc1+ #44 Tainted: G W [ 71.779565] -------------------------------- [ 71.779637] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ 71.779740] kworker/6:1/120 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 71.779832] 00000000ad761971 (&(&kfd->interrupt_lock)->rlock){?...}, at: kgd2kfd_interrupt+0x75/0x100 [amdgpu] [ 71.780058] {IN-HARDIRQ-W} state was registered at: [ 71.780115] _raw_spin_lock+0x2c/0x40 [ 71.780180] kgd2kfd_interrupt+0x75/0x100 [amdgpu] [ 71.780248] amdgpu_irq_callback+0x6c/0x150 [amdgpu] [ 71.780315] amdgpu_ih_process+0x88/0x100 [amdgpu] [ 71.780380] amdgpu_irq_handler+0x20/0x40 [amdgpu] [ 71.780409] __handle_irq_event_percpu+0x49/0x2a0 [ 71.780436] handle_irq_event_percpu+0x30/0x70 [ 71.780461] handle_irq_event+0x37/0x60 [ 71.780484] handle_edge_irq+0x83/0x1b0 [ 71.780506] handle_irq+0x1f/0x30 [ 71.780526] do_IRQ+0x53/0x110 [ 71.780544] ret_from_intr+0x0/0x22 [ 71.780566] cpuidle_enter_state+0xaa/0x330 [ 71.780591] do_idle+0x203/0x280 [ 71.780610] cpu_startup_entry+0x6f/0x80 [ 71.780634] start_secondary+0x1b0/0x200 [ 71.780657] secondary_startup_64+0xa4/0xb0 Fix this by always using irq save spin locks. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki
pushed a commit
that referenced
this issue
Oct 14, 2019
[ Upstream commit 768fb61 ] BPF_PROG_TYPE_SOCK_OPS program can reenter bpf_event_output because it can be called from atomic and non-atomic contexts since we don't have bpf_prog_active to prevent it happen. This patch enables 3 levels of nesting to support normal, irq and nmi context. We can easily reproduce the issue by running netperf crr mode with 100 flows and 10 threads from netperf client side. Here is the whole stack dump: [ 515.228898] WARNING: CPU: 20 PID: 14686 at kernel/trace/bpf_trace.c:549 bpf_event_output+0x1f9/0x220 [ 515.228903] CPU: 20 PID: 14686 Comm: tcp_crr Tainted: G W 4.15.0-smp-fixpanic #44 [ 515.228904] Hardware name: Intel TBG,ICH10/Ikaria_QC_1b, BIOS 1.22.0 06/04/2018 [ 515.228905] RIP: 0010:bpf_event_output+0x1f9/0x220 [ 515.228906] RSP: 0018:ffff9a57ffc03938 EFLAGS: 00010246 [ 515.228907] RAX: 0000000000000012 RBX: 0000000000000001 RCX: 0000000000000000 [ 515.228907] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffffffff836b0f80 [ 515.228908] RBP: ffff9a57ffc039c8 R08: 0000000000000004 R09: 0000000000000012 [ 515.228908] R10: ffff9a57ffc1de40 R11: 0000000000000000 R12: 0000000000000002 [ 515.228909] R13: ffff9a57e13bae00 R14: 00000000ffffffff R15: ffff9a57ffc1e2c0 [ 515.228910] FS: 00007f5a3e6ec700(0000) GS:ffff9a57ffc00000(0000) knlGS:0000000000000000 [ 515.228910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 515.228911] CR2: 0000537082664fff CR3: 000000061fed6002 CR4: 00000000000226f0 [ 515.228911] Call Trace: [ 515.228913] <IRQ> [ 515.228919] [<ffffffff82c6c6cb>] bpf_sockopt_event_output+0x3b/0x50 [ 515.228923] [<ffffffff8265daee>] ? bpf_ktime_get_ns+0xe/0x10 [ 515.228927] [<ffffffff8266fda5>] ? __cgroup_bpf_run_filter_sock_ops+0x85/0x100 [ 515.228930] [<ffffffff82cf90a5>] ? tcp_init_transfer+0x125/0x150 [ 515.228933] [<ffffffff82cf9159>] ? tcp_finish_connect+0x89/0x110 [ 515.228936] [<ffffffff82cf98e4>] ? tcp_rcv_state_process+0x704/0x1010 [ 515.228939] [<ffffffff82c6e263>] ? sk_filter_trim_cap+0x53/0x2a0 [ 515.228942] [<ffffffff82d90d1f>] ? tcp_v6_inbound_md5_hash+0x6f/0x1d0 [ 515.228945] [<ffffffff82d92160>] ? tcp_v6_do_rcv+0x1c0/0x460 [ 515.228947] [<ffffffff82d93558>] ? tcp_v6_rcv+0x9f8/0xb30 [ 515.228951] [<ffffffff82d737c0>] ? ip6_route_input+0x190/0x220 [ 515.228955] [<ffffffff82d5f7ad>] ? ip6_protocol_deliver_rcu+0x6d/0x450 [ 515.228958] [<ffffffff82d60246>] ? ip6_rcv_finish+0xb6/0x170 [ 515.228961] [<ffffffff82d5fb90>] ? ip6_protocol_deliver_rcu+0x450/0x450 [ 515.228963] [<ffffffff82d60361>] ? ipv6_rcv+0x61/0xe0 [ 515.228966] [<ffffffff82d60190>] ? ipv6_list_rcv+0x330/0x330 [ 515.228969] [<ffffffff82c4976b>] ? __netif_receive_skb_one_core+0x5b/0xa0 [ 515.228972] [<ffffffff82c497d1>] ? __netif_receive_skb+0x21/0x70 [ 515.228975] [<ffffffff82c4a8d2>] ? process_backlog+0xb2/0x150 [ 515.228978] [<ffffffff82c4aadf>] ? net_rx_action+0x16f/0x410 [ 515.228982] [<ffffffff830000dd>] ? __do_softirq+0xdd/0x305 [ 515.228986] [<ffffffff8252cfdc>] ? irq_exit+0x9c/0xb0 [ 515.228989] [<ffffffff82e02de5>] ? smp_call_function_single_interrupt+0x65/0x120 [ 515.228991] [<ffffffff82e020e1>] ? call_function_single_interrupt+0x81/0x90 [ 515.228992] </IRQ> [ 515.228996] [<ffffffff82a11ff0>] ? io_serial_in+0x20/0x20 [ 515.229000] [<ffffffff8259c040>] ? console_unlock+0x230/0x490 [ 515.229003] [<ffffffff8259cbaa>] ? vprintk_emit+0x26a/0x2a0 [ 515.229006] [<ffffffff8259cbff>] ? vprintk_default+0x1f/0x30 [ 515.229008] [<ffffffff8259d9f5>] ? vprintk_func+0x35/0x70 [ 515.229011] [<ffffffff8259d4bb>] ? printk+0x50/0x66 [ 515.229013] [<ffffffff82637637>] ? bpf_event_output+0xb7/0x220 [ 515.229016] [<ffffffff82c6c6cb>] ? bpf_sockopt_event_output+0x3b/0x50 [ 515.229019] [<ffffffff8265daee>] ? bpf_ktime_get_ns+0xe/0x10 [ 515.229023] [<ffffffff82c29e87>] ? release_sock+0x97/0xb0 [ 515.229026] [<ffffffff82ce9d6a>] ? tcp_recvmsg+0x31a/0xda0 [ 515.229029] [<ffffffff8266fda5>] ? __cgroup_bpf_run_filter_sock_ops+0x85/0x100 [ 515.229032] [<ffffffff82ce77c1>] ? tcp_set_state+0x191/0x1b0 [ 515.229035] [<ffffffff82ced10e>] ? tcp_disconnect+0x2e/0x600 [ 515.229038] [<ffffffff82cecbbb>] ? tcp_close+0x3eb/0x460 [ 515.229040] [<ffffffff82d21082>] ? inet_release+0x42/0x70 [ 515.229043] [<ffffffff82d58809>] ? inet6_release+0x39/0x50 [ 515.229046] [<ffffffff82c1f32d>] ? __sock_release+0x4d/0xd0 [ 515.229049] [<ffffffff82c1f3e5>] ? sock_close+0x15/0x20 [ 515.229052] [<ffffffff8273b517>] ? __fput+0xe7/0x1f0 [ 515.229055] [<ffffffff8273b66e>] ? ____fput+0xe/0x10 [ 515.229058] [<ffffffff82547bf2>] ? task_work_run+0x82/0xb0 [ 515.229061] [<ffffffff824086df>] ? exit_to_usermode_loop+0x7e/0x11f [ 515.229064] [<ffffffff82408171>] ? do_syscall_64+0x111/0x130 [ 515.229067] [<ffffffff82e0007c>] ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: a5a3a82 ("bpf: add perf event notificaton support for sock_ops") Signed-off-by: Allan Zhang <allanzhang@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20190925234312.94063-2-allanzhang@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki
pushed a commit
that referenced
this issue
Mar 5, 2020
commit f2e97dc upstream. Fix following build error. We could push a tcp.h header into one of the include paths, but I think its easy enough to simply pull in the three defines we need here. If we end up using more of tcp.h at some point we can pull it in later. /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c: In function ‘connected_socket_v4’: /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:20:11: error: ‘TCP_REPAIR_ON’ undeclared (first use in this function) repair = TCP_REPAIR_ON; ^ /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:20:11: note: each undeclared identifier is reported only once for each function it appears in /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:29:11: error: ‘TCP_REPAIR_OFF_NO_WP’ undeclared (first use in this function) repair = TCP_REPAIR_OFF_NO_WP; Then with fix, $ ./test_progs -n 44 #44/1 sockmap create_update_free:OK #44/2 sockhash create_update_free:OK #44 sockmap_basic:OK Fixes: 5d3919a ("selftests/bpf: Test freeing sockmap/sockhash with a socket in it") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/158131347731.21414.12120493483848386652.stgit@john-Precision-5820-Tower Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
shenki
pushed a commit
that referenced
this issue
Aug 10, 2020
[ Upstream commit 3c8ce24 ] The lifetime of EMAD transactions (i.e., 'struct mlxsw_reg_trans') is managed using RCU. They are freed using kfree_rcu() once the transaction ends. However, in case the transaction failed it is freed immediately after being removed from the active transactions list. This is problematic because it is still possible for a different CPU to dereference the transaction from an RCU read-side critical section while traversing the active transaction list in mlxsw_emad_rx_listener_func(). In which case, a use-after-free is triggered [1]. Fix this by freeing the transaction after a grace period by calling kfree_rcu(). [1] BUG: KASAN: use-after-free in mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671 Read of size 8 at addr ffff88800b7964e8 by task syz-executor.2/2881 CPU: 0 PID: 2881 Comm: syz-executor.2 Not tainted 5.8.0-rc4+ #44 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xf6/0x16e lib/dump_stack.c:118 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671 mlxsw_core_skb_receive+0x571/0x700 drivers/net/ethernet/mellanox/mlxsw/core.c:2061 mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline] mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651 tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550 __do_softirq+0x223/0x964 kernel/softirq.c:292 asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711 </IRQ> __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline] run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline] do_softirq_own_stack+0x109/0x140 arch/x86/kernel/irq_64.c:77 invoke_softirq kernel/softirq.c:387 [inline] __irq_exit_rcu kernel/softirq.c:417 [inline] irq_exit_rcu+0x16f/0x1a0 kernel/softirq.c:429 sysvec_apic_timer_interrupt+0x4e/0xd0 arch/x86/kernel/apic/apic.c:1091 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:587 RIP: 0010:arch_local_irq_restore arch/x86/include/asm/irqflags.h:85 [inline] RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline] RIP: 0010:_raw_spin_unlock_irqrestore+0x3b/0x40 kernel/locking/spinlock.c:191 Code: e8 2a c3 f4 fc 48 89 ef e8 12 96 f5 fc f6 c7 02 75 11 53 9d e8 d6 db 11 fd 65 ff 0d 1f 21 b3 56 5b 5d c3 e8 a7 d7 11 fd 53 9d <eb> ed 0f 1f 00 55 48 89 fd 65 ff 05 05 21 b3 56 ff 74 24 08 48 8d RSP: 0018:ffff8880446ffd80 EFLAGS: 00000286 RAX: 0000000000000006 RBX: 0000000000000286 RCX: 0000000000000006 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa94ecea9 RBP: ffff888012934408 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: fffffbfff57be301 R12: 1ffff110088dffc1 R13: ffff888037b817c0 R14: ffff88802442415a R15: ffff888024424000 __do_sys_perf_event_open+0x1b5d/0x2bd0 kernel/events/core.c:11874 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x473dbd Code: Bad RIP value. RSP: 002b:00007f21e5e9cc28 EFLAGS: 00000246 ORIG_RAX: 000000000000012a RAX: ffffffffffffffda RBX: 000000000057bf00 RCX: 0000000000473dbd RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000040 RBP: 000000000057bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000003 R11: 0000000000000246 R12: 000000000057bf0c R13: 00007ffd0493503f R14: 00000000004d0f46 R15: 00007f21e5e9cd80 Allocated by task 871: save_stack+0x1b/0x40 mm/kasan/common.c:48 set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc mm/kasan/common.c:494 [inline] __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467 kmalloc include/linux/slab.h:555 [inline] kzalloc include/linux/slab.h:669 [inline] mlxsw_core_reg_access_emad+0x70/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1812 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415 kthread+0x355/0x470 kernel/kthread.c:291 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293 Freed by task 871: save_stack+0x1b/0x40 mm/kasan/common.c:48 set_track mm/kasan/common.c:56 [inline] kasan_set_free_info mm/kasan/common.c:316 [inline] __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455 slab_free_hook mm/slub.c:1474 [inline] slab_free_freelist_hook mm/slub.c:1507 [inline] slab_free mm/slub.c:3072 [inline] kfree+0xe6/0x320 mm/slub.c:4052 mlxsw_core_reg_access_emad+0xd45/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1819 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415 kthread+0x355/0x470 kernel/kthread.c:291 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293 The buggy address belongs to the object at ffff88800b796400 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 232 bytes inside of 512-byte region [ffff88800b796400, ffff88800b796600) The buggy address belongs to the page: page:ffffea00002de500 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea00002de500 order:2 compound_mapcount:0 compound_pincount:0 flags: 0x100000000010200(slab|head) raw: 0100000000010200 dead000000000100 dead000000000122 ffff88806c402500 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88800b796380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88800b796400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88800b796480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88800b796500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88800b796580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: caf7297 ("mlxsw: core: Introduce support for asynchronous EMAD register access") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki
pushed a commit
that referenced
this issue
Aug 31, 2020
[ Upstream commit 6184358 ] When enable LOCKDEP, static_obj() will cause error. Because some __initdata static variables is before _stext: static int static_obj(const void *obj) { unsigned long start = (unsigned long) &_stext, end = (unsigned long) &_end, addr = (unsigned long) obj; /* * static variable? */ if ((addr >= start) && (addr < end)) return 1; [ 0.067192] INFO: trying to register non-static key. [ 0.067325] the code is fine but needs lockdep annotation. [ 0.067449] turning off the locking correctness validator. [ 0.067718] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-rc7-dirty #44 [ 0.067945] Call Trace: [ 0.068369] [<ffffffe00020323c>] walk_stackframe+0x0/0xa4 [ 0.068506] [<ffffffe000203422>] show_stack+0x2a/0x34 [ 0.068631] [<ffffffe000521e4e>] dump_stack+0x94/0xca [ 0.068757] [<ffffffe000255a4e>] register_lock_class+0x5b8/0x5bc [ 0.068969] [<ffffffe000255abe>] __lock_acquire+0x6c/0x1d5c [ 0.069101] [<ffffffe0002550fe>] lock_acquire+0xae/0x312 [ 0.069228] [<ffffffe000989a8e>] _raw_spin_lock_irqsave+0x40/0x5a [ 0.069357] [<ffffffe000247c64>] complete+0x1e/0x50 [ 0.069479] [<ffffffe000984c38>] rest_init+0x1b0/0x28a [ 0.069660] [<ffffffe0000016a2>] 0xffffffe0000016a2 [ 0.069779] [<ffffffe000001b84>] 0xffffffe000001b84 [ 0.069953] [<ffffffe000001092>] 0xffffffe000001092 static __initdata DECLARE_COMPLETION(kthreadd_done); noinline void __ref rest_init(void) { ... complete(&kthreadd_done); Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki
pushed a commit
that referenced
this issue
May 24, 2021
[ Upstream commit d5027ca ] Ritesh reported a bug [1] against UML, noting that it crashed on startup. The backtrace shows the following (heavily redacted): (gdb) bt ... #26 0x0000000060015b5d in sem_init () at ipc/sem.c:268 #27 0x00007f89906d92f7 in ?? () from /lib/x86_64-linux-gnu/libcom_err.so.2 #28 0x00007f8990ab8fb2 in call_init (...) at dl-init.c:72 ... #40 0x00007f89909bf3a6 in nss_load_library (...) at nsswitch.c:359 ... #44 0x00007f8990895e35 in _nss_compat_getgrnam_r (...) at nss_compat/compat-grp.c:486 #45 0x00007f8990968b85 in __getgrnam_r [...] #46 0x00007f89909d6b77 in grantpt [...] #47 0x00007f8990a9394e in __GI_openpty [...] #48 0x00000000604a1f65 in openpty_cb (...) at arch/um/os-Linux/sigio.c:407 #49 0x00000000604a58d0 in start_idle_thread (...) at arch/um/os-Linux/skas/process.c:598 #50 0x0000000060004a3d in start_uml () at arch/um/kernel/skas/process.c:45 #51 0x00000000600047b2 in linux_main (...) at arch/um/kernel/um_arch.c:334 #52 0x000000006000574f in main (...) at arch/um/os-Linux/main.c:144 indicating that the UML function openpty_cb() calls openpty(), which internally calls __getgrnam_r(), which causes the nsswitch machinery to get started. This loads, through lots of indirection that I snipped, the libcom_err.so.2 library, which (in an unknown function, "??") calls sem_init(). Now, of course it wants to get libpthread's sem_init(), since it's linked against libpthread. However, the dynamic linker looks up that symbol against the binary first, and gets the kernel's sem_init(). Hajime Tazaki noted that "objcopy -L" can localize a symbol, so the dynamic linker wouldn't do the lookup this way. I tried, but for some reason that didn't seem to work. Doing the same thing in the linker script instead does seem to work, though I cannot entirely explain - it *also* works if I just add "VERSION { { global: *; }; }" instead, indicating that something else is happening that I don't really understand. It may be that explicitly doing that marks them with some kind of empty version, and that's different from the default. Explicitly marking them with a version breaks kallsyms, so that doesn't seem to be possible. Marking all the symbols as local seems correct, and does seem to address the issue, so do that. Also do it for static link, nsswitch libraries could still be loaded there. [1] https://bugs.debian.org/983379 Reported-by: Ritesh Raj Sarraf <rrs@debian.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Tested-By: Ritesh Raj Sarraf <rrs@debian.org> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki
pushed a commit
that referenced
this issue
Aug 19, 2021
[ Upstream commit 5e21bb4 ] These two types of XDP progs (BPF_XDP_DEVMAP, BPF_XDP_CPUMAP) will not be executed directly in the driver, therefore we should also not directly run them from here. To run in these two situations, there must be further preparations done, otherwise these may cause a kernel panic. For more details, see also dev_xdp_attach(). [ 46.982479] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 46.984295] #PF: supervisor read access in kernel mode [ 46.985777] #PF: error_code(0x0000) - not-present page [ 46.987227] PGD 800000010dca4067 P4D 800000010dca4067 PUD 10dca6067 PMD 0 [ 46.989201] Oops: 0000 [#1] SMP PTI [ 46.990304] CPU: 7 PID: 562 Comm: a.out Not tainted 5.13.0+ #44 [ 46.992001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/24 [ 46.995113] RIP: 0010:___bpf_prog_run+0x17b/0x1710 [ 46.996586] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02 [ 47.001562] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246 [ 47.003115] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000 [ 47.005163] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98 [ 47.007135] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff [ 47.009171] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98 [ 47.011172] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8 [ 47.013244] FS: 00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000 [ 47.015705] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.017475] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0 [ 47.019558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.021595] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.023574] PKRU: 55555554 [ 47.024571] Call Trace: [ 47.025424] __bpf_prog_run32+0x32/0x50 [ 47.026296] ? printk+0x53/0x6a [ 47.027066] ? ktime_get+0x39/0x90 [ 47.027895] bpf_test_run.cold.28+0x23/0x123 [ 47.028866] ? printk+0x53/0x6a [ 47.029630] bpf_prog_test_run_xdp+0x149/0x1d0 [ 47.030649] __sys_bpf+0x1305/0x23d0 [ 47.031482] __x64_sys_bpf+0x17/0x20 [ 47.032316] do_syscall_64+0x3a/0x80 [ 47.033165] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 47.034254] RIP: 0033:0x7f04a51364dd [ 47.035133] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 48 [ 47.038768] RSP: 002b:00007fff8f9fc518 EFLAGS: 00000213 ORIG_RAX: 0000000000000141 [ 47.040344] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f04a51364dd [ 47.041749] RDX: 0000000000000048 RSI: 0000000020002a80 RDI: 000000000000000a [ 47.043171] RBP: 00007fff8f9fc530 R08: 0000000002049300 R09: 0000000020000100 [ 47.044626] R10: 0000000000000004 R11: 0000000000000213 R12: 0000000000401070 [ 47.046088] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 47.047579] Modules linked in: [ 47.048318] CR2: 0000000000000000 [ 47.049120] ---[ end trace 7ad34443d5be719a ]--- [ 47.050273] RIP: 0010:___bpf_prog_run+0x17b/0x1710 [ 47.051343] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02 [ 47.054943] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246 [ 47.056068] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000 [ 47.057522] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98 [ 47.058961] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff [ 47.060390] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98 [ 47.061803] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8 [ 47.063249] FS: 00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000 [ 47.065070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.066307] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0 [ 47.067747] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.069217] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.070652] PKRU: 55555554 [ 47.071318] Kernel panic - not syncing: Fatal exception [ 47.072854] Kernel Offset: disabled [ 47.073683] ---[ end Kernel panic - not syncing: Fatal exception ]--- Fixes: 9216477 ("bpf: cpumap: Add the possibility to attach an eBPF program to cpumap") Fixes: fbee97f ("bpf: Add support to attach bpf program to a devmap entry") Reported-by: Abaci <abaci@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: David Ahern <dsahern@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210708080409.73525-1-xuanzhuo@linux.alibaba.com Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki
pushed a commit
that referenced
this issue
Jan 31, 2022
[ Upstream commit c1020d3 ] On an arm64 platform with the Spectrum ASIC, after loading and executing a new kernel via kexec, the following trace [1] is observed. This seems to be caused by the fact that the device is not properly shutdown before executing the new kernel. Fix this by implementing a shutdown method which mirrors the remove method, as recommended by the kexec maintainer [2][3]. [1] BUG: Bad page state in process devlink pfn:22f73d page:fffffe00089dcf40 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x2ffff00000000000() raw: 2ffff00000000000 0000000000000000 ffffffff089d0201 0000000000000000 raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000 page dumped because: nonzero _refcount Modules linked in: CPU: 1 PID: 16346 Comm: devlink Tainted: G B 5.8.0-rc6-custom-273020-gac6b365b1bf5 #44 Hardware name: Marvell Armada 7040 TX4810M (DT) Call trace: dump_backtrace+0x0/0x1d0 show_stack+0x1c/0x28 dump_stack+0xbc/0x118 bad_page+0xcc/0xf8 check_free_page_bad+0x80/0x88 __free_pages_ok+0x3f8/0x418 __free_pages+0x38/0x60 kmem_freepages+0x200/0x2a8 slab_destroy+0x28/0x68 slabs_destroy+0x60/0x90 ___cache_free+0x1b4/0x358 kfree+0xc0/0x1d0 skb_free_head+0x2c/0x38 skb_release_data+0x110/0x1a0 skb_release_all+0x2c/0x38 consume_skb+0x38/0x130 __dev_kfree_skb_any+0x44/0x50 mlxsw_pci_rdq_fini+0x8c/0xb0 mlxsw_pci_queue_fini.isra.0+0x28/0x58 mlxsw_pci_queue_group_fini+0x58/0x88 mlxsw_pci_aqs_fini+0x2c/0x60 mlxsw_pci_fini+0x34/0x50 mlxsw_core_bus_device_unregister+0x104/0x1d0 mlxsw_devlink_core_bus_device_reload_down+0x2c/0x48 devlink_reload+0x44/0x158 devlink_nl_cmd_reload+0x270/0x290 genl_rcv_msg+0x188/0x2f0 netlink_rcv_skb+0x5c/0x118 genl_rcv+0x3c/0x50 netlink_unicast+0x1bc/0x278 netlink_sendmsg+0x194/0x390 __sys_sendto+0xe0/0x158 __arm64_sys_sendto+0x2c/0x38 el0_svc_common.constprop.0+0x70/0x168 do_el0_svc+0x28/0x88 el0_sync_handler+0x88/0x190 el0_sync+0x140/0x180 [2] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1195432.html [3] https://patchwork.kernel.org/project/linux-scsi/patch/20170212214920.28866-1-anton@ozlabs.org/#20116693 Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki
pushed a commit
that referenced
this issue
Aug 4, 2022
commit 84e7051 upstream. The return thunk call makes the fastop functions larger, just like IBT does. Consider a 16-byte FASTOP_SIZE when CONFIG_RETHUNK is enabled. Otherwise, functions will be incorrectly aligned and when computing their position for differently sized operators, they will executed in the middle or end of a function, which may as well be an int3, leading to a crash like: [ 36.091116] int3: 0000 [#1] SMP NOPTI [ 36.091119] CPU: 3 PID: 1371 Comm: qemu-system-x86 Not tainted 5.15.0-41-generic #44 [ 36.091120] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 [ 36.091121] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm] [ 36.091185] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc [ 36.091186] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202 [ 36.091188] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000 [ 36.091188] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200 [ 36.091189] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002 [ 36.091190] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70 [ 36.091190] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000 [ 36.091191] FS: 00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000 [ 36.091192] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 36.091192] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0 [ 36.091195] PKRU: 55555554 [ 36.091195] Call Trace: [ 36.091197] <TASK> [ 36.091198] ? fastop+0x5a/0xa0 [kvm] [ 36.091222] x86_emulate_insn+0x7b8/0xe90 [kvm] [ 36.091244] x86_emulate_instruction+0x2f4/0x630 [kvm] [ 36.091263] ? kvm_arch_vcpu_load+0x7c/0x230 [kvm] [ 36.091283] ? vmx_prepare_switch_to_host+0xf7/0x190 [kvm_intel] [ 36.091290] complete_emulated_mmio+0x297/0x320 [kvm] [ 36.091310] kvm_arch_vcpu_ioctl_run+0x32f/0x550 [kvm] [ 36.091330] kvm_vcpu_ioctl+0x29e/0x6d0 [kvm] [ 36.091344] ? kvm_vcpu_ioctl+0x120/0x6d0 [kvm] [ 36.091357] ? __fget_files+0x86/0xc0 [ 36.091362] ? __fget_files+0x86/0xc0 [ 36.091363] __x64_sys_ioctl+0x92/0xd0 [ 36.091366] do_syscall_64+0x59/0xc0 [ 36.091369] ? syscall_exit_to_user_mode+0x27/0x50 [ 36.091370] ? do_syscall_64+0x69/0xc0 [ 36.091371] ? syscall_exit_to_user_mode+0x27/0x50 [ 36.091372] ? __x64_sys_writev+0x1c/0x30 [ 36.091374] ? do_syscall_64+0x69/0xc0 [ 36.091374] ? exit_to_user_mode_prepare+0x37/0xb0 [ 36.091378] ? syscall_exit_to_user_mode+0x27/0x50 [ 36.091379] ? do_syscall_64+0x69/0xc0 [ 36.091379] ? do_syscall_64+0x69/0xc0 [ 36.091380] ? do_syscall_64+0x69/0xc0 [ 36.091381] ? do_syscall_64+0x69/0xc0 [ 36.091381] entry_SYSCALL_64_after_hwframe+0x61/0xcb [ 36.091384] RIP: 0033:0x7efdfe6d1aff [ 36.091390] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 [ 36.091391] RSP: 002b:00007efdfce8c460 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 36.091393] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007efdfe6d1aff [ 36.091393] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000c [ 36.091394] RBP: 0000558f1609e220 R08: 0000558f13fb8190 R09: 00000000ffffffff [ 36.091394] R10: 0000558f16b5e950 R11: 0000000000000246 R12: 0000000000000000 [ 36.091394] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 [ 36.091396] </TASK> [ 36.091397] Modules linked in: isofs nls_iso8859_1 kvm_intel joydev kvm input_leds serio_raw sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler drm msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net net_failover crypto_simd ahci xhci_pci cryptd psmouse virtio_blk libahci xhci_pci_renesas failover [ 36.123271] ---[ end trace db3c0ab5a48fabcc ]--- [ 36.123272] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm] [ 36.123319] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc [ 36.123320] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202 [ 36.123321] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000 [ 36.123321] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200 [ 36.123322] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002 [ 36.123322] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70 [ 36.123323] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000 [ 36.123323] FS: 00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000 [ 36.123324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 36.123325] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0 [ 36.123327] PKRU: 55555554 [ 36.123328] Kernel panic - not syncing: Fatal exception in interrupt [ 36.123410] Kernel Offset: 0x1400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 36.135305] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Fixes: aa3d480 ("x86: Use return-thunk in asm code") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Co-developed-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@suse.de> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Message-Id: <20220713171241.184026-1-cascardo@canonical.com> Tested-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
shenki
pushed a commit
that referenced
this issue
Jun 19, 2023
[ Upstream commit 71c6aa0 ] We encountered a crash when using SMCRv2. It is caused by a logical error in smc_llc_fill_ext_v2(). BUG: kernel NULL pointer dereference, address: 0000000000000014 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 453 Comm: kworker/7:4 Kdump: loaded Tainted: G W E 6.4.0-rc3+ #44 Workqueue: events smc_llc_add_link_work [smc] RIP: 0010:smc_llc_fill_ext_v2+0x117/0x280 [smc] RSP: 0018:ffffacb5c064bd88 EFLAGS: 00010282 RAX: ffff9a6bc1c3c02c RBX: ffff9a6be3558000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000002 RDI: 000000000000000a RBP: ffffacb5c064bdb8 R08: 0000000000000040 R09: 000000000000000c R10: ffff9a6bc0910300 R11: 0000000000000002 R12: 0000000000000000 R13: 0000000000000002 R14: ffff9a6bc1c3c02c R15: ffff9a6be3558250 FS: 0000000000000000(0000) GS:ffff9a6eefdc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000014 CR3: 000000010b078003 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> smc_llc_send_add_link+0x1ae/0x2f0 [smc] smc_llc_srv_add_link+0x2c9/0x5a0 [smc] ? cc_mkenc+0x40/0x60 smc_llc_add_link_work+0xb8/0x140 [smc] process_one_work+0x1e5/0x3f0 worker_thread+0x4d/0x2f0 ? __pfx_worker_thread+0x10/0x10 kthread+0xe5/0x120 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 </TASK> When an alernate RNIC is available in system, SMC will try to add a new link based on the RNIC for resilience. All the RMBs in use will be mapped to the new link. Then the RMBs' MRs corresponding to the new link will be filled into SMCRv2 LLC ADD LINK messages. However, smc_llc_fill_ext_v2() mistakenly accesses to unused RMBs which haven't been mapped to the new link and have no valid MRs, thus causing a crash. So this patch fixes the logic. Fixes: b4ba465 ("net/smc: extend LLC layer for SMC-Rv2") Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
amboar
pushed a commit
that referenced
this issue
Feb 11, 2024
[ Upstream commit 4849775 ] If the net_cls subsystem is already mounted, attempting to mount it again in setup_classid_environment() will result in a failure with the error code EBUSY. Despite this, tmpfs will have been successfully mounted at /sys/fs/cgroup/net_cls. Consequently, the /sys/fs/cgroup/net_cls directory will be empty, causing subsequent setup operations to fail. Here's an error log excerpt illustrating the issue when net_cls has already been mounted at /sys/fs/cgroup/net_cls prior to running setup_classid_environment(): - Before that change $ tools/testing/selftests/bpf/test_progs --name=cgroup_v1v2 test_cgroup_v1v2:PASS:server_fd 0 nsec test_cgroup_v1v2:PASS:client_fd 0 nsec test_cgroup_v1v2:PASS:cgroup_fd 0 nsec test_cgroup_v1v2:PASS:server_fd 0 nsec run_test:PASS:skel_open 0 nsec run_test:PASS:prog_attach 0 nsec test_cgroup_v1v2:PASS:cgroup-v2-only 0 nsec (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup.procs (cgroup_helpers.c:540: errno: No such file or directory) Opening cgroup classid: /sys/fs/cgroup/net_cls/cgroup-test-work-dir/net_cls.classid run_test:PASS:skel_open 0 nsec run_test:PASS:prog_attach 0 nsec (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup-test-work-dir/cgroup.procs run_test:FAIL:join_classid unexpected error: 1 (errno 2) test_cgroup_v1v2:FAIL:cgroup-v1v2 unexpected error: -1 (errno 2) (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup.procs #44 cgroup_v1v2:FAIL Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED - After that change $ tools/testing/selftests/bpf/test_progs --name=cgroup_v1v2 #44 cgroup_v1v2:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20231111090034.4248-3-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
amboar
pushed a commit
that referenced
this issue
Jul 10, 2024
[ Upstream commit 5d92c7c ] If the ata_port_alloc() call in ata_host_alloc() fails, ata_host_release() will get called. However, the code in ata_host_release() tries to free ata_port struct members unconditionally, which can lead to the following: BUG: unable to handle page fault for address: 0000000000003990 PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 10 PID: 594 Comm: (udev-worker) Not tainted 6.10.0-rc5 #44 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:ata_host_release.cold+0x2f/0x6e [libata] Code: e4 4d 63 f4 44 89 e2 48 c7 c6 90 ad 32 c0 48 c7 c7 d0 70 33 c0 49 83 c6 0e 41 RSP: 0018:ffffc90000ebb968 EFLAGS: 00010246 RAX: 0000000000000041 RBX: ffff88810fb52e78 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff88813b3218c0 RDI: ffff88813b3218c0 RBP: ffff88810fb52e40 R08: 0000000000000000 R09: 6c65725f74736f68 R10: ffffc90000ebb738 R11: 73692033203a746e R12: 0000000000000004 R13: 0000000000000000 R14: 0000000000000011 R15: 0000000000000006 FS: 00007f6cc55b9980(0000) GS:ffff88813b300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000003990 CR3: 00000001122a2000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? page_fault_oops+0x15a/0x2f0 ? exc_page_fault+0x7e/0x180 ? asm_exc_page_fault+0x26/0x30 ? ata_host_release.cold+0x2f/0x6e [libata] ? ata_host_release.cold+0x2f/0x6e [libata] release_nodes+0x35/0xb0 devres_release_group+0x113/0x140 ata_host_alloc+0xed/0x120 [libata] ata_host_alloc_pinfo+0x14/0xa0 [libata] ahci_init_one+0x6c9/0xd20 [ahci] Do not access ata_port struct members unconditionally. Fixes: 633273a ("libata-pmp: hook PMP support and enable it") Cc: stable@vger.kernel.org Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20240629124210.181537-7-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Stable-dep-of: f6549f5 ("ata,scsi: libata-core: Do not leak memory for ata_port struct members") Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The text was updated successfully, but these errors were encountered: