Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spi: stm32-qspi: Fix wait_cmd timeout in APM mode #16

Conversation

ebstoll
Copy link

@ebstoll ebstoll commented Apr 12, 2022

In APM mode calling stm32_qspi_wait_cmd() leads to a timeout.
The reason for this is that stm32_qspi_wait_cmd() requires the
TCF/TEF flags to be set to finish. In APM mode these flags are
not used.
Currently it is working for most cases because the TCF/TEF flags
are set by a previous command and not cleared when the
command finishes or a new command is started.

Using two flashes and doing data transfers simoutanously on
them triggers situations where the previous command doesn't
finish with TCF flag set. Then the call to stm32_qspi_wait_cmd()
leads to a timeout and e.g. into a read only UBI file system.

In APM mode calling stm32_qspi_wait_cmd() leads to a timeout.
The reason for this is that stm32_qspi_wait_cmd() requires the
TCF/TEF flags to be set to finish. In APM mode these flags are
not used.
Currently it is working for most cases because the TCF/TEF flags
are set by a previous command and not cleared when the
command finishes or a new command is started.

Using two flashes and doing data transfers simoutanously on
them triggers situations where the previous command doesn't
finish with TCF flag set. Then the call to stm32_qspi_wait_cmd()
leads to a timeout and e.g. into a read only UBI file system.
@ST-dot-com
Copy link

This pull request has been refused, the Contribution License Agreement must be signed.

@ST-dot-com ST-dot-com closed this Apr 12, 2022
mcarlin-ds pushed a commit to DatumSystems/linux that referenced this pull request Oct 2, 2022
[ Upstream commit 4224cfd ]

When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    STMicroelectronics#13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    STMicroelectronics#14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    STMicroelectronics#15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    STMicroelectronics#16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    STMicroelectronics#17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    STMicroelectronics#18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    STMicroelectronics#19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    STMicroelectronics#20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    STMicroelectronics#21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    STMicroelectronics#22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    STMicroelectronics#23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    STMicroelectronics#24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    STMicroelectronics#25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    STMicroelectronics#26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <suresh2514@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
mcarlin-ds pushed a commit to DatumSystems/linux that referenced this pull request Apr 18, 2023
[ Upstream commit 4e264be ]

When a system with E810 with existing VFs gets rebooted the following
hang may be observed.

 Pid 1 is hung in iavf_remove(), part of a network driver:
 PID: 1        TASK: ffff965400e5a340  CPU: 24   COMMAND: "systemd-shutdow"
  #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
  #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
  #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
  #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
  #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
  #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
  #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
  #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
  #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
  #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
 #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
 STMicroelectronics#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
 STMicroelectronics#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
 STMicroelectronics#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
 STMicroelectronics#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
 STMicroelectronics#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
 STMicroelectronics#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
 STMicroelectronics#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
 STMicroelectronics#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
 STMicroelectronics#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
 STMicroelectronics#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
     RIP: 00007f1baa5c13d7  RSP: 00007fffbcc55a98  RFLAGS: 00000202
     RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f1baa5c13d7
     RDX: 0000000001234567  RSI: 0000000028121969  RDI: 00000000fee1dead
     RBP: 00007fffbcc55ca0   R8: 0000000000000000   R9: 00007fffbcc54e90
     R10: 00007fffbcc55050  R11: 0000000000000202  R12: 0000000000000005
     R13: 0000000000000000  R14: 00007fffbcc55af0  R15: 0000000000000000
     ORIG_RAX: 00000000000000a9  CS: 0033  SS: 002b

During reboot all drivers PM shutdown callbacks are invoked.
In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
In ice_shutdown() the call chain above is executed, which at some point
calls iavf_remove(). However iavf_remove() expects the VF to be in one
of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
that's not the case it sleeps forever.
So if iavf_shutdown() gets invoked before iavf_remove() the system will
hang indefinitely because the adapter is already in state __IAVF_REMOVE.

Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
as we already went through iavf_shutdown().

Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove")
Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
Reported-by: Marius Cornea <mcornea@redhat.com>
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
arnopo pushed a commit to arnopo/linux that referenced this pull request Aug 31, 2023
Christoph Biedl reported early OOM on recent kernels:

    swapper: page allocation failure: order:0, mode:0x100(__GFP_ZERO),
nodemask=(null)
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.3.0-rc4+ STMicroelectronics#16
    Hardware name: 9000/785/C3600
    Backtrace:
     [<10408594>] show_stack+0x48/0x5c
     [<10e152d8>] dump_stack_lvl+0x48/0x64
     [<10e15318>] dump_stack+0x24/0x34
     [<105cf7f8>] warn_alloc+0x10c/0x1c8
     [<105d068c>] __alloc_pages+0xbbc/0xcf8
     [<105d0e4c>] __get_free_pages+0x28/0x78
     [<105ad10c>] __pte_alloc_kernel+0x30/0x98
     [<10406934>] set_fixmap+0xec/0xf4
     [<10411ad4>] patch_map.constprop.0+0xa8/0xdc
     [<10411bb0>] __patch_text_multiple+0xa8/0x208
     [<10411d78>] patch_text+0x30/0x48
     [<1041246c>] arch_jump_label_transform+0x90/0xcc
     [<1056f734>] jump_label_update+0xd4/0x184
     [<1056fc9c>] static_key_enable_cpuslocked+0xc0/0x110
     [<1056fd08>] static_key_enable+0x1c/0x2c
     [<1011362c>] init_mem_debugging_and_hardening+0xdc/0xf8
     [<1010141c>] start_kernel+0x5f0/0xa98
     [<10105da8>] start_parisc+0xb8/0xe4

    Mem-Info:
    active_anon:0 inactive_anon:0 isolated_anon:0
     active_file:0 inactive_file:0 isolated_file:0
     unevictable:0 dirty:0 writeback:0
     slab_reclaimable:0 slab_unreclaimable:0
     mapped:0 shmem:0 pagetables:0
     sec_pagetables:0 bounce:0
     kernel_misc_reclaimable:0
     free:0 free_pcp:0 free_cma:0
    Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
mapped:0kB dirty:0kB writeback:0kB shmem:0kB
+writeback_tmp:0kB kernel_stack:0kB pagetables:0kB sec_pagetables:0kB
all_unreclaimable? no
    Normal free:0kB boost:0kB min:0kB low:0kB high:0kB
reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB
inactive_file:0kB unevictable:0kB writepending:0kB
+present:1048576kB managed:1039360kB mlocked:0kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB
    lowmem_reserve[]: 0 0
    Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
0*1024kB 0*2048kB 0*4096kB = 0kB
    0 total pagecache pages
    0 pages in swap cache
    Free swap  = 0kB
    Total swap = 0kB
    262144 pages RAM
    0 pages HighMem/MovableOnly
    2304 pages reserved
    Backtrace:
     [<10411d78>] patch_text+0x30/0x48
     [<1041246c>] arch_jump_label_transform+0x90/0xcc
     [<1056f734>] jump_label_update+0xd4/0x184
     [<1056fc9c>] static_key_enable_cpuslocked+0xc0/0x110
     [<1056fd08>] static_key_enable+0x1c/0x2c
     [<1011362c>] init_mem_debugging_and_hardening+0xdc/0xf8
     [<1010141c>] start_kernel+0x5f0/0xa98
     [<10105da8>] start_parisc+0xb8/0xe4

    Kernel Fault: Code=15 (Data TLB miss fault) at addr 0f7fe3c0
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.3.0-rc4+ STMicroelectronics#16
    Hardware name: 9000/785/C3600

This happens because patching static key code temporarily maps it via
fixmap and if it happens before page allocator is initialized set_fixmap()
cannot allocate memory using pte_alloc_kernel().

Make sure that fixmap page tables are preallocated early so that
pte_offset_kernel() in set_fixmap() never resorts to pte allocation.

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Tested-by: John David Anglin <dave.anglin@bell.net>
Cc: <stable@vger.kernel.org> # v6.4+
mcarlin-ds pushed a commit to DatumSystems/linux that referenced this pull request Sep 13, 2024
[ Upstream commit 19ecbe8 ]

If komeda_pipeline_unbound_components() returns -EDEADLK,
it means that a deadlock happened in the locking context.
Currently, komeda is not dealing with the deadlock properly,producing the
following output when CONFIG_DEBUG_WW_MUTEX_SLOWPATH is enabled:

 ------------[ cut here ]------------
[   26.103984] WARNING: CPU: 2 PID: 345 at drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c:1248
	       komeda_release_unclaimed_resources+0x13c/0x170
[   26.117453] Modules linked in:
[   26.120511] CPU: 2 PID: 345 Comm: composer@2.1-se Kdump: loaded Tainted: G   W  5.10.110-SE-SDK1.8-dirty STMicroelectronics#16
[   26.131374] Hardware name: Siengine Se1000 Evaluation board (DT)
[   26.137379] pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--)
[   26.143385] pc : komeda_release_unclaimed_resources+0x13c/0x170
[   26.149301] lr : komeda_release_unclaimed_resources+0xbc/0x170
[   26.155130] sp : ffff800017b8b8d0
[   26.158442] pmr_save: 000000e0
[   26.161493] x29: ffff800017b8b8d0 x28: ffff000cf2f96200
[   26.166805] x27: ffff000c8f5a8800 x26: 0000000000000000
[   26.172116] x25: 0000000000000038 x24: ffff8000116a0140
[   26.177428] x23: 0000000000000038 x22: ffff000cf2f96200
[   26.182739] x21: ffff000cfc300300 x20: ffff000c8ab77080
[   26.188051] x19: 0000000000000003 x18: 0000000000000000
[   26.193362] x17: 0000000000000000 x16: 0000000000000000
[   26.198672] x15: b400e638f738ba38 x14: 0000000000000000
[   26.203983] x13: 0000000106400a00 x12: 0000000000000000
[   26.209294] x11: 0000000000000000 x10: 0000000000000000
[   26.214604] x9 : ffff800012f80000 x8 : ffff000ca3308000
[   26.219915] x7 : 0000000ff3000000 x6 : ffff80001084034c
[   26.225226] x5 : ffff800017b8bc40 x4 : 000000000000000f
[   26.230536] x3 : ffff000ca3308000 x2 : 0000000000000000
[   26.235847] x1 : 0000000000000000 x0 : ffffffffffffffdd
[   26.241158] Call trace:
[   26.243604] komeda_release_unclaimed_resources+0x13c/0x170
[   26.249175] komeda_crtc_atomic_check+0x68/0xf0
[   26.253706] drm_atomic_helper_check_planes+0x138/0x1f4
[   26.258929] komeda_kms_check+0x284/0x36c
[   26.262939] drm_atomic_check_only+0x40c/0x714
[   26.267381] drm_atomic_nonblocking_commit+0x1c/0x60
[   26.272344] drm_mode_atomic_ioctl+0xa3c/0xb8c
[   26.276787] drm_ioctl_kernel+0xc4/0x120
[   26.280708] drm_ioctl+0x268/0x534
[   26.284109] __arm64_sys_ioctl+0xa8/0xf0
[   26.288030] el0_svc_common.constprop.0+0x80/0x240
[   26.292817] do_el0_svc+0x24/0x90
[   26.296132] el0_svc+0x20/0x30
[   26.299185] el0_sync_handler+0xe8/0xf0
[   26.303018] el0_sync+0x1a4/0x1c0
[   26.306330] irq event stamp: 0
[   26.309384] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[   26.315650] hardirqs last disabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c
[   26.323825] softirqs last  enabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c
[   26.331997] softirqs last disabled at (0): [<0000000000000000>] 0x0
[   26.338261] ---[ end trace 20ae984fa860184a ]---
[   26.343021] ------------[ cut here ]------------
[   26.347646] WARNING: CPU: 3 PID: 345 at drivers/gpu/drm/drm_modeset_lock.c:228 drm_modeset_drop_locks+0x84/0x90
[   26.357727] Modules linked in:
[   26.360783] CPU: 3 PID: 345 Comm: composer@2.1-se Kdump: loaded Tainted: G   W  5.10.110-SE-SDK1.8-dirty STMicroelectronics#16
[   26.371645] Hardware name: Siengine Se1000 Evaluation board (DT)
[   26.377647] pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--)
[   26.383649] pc : drm_modeset_drop_locks+0x84/0x90
[   26.388351] lr : drm_mode_atomic_ioctl+0x860/0xb8c
[   26.393137] sp : ffff800017b8bb10
[   26.396447] pmr_save: 000000e0
[   26.399497] x29: ffff800017b8bb10 x28: 0000000000000001
[   26.404807] x27: 0000000000000038 x26: 0000000000000002
[   26.410115] x25: ffff000cecbefa00 x24: ffff000cf2f96200
[   26.415423] x23: 0000000000000001 x22: 0000000000000018
[   26.420731] x21: 0000000000000001 x20: ffff800017b8bc10
[   26.426039] x19: 0000000000000000 x18: 0000000000000000
[   26.431347] x17: 0000000002e8bf2c x16: 0000000002e94c6b
[   26.436655] x15: 0000000002ea48b9 x14: ffff8000121f0300
[   26.441963] x13: 0000000002ee2ca8 x12: ffff80001129cae0
[   26.447272] x11: ffff800012435000 x10: ffff000ed46b5e88
[   26.452580] x9 : ffff000c9935e600 x8 : 0000000000000000
[   26.457888] x7 : 000000008020001e x6 : 000000008020001f
[   26.463196] x5 : ffff80001085fbe0 x4 : fffffe0033a59f20
[   26.468504] x3 : 000000008020001e x2 : 0000000000000000
[   26.473813] x1 : 0000000000000000 x0 : ffff000c8f596090
[   26.479122] Call trace:
[   26.481566] drm_modeset_drop_locks+0x84/0x90
[   26.485918] drm_mode_atomic_ioctl+0x860/0xb8c
[   26.490359] drm_ioctl_kernel+0xc4/0x120
[   26.494278] drm_ioctl+0x268/0x534
[   26.497677] __arm64_sys_ioctl+0xa8/0xf0
[   26.501598] el0_svc_common.constprop.0+0x80/0x240
[   26.506384] do_el0_svc+0x24/0x90
[   26.509697] el0_svc+0x20/0x30
[   26.512748] el0_sync_handler+0xe8/0xf0
[   26.516580] el0_sync+0x1a4/0x1c0
[   26.519891] irq event stamp: 0
[   26.522943] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[   26.529207] hardirqs last disabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c
[   26.537379] softirqs last  enabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c
[   26.545550] softirqs last disabled at (0): [<0000000000000000>] 0x0
[   26.551812] ---[ end trace 20ae984fa860184b ]---

According to the call trace information,it can be located to be
WARN_ON(IS_ERR(c_st)) in the komeda_pipeline_unbound_components function;
Then follow the function.
komeda_pipeline_unbound_components
-> komeda_component_get_state_and_set_user
  -> komeda_pipeline_get_state_and_set_crtc
    -> komeda_pipeline_get_state
      ->drm_atomic_get_private_obj_state
        -> drm_atomic_get_private_obj_state
          -> drm_modeset_lock

komeda_pipeline_unbound_components
-> komeda_component_get_state_and_set_user
  -> komeda_component_get_state
    -> drm_atomic_get_private_obj_state
     -> drm_modeset_lock

ret = drm_modeset_lock(&obj->lock, state->acquire_ctx); if (ret)
	return ERR_PTR(ret);
Here it return -EDEADLK.

deal with the deadlock as suggested by [1], using the
function drm_modeset_backoff().
[1] https://docs.kernel.org/gpu/drm-kms.html?highlight=kms#kms-locking

Therefore, handling this problem can be solved
by adding return -EDEADLK back to the drm_modeset_backoff processing flow
in the drm_mode_atomic_ioctl function.

Signed-off-by: baozhu.liu <lucas.liu@siengine.com>
Signed-off-by: menghui.huang <menghui.huang@siengine.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230804013117.6870-1-menghui.huang@siengine.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
mcarlin-ds pushed a commit to DatumSystems/linux that referenced this pull request Sep 13, 2024
[ Upstream commit e3e82fc ]

When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a
cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when
removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be
dereferenced as wrong struct in irdma_free_pending_cqp_request().

  PID: 3669   TASK: ffff88aef892c000  CPU: 28  COMMAND: "kworker/28:0"
   #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34
   #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2
   #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f
   #3 [fffffe0000549eb8] do_nmi at ffffffff81079582
   #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4
      [exception RIP: native_queued_spin_lock_slowpath+1291]
      RIP: ffffffff8127e72b  RSP: ffff88aa841ef778  RFLAGS: 00000046
      RAX: 0000000000000000  RBX: ffff88b01f849700  RCX: ffffffff8127e47e
      RDX: 0000000000000000  RSI: 0000000000000004  RDI: ffffffff83857ec0
      RBP: ffff88afe3e4efc8   R8: ffffed15fc7c9dfa   R9: ffffed15fc7c9dfa
      R10: 0000000000000001  R11: ffffed15fc7c9df9  R12: 0000000000740000
      R13: ffff88b01f849708  R14: 0000000000000003  R15: ffffed1603f092e1
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  -- <NMI exception stack> --
   #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b
   #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4
   #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363
   #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma]
   #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma]
   #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma]
   #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma]
   #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb
   STMicroelectronics#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6
   STMicroelectronics#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278
   STMicroelectronics#15 [ffff88aa841efb88] device_del at ffffffff82179d23
   STMicroelectronics#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice]
   STMicroelectronics#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice]
   STMicroelectronics#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a
   STMicroelectronics#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff
   STMicroelectronics#20 [ffff88aa841eff10] kthread at ffffffff811d87a0
   STMicroelectronics#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f

Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn
Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
mcarlin-ds pushed a commit to DatumSystems/linux that referenced this pull request Sep 13, 2024
…alid

[ Upstream commit f2f11fc ]

If smb2 request from client is invalid, The following kernel oops could
happen. The patch e2b76ab: "ksmbd: add support for read compound"
leads this issue. When request is invalid, It doesn't set anything in
the response buffer. This patch add missing set invalid parameter error
response.

[  673.085542] ksmbd: cli req too short, len 184 not 142. cmd:5 mid:109
[  673.085580] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  673.085591] #PF: supervisor read access in kernel mode
[  673.085600] #PF: error_code(0x0000) - not-present page
[  673.085608] PGD 0 P4D 0
[  673.085620] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  673.085631] CPU: 3 PID: 1039 Comm: kworker/3:0 Not tainted 6.6.0-rc2-tmt STMicroelectronics#16
[  673.085643] Hardware name: AZW U59/U59, BIOS JTKT001 05/05/2022
[  673.085651] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd]
[  673.085719] RIP: 0010:ksmbd_conn_write+0x68/0xc0 [ksmbd]
[  673.085808] RAX: 0000000000000000 RBX: ffff88811ade4f00 RCX: 0000000000000000
[  673.085817] RDX: 0000000000000000 RSI: ffff88810c2a9780 RDI: ffff88810c2a9ac0
[  673.085826] RBP: ffffc900005e3e00 R08: 0000000000000000 R09: 0000000000000000
[  673.085834] R10: ffffffffa3168160 R11: 63203a64626d736b R12: ffff8881057c8800
[  673.085842] R13: ffff8881057c8820 R14: ffff8882781b2380 R15: ffff8881057c8800
[  673.085852] FS:  0000000000000000(0000) GS:ffff888278180000(0000) knlGS:0000000000000000
[  673.085864] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  673.085872] CR2: 0000000000000000 CR3: 000000015b63c000 CR4: 0000000000350ee0
[  673.085883] Call Trace:
[  673.085890]  <TASK>
[  673.085900]  ? show_regs+0x6a/0x80
[  673.085916]  ? __die+0x25/0x70
[  673.085926]  ? page_fault_oops+0x154/0x4b0
[  673.085938]  ? tick_nohz_tick_stopped+0x18/0x50
[  673.085954]  ? __irq_work_queue_local+0xba/0x140
[  673.085967]  ? do_user_addr_fault+0x30f/0x6c0
[  673.085979]  ? exc_page_fault+0x79/0x180
[  673.085992]  ? asm_exc_page_fault+0x27/0x30
[  673.086009]  ? ksmbd_conn_write+0x68/0xc0 [ksmbd]
[  673.086067]  ? ksmbd_conn_write+0x46/0xc0 [ksmbd]
[  673.086123]  handle_ksmbd_work+0x28d/0x4b0 [ksmbd]
[  673.086177]  process_one_work+0x178/0x350
[  673.086193]  ? __pfx_worker_thread+0x10/0x10
[  673.086202]  worker_thread+0x2f3/0x420
[  673.086210]  ? _raw_spin_unlock_irqrestore+0x27/0x50
[  673.086222]  ? __pfx_worker_thread+0x10/0x10
[  673.086230]  kthread+0x103/0x140
[  673.086242]  ? __pfx_kthread+0x10/0x10
[  673.086253]  ret_from_fork+0x39/0x60
[  673.086263]  ? __pfx_kthread+0x10/0x10
[  673.086274]  ret_from_fork_asm+0x1b/0x30

Fixes: e2b76ab ("ksmbd: add support for read compound")
Reported-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants