-
Notifications
You must be signed in to change notification settings - Fork 53.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hackers.mu][zero content-prevent cryptographic leaks] #565
Conversation
Hi @yasirmx! Thanks for your contribution to the Linux kernel! Linux kernel development happens on mailing lists, rather than on GitHub - this GitHub repository is a read-only mirror that isn't used for accepting contributions. So that your change can become part of Linux, please email it to us as a patch. Sending patches isn't quite as simple as sending a pull request, but fortunately it is a well documented process. Here's what to do:
How do I format my contribution?The Linux kernel community is notoriously picky about how contributions are formatted and sent. Fortunately, they have documented their expectations. Firstly, all contributions need to be formatted as patches. A patch is a plain text document showing the change you want to make to the code, and documenting why it is a good idea. You can create patches with Secondly, patches need 'commit messages', which is the human-friendly documentation explaining what the change is and why it's necessary. Thirdly, changes have some technical requirements. There is a Linux kernel coding style, and there are licensing requirements you need to comply with. Both of these are documented in the Submitting Patches documentation that is part of the kernel. Note that you will almost certainly have to modify your existing git commits to satisfy these requirements. Don't worry: there are many guides on the internet for doing this. Who do I send my contribution to?The Linux kernel is composed of a number of subsystems. These subsystems are maintained by different people, and have different mailing lists where they discuss proposed changes. If you don't already know what subsystem your change belongs to, the
Make sure that your list of recipients includes a mailing list. If you can't find a more specific mailing list, then LKML - the Linux Kernel Mailing List - is the place to send your patches. It's not usually necessary to subscribe to the mailing list before you send the patches, but if you're interested in kernel development, subscribing to a subsystem mailing list is a good idea. (At this point, you probably don't need to subscribe to LKML - it is a very high traffic list with about a thousand messages per day, which is often not useful for beginners.) How do I send my contribution?Use For more information about using How do I get help if I'm stuck?Firstly, don't get discouraged! There are an enormous number of resources on the internet, and many kernel developers who would like to see you succeed. Many issues - especially about how to use certain tools - can be resolved by using your favourite internet search engine. If you can't find an answer, there are a few places you can turn:
If you get really, really stuck, you could try the owners of this bot, @daxtens and @ajdlinux. Please be aware that we do have full-time jobs, so we are almost certainly the slowest way to get answers! I sent my patch - now what?You wait. You can check that your email has been received by checking the mailing list archives for the mailing list you sent your patch to. Messages may not be received instantly, so be patient. Kernel developers are generally very busy people, so it may take a few weeks before your patch is looked at. Then, you keep waiting. Three things may happen:
Further information
Happy hacking! This message was posted by a bot - if you have any questions or suggestions, please talk to my owners, @ajdlinux and @daxtens, or raise an issue at https://github.com/ajdlinux/KernelPRBot. |
rust: gpio: add support for registering irq chips with gpio chip.
When an MST connector stays enabled during a commit the connector's MST state needs to be added to the atomic state, but the corresponding MST payload allocation shouldn't be set for deletion; fix such modesets by ensuring the above even if the connector was already enabled before the modeset. The issue led to the following: [ 761.992923] i915 0000:00:02.0: drm_WARN_ON(payload->delete) [ 761.992949] WARNING: CPU: 6 PID: 1401 at drivers/gpu/drm/display/drm_dp_mst_topology.c:4221 drm_dp_atomic_find_time_slots+0x236/0x280 [drm_display_helper] [ 761.992955] Modules linked in: snd_hda_intel i915 drm_buddy drm_display_helper drm_kms_helper ttm drm snd_hda_codec_hdmi snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm prime_numbers i2c_algo_bit syscopyarea sysfillrect sysimgblt fb_sys_fops x86_pkg_temp_thermal cdc_ether coretemp crct10dif_pclmul usbnet crc32_pclmul mii ghash_clmulni_intel e1000e mei_me ptp i2c_i801 pps_core mei i2c_smbus intel_lpss_pci fuse [last unloaded: drm] [ 761.992986] CPU: 6 PID: 1401 Comm: testdisplay Tainted: G U 6.0.0-rc4-imre+ torvalds#565 [ 761.992989] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS ADLPFWI1.R00.3135.A00.2203251419 03/25/2022 [ 761.992990] RIP: 0010:drm_dp_atomic_find_time_slots+0x236/0x280 [drm_display_helper] [ 761.992994] Code: 4c 8b 67 50 4d 85 e4 75 03 4c 8b 27 e8 03 28 4e e1 48 c7 c1 8b 26 2c a0 4c 89 e2 48 c7 c7 a8 26 2c a0 48 89 c6 e8 31 d5 88 e1 <0f> 0b 49 8b 85 d0 00 00 00 4c 89 fa 48 c7 c6 a0 41 2c a0 48 8b 78 [ 761.992995] RSP: 0018:ffffc9000177ba60 EFLAGS: 00010286 [ 761.992998] RAX: 0000000000000000 RBX: ffff88810d2f1540 RCX: 0000000000000000 [ 761.992999] RDX: 0000000000000001 RSI: ffffffff82368a25 RDI: 00000000ffffffff [ 761.993000] RBP: ffff888142299d80 R08: ffff8884adbfdfe8 R09: 00000000ffefffff [ 761.993001] R10: ffff8884a6bfe000 R11: ffff8884ac443c30 R12: ffff888102972f90 [ 761.993002] R13: ffff8881163e2cf0 R14: 00000000000003ac R15: ffff88810c501000 [ 761.993003] FS: 00007f81e4c459c0(0000) GS:ffff888496500000(0000) knlGS:0000000000000000 [ 761.993004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 761.993005] CR2: 0000555dac962a98 CR3: 0000000123a34006 CR4: 0000000000770ee0 [ 761.993006] PKRU: 55555554 [ 761.993007] Call Trace: [ 761.993009] <TASK> [ 761.993012] intel_dp_mst_compute_config+0x19a/0x350 [i915] [ 761.993090] intel_atomic_check+0xf37/0x3180 [i915] [ 761.993168] drm_atomic_check_only+0x5d3/0xa60 [drm] [ 761.993182] drm_atomic_commit+0x56/0xc0 [drm] [ 761.993192] ? drm_plane_get_damage_clips.cold+0x1c/0x1c [drm] [ 761.993204] drm_atomic_helper_set_config+0x78/0xc0 [drm_kms_helper] [ 761.993214] drm_mode_setcrtc+0x1ed/0x750 [drm] [ 761.993232] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 761.993241] drm_ioctl_kernel+0xb5/0x150 [drm] [ 761.993252] drm_ioctl+0x203/0x3d0 [drm] [ 761.993261] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 761.993276] __x64_sys_ioctl+0x8a/0xb0 [ 761.993281] do_syscall_64+0x38/0x90 [ 761.993285] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 761.993287] RIP: 0033:0x7f81e551aaff [ 761.993288] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 [ 761.993290] RSP: 002b:00007fff4304af10 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 761.993292] RAX: ffffffffffffffda RBX: 00007fff4304afa0 RCX: 00007f81e551aaff [ 761.993293] RDX: 00007fff4304afa0 RSI: 00000000c06864a2 RDI: 0000000000000004 [ 761.993294] RBP: 00000000c06864a2 R08: 0000000000000000 R09: 0000555dac8a9c68 [ 761.993294] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000008c4 [ 761.993295] R13: 0000000000000004 R14: 0000555dac8a9c68 R15: 00007fff4304b098 [ 761.993301] </TASK> Fixes: 083351e ("drm/display/dp_mst: Fix modeset tracking in drm_dp_atomic_release_vcpi_slots()") Testcase: igt@testdisplay Cc: Lyude Paul <lyude@redhat.com> Signed-off-by: Imre Deak <imre.deak@intel.com>
When an MST connector stays enabled during a commit the connector's MST state needs to be added to the atomic state, but the corresponding MST payload allocation shouldn't be set for deletion; fix such modesets by ensuring the above even if the connector was already enabled before the modeset. The issue led to the following: [ 761.992923] i915 0000:00:02.0: drm_WARN_ON(payload->delete) [ 761.992949] WARNING: CPU: 6 PID: 1401 at drivers/gpu/drm/display/drm_dp_mst_topology.c:4221 drm_dp_atomic_find_time_slots+0x236/0x280 [drm_display_helper] [ 761.992955] Modules linked in: snd_hda_intel i915 drm_buddy drm_display_helper drm_kms_helper ttm drm snd_hda_codec_hdmi snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm prime_numbers i2c_algo_bit syscopyarea sysfillrect sysimgblt fb_sys_fops x86_pkg_temp_thermal cdc_ether coretemp crct10dif_pclmul usbnet crc32_pclmul mii ghash_clmulni_intel e1000e mei_me ptp i2c_i801 pps_core mei i2c_smbus intel_lpss_pci fuse [last unloaded: drm] [ 761.992986] CPU: 6 PID: 1401 Comm: testdisplay Tainted: G U 6.0.0-rc4-imre+ torvalds#565 [ 761.992989] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS ADLPFWI1.R00.3135.A00.2203251419 03/25/2022 [ 761.992990] RIP: 0010:drm_dp_atomic_find_time_slots+0x236/0x280 [drm_display_helper] [ 761.992994] Code: 4c 8b 67 50 4d 85 e4 75 03 4c 8b 27 e8 03 28 4e e1 48 c7 c1 8b 26 2c a0 4c 89 e2 48 c7 c7 a8 26 2c a0 48 89 c6 e8 31 d5 88 e1 <0f> 0b 49 8b 85 d0 00 00 00 4c 89 fa 48 c7 c6 a0 41 2c a0 48 8b 78 [ 761.992995] RSP: 0018:ffffc9000177ba60 EFLAGS: 00010286 [ 761.992998] RAX: 0000000000000000 RBX: ffff88810d2f1540 RCX: 0000000000000000 [ 761.992999] RDX: 0000000000000001 RSI: ffffffff82368a25 RDI: 00000000ffffffff [ 761.993000] RBP: ffff888142299d80 R08: ffff8884adbfdfe8 R09: 00000000ffefffff [ 761.993001] R10: ffff8884a6bfe000 R11: ffff8884ac443c30 R12: ffff888102972f90 [ 761.993002] R13: ffff8881163e2cf0 R14: 00000000000003ac R15: ffff88810c501000 [ 761.993003] FS: 00007f81e4c459c0(0000) GS:ffff888496500000(0000) knlGS:0000000000000000 [ 761.993004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 761.993005] CR2: 0000555dac962a98 CR3: 0000000123a34006 CR4: 0000000000770ee0 [ 761.993006] PKRU: 55555554 [ 761.993007] Call Trace: [ 761.993009] <TASK> [ 761.993012] intel_dp_mst_compute_config+0x19a/0x350 [i915] [ 761.993090] intel_atomic_check+0xf37/0x3180 [i915] [ 761.993168] drm_atomic_check_only+0x5d3/0xa60 [drm] [ 761.993182] drm_atomic_commit+0x56/0xc0 [drm] [ 761.993192] ? drm_plane_get_damage_clips.cold+0x1c/0x1c [drm] [ 761.993204] drm_atomic_helper_set_config+0x78/0xc0 [drm_kms_helper] [ 761.993214] drm_mode_setcrtc+0x1ed/0x750 [drm] [ 761.993232] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 761.993241] drm_ioctl_kernel+0xb5/0x150 [drm] [ 761.993252] drm_ioctl+0x203/0x3d0 [drm] [ 761.993261] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 761.993276] __x64_sys_ioctl+0x8a/0xb0 [ 761.993281] do_syscall_64+0x38/0x90 [ 761.993285] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 761.993287] RIP: 0033:0x7f81e551aaff [ 761.993288] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 [ 761.993290] RSP: 002b:00007fff4304af10 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 761.993292] RAX: ffffffffffffffda RBX: 00007fff4304afa0 RCX: 00007f81e551aaff [ 761.993293] RDX: 00007fff4304afa0 RSI: 00000000c06864a2 RDI: 0000000000000004 [ 761.993294] RBP: 00000000c06864a2 R08: 0000000000000000 R09: 0000555dac8a9c68 [ 761.993294] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000008c4 [ 761.993295] R13: 0000000000000004 R14: 0000555dac8a9c68 R15: 00007fff4304b098 [ 761.993301] </TASK> Fixes: 083351e ("drm/display/dp_mst: Fix modeset tracking in drm_dp_atomic_release_vcpi_slots()") Testcase: igt@testdisplay Cc: Lyude Paul <lyude@redhat.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220907142542.1681994-1-imre.deak@intel.com
When an MST connector stays enabled during a commit the connector's MST state needs to be added to the atomic state, but the corresponding MST payload allocation shouldn't be set for deletion; fix such modesets by ensuring the above even if the connector was already enabled before the modeset. The issue led to the following: [ 761.992923] i915 0000:00:02.0: drm_WARN_ON(payload->delete) [ 761.992949] WARNING: CPU: 6 PID: 1401 at drivers/gpu/drm/display/drm_dp_mst_topology.c:4221 drm_dp_atomic_find_time_slots+0x236/0x280 [drm_display_helper] [ 761.992955] Modules linked in: snd_hda_intel i915 drm_buddy drm_display_helper drm_kms_helper ttm drm snd_hda_codec_hdmi snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm prime_numbers i2c_algo_bit syscopyarea sysfillrect sysimgblt fb_sys_fops x86_pkg_temp_thermal cdc_ether coretemp crct10dif_pclmul usbnet crc32_pclmul mii ghash_clmulni_intel e1000e mei_me ptp i2c_i801 pps_core mei i2c_smbus intel_lpss_pci fuse [last unloaded: drm] [ 761.992986] CPU: 6 PID: 1401 Comm: testdisplay Tainted: G U 6.0.0-rc4-imre+ torvalds#565 [ 761.992989] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS ADLPFWI1.R00.3135.A00.2203251419 03/25/2022 [ 761.992990] RIP: 0010:drm_dp_atomic_find_time_slots+0x236/0x280 [drm_display_helper] [ 761.992994] Code: 4c 8b 67 50 4d 85 e4 75 03 4c 8b 27 e8 03 28 4e e1 48 c7 c1 8b 26 2c a0 4c 89 e2 48 c7 c7 a8 26 2c a0 48 89 c6 e8 31 d5 88 e1 <0f> 0b 49 8b 85 d0 00 00 00 4c 89 fa 48 c7 c6 a0 41 2c a0 48 8b 78 [ 761.992995] RSP: 0018:ffffc9000177ba60 EFLAGS: 00010286 [ 761.992998] RAX: 0000000000000000 RBX: ffff88810d2f1540 RCX: 0000000000000000 [ 761.992999] RDX: 0000000000000001 RSI: ffffffff82368a25 RDI: 00000000ffffffff [ 761.993000] RBP: ffff888142299d80 R08: ffff8884adbfdfe8 R09: 00000000ffefffff [ 761.993001] R10: ffff8884a6bfe000 R11: ffff8884ac443c30 R12: ffff888102972f90 [ 761.993002] R13: ffff8881163e2cf0 R14: 00000000000003ac R15: ffff88810c501000 [ 761.993003] FS: 00007f81e4c459c0(0000) GS:ffff888496500000(0000) knlGS:0000000000000000 [ 761.993004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 761.993005] CR2: 0000555dac962a98 CR3: 0000000123a34006 CR4: 0000000000770ee0 [ 761.993006] PKRU: 55555554 [ 761.993007] Call Trace: [ 761.993009] <TASK> [ 761.993012] intel_dp_mst_compute_config+0x19a/0x350 [i915] [ 761.993090] intel_atomic_check+0xf37/0x3180 [i915] [ 761.993168] drm_atomic_check_only+0x5d3/0xa60 [drm] [ 761.993182] drm_atomic_commit+0x56/0xc0 [drm] [ 761.993192] ? drm_plane_get_damage_clips.cold+0x1c/0x1c [drm] [ 761.993204] drm_atomic_helper_set_config+0x78/0xc0 [drm_kms_helper] [ 761.993214] drm_mode_setcrtc+0x1ed/0x750 [drm] [ 761.993232] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 761.993241] drm_ioctl_kernel+0xb5/0x150 [drm] [ 761.993252] drm_ioctl+0x203/0x3d0 [drm] [ 761.993261] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 761.993276] __x64_sys_ioctl+0x8a/0xb0 [ 761.993281] do_syscall_64+0x38/0x90 [ 761.993285] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 761.993287] RIP: 0033:0x7f81e551aaff [ 761.993288] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 [ 761.993290] RSP: 002b:00007fff4304af10 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 761.993292] RAX: ffffffffffffffda RBX: 00007fff4304afa0 RCX: 00007f81e551aaff [ 761.993293] RDX: 00007fff4304afa0 RSI: 00000000c06864a2 RDI: 0000000000000004 [ 761.993294] RBP: 00000000c06864a2 R08: 0000000000000000 R09: 0000555dac8a9c68 [ 761.993294] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000008c4 [ 761.993295] R13: 0000000000000004 R14: 0000555dac8a9c68 R15: 00007fff4304b098 [ 761.993301] </TASK> Fixes: 083351e ("drm/display/dp_mst: Fix modeset tracking in drm_dp_atomic_release_vcpi_slots()") Testcase: igt@testdisplay Cc: Lyude Paul <lyude@redhat.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220907142542.1681994-1-imre.deak@intel.com (cherry picked from commit 5d832b6)
Move GMU mutex initialization earlier to make sure that it is always initialized. a6xx_destroy can be called from ther failure path before GMU initialization. This fixes the following backtrace: ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 0 PID: 58 at kernel/locking/mutex.c:582 __mutex_lock+0x1ec/0x3d0 Modules linked in: CPU: 0 PID: 58 Comm: kworker/u16:1 Not tainted 6.3.0-rc5-00155-g187c06436519 torvalds#565 Hardware name: Qualcomm Technologies, Inc. SM8350 HDK (DT) Workqueue: events_unbound deferred_probe_work_func pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __mutex_lock+0x1ec/0x3d0 lr : __mutex_lock+0x1ec/0x3d0 sp : ffff800008993620 x29: ffff800008993620 x28: 0000000000000002 x27: ffff47b253c52800 x26: 0000000001000606 x25: ffff47b240bb2810 x24: fffffffffffffff4 x23: 0000000000000000 x22: ffffc38bba15ac14 x21: 0000000000000002 x20: ffff800008993690 x19: ffff47b2430cc668 x18: fffffffffffe98f0 x17: 6f74616c75676572 x16: 20796d6d75642067 x15: 0000000000000038 x14: 0000000000000000 x13: ffffc38bbba050b8 x12: 0000000000000666 x11: 0000000000000222 x10: ffffc38bbba603e8 x9 : ffffc38bbba050b8 x8 : 00000000ffffefff x7 : ffffc38bbba5d0b8 x6 : 0000000000000222 x5 : 000000000000bff4 x4 : 40000000fffff222 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff47b240cb1880 Call trace: __mutex_lock+0x1ec/0x3d0 mutex_lock_nested+0x2c/0x38 a6xx_destroy+0xa0/0x138 a6xx_gpu_init+0x41c/0x618 adreno_bind+0x188/0x290 component_bind_all+0x118/0x248 msm_drm_bind+0x1c0/0x670 try_to_bring_up_aggregate_device+0x164/0x1d0 __component_add+0xa8/0x16c component_add+0x14/0x20 dsi_dev_attach+0x20/0x2c dsi_host_attach+0x9c/0x144 devm_mipi_dsi_attach+0x34/0xac lt9611uxc_attach_dsi.isra.0+0x84/0xfc lt9611uxc_probe+0x5b8/0x67c i2c_device_probe+0x1ac/0x358 really_probe+0x148/0x2ac __driver_probe_device+0x78/0xe0 driver_probe_device+0x3c/0x160 __device_attach_driver+0xb8/0x138 bus_for_each_drv+0x84/0xe0 __device_attach+0x9c/0x188 device_initial_probe+0x14/0x20 bus_probe_device+0xac/0xb0 deferred_probe_work_func+0x8c/0xc8 process_one_work+0x2bc/0x594 worker_thread+0x228/0x438 kthread+0x108/0x10c ret_from_fork+0x10/0x20 irq event stamp: 299345 hardirqs last enabled at (299345): [<ffffc38bb9ba61e4>] put_cpu_partial+0x1c8/0x22c hardirqs last disabled at (299344): [<ffffc38bb9ba61dc>] put_cpu_partial+0x1c0/0x22c softirqs last enabled at (296752): [<ffffc38bb9890434>] _stext+0x434/0x4e8 softirqs last disabled at (296741): [<ffffc38bb989669c>] ____do_softirq+0x10/0x1c ---[ end trace 0000000000000000 ]--- Fixes: 4cd15a3 ("drm/msm/a6xx: Make GPU destroy a bit safer") Cc: Douglas Anderson <dianders@chromium.org> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Move GMU mutex initialization earlier to make sure that it is always initialized. a6xx_destroy can be called from ther failure path before GMU initialization. This fixes the following backtrace: ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 0 PID: 58 at kernel/locking/mutex.c:582 __mutex_lock+0x1ec/0x3d0 Modules linked in: CPU: 0 PID: 58 Comm: kworker/u16:1 Not tainted 6.3.0-rc5-00155-g187c06436519 #565 Hardware name: Qualcomm Technologies, Inc. SM8350 HDK (DT) Workqueue: events_unbound deferred_probe_work_func pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __mutex_lock+0x1ec/0x3d0 lr : __mutex_lock+0x1ec/0x3d0 sp : ffff800008993620 x29: ffff800008993620 x28: 0000000000000002 x27: ffff47b253c52800 x26: 0000000001000606 x25: ffff47b240bb2810 x24: fffffffffffffff4 x23: 0000000000000000 x22: ffffc38bba15ac14 x21: 0000000000000002 x20: ffff800008993690 x19: ffff47b2430cc668 x18: fffffffffffe98f0 x17: 6f74616c75676572 x16: 20796d6d75642067 x15: 0000000000000038 x14: 0000000000000000 x13: ffffc38bbba050b8 x12: 0000000000000666 x11: 0000000000000222 x10: ffffc38bbba603e8 x9 : ffffc38bbba050b8 x8 : 00000000ffffefff x7 : ffffc38bbba5d0b8 x6 : 0000000000000222 x5 : 000000000000bff4 x4 : 40000000fffff222 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff47b240cb1880 Call trace: __mutex_lock+0x1ec/0x3d0 mutex_lock_nested+0x2c/0x38 a6xx_destroy+0xa0/0x138 a6xx_gpu_init+0x41c/0x618 adreno_bind+0x188/0x290 component_bind_all+0x118/0x248 msm_drm_bind+0x1c0/0x670 try_to_bring_up_aggregate_device+0x164/0x1d0 __component_add+0xa8/0x16c component_add+0x14/0x20 dsi_dev_attach+0x20/0x2c dsi_host_attach+0x9c/0x144 devm_mipi_dsi_attach+0x34/0xac lt9611uxc_attach_dsi.isra.0+0x84/0xfc lt9611uxc_probe+0x5b8/0x67c i2c_device_probe+0x1ac/0x358 really_probe+0x148/0x2ac __driver_probe_device+0x78/0xe0 driver_probe_device+0x3c/0x160 __device_attach_driver+0xb8/0x138 bus_for_each_drv+0x84/0xe0 __device_attach+0x9c/0x188 device_initial_probe+0x14/0x20 bus_probe_device+0xac/0xb0 deferred_probe_work_func+0x8c/0xc8 process_one_work+0x2bc/0x594 worker_thread+0x228/0x438 kthread+0x108/0x10c ret_from_fork+0x10/0x20 irq event stamp: 299345 hardirqs last enabled at (299345): [<ffffc38bb9ba61e4>] put_cpu_partial+0x1c8/0x22c hardirqs last disabled at (299344): [<ffffc38bb9ba61dc>] put_cpu_partial+0x1c0/0x22c softirqs last enabled at (296752): [<ffffc38bb9890434>] _stext+0x434/0x4e8 softirqs last disabled at (296741): [<ffffc38bb989669c>] ____do_softirq+0x10/0x1c ---[ end trace 0000000000000000 ]--- Fixes: 4cd15a3 ("drm/msm/a6xx: Make GPU destroy a bit safer") Cc: Douglas Anderson <dianders@chromium.org> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/531540/ Signed-off-by: Rob Clark <robdclark@chromium.org>
[ Upstream commit 12abd73 ] Move GMU mutex initialization earlier to make sure that it is always initialized. a6xx_destroy can be called from ther failure path before GMU initialization. This fixes the following backtrace: ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 0 PID: 58 at kernel/locking/mutex.c:582 __mutex_lock+0x1ec/0x3d0 Modules linked in: CPU: 0 PID: 58 Comm: kworker/u16:1 Not tainted 6.3.0-rc5-00155-g187c06436519 torvalds#565 Hardware name: Qualcomm Technologies, Inc. SM8350 HDK (DT) Workqueue: events_unbound deferred_probe_work_func pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __mutex_lock+0x1ec/0x3d0 lr : __mutex_lock+0x1ec/0x3d0 sp : ffff800008993620 x29: ffff800008993620 x28: 0000000000000002 x27: ffff47b253c52800 x26: 0000000001000606 x25: ffff47b240bb2810 x24: fffffffffffffff4 x23: 0000000000000000 x22: ffffc38bba15ac14 x21: 0000000000000002 x20: ffff800008993690 x19: ffff47b2430cc668 x18: fffffffffffe98f0 x17: 6f74616c75676572 x16: 20796d6d75642067 x15: 0000000000000038 x14: 0000000000000000 x13: ffffc38bbba050b8 x12: 0000000000000666 x11: 0000000000000222 x10: ffffc38bbba603e8 x9 : ffffc38bbba050b8 x8 : 00000000ffffefff x7 : ffffc38bbba5d0b8 x6 : 0000000000000222 x5 : 000000000000bff4 x4 : 40000000fffff222 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff47b240cb1880 Call trace: __mutex_lock+0x1ec/0x3d0 mutex_lock_nested+0x2c/0x38 a6xx_destroy+0xa0/0x138 a6xx_gpu_init+0x41c/0x618 adreno_bind+0x188/0x290 component_bind_all+0x118/0x248 msm_drm_bind+0x1c0/0x670 try_to_bring_up_aggregate_device+0x164/0x1d0 __component_add+0xa8/0x16c component_add+0x14/0x20 dsi_dev_attach+0x20/0x2c dsi_host_attach+0x9c/0x144 devm_mipi_dsi_attach+0x34/0xac lt9611uxc_attach_dsi.isra.0+0x84/0xfc lt9611uxc_probe+0x5b8/0x67c i2c_device_probe+0x1ac/0x358 really_probe+0x148/0x2ac __driver_probe_device+0x78/0xe0 driver_probe_device+0x3c/0x160 __device_attach_driver+0xb8/0x138 bus_for_each_drv+0x84/0xe0 __device_attach+0x9c/0x188 device_initial_probe+0x14/0x20 bus_probe_device+0xac/0xb0 deferred_probe_work_func+0x8c/0xc8 process_one_work+0x2bc/0x594 worker_thread+0x228/0x438 kthread+0x108/0x10c ret_from_fork+0x10/0x20 irq event stamp: 299345 hardirqs last enabled at (299345): [<ffffc38bb9ba61e4>] put_cpu_partial+0x1c8/0x22c hardirqs last disabled at (299344): [<ffffc38bb9ba61dc>] put_cpu_partial+0x1c0/0x22c softirqs last enabled at (296752): [<ffffc38bb9890434>] _stext+0x434/0x4e8 softirqs last disabled at (296741): [<ffffc38bb989669c>] ____do_softirq+0x10/0x1c ---[ end trace 0000000000000000 ]--- Fixes: 4cd15a3 ("drm/msm/a6xx: Make GPU destroy a bit safer") Cc: Douglas Anderson <dianders@chromium.org> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/531540/ Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Geert reports that the following NULL pointer dereference happens for him after commit 49d7d58 ("drm/ssd130x: Don't allocate buffers on each plane update"): [drm] Initialized ssd130x 1.0.0 20220131 for 0-003c on minor 0 ssd130x-i2c 0-003c: [drm] surface width(128), height(32), bpp(1) and format(R1 little-endian (0x20203152)) Unable to handle kernel NULL pointer dereference at virtual address 00000000 Oops [#1] CPU: 0 PID: 1 Comm: swapper Not tainted 6.5.0-rc1-orangecrab-02219-g0a529a1e4bf4 torvalds#565 epc : ssd130x_update_rect.isra.0+0x13c/0x340 ra : ssd130x_update_rect.isra.0+0x2bc/0x340 ... status: 00000120 badaddr: 00000000 cause: 0000000f [<c0303d90>] ssd130x_update_rect.isra.0+0x13c/0x340 [<c0304200>] ssd130x_primary_plane_helper_atomic_update+0x26c/0x284 [<c02f8d54>] drm_atomic_helper_commit_planes+0xfc/0x27c [<c02f9314>] drm_atomic_helper_commit_tail+0x5c/0xb4 [<c02f94fc>] commit_tail+0x190/0x1b8 [<c02f99fc>] drm_atomic_helper_commit+0x194/0x1c0 [<c02c5d00>] drm_atomic_commit+0xa4/0xe4 [<c02cce40>] drm_client_modeset_commit_atomic+0x244/0x278 [<c02ccef0>] drm_client_modeset_commit_locked+0x7c/0x1bc [<c02cd064>] drm_client_modeset_commit+0x34/0x64 [<c0301a78>] __drm_fb_helper_restore_fbdev_mode_unlocked+0xc4/0xe8 [<c0303424>] drm_fb_helper_set_par+0x38/0x58 [<c027c410>] fbcon_init+0x294/0x534 ... The problem is that fbcon calls fbcon_init() which triggers a DRM modeset and this leads to drm_atomic_helper_commit_planes() attempting to commit the atomic state for all planes, even the ones whose CRTC is not enabled. Since the primary plane buffer is allocated in the encoder .atomic_enable callback, this happens after that initial modeset commit and leads to the mentioned NULL pointer dereference. Fix this by allocating the buffers in the struct drm_plane_helper_funcs .begin_fb_access callback and free them in to .end_fb_access callback, instead of doing it when the encoder is enabled. Fixes: 49d7d58 ("drm/ssd130x: Don't allocate buffers on each plane update") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Javier Martinez Canillas <javierm@redhat.com> writes: > Geert Uytterhoeven <geert@linux-m68k.org> writes: > > Hello Geert and Thomas, > >> Hi Thomas, >> >> On Mon, Jul 17, 2023 at 10:48 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > [...] > >>> >>> After some discussion on IRC, I'd suggest to allocate the buffer >>> somewhere within probe. So it will always be there when the plane code runs. >>> >>> A full fix would be to allocate the buffer memory as part of the plane >>> state and/or the plane's atomic_check. That's a bit more complicated if >>> you want to shared the buffer memory across plane updates. >> >> Note that actually two buffers are involved: data_array (monochrome, >> needed for each update), and buffer (R8, only needed when converting >> from XR24 to R1). >> >> For the former, I agree, as it's always needed. >> For the latter, I'm afraid it would set a bad example: while allocating >> a possibly-unused buffer doesn't hurt for small displays, it would >> mean wasting 1 MiB in e.g. the repaper driver (once it has gained >> support for R1 ;^). >> > > Maybe another option is to allocate on the struct drm_mode_config_funcs > .fb_create callback? That way, we can get the mode_cmd->pixel_format and > determine if only "data_array" buffer must be allocated or also "buffer". > Something like the following patch: From 307bf062c9a999693a4a68c6740ec55b81f796b8 Mon Sep 17 00:00:00 2001 From: Javier Martinez Canillas <javierm@redhat.com> Date: Mon, 17 Jul 2023 12:30:35 +0200 Subject: [PATCH v2] drm/ssd130x: Fix an oops when attempting to update a disabled plane Geert reports that the following NULL pointer dereference happens for him after commit 49d7d58 ("drm/ssd130x: Don't allocate buffers on each plane update"): [drm] Initialized ssd130x 1.0.0 20220131 for 0-003c on minor 0 ssd130x-i2c 0-003c: [drm] surface width(128), height(32), bpp(1) and format(R1 little-endian (0x20203152)) Unable to handle kernel NULL pointer dereference at virtual address 00000000 Oops [#1] CPU: 0 PID: 1 Comm: swapper Not tainted 6.5.0-rc1-orangecrab-02219-g0a529a1e4bf4 torvalds#565 epc : ssd130x_update_rect.isra.0+0x13c/0x340 ra : ssd130x_update_rect.isra.0+0x2bc/0x340 ... status: 00000120 badaddr: 00000000 cause: 0000000f [<c0303d90>] ssd130x_update_rect.isra.0+0x13c/0x340 [<c0304200>] ssd130x_primary_plane_helper_atomic_update+0x26c/0x284 [<c02f8d54>] drm_atomic_helper_commit_planes+0xfc/0x27c [<c02f9314>] drm_atomic_helper_commit_tail+0x5c/0xb4 [<c02f94fc>] commit_tail+0x190/0x1b8 [<c02f99fc>] drm_atomic_helper_commit+0x194/0x1c0 [<c02c5d00>] drm_atomic_commit+0xa4/0xe4 [<c02cce40>] drm_client_modeset_commit_atomic+0x244/0x278 [<c02ccef0>] drm_client_modeset_commit_locked+0x7c/0x1bc [<c02cd064>] drm_client_modeset_commit+0x34/0x64 [<c0301a78>] __drm_fb_helper_restore_fbdev_mode_unlocked+0xc4/0xe8 [<c0303424>] drm_fb_helper_set_par+0x38/0x58 [<c027c410>] fbcon_init+0x294/0x534 ... The problem is that fbcon calls fbcon_init() which triggers a DRM modeset and this leads to drm_atomic_helper_commit_planes() attempting to commit the atomic state for all planes, even the ones whose CRTC is not enabled. Since the primary plane buffer is allocated in the encoder .atomic_enable callback, this happens after that initial modeset commit and leads to the mentioned NULL pointer dereference. Fix this by allocating the buffers in the struct drm_mode_config_funcs .fb_create, instead of doing it when the encoder is enabled. Also, make a couple of improvements to the ssd130x_buf_alloc() function: * Make the allocation smarter to only allocate the intermediate buffer if the native R1 format is not used. Otherwise is not needed, since there is no XRGB8888 to R1 format conversion during plane updates. * Allocate the buffers as device managed resources, this is enough and simplifies the logic since there is no need to explicitly free them. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Geert Uytterhoeven <geert@linux-m68k.org> writes: Hello Geert, Thanks for your review! > Hi Javier, [...] >> - ssd130x->buffer = kcalloc(pitch, ssd130x->height, GFP_KERNEL); >> - if (!ssd130x->buffer) >> - return -ENOMEM; >> + ssd130x->buffer = devm_kcalloc(ssd130x->dev, pitch, ssd130x->height, >> + GFP_KERNEL); > > This should check if the buffer was allocated before. > Else an application creating two or more frame buffers will keep > on allocating new buffers. The same is true for fbdev emulation vs. > a userspace application. > Yes, you are correct. >> + if (!ssd130x->buffer) >> + return -ENOMEM; >> + } >> >> - ssd130x->data_array = kcalloc(ssd130x->width, pages, GFP_KERNEL); >> - if (!ssd130x->data_array) { >> - kfree(ssd130x->buffer); >> + ssd130x->data_array = devm_kcalloc(ssd130x->dev, ssd130x->width, pages, >> + GFP_KERNEL); > > Same here. > >> + if (!ssd130x->data_array) >> return -ENOMEM; >> - } > > And you still need the data_array buffer for clearing the screen, > which is not tied to the creation of a frame buffer, I think? > It is indeed. I forgot about that. So what I would do for now is just to allocate the two unconditionally and then we can optimize as a follow-up. But also, this v2 is not correct because as I mentioned the device and drm_device lifecycles are not the same and the kernel crashes on poweroff: [ 568.351783] ssd130x_update_rect.isra.0+0xe0/0x308 [ 568.356881] ssd130x_primary_plane_helper_atomic_disable+0x54/0x78 [ 568.363475] drm_atomic_helper_commit_planes+0x1ec/0x288 [ 568.369128] drm_atomic_helper_commit_tail+0x5c/0xb0 [ 568.374422] commit_tail+0x168/0x1a0 [ 568.378240] drm_atomic_helper_commit+0x1c8/0x1e8 [ 568.383265] drm_atomic_commit+0xa0/0xc8 [ 568.387439] drm_atomic_helper_disable_all+0x204/0x220 [ 568.392914] drm_atomic_helper_shutdown+0x98/0x138 [ 568.398033] ssd130x_shutdown+0x18/0x30 [ 568.402117] ssd130x_i2c_shutdown+0x1c/0x30 [ 568.406558] i2c_device_shutdown+0x48/0x78 [ 568.410913] device_shutdown+0x130/0x238 [ 568.415087] kernel_restart+0x48/0xd0 [ 568.418996] __do_sys_reboot+0x14c/0x230 [ 568.423167] __arm64_sys_reboot+0x2c/0x40 [ 568.427426] invoke_syscall+0x78/0x100 [ 568.431420] el0_svc_common.constprop.0+0x68/0x130 [ 568.436531] do_el0_svc+0x34/0x50 [ 568.440080] el0_svc+0x48/0x150 [ 568.443455] el0t_64_sync_handler+0x114/0x120 [ 568.448072] el0t_64_sync+0x194/0x198 [ 568.451985] Code: 12000949 52800001 52800002 0b830f63 (38634b20) [ 568.458502] ---[ end trace 0000000000000000 ]--- Sima also suggested to make the allocation in the plane .prepare_fb, I think that the better place is .begin_fb_access for allocation and the .end_fb_access for free. Updated patch below and this is the one that I'm more comfortable now as a solution: From b4d078dd65a04ab61dd9264e982b37321f7a75d9 Mon Sep 17 00:00:00 2001 From: Javier Martinez Canillas <javierm@redhat.com> Date: Mon, 17 Jul 2023 12:30:35 +0200 Subject: [PATCH v3] drm/ssd130x: Fix an oops when attempting to update a disabled plane Geert reports that the following NULL pointer dereference happens for him after commit 49d7d58 ("drm/ssd130x: Don't allocate buffers on each plane update"): [drm] Initialized ssd130x 1.0.0 20220131 for 0-003c on minor 0 ssd130x-i2c 0-003c: [drm] surface width(128), height(32), bpp(1) and format(R1 little-endian (0x20203152)) Unable to handle kernel NULL pointer dereference at virtual address 00000000 Oops [#1] CPU: 0 PID: 1 Comm: swapper Not tainted 6.5.0-rc1-orangecrab-02219-g0a529a1e4bf4 torvalds#565 epc : ssd130x_update_rect.isra.0+0x13c/0x340 ra : ssd130x_update_rect.isra.0+0x2bc/0x340 ... status: 00000120 badaddr: 00000000 cause: 0000000f [<c0303d90>] ssd130x_update_rect.isra.0+0x13c/0x340 [<c0304200>] ssd130x_primary_plane_helper_atomic_update+0x26c/0x284 [<c02f8d54>] drm_atomic_helper_commit_planes+0xfc/0x27c [<c02f9314>] drm_atomic_helper_commit_tail+0x5c/0xb4 [<c02f94fc>] commit_tail+0x190/0x1b8 [<c02f99fc>] drm_atomic_helper_commit+0x194/0x1c0 [<c02c5d00>] drm_atomic_commit+0xa4/0xe4 [<c02cce40>] drm_client_modeset_commit_atomic+0x244/0x278 [<c02ccef0>] drm_client_modeset_commit_locked+0x7c/0x1bc [<c02cd064>] drm_client_modeset_commit+0x34/0x64 [<c0301a78>] __drm_fb_helper_restore_fbdev_mode_unlocked+0xc4/0xe8 [<c0303424>] drm_fb_helper_set_par+0x38/0x58 [<c027c410>] fbcon_init+0x294/0x534 ... The problem is that fbcon calls fbcon_init() which triggers a DRM modeset and this leads to drm_atomic_helper_commit_planes() attempting to commit the atomic state for all planes, even the ones whose CRTC is not enabled. Since the primary plane buffer is allocated in the encoder .atomic_enable callback, this happens after that initial modeset commit and leads to the mentioned NULL pointer dereference. Fix this by allocating the buffers in the struct drm_plane_helper_funcs .begin_fb_access callback and free them in to .end_fb_access callback, instead of doing it when the encoder is enabled. Fixes: commit 49d7d58 ("drm/ssd130x: Don't allocate buffers on each plane update") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Geert reports that the following NULL pointer dereference happens for him after commit 49d7d58 ("drm/ssd130x: Don't allocate buffers on each plane update"): [drm] Initialized ssd130x 1.0.0 20220131 for 0-003c on minor 0 ssd130x-i2c 0-003c: [drm] surface width(128), height(32), bpp(1) and format(R1 little-endian (0x20203152)) Unable to handle kernel NULL pointer dereference at virtual address 00000000 Oops [#1] CPU: 0 PID: 1 Comm: swapper Not tainted 6.5.0-rc1-orangecrab-02219-g0a529a1e4bf4 torvalds#565 epc : ssd130x_update_rect.isra.0+0x13c/0x340 ra : ssd130x_update_rect.isra.0+0x2bc/0x340 ... status: 00000120 badaddr: 00000000 cause: 0000000f [<c0303d90>] ssd130x_update_rect.isra.0+0x13c/0x340 [<c0304200>] ssd130x_primary_plane_helper_atomic_update+0x26c/0x284 [<c02f8d54>] drm_atomic_helper_commit_planes+0xfc/0x27c [<c02f9314>] drm_atomic_helper_commit_tail+0x5c/0xb4 [<c02f94fc>] commit_tail+0x190/0x1b8 [<c02f99fc>] drm_atomic_helper_commit+0x194/0x1c0 [<c02c5d00>] drm_atomic_commit+0xa4/0xe4 [<c02cce40>] drm_client_modeset_commit_atomic+0x244/0x278 [<c02ccef0>] drm_client_modeset_commit_locked+0x7c/0x1bc [<c02cd064>] drm_client_modeset_commit+0x34/0x64 [<c0301a78>] __drm_fb_helper_restore_fbdev_mode_unlocked+0xc4/0xe8 [<c0303424>] drm_fb_helper_set_par+0x38/0x58 [<c027c410>] fbcon_init+0x294/0x534 ... The problem is that fbcon calls fbcon_init() which triggers a DRM modeset and this leads to drm_atomic_helper_commit_planes() attempting to commit the atomic state for all planes, even the ones whose CRTC is not enabled. Since the primary plane buffer is allocated in the encoder .atomic_enable callback, this happens after that initial modeset commit and leads to the mentioned NULL pointer dereference. Fix this by having custom helpers to allocate, duplicate and destroy the plane state, that will take care of allocating and freeing these buffers. Instead of doing it in the encoder atomic enabled and disabled callbacks, since allocations must not be done in an .atomic_enable handler. Because drivers are not allowed to fail after drm_atomic_helper_swap_state() has been called and the new atomic state is stored into the current sw state. Fixes: 49d7d58 ("drm/ssd130x: Don't allocate buffers on each plane update") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Suggested-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
We got the following issue in our fault injection stress test: ================================================================== BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370 Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798 CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty torvalds#565 Call Trace: kasan_check_range+0xf6/0x1b0 fscache_withdraw_volume+0x2e1/0x370 cachefiles_withdraw_volume+0x31/0x50 cachefiles_withdraw_cache+0x3ad/0x900 cachefiles_put_unbind_pincount+0x1f6/0x250 cachefiles_daemon_release+0x13b/0x290 __fput+0x204/0xa00 task_work_run+0x139/0x230 Allocated by task 5820: __kmalloc+0x1df/0x4b0 fscache_alloc_volume+0x70/0x600 __fscache_acquire_volume+0x1c/0x610 erofs_fscache_register_volume+0x96/0x1a0 erofs_fscache_register_fs+0x49a/0x690 erofs_fc_fill_super+0x6c0/0xcc0 vfs_get_super+0xa9/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] Freed by task 5820: kfree+0xf1/0x2c0 fscache_put_volume.part.0+0x5cb/0x9e0 erofs_fscache_unregister_fs+0x157/0x1b0 erofs_kill_sb+0xd9/0x1c0 deactivate_locked_super+0xa3/0x100 vfs_get_super+0x105/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] ================================================================== Following is the process that triggers the issue: mount failed | daemon exit ------------------------------------------------------------ deactivate_locked_super cachefiles_daemon_release erofs_kill_sb erofs_fscache_unregister_fs fscache_relinquish_volume __fscache_relinquish_volume fscache_put_volume(fscache_volume, fscache_volume_put_relinquish) zero = __refcount_dec_and_test(&fscache_volume->ref, &ref); cachefiles_put_unbind_pincount cachefiles_daemon_unbind cachefiles_withdraw_cache cachefiles_withdraw_volumes list_del_init(&volume->cache_link) fscache_free_volume(fscache_volume) cache->ops->free_volume cachefiles_free_volume list_del_init(&cachefiles_volume->cache_link); kfree(fscache_volume) cachefiles_withdraw_volume fscache_withdraw_volume fscache_volume->n_accesses // fscache_volume UAF !!! The fscache_volume in cache->volumes must not have been freed yet, but its reference count may be 0. So use the new fscache_try_get_volume() helper function try to get its reference count. If the reference count of fscache_volume is 0, fscache_put_volume() is freeing it, so wait for it to be removed from cache->volumes. If its reference count is not 0, call cachefiles_withdraw_volume() with reference count protection to avoid the above issue. Fixes: fe2140e ("cachefiles: Implement volume support") Signed-off-by: Baokun Li <libaokun1@huawei.com>
We got the following issue in our fault injection stress test: ================================================================== BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370 Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798 CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty torvalds#565 Call Trace: kasan_check_range+0xf6/0x1b0 fscache_withdraw_volume+0x2e1/0x370 cachefiles_withdraw_volume+0x31/0x50 cachefiles_withdraw_cache+0x3ad/0x900 cachefiles_put_unbind_pincount+0x1f6/0x250 cachefiles_daemon_release+0x13b/0x290 __fput+0x204/0xa00 task_work_run+0x139/0x230 Allocated by task 5820: __kmalloc+0x1df/0x4b0 fscache_alloc_volume+0x70/0x600 __fscache_acquire_volume+0x1c/0x610 erofs_fscache_register_volume+0x96/0x1a0 erofs_fscache_register_fs+0x49a/0x690 erofs_fc_fill_super+0x6c0/0xcc0 vfs_get_super+0xa9/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] Freed by task 5820: kfree+0xf1/0x2c0 fscache_put_volume.part.0+0x5cb/0x9e0 erofs_fscache_unregister_fs+0x157/0x1b0 erofs_kill_sb+0xd9/0x1c0 deactivate_locked_super+0xa3/0x100 vfs_get_super+0x105/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] ================================================================== Following is the process that triggers the issue: mount failed | daemon exit ------------------------------------------------------------ deactivate_locked_super cachefiles_daemon_release erofs_kill_sb erofs_fscache_unregister_fs fscache_relinquish_volume __fscache_relinquish_volume fscache_put_volume(fscache_volume, fscache_volume_put_relinquish) zero = __refcount_dec_and_test(&fscache_volume->ref, &ref); cachefiles_put_unbind_pincount cachefiles_daemon_unbind cachefiles_withdraw_cache cachefiles_withdraw_volumes list_del_init(&volume->cache_link) fscache_free_volume(fscache_volume) cache->ops->free_volume cachefiles_free_volume list_del_init(&cachefiles_volume->cache_link); kfree(fscache_volume) cachefiles_withdraw_volume fscache_withdraw_volume fscache_volume->n_accesses // fscache_volume UAF !!! The fscache_volume in cache->volumes must not have been freed yet, but its reference count may be 0. So use the new fscache_try_get_volume() helper function try to get its reference count. If the reference count of fscache_volume is 0, fscache_put_volume() is freeing it, so wait for it to be removed from cache->volumes. If its reference count is not 0, call cachefiles_withdraw_volume() with reference count protection to avoid the above issue. Fixes: fe2140e ("cachefiles: Implement volume support") Signed-off-by: Baokun Li <libaokun1@huawei.com>
We got the following issue in our fault injection stress test: ================================================================== BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370 Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798 CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty torvalds#565 Call Trace: kasan_check_range+0xf6/0x1b0 fscache_withdraw_volume+0x2e1/0x370 cachefiles_withdraw_volume+0x31/0x50 cachefiles_withdraw_cache+0x3ad/0x900 cachefiles_put_unbind_pincount+0x1f6/0x250 cachefiles_daemon_release+0x13b/0x290 __fput+0x204/0xa00 task_work_run+0x139/0x230 Allocated by task 5820: __kmalloc+0x1df/0x4b0 fscache_alloc_volume+0x70/0x600 __fscache_acquire_volume+0x1c/0x610 erofs_fscache_register_volume+0x96/0x1a0 erofs_fscache_register_fs+0x49a/0x690 erofs_fc_fill_super+0x6c0/0xcc0 vfs_get_super+0xa9/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] Freed by task 5820: kfree+0xf1/0x2c0 fscache_put_volume.part.0+0x5cb/0x9e0 erofs_fscache_unregister_fs+0x157/0x1b0 erofs_kill_sb+0xd9/0x1c0 deactivate_locked_super+0xa3/0x100 vfs_get_super+0x105/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] ================================================================== Following is the process that triggers the issue: mount failed | daemon exit ------------------------------------------------------------ deactivate_locked_super cachefiles_daemon_release erofs_kill_sb erofs_fscache_unregister_fs fscache_relinquish_volume __fscache_relinquish_volume fscache_put_volume(fscache_volume, fscache_volume_put_relinquish) zero = __refcount_dec_and_test(&fscache_volume->ref, &ref); cachefiles_put_unbind_pincount cachefiles_daemon_unbind cachefiles_withdraw_cache cachefiles_withdraw_volumes list_del_init(&volume->cache_link) fscache_free_volume(fscache_volume) cache->ops->free_volume cachefiles_free_volume list_del_init(&cachefiles_volume->cache_link); kfree(fscache_volume) cachefiles_withdraw_volume fscache_withdraw_volume fscache_volume->n_accesses // fscache_volume UAF !!! The fscache_volume in cache->volumes must not have been freed yet, but its reference count may be 0. So use the new fscache_try_get_volume() helper function try to get its reference count. If the reference count of fscache_volume is 0, fscache_put_volume() is freeing it, so wait for it to be removed from cache->volumes. If its reference count is not 0, call cachefiles_withdraw_volume() with reference count protection to avoid the above issue. Fixes: fe2140e ("cachefiles: Implement volume support") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-3-libaokun@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org>
[ Upstream commit 522018a ] We got the following issue in our fault injection stress test: ================================================================== BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370 Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798 CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty torvalds#565 Call Trace: kasan_check_range+0xf6/0x1b0 fscache_withdraw_volume+0x2e1/0x370 cachefiles_withdraw_volume+0x31/0x50 cachefiles_withdraw_cache+0x3ad/0x900 cachefiles_put_unbind_pincount+0x1f6/0x250 cachefiles_daemon_release+0x13b/0x290 __fput+0x204/0xa00 task_work_run+0x139/0x230 Allocated by task 5820: __kmalloc+0x1df/0x4b0 fscache_alloc_volume+0x70/0x600 __fscache_acquire_volume+0x1c/0x610 erofs_fscache_register_volume+0x96/0x1a0 erofs_fscache_register_fs+0x49a/0x690 erofs_fc_fill_super+0x6c0/0xcc0 vfs_get_super+0xa9/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] Freed by task 5820: kfree+0xf1/0x2c0 fscache_put_volume.part.0+0x5cb/0x9e0 erofs_fscache_unregister_fs+0x157/0x1b0 erofs_kill_sb+0xd9/0x1c0 deactivate_locked_super+0xa3/0x100 vfs_get_super+0x105/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] ================================================================== Following is the process that triggers the issue: mount failed | daemon exit ------------------------------------------------------------ deactivate_locked_super cachefiles_daemon_release erofs_kill_sb erofs_fscache_unregister_fs fscache_relinquish_volume __fscache_relinquish_volume fscache_put_volume(fscache_volume, fscache_volume_put_relinquish) zero = __refcount_dec_and_test(&fscache_volume->ref, &ref); cachefiles_put_unbind_pincount cachefiles_daemon_unbind cachefiles_withdraw_cache cachefiles_withdraw_volumes list_del_init(&volume->cache_link) fscache_free_volume(fscache_volume) cache->ops->free_volume cachefiles_free_volume list_del_init(&cachefiles_volume->cache_link); kfree(fscache_volume) cachefiles_withdraw_volume fscache_withdraw_volume fscache_volume->n_accesses // fscache_volume UAF !!! The fscache_volume in cache->volumes must not have been freed yet, but its reference count may be 0. So use the new fscache_try_get_volume() helper function try to get its reference count. If the reference count of fscache_volume is 0, fscache_put_volume() is freeing it, so wait for it to be removed from cache->volumes. If its reference count is not 0, call cachefiles_withdraw_volume() with reference count protection to avoid the above issue. Fixes: fe2140e ("cachefiles: Implement volume support") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-3-libaokun@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 522018a ] We got the following issue in our fault injection stress test: ================================================================== BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370 Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798 CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty torvalds#565 Call Trace: kasan_check_range+0xf6/0x1b0 fscache_withdraw_volume+0x2e1/0x370 cachefiles_withdraw_volume+0x31/0x50 cachefiles_withdraw_cache+0x3ad/0x900 cachefiles_put_unbind_pincount+0x1f6/0x250 cachefiles_daemon_release+0x13b/0x290 __fput+0x204/0xa00 task_work_run+0x139/0x230 Allocated by task 5820: __kmalloc+0x1df/0x4b0 fscache_alloc_volume+0x70/0x600 __fscache_acquire_volume+0x1c/0x610 erofs_fscache_register_volume+0x96/0x1a0 erofs_fscache_register_fs+0x49a/0x690 erofs_fc_fill_super+0x6c0/0xcc0 vfs_get_super+0xa9/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] Freed by task 5820: kfree+0xf1/0x2c0 fscache_put_volume.part.0+0x5cb/0x9e0 erofs_fscache_unregister_fs+0x157/0x1b0 erofs_kill_sb+0xd9/0x1c0 deactivate_locked_super+0xa3/0x100 vfs_get_super+0x105/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] ================================================================== Following is the process that triggers the issue: mount failed | daemon exit ------------------------------------------------------------ deactivate_locked_super cachefiles_daemon_release erofs_kill_sb erofs_fscache_unregister_fs fscache_relinquish_volume __fscache_relinquish_volume fscache_put_volume(fscache_volume, fscache_volume_put_relinquish) zero = __refcount_dec_and_test(&fscache_volume->ref, &ref); cachefiles_put_unbind_pincount cachefiles_daemon_unbind cachefiles_withdraw_cache cachefiles_withdraw_volumes list_del_init(&volume->cache_link) fscache_free_volume(fscache_volume) cache->ops->free_volume cachefiles_free_volume list_del_init(&cachefiles_volume->cache_link); kfree(fscache_volume) cachefiles_withdraw_volume fscache_withdraw_volume fscache_volume->n_accesses // fscache_volume UAF !!! The fscache_volume in cache->volumes must not have been freed yet, but its reference count may be 0. So use the new fscache_try_get_volume() helper function try to get its reference count. If the reference count of fscache_volume is 0, fscache_put_volume() is freeing it, so wait for it to be removed from cache->volumes. If its reference count is not 0, call cachefiles_withdraw_volume() with reference count protection to avoid the above issue. Fixes: fe2140e ("cachefiles: Implement volume support") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-3-libaokun@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 522018a ] We got the following issue in our fault injection stress test: ================================================================== BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370 Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798 CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty torvalds#565 Call Trace: kasan_check_range+0xf6/0x1b0 fscache_withdraw_volume+0x2e1/0x370 cachefiles_withdraw_volume+0x31/0x50 cachefiles_withdraw_cache+0x3ad/0x900 cachefiles_put_unbind_pincount+0x1f6/0x250 cachefiles_daemon_release+0x13b/0x290 __fput+0x204/0xa00 task_work_run+0x139/0x230 Allocated by task 5820: __kmalloc+0x1df/0x4b0 fscache_alloc_volume+0x70/0x600 __fscache_acquire_volume+0x1c/0x610 erofs_fscache_register_volume+0x96/0x1a0 erofs_fscache_register_fs+0x49a/0x690 erofs_fc_fill_super+0x6c0/0xcc0 vfs_get_super+0xa9/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] Freed by task 5820: kfree+0xf1/0x2c0 fscache_put_volume.part.0+0x5cb/0x9e0 erofs_fscache_unregister_fs+0x157/0x1b0 erofs_kill_sb+0xd9/0x1c0 deactivate_locked_super+0xa3/0x100 vfs_get_super+0x105/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] ================================================================== Following is the process that triggers the issue: mount failed | daemon exit ------------------------------------------------------------ deactivate_locked_super cachefiles_daemon_release erofs_kill_sb erofs_fscache_unregister_fs fscache_relinquish_volume __fscache_relinquish_volume fscache_put_volume(fscache_volume, fscache_volume_put_relinquish) zero = __refcount_dec_and_test(&fscache_volume->ref, &ref); cachefiles_put_unbind_pincount cachefiles_daemon_unbind cachefiles_withdraw_cache cachefiles_withdraw_volumes list_del_init(&volume->cache_link) fscache_free_volume(fscache_volume) cache->ops->free_volume cachefiles_free_volume list_del_init(&cachefiles_volume->cache_link); kfree(fscache_volume) cachefiles_withdraw_volume fscache_withdraw_volume fscache_volume->n_accesses // fscache_volume UAF !!! The fscache_volume in cache->volumes must not have been freed yet, but its reference count may be 0. So use the new fscache_try_get_volume() helper function try to get its reference count. If the reference count of fscache_volume is 0, fscache_put_volume() is freeing it, so wait for it to be removed from cache->volumes. If its reference count is not 0, call cachefiles_withdraw_volume() with reference count protection to avoid the above issue. Fixes: fe2140e ("cachefiles: Implement volume support") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-3-libaokun@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 522018a ] We got the following issue in our fault injection stress test: ================================================================== BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370 Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798 CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty torvalds#565 Call Trace: kasan_check_range+0xf6/0x1b0 fscache_withdraw_volume+0x2e1/0x370 cachefiles_withdraw_volume+0x31/0x50 cachefiles_withdraw_cache+0x3ad/0x900 cachefiles_put_unbind_pincount+0x1f6/0x250 cachefiles_daemon_release+0x13b/0x290 __fput+0x204/0xa00 task_work_run+0x139/0x230 Allocated by task 5820: __kmalloc+0x1df/0x4b0 fscache_alloc_volume+0x70/0x600 __fscache_acquire_volume+0x1c/0x610 erofs_fscache_register_volume+0x96/0x1a0 erofs_fscache_register_fs+0x49a/0x690 erofs_fc_fill_super+0x6c0/0xcc0 vfs_get_super+0xa9/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] Freed by task 5820: kfree+0xf1/0x2c0 fscache_put_volume.part.0+0x5cb/0x9e0 erofs_fscache_unregister_fs+0x157/0x1b0 erofs_kill_sb+0xd9/0x1c0 deactivate_locked_super+0xa3/0x100 vfs_get_super+0x105/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] ================================================================== Following is the process that triggers the issue: mount failed | daemon exit ------------------------------------------------------------ deactivate_locked_super cachefiles_daemon_release erofs_kill_sb erofs_fscache_unregister_fs fscache_relinquish_volume __fscache_relinquish_volume fscache_put_volume(fscache_volume, fscache_volume_put_relinquish) zero = __refcount_dec_and_test(&fscache_volume->ref, &ref); cachefiles_put_unbind_pincount cachefiles_daemon_unbind cachefiles_withdraw_cache cachefiles_withdraw_volumes list_del_init(&volume->cache_link) fscache_free_volume(fscache_volume) cache->ops->free_volume cachefiles_free_volume list_del_init(&cachefiles_volume->cache_link); kfree(fscache_volume) cachefiles_withdraw_volume fscache_withdraw_volume fscache_volume->n_accesses // fscache_volume UAF !!! The fscache_volume in cache->volumes must not have been freed yet, but its reference count may be 0. So use the new fscache_try_get_volume() helper function try to get its reference count. If the reference count of fscache_volume is 0, fscache_put_volume() is freeing it, so wait for it to be removed from cache->volumes. If its reference count is not 0, call cachefiles_withdraw_volume() with reference count protection to avoid the above issue. Fixes: fe2140e ("cachefiles: Implement volume support") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-3-libaokun@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 522018a ] We got the following issue in our fault injection stress test: ================================================================== BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370 Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798 CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty torvalds#565 Call Trace: kasan_check_range+0xf6/0x1b0 fscache_withdraw_volume+0x2e1/0x370 cachefiles_withdraw_volume+0x31/0x50 cachefiles_withdraw_cache+0x3ad/0x900 cachefiles_put_unbind_pincount+0x1f6/0x250 cachefiles_daemon_release+0x13b/0x290 __fput+0x204/0xa00 task_work_run+0x139/0x230 Allocated by task 5820: __kmalloc+0x1df/0x4b0 fscache_alloc_volume+0x70/0x600 __fscache_acquire_volume+0x1c/0x610 erofs_fscache_register_volume+0x96/0x1a0 erofs_fscache_register_fs+0x49a/0x690 erofs_fc_fill_super+0x6c0/0xcc0 vfs_get_super+0xa9/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] Freed by task 5820: kfree+0xf1/0x2c0 fscache_put_volume.part.0+0x5cb/0x9e0 erofs_fscache_unregister_fs+0x157/0x1b0 erofs_kill_sb+0xd9/0x1c0 deactivate_locked_super+0xa3/0x100 vfs_get_super+0x105/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] ================================================================== Following is the process that triggers the issue: mount failed | daemon exit ------------------------------------------------------------ deactivate_locked_super cachefiles_daemon_release erofs_kill_sb erofs_fscache_unregister_fs fscache_relinquish_volume __fscache_relinquish_volume fscache_put_volume(fscache_volume, fscache_volume_put_relinquish) zero = __refcount_dec_and_test(&fscache_volume->ref, &ref); cachefiles_put_unbind_pincount cachefiles_daemon_unbind cachefiles_withdraw_cache cachefiles_withdraw_volumes list_del_init(&volume->cache_link) fscache_free_volume(fscache_volume) cache->ops->free_volume cachefiles_free_volume list_del_init(&cachefiles_volume->cache_link); kfree(fscache_volume) cachefiles_withdraw_volume fscache_withdraw_volume fscache_volume->n_accesses // fscache_volume UAF !!! The fscache_volume in cache->volumes must not have been freed yet, but its reference count may be 0. So use the new fscache_try_get_volume() helper function try to get its reference count. If the reference count of fscache_volume is 0, fscache_put_volume() is freeing it, so wait for it to be removed from cache->volumes. If its reference count is not 0, call cachefiles_withdraw_volume() with reference count protection to avoid the above issue. Fixes: fe2140e ("cachefiles: Implement volume support") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-3-libaokun@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We got the following issue in our fault injection stress test: ================================================================== BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370 Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798 CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty torvalds#565 Call Trace: kasan_check_range+0xf6/0x1b0 fscache_withdraw_volume+0x2e1/0x370 cachefiles_withdraw_volume+0x31/0x50 cachefiles_withdraw_cache+0x3ad/0x900 cachefiles_put_unbind_pincount+0x1f6/0x250 cachefiles_daemon_release+0x13b/0x290 __fput+0x204/0xa00 task_work_run+0x139/0x230 Allocated by task 5820: __kmalloc+0x1df/0x4b0 fscache_alloc_volume+0x70/0x600 __fscache_acquire_volume+0x1c/0x610 erofs_fscache_register_volume+0x96/0x1a0 erofs_fscache_register_fs+0x49a/0x690 erofs_fc_fill_super+0x6c0/0xcc0 vfs_get_super+0xa9/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] Freed by task 5820: kfree+0xf1/0x2c0 fscache_put_volume.part.0+0x5cb/0x9e0 erofs_fscache_unregister_fs+0x157/0x1b0 erofs_kill_sb+0xd9/0x1c0 deactivate_locked_super+0xa3/0x100 vfs_get_super+0x105/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] ================================================================== Following is the process that triggers the issue: mount failed | daemon exit ------------------------------------------------------------ deactivate_locked_super cachefiles_daemon_release erofs_kill_sb erofs_fscache_unregister_fs fscache_relinquish_volume __fscache_relinquish_volume fscache_put_volume(fscache_volume, fscache_volume_put_relinquish) zero = __refcount_dec_and_test(&fscache_volume->ref, &ref); cachefiles_put_unbind_pincount cachefiles_daemon_unbind cachefiles_withdraw_cache cachefiles_withdraw_volumes list_del_init(&volume->cache_link) fscache_free_volume(fscache_volume) cache->ops->free_volume cachefiles_free_volume list_del_init(&cachefiles_volume->cache_link); kfree(fscache_volume) cachefiles_withdraw_volume fscache_withdraw_volume fscache_volume->n_accesses // fscache_volume UAF !!! The fscache_volume in cache->volumes must not have been freed yet, but its reference count may be 0. So use the new fscache_try_get_volume() helper function try to get its reference count. If the reference count of fscache_volume is 0, fscache_put_volume() is freeing it, so wait for it to be removed from cache->volumes. If its reference count is not 0, call cachefiles_withdraw_volume() with reference count protection to avoid the above issue. Fixes: fe2140e ("cachefiles: Implement volume support") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-3-libaokun@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org>
contents need to be zeroed otherwise cryptographic data will be leaked