Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use WPA3 #1

Closed
backslashxx opened this issue Jan 16, 2023 · 1 comment
Closed

Can't use WPA3 #1

backslashxx opened this issue Jan 16, 2023 · 1 comment

Comments

@backslashxx
Copy link

Hi, how can I enable WPA3 in this kernel?

@TogoFire
Copy link
Owner

TogoFire commented Jan 16, 2023

TogoFire pushed a commit that referenced this issue Jan 16, 2023
fsck.f2fs forces a filesystem fix on boot if it detects that the current
kernel version differs from the one saved in the superblock, which results in
fsck blocking boot for a long time (~35 seconds). This commit provides a
way to report a constant fake kernel version to fsck to avoid triggering
the version check, which is useful if you boot new kernel builds
frequently.

Signed-off-by: Danny Lin <danny@kdrag0n.dev>
Change-Id: I3ddcf3fb570fc914d513d7ae6708b3fb0ff48fd7

Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com>

ocean stock-kernel =

CONFIG_F2FS_REPORT_FAKE_KERNEL_VERSION=y
CONFIG_F2FS_FAKE_KERNEL_RELEASE="4.9.206-perf+"
CONFIG_F2FS_FAKE_KERNEL_VERSION="#1 SMP PREEMPT Wed Sep 30 23:50:55 CDT 2020 aarch64"

Change-Id: e52c74f996b3d85015c557f8bb26afc33f2b4cc9
Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com>
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Jan 16, 2023
Patch series "lib/sort & lib/list_sort: faster and smaller", v2.

Because CONFIG_RETPOLINE has made indirect calls much more expensive, I
thought I'd try to reduce the number made by the library sort functions.

The first three patches apply to lib/sort.c.

Patch #1 is a simple optimization.  The built-in swap has special cases
for aligned 4- and 8-byte objects.  But those are almost never used;
most calls to sort() work on larger structures, which fall back to the
byte-at-a-time loop.  This generalizes them to aligned *multiples* of 4
and 8 bytes.  (If nothing else, it saves an awful lot of energy by not
thrashing the store buffers as much.)

Patch #2 grabs a juicy piece of low-hanging fruit.  I agree that nice
simple solid heapsort is preferable to more complex algorithms (sorry,
Andrey), but it's possible to implement heapsort with far fewer
comparisons (50% asymptotically, 25-40% reduction for realistic sizes)
than the way it's been done up to now.  And with some care, the code
ends up smaller, as well.  This is the "big win" patch.

Patch #3 adds the same sort of indirect call bypass that has been added
to the net code of late.  The great majority of the callers use the
builtin swap functions, so replace the indirect call to sort_func with a
(highly preditable) series of if() statements.  Rather surprisingly,
this decreased code size, as the swap functions were inlined and their
prologue & epilogue code eliminated.

lib/list_sort.c is a bit trickier, as merge sort is already close to
optimal, and we don't want to introduce triumphs of theory over
practicality like the Ford-Johnson merge-insertion sort.

Patch #4, without changing the algorithm, chops 32% off the code size
and removes the part[MAX_LIST_LENGTH+1] pointer array (and the
corresponding upper limit on efficiently sortable input size).

Patch #5 improves the algorithm.  The previous code is already optimal
for power-of-two (or slightly smaller) size inputs, but when the input
size is just over a power of 2, there's a very unbalanced final merge.

There are, in the literature, several algorithms which solve this, but
they all depend on the "breadth-first" merge order which was replaced by
commit 835cc0c with a more cache-friendly "depth-first" order.
Some hard thinking came up with a depth-first algorithm which defers
merges as little as possible while avoiding bad merges.  This saves
0.2*n compares, averaged over all sizes.

The code size increase is minimal (64 bytes on x86-64, reducing the net
savings to 26%), but the comments expanded significantly to document the
clever algorithm.

TESTING NOTES: I have some ugly user-space benchmarking code which I
used for testing before moving this code into the kernel.  Shout if you
want a copy.

I'm running this code right now, with CONFIG_TEST_SORT and
CONFIG_TEST_LIST_SORT, but I confess I haven't rebooted since the last
round of minor edits to quell checkpatch.  I figure there will be at
least one round of comments and final testing.

This patch (of 5):

Rather than having special-case swap functions for 4- and 8-byte
objects, special-case aligned multiples of 4 or 8 bytes.  This speeds up
most users of sort() by avoiding fallback to the byte copy loop.

Despite what ca96ab8 ("lib/sort: Add 64 bit swap function") claims,
very few users of sort() sort pointers (or pointer-sized objects); most
sort structures containing at least two words.  (E.g.
drivers/acpi/fan.c:acpi_fan_get_fps() sorts an array of 40-byte struct
acpi_fan_fps.)

The functions also got renamed to reflect the fact that they support
multiple words.  In the great tradition of bikeshedding, the names were
by far the most contentious issue during review of this patch series.

x86-64 code size 872 -> 886 bytes (+14)

With feedback from Andy Shevchenko, Rasmus Villemoes and Geert
Uytterhoeven.

Link: http://lkml.kernel.org/r/f24f932df3a7fa1973c1084154f1cea596bcf341.1552704200.git.lkml@sdf.org
Signed-off-by: George Spelvin <lkml@sdf.org>
Acked-by: Andrey Abramov <st5pub@yandex.ru>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Daniel Wagner <daniel.wagner@siemens.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
TogoFire pushed a commit that referenced this issue Jan 16, 2023
commit ba5a4fdd63ae0c575707030db0b634b160baddd7 upstream.

syzbot complained about a recent change in TCP stack,
hitting a NULL pointer [1]

tcp request sockets have an af_specific pointer, which
was used before the blamed change only for SYNACK generation
in non SYNCOOKIE mode.

tcp requests sockets momentarily created when third packet
coming from client in SYNCOOKIE mode were not using
treq->af_specific.

Make sure this field is populated, in the same way normal
TCP requests sockets do in tcp_conn_request().

[1]
TCP: request_sock_TCPv6: Possible SYN flooding on port 20002. Sending cookies.  Check SNMP counters.
general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 1 PID: 3695 Comm: syz-executor864 Not tainted 5.18.0-rc3-syzkaller-00224-g5fd1fe4807f9 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:tcp_create_openreq_child+0xe16/0x16b0 net/ipv4/tcp_minisocks.c:534
Code: 48 c1 ea 03 80 3c 02 00 0f 85 e5 07 00 00 4c 8b b3 28 01 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7e 08 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 c9 07 00 00 48 8b 3c 24 48 89 de 41 ff 56 08 48
RSP: 0018:ffffc90000de0588 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff888076490330 RCX: 0000000000000100
RDX: 0000000000000001 RSI: ffffffff87d67ff0 RDI: 0000000000000008
RBP: ffff88806ee1c7f8 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff87d67f00 R11: 0000000000000000 R12: ffff88806ee1bfc0
R13: ffff88801b0e0368 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f517fe58700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffcead76960 CR3: 000000006f97b000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 tcp_v6_syn_recv_sock+0x199/0x23b0 net/ipv6/tcp_ipv6.c:1267
 tcp_get_cookie_sock+0xc9/0x850 net/ipv4/syncookies.c:207
 cookie_v6_check+0x15c3/0x2340 net/ipv6/syncookies.c:258
 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:1131 [inline]
 tcp_v6_do_rcv+0x1148/0x13b0 net/ipv6/tcp_ipv6.c:1486
 tcp_v6_rcv+0x3305/0x3840 net/ipv6/tcp_ipv6.c:1725
 ip6_protocol_deliver_rcu+0x2e9/0x1900 net/ipv6/ip6_input.c:422
 ip6_input_finish+0x14c/0x2c0 net/ipv6/ip6_input.c:464
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:473
 dst_input include/net/dst.h:461 [inline]
 ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ipv6_rcv+0x27f/0x3b0 net/ipv6/ip6_input.c:297
 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5405
 __netif_receive_skb+0x24/0x1b0 net/core/dev.c:5519
 process_backlog+0x3a0/0x7c0 net/core/dev.c:5847
 __napi_poll+0xb3/0x6e0 net/core/dev.c:6413
 napi_poll net/core/dev.c:6480 [inline]
 net_rx_action+0x8ec/0xc60 net/core/dev.c:6567
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097

Fixes: 5b0b9e4c2c89 ("tcp: md5: incorrect tcp_header_len for incoming connections")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[fruggeri: Account for backport conflicts from 35b2c3211609 and 6fc8c827dd4f]
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
TogoFire pushed a commit that referenced this issue Jan 16, 2023
During boot we observed several warnings that can be avoided

[    6.725935] pmic_analog_codec 800f000.qcom,spmi:qcom,pm660l@3:analog-codec@f000: Error: regulator not found:cdc-vdd-spkdrv
[    6.726304] ------------[ cut here ]------------
[    6.726364] WARNING: CPU: 1 PID: 455 at regcache_cache_only+0xf0/0x100
[    6.726452] Modules linked in:
[    6.726460] CPU: 1 PID: 455 Comm: kworker/1:3 Not tainted 4.9.170-g00294e841ec8-dirty #1
[    6.726462] Hardware name: Qualcomm Technologies, Inc. SDM 630 PM660 + PM660L MTP (DT)
[    6.726471] Workqueue: events deferred_probe_work_func
[    6.726473] task: ffffffedb575b600 task.stack: ffffffedb5770000
[    6.726476] PC is at regcache_cache_only+0xf0/0x100
[    6.726478] LR is at regcache_cache_only+0x2c/0x100

Change-Id: 5b85423b078ce856772a45a9ffe9870f522518e8
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Jan 16, 2023
log:
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld -EL -maarch64elf --lto-O3 -z noexecstack -r -o vmlinux.o --whole-archive built -in.o
 #0 0x000055d50c1017a3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e87a3)
 #1 0x000055d50c1013fa (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e83fa)
 #2 0x00007f52034b9b50 __restore_rt (/lib64/libc.so.6+0x3cb50)
 #3 0x000055d50c38a111 void lld::elf::writeResult<llvm::object::ELFType<(llvm::support::endianness)1, true>>() (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang- neutron/bin/ld.lld+0x2871111)
 #4 0x000055d50c1e539e lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26cc39e)
 #5 0x000055d50c1ce651 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26b5651)
 #6 0x000055d50c1cbc96 lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron /bin/ld.lld+0x26b2c96)
 #7 0x000055d50c0aad03 (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2591d03)
 #8 0x000055d50bc2bc3e lld_main(int, char**) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2112c3e)
 #9 0x00007f52034a4510 __libc_start_call_main (/lib64/libc.so.6+0x27510)
#10 0x00007f52034a45c9 __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c9)
#11 0x000055d50c09e9f5 _start (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25859f5)
../scripts/link-vmlinux.sh, line 94: 60322 Segmentation fault (core image recorded)${LDFINAL} ${LDFLAGS} -r -o ${1} $(lto_lds) ${objects}
make[1]: *** [/home/togo/Documents/GitHub/kernel_xiaomi_panda/Makefile:1272: vmlinux] Error 139
make[1]: Leaving directory '/home/togo/Documentos/GitHub/kernel_xiaomi_panda/out'

make: *** [Makefile:154: sub-make] Error 2

Change-Id: 00fc97669415f7b9f5816ebac37fc85c5da82a53
Signed-off-by: GeoPD <geoemmanuelpd2001@gmail.com>
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Jan 16, 2023
log:
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld -EL -maarch64elf --lto-O3 -z noexecstack -r -o vmlinux.o --whole-archive built -in.o
 #0 0x000055d50c1017a3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e87a3)
 #1 0x000055d50c1013fa (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e83fa)
 #2 0x00007f52034b9b50 __restore_rt (/lib64/libc.so.6+0x3cb50)
 #3 0x000055d50c38a111 void lld::elf::writeResult<llvm::object::ELFType<(llvm::support::endianness)1, true>>() (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang- neutron/bin/ld.lld+0x2871111)
 #4 0x000055d50c1e539e lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26cc39e)
 #5 0x000055d50c1ce651 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26b5651)
 #6 0x000055d50c1cbc96 lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron /bin/ld.lld+0x26b2c96)
 #7 0x000055d50c0aad03 (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2591d03)
 #8 0x000055d50bc2bc3e lld_main(int, char**) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2112c3e)
 #9 0x00007f52034a4510 __libc_start_call_main (/lib64/libc.so.6+0x27510)
#10 0x00007f52034a45c9 __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c9)
#11 0x000055d50c09e9f5 _start (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25859f5)
../scripts/link-vmlinux.sh, line 94: 60322 Segmentation fault (core image recorded)${LDFINAL} ${LDFLAGS} -r -o ${1} $(lto_lds) ${objects}
make[1]: *** [/home/togo/Documents/GitHub/kernel_xiaomi_panda/Makefile:1272: vmlinux] Error 139
make[1]: Leaving directory '/home/togo/Documentos/GitHub/kernel_xiaomi_panda/out'

make: *** [Makefile:154: sub-make] Error 2

Change-Id: 00fc97669415f7b9f5816ebac37fc85c5da82a53
Signed-off-by: GeoPD <geoemmanuelpd2001@gmail.com>
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Feb 21, 2023
During boot we observed several warnings that can be avoided

[    6.725935] pmic_analog_codec 800f000.qcom,spmi:qcom,pm660l@3:analog-codec@f000: Error: regulator not found:cdc-vdd-spkdrv
[    6.726304] ------------[ cut here ]------------
[    6.726364] WARNING: CPU: 1 PID: 455 at regcache_cache_only+0xf0/0x100
[    6.726452] Modules linked in:
[    6.726460] CPU: 1 PID: 455 Comm: kworker/1:3 Not tainted 4.9.170-g00294e841ec8-dirty #1
[    6.726462] Hardware name: Qualcomm Technologies, Inc. SDM 630 PM660 + PM660L MTP (DT)
[    6.726471] Workqueue: events deferred_probe_work_func
[    6.726473] task: ffffffedb575b600 task.stack: ffffffedb5770000
[    6.726476] PC is at regcache_cache_only+0xf0/0x100
[    6.726478] LR is at regcache_cache_only+0x2c/0x100

Change-Id: 5b85423b078ce856772a45a9ffe9870f522518e8
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Feb 21, 2023
log:
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld -EL -maarch64elf --lto-O3 -z noexecstack -r -o vmlinux.o --whole-archive built -in.o
 #0 0x000055d50c1017a3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e87a3)
 #1 0x000055d50c1013fa (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e83fa)
 #2 0x00007f52034b9b50 __restore_rt (/lib64/libc.so.6+0x3cb50)
 #3 0x000055d50c38a111 void lld::elf::writeResult<llvm::object::ELFType<(llvm::support::endianness)1, true>>() (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang- neutron/bin/ld.lld+0x2871111)
 #4 0x000055d50c1e539e lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26cc39e)
 #5 0x000055d50c1ce651 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26b5651)
 #6 0x000055d50c1cbc96 lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron /bin/ld.lld+0x26b2c96)
 #7 0x000055d50c0aad03 (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2591d03)
 #8 0x000055d50bc2bc3e lld_main(int, char**) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2112c3e)
 #9 0x00007f52034a4510 __libc_start_call_main (/lib64/libc.so.6+0x27510)
#10 0x00007f52034a45c9 __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c9)
#11 0x000055d50c09e9f5 _start (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25859f5)
../scripts/link-vmlinux.sh, line 94: 60322 Segmentation fault (core image recorded)${LDFINAL} ${LDFLAGS} -r -o ${1} $(lto_lds) ${objects}
make[1]: *** [/home/togo/Documents/GitHub/kernel_xiaomi_panda/Makefile:1272: vmlinux] Error 139
make[1]: Leaving directory '/home/togo/Documentos/GitHub/kernel_xiaomi_panda/out'

make: *** [Makefile:154: sub-make] Error 2

Change-Id: 00fc97669415f7b9f5816ebac37fc85c5da82a53
Signed-off-by: GeoPD <geoemmanuelpd2001@gmail.com>
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Mar 14, 2023
With RT_FULL we get the below wreckage:

[  126.060484] =======================================================
[  126.060486] [ INFO: possible circular locking dependency detected ]
[  126.060489] 3.0.1-rt10+ #30
[  126.060490] -------------------------------------------------------
[  126.060492] irq/24-eth0/1235 is trying to acquire lock:
[  126.060495]  (&(lock)->wait_lock#2){+.+...}, at: [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060503]
[  126.060504] but task is already holding lock:
[  126.060506]  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060511]
[  126.060511] which lock already depends on the new lock.
[  126.060513]
[  126.060514]
[  126.060514] the existing dependency chain (in reverse order) is:
[  126.060516]
[  126.060516] -> #1 (&p->pi_lock){-...-.}:
[  126.060519]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060524]        [<ffffffff8150291e>] _raw_spin_lock_irqsave+0x4b/0x85
[  126.060527]        [<ffffffff810b5aa4>] task_blocks_on_rt_mutex+0x36/0x20f
[  126.060531]        [<ffffffff815019bb>] rt_mutex_slowlock+0xd1/0x15a
[  126.060534]        [<ffffffff81501ae3>] rt_mutex_lock+0x2d/0x2f
[  126.060537]        [<ffffffff810d9020>] rcu_boost+0xad/0xde
[  126.060541]        [<ffffffff810d90ce>] rcu_boost_kthread+0x7d/0x9b
[  126.060544]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060547]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060551]
[  126.060552] -> #0 (&(lock)->wait_lock#2){+.+...}:
[  126.060555]        [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060558]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060561]        [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060564]        [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060566]        [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060569]        [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060573]        [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060576]        [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060580]        [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060583]        [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060585]        [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060590]        [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060593]        [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060597]        [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060600]        [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060603]        [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060606]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060608]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060611]
[  126.060612] other info that might help us debug this:
[  126.060614]
[  126.060615]  Possible unsafe locking scenario:
[  126.060616]
[  126.060617]        CPU0                    CPU1
[  126.060619]        ----                    ----
[  126.060620]   lock(&p->pi_lock);
[  126.060623]                                lock(&(lock)->wait_lock);
[  126.060625]                                lock(&p->pi_lock);
[  126.060627]   lock(&(lock)->wait_lock);
[  126.060629]
[  126.060629]  *** DEADLOCK ***
[  126.060630]
[  126.060632] 1 lock held by irq/24-eth0/1235:
[  126.060633]  #0:  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060638]
[  126.060638] stack backtrace:
[  126.060641] Pid: 1235, comm: irq/24-eth0 Not tainted 3.0.1-rt10+ #30
[  126.060643] Call Trace:
[  126.060644]  <IRQ>  [<ffffffff810acbde>] print_circular_bug+0x289/0x29a
[  126.060651]  [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060655]  [<ffffffff810ab3aa>] ? trace_hardirqs_off_caller+0x1f/0x99
[  126.060658]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060661]  [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060664]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060668]  [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060671]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060674]  [<ffffffff810d9655>] ? rcu_report_qs_rsp+0x87/0x8c
[  126.060677]  [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060680]  [<ffffffff810d9ea3>] ? rcu_read_unlock_special+0x9b/0x1c4
[  126.060683]  [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060687]  [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060690]  [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060693]  [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060696]  [<ffffffff810683da>] ? select_task_rq_rt+0x27/0xd5
[  126.060701]  [<ffffffff810a852a>] ? clockevents_program_event+0x8e/0x90
[  126.060704]  [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060708]  [<ffffffff810a95dc>] ? tick_program_event+0x1f/0x21
[  126.060711]  [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060715]  [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060718]  [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060721]  [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060724]  [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060726]  <EOI>  [<ffffffff81072855>] ? migrate_disable+0x75/0x12d
[  126.060733]  [<ffffffff81080a61>] ? local_bh_disable+0xe/0x1f
[  126.060736]  [<ffffffff81080a70>] ? local_bh_disable+0x1d/0x1f
[  126.060739]  [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060742]  [<ffffffff81502ac0>] ? _raw_spin_unlock_irq+0x3b/0x59
[  126.060745]  [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060748]  [<ffffffff810d5937>] ? irq_thread_fn+0x3a/0x3a
[  126.060751]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060754]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060757]  [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060761]  [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060764]  [<ffffffff81069ed7>] ? finish_task_switch+0x87/0x10a
[  126.060768]  [<ffffffff81502ec4>] ? retint_restore_args+0xe/0xe
[  126.060771]  [<ffffffff8109a6c7>] ? __init_kthread_worker+0x8c/0x8c
[  126.060774]  [<ffffffff81509b10>] ? gs_change+0xb/0xb

Because irq_exit() does:

void irq_exit(void)
{
	account_system_vtime(current);
	trace_hardirq_exit();
	sub_preempt_count(IRQ_EXIT_OFFSET);
	if (!in_interrupt() && local_softirq_pending())
		invoke_softirq();

	...
}

Which triggers a wakeup, which uses RCU, now if the interrupted task has
t->rcu_read_unlock_special set, the rcu usage from the wakeup will end
up in rcu_read_unlock_special(). rcu_read_unlock_special() will test
for in_irq(), which will fail as we just decremented preempt_count
with IRQ_EXIT_OFFSET, and in_sering_softirq(), which for
PREEMPT_RT_FULL reads:

int in_serving_softirq(void)
{
	int res;

	preempt_disable();
	res = __get_cpu_var(local_softirq_runner) == current;
	preempt_enable();
	return res;
}

Which will thus also fail, resulting in the above wreckage.

The 'somewhat' ugly solution is to open-code the preempt_count() test
in rcu_read_unlock_special().

Also, we're not at all sure how ->rcu_read_unlock_special gets set
here... so this is very likely a bandaid and more thought is required.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Change-Id: 004a13ffe770a78cc15db3688e8d19f6df300dcd
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Mar 14, 2023
We don't need to hold the local pinctrl lock here to set irq wake on the
summary irq line. Doing so only leads to lockdep warnings instead of
protecting us from anything. Remove the locking.

 WARNING: possible circular locking dependency detected
 5.4.11 #2 Tainted: G        W
 ------------------------------------------------------
 cat/3083 is trying to acquire lock:
 ffffff81f4fa58c0 (&irq_desc_lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94

 but task is already holding lock:
 ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&pctrl->lock){-.-.}:
        _raw_spin_lock_irqsave+0x64/0x80
        msm_gpio_irq_ack+0x68/0xf4
        __irq_do_set_handler+0xe0/0x180
        __irq_set_handler+0x60/0x9c
        irq_domain_set_info+0x90/0xb4
        gpiochip_hierarchy_irq_domain_alloc+0x110/0x200
        __irq_domain_alloc_irqs+0x130/0x29c
        irq_create_fwspec_mapping+0x1f0/0x300
        irq_create_of_mapping+0x70/0x98
        of_irq_get+0xa4/0xd4
        spi_drv_probe+0x4c/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        __device_attach_driver+0x9c/0x110
        bus_for_each_drv+0x88/0xd0
        __device_attach+0xb0/0x160
        device_initial_probe+0x20/0x2c
        bus_probe_device+0x34/0x94
        device_add+0x35c/0x3f0
        spi_add_device+0xbc/0x194
        of_register_spi_devices+0x2c8/0x408
        spi_register_controller+0x57c/0x6fc
        spi_geni_probe+0x260/0x328
        platform_drv_probe+0x90/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        device_driver_attach+0x4c/0x6c
        __driver_attach+0xcc/0x154
        bus_for_each_dev+0x84/0xcc
        driver_attach+0x2c/0x38
        bus_add_driver+0x108/0x1fc
        driver_register+0x64/0xf8
        __platform_driver_register+0x4c/0x58
        spi_geni_driver_init+0x1c/0x24
        do_one_initcall+0x1a4/0x3e8
        do_initcall_level+0xb4/0xcc
        do_basic_setup+0x30/0x48
        kernel_init_freeable+0x124/0x1a8
        kernel_init+0x14/0x100
        ret_from_fork+0x10/0x18

 -> #0 (&irq_desc_lock_class){-.-.}:
        __lock_acquire+0xeb4/0x2388
        lock_acquire+0x1cc/0x210
        _raw_spin_lock_irqsave+0x64/0x80
        __irq_get_desc_lock+0x64/0x94
        irq_set_irq_wake+0x40/0x144
        msm_gpio_irq_set_wake+0x5c/0x7c
        set_irq_wake_real+0x40/0x5c
        irq_set_irq_wake+0x70/0x144
        cros_ec_rtc_suspend+0x38/0x4c
        platform_pm_suspend+0x34/0x60
        dpm_run_callback+0x64/0xcc
        __device_suspend+0x310/0x41c
        dpm_suspend+0xf8/0x298
        dpm_suspend_start+0x84/0xb4
        suspend_devices_and_enter+0xbc/0x620
        pm_suspend+0x210/0x348
        state_store+0xb0/0x108
        kobj_attr_store+0x14/0x24
        sysfs_kf_write+0x4c/0x64
        kernfs_fop_write+0x15c/0x1fc
        __vfs_write+0x54/0x18c
        vfs_write+0xe4/0x1a4
        ksys_write+0x7c/0xe4
        __arm64_sys_write+0x20/0x2c
        el0_svc_common+0xa8/0x160
        el0_svc_handler+0x7c/0x98
        el0_svc+0x8/0xc

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&pctrl->lock);
                                lock(&irq_desc_lock_class);
                                lock(&pctrl->lock);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

 7 locks held by cat/3083:
  #0: ffffff81f06d1420 (sb_writers#7){.+.+}, at: vfs_write+0xd0/0x1a4
  #1: ffffff81c8935680 (&of->mutex){+.+.}, at: kernfs_fop_write+0x12c/0x1fc
  #2: ffffff81f4c322f0 (kn->count#337){.+.+}, at: kernfs_fop_write+0x134/0x1fc
  #3: ffffffe89a641d60 (system_transition_mutex){+.+.}, at: pm_suspend+0x108/0x348
  #4: ffffff81f190e970 (&dev->mutex){....}, at: __device_suspend+0x168/0x41c
  #5: ffffff81f183d8c0 (lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94
  #6: ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 stack backtrace:
 CPU: 4 PID: 3083 Comm: cat Tainted: G        W         5.4.11 #2
 Hardware name: Google Cheza (rev3+) (DT)
 Call trace:
  dump_backtrace+0x0/0x174
  show_stack+0x20/0x2c
  dump_stack+0xc8/0x124
  print_circular_bug+0x2ac/0x2c4
  check_noncircular+0x1a0/0x1a8
  __lock_acquire+0xeb4/0x2388
  lock_acquire+0x1cc/0x210
  _raw_spin_lock_irqsave+0x64/0x80
  __irq_get_desc_lock+0x64/0x94
  irq_set_irq_wake+0x40/0x144
  msm_gpio_irq_set_wake+0x5c/0x7c
  set_irq_wake_real+0x40/0x5c
  irq_set_irq_wake+0x70/0x144
  cros_ec_rtc_suspend+0x38/0x4c
  platform_pm_suspend+0x34/0x60
  dpm_run_callback+0x64/0xcc
  __device_suspend+0x310/0x41c
  dpm_suspend+0xf8/0x298
  dpm_suspend_start+0x84/0xb4
  suspend_devices_and_enter+0xbc/0x620
  pm_suspend+0x210/0x348
  state_store+0xb0/0x108
  kobj_attr_store+0x14/0x24
  sysfs_kf_write+0x4c/0x64
  kernfs_fop_write+0x15c/0x1fc
  __vfs_write+0x54/0x18c
  vfs_write+0xe4/0x1a4
  ksys_write+0x7c/0xe4
  __arm64_sys_write+0x20/0x2c
  el0_svc_common+0xa8/0x160
  el0_svc_handler+0x7c/0x98
  el0_svc+0x8/0xc

Fixes: 6aced33 ("pinctrl: msm: drop wake_irqs bitmap")
Change-Id: ef2b6a1e1fb3fa770292185f167c9b2dda592ca2
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Brian Masney <masneyb@onstation.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Maulik Shah <mkshah@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20200121180950.36959-1-swboyd@chromium.org
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Mar 14, 2023
On a system that often re-configures a USB gadget device, the kernel log
is filled with:

  configfs-gadget gadget: high-speed config #1: c

Reduce the verbosity of this print to debug.

Change-Id: 16244bd7c13974c50c35860cdd0bbbc74ca18889
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Mar 15, 2023
With RT_FULL we get the below wreckage:

[  126.060484] =======================================================
[  126.060486] [ INFO: possible circular locking dependency detected ]
[  126.060489] 3.0.1-rt10+ #30
[  126.060490] -------------------------------------------------------
[  126.060492] irq/24-eth0/1235 is trying to acquire lock:
[  126.060495]  (&(lock)->wait_lock#2){+.+...}, at: [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060503]
[  126.060504] but task is already holding lock:
[  126.060506]  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060511]
[  126.060511] which lock already depends on the new lock.
[  126.060513]
[  126.060514]
[  126.060514] the existing dependency chain (in reverse order) is:
[  126.060516]
[  126.060516] -> #1 (&p->pi_lock){-...-.}:
[  126.060519]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060524]        [<ffffffff8150291e>] _raw_spin_lock_irqsave+0x4b/0x85
[  126.060527]        [<ffffffff810b5aa4>] task_blocks_on_rt_mutex+0x36/0x20f
[  126.060531]        [<ffffffff815019bb>] rt_mutex_slowlock+0xd1/0x15a
[  126.060534]        [<ffffffff81501ae3>] rt_mutex_lock+0x2d/0x2f
[  126.060537]        [<ffffffff810d9020>] rcu_boost+0xad/0xde
[  126.060541]        [<ffffffff810d90ce>] rcu_boost_kthread+0x7d/0x9b
[  126.060544]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060547]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060551]
[  126.060552] -> #0 (&(lock)->wait_lock#2){+.+...}:
[  126.060555]        [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060558]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060561]        [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060564]        [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060566]        [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060569]        [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060573]        [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060576]        [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060580]        [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060583]        [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060585]        [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060590]        [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060593]        [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060597]        [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060600]        [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060603]        [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060606]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060608]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060611]
[  126.060612] other info that might help us debug this:
[  126.060614]
[  126.060615]  Possible unsafe locking scenario:
[  126.060616]
[  126.060617]        CPU0                    CPU1
[  126.060619]        ----                    ----
[  126.060620]   lock(&p->pi_lock);
[  126.060623]                                lock(&(lock)->wait_lock);
[  126.060625]                                lock(&p->pi_lock);
[  126.060627]   lock(&(lock)->wait_lock);
[  126.060629]
[  126.060629]  *** DEADLOCK ***
[  126.060630]
[  126.060632] 1 lock held by irq/24-eth0/1235:
[  126.060633]  #0:  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060638]
[  126.060638] stack backtrace:
[  126.060641] Pid: 1235, comm: irq/24-eth0 Not tainted 3.0.1-rt10+ #30
[  126.060643] Call Trace:
[  126.060644]  <IRQ>  [<ffffffff810acbde>] print_circular_bug+0x289/0x29a
[  126.060651]  [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060655]  [<ffffffff810ab3aa>] ? trace_hardirqs_off_caller+0x1f/0x99
[  126.060658]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060661]  [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060664]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060668]  [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060671]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060674]  [<ffffffff810d9655>] ? rcu_report_qs_rsp+0x87/0x8c
[  126.060677]  [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060680]  [<ffffffff810d9ea3>] ? rcu_read_unlock_special+0x9b/0x1c4
[  126.060683]  [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060687]  [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060690]  [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060693]  [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060696]  [<ffffffff810683da>] ? select_task_rq_rt+0x27/0xd5
[  126.060701]  [<ffffffff810a852a>] ? clockevents_program_event+0x8e/0x90
[  126.060704]  [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060708]  [<ffffffff810a95dc>] ? tick_program_event+0x1f/0x21
[  126.060711]  [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060715]  [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060718]  [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060721]  [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060724]  [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060726]  <EOI>  [<ffffffff81072855>] ? migrate_disable+0x75/0x12d
[  126.060733]  [<ffffffff81080a61>] ? local_bh_disable+0xe/0x1f
[  126.060736]  [<ffffffff81080a70>] ? local_bh_disable+0x1d/0x1f
[  126.060739]  [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060742]  [<ffffffff81502ac0>] ? _raw_spin_unlock_irq+0x3b/0x59
[  126.060745]  [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060748]  [<ffffffff810d5937>] ? irq_thread_fn+0x3a/0x3a
[  126.060751]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060754]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060757]  [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060761]  [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060764]  [<ffffffff81069ed7>] ? finish_task_switch+0x87/0x10a
[  126.060768]  [<ffffffff81502ec4>] ? retint_restore_args+0xe/0xe
[  126.060771]  [<ffffffff8109a6c7>] ? __init_kthread_worker+0x8c/0x8c
[  126.060774]  [<ffffffff81509b10>] ? gs_change+0xb/0xb

Because irq_exit() does:

void irq_exit(void)
{
	account_system_vtime(current);
	trace_hardirq_exit();
	sub_preempt_count(IRQ_EXIT_OFFSET);
	if (!in_interrupt() && local_softirq_pending())
		invoke_softirq();

	...
}

Which triggers a wakeup, which uses RCU, now if the interrupted task has
t->rcu_read_unlock_special set, the rcu usage from the wakeup will end
up in rcu_read_unlock_special(). rcu_read_unlock_special() will test
for in_irq(), which will fail as we just decremented preempt_count
with IRQ_EXIT_OFFSET, and in_sering_softirq(), which for
PREEMPT_RT_FULL reads:

int in_serving_softirq(void)
{
	int res;

	preempt_disable();
	res = __get_cpu_var(local_softirq_runner) == current;
	preempt_enable();
	return res;
}

Which will thus also fail, resulting in the above wreckage.

The 'somewhat' ugly solution is to open-code the preempt_count() test
in rcu_read_unlock_special().

Also, we're not at all sure how ->rcu_read_unlock_special gets set
here... so this is very likely a bandaid and more thought is required.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Change-Id: 004a13ffe770a78cc15db3688e8d19f6df300dcd
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Mar 15, 2023
We don't need to hold the local pinctrl lock here to set irq wake on the
summary irq line. Doing so only leads to lockdep warnings instead of
protecting us from anything. Remove the locking.

 WARNING: possible circular locking dependency detected
 5.4.11 #2 Tainted: G        W
 ------------------------------------------------------
 cat/3083 is trying to acquire lock:
 ffffff81f4fa58c0 (&irq_desc_lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94

 but task is already holding lock:
 ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&pctrl->lock){-.-.}:
        _raw_spin_lock_irqsave+0x64/0x80
        msm_gpio_irq_ack+0x68/0xf4
        __irq_do_set_handler+0xe0/0x180
        __irq_set_handler+0x60/0x9c
        irq_domain_set_info+0x90/0xb4
        gpiochip_hierarchy_irq_domain_alloc+0x110/0x200
        __irq_domain_alloc_irqs+0x130/0x29c
        irq_create_fwspec_mapping+0x1f0/0x300
        irq_create_of_mapping+0x70/0x98
        of_irq_get+0xa4/0xd4
        spi_drv_probe+0x4c/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        __device_attach_driver+0x9c/0x110
        bus_for_each_drv+0x88/0xd0
        __device_attach+0xb0/0x160
        device_initial_probe+0x20/0x2c
        bus_probe_device+0x34/0x94
        device_add+0x35c/0x3f0
        spi_add_device+0xbc/0x194
        of_register_spi_devices+0x2c8/0x408
        spi_register_controller+0x57c/0x6fc
        spi_geni_probe+0x260/0x328
        platform_drv_probe+0x90/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        device_driver_attach+0x4c/0x6c
        __driver_attach+0xcc/0x154
        bus_for_each_dev+0x84/0xcc
        driver_attach+0x2c/0x38
        bus_add_driver+0x108/0x1fc
        driver_register+0x64/0xf8
        __platform_driver_register+0x4c/0x58
        spi_geni_driver_init+0x1c/0x24
        do_one_initcall+0x1a4/0x3e8
        do_initcall_level+0xb4/0xcc
        do_basic_setup+0x30/0x48
        kernel_init_freeable+0x124/0x1a8
        kernel_init+0x14/0x100
        ret_from_fork+0x10/0x18

 -> #0 (&irq_desc_lock_class){-.-.}:
        __lock_acquire+0xeb4/0x2388
        lock_acquire+0x1cc/0x210
        _raw_spin_lock_irqsave+0x64/0x80
        __irq_get_desc_lock+0x64/0x94
        irq_set_irq_wake+0x40/0x144
        msm_gpio_irq_set_wake+0x5c/0x7c
        set_irq_wake_real+0x40/0x5c
        irq_set_irq_wake+0x70/0x144
        cros_ec_rtc_suspend+0x38/0x4c
        platform_pm_suspend+0x34/0x60
        dpm_run_callback+0x64/0xcc
        __device_suspend+0x310/0x41c
        dpm_suspend+0xf8/0x298
        dpm_suspend_start+0x84/0xb4
        suspend_devices_and_enter+0xbc/0x620
        pm_suspend+0x210/0x348
        state_store+0xb0/0x108
        kobj_attr_store+0x14/0x24
        sysfs_kf_write+0x4c/0x64
        kernfs_fop_write+0x15c/0x1fc
        __vfs_write+0x54/0x18c
        vfs_write+0xe4/0x1a4
        ksys_write+0x7c/0xe4
        __arm64_sys_write+0x20/0x2c
        el0_svc_common+0xa8/0x160
        el0_svc_handler+0x7c/0x98
        el0_svc+0x8/0xc

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&pctrl->lock);
                                lock(&irq_desc_lock_class);
                                lock(&pctrl->lock);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

 7 locks held by cat/3083:
  #0: ffffff81f06d1420 (sb_writers#7){.+.+}, at: vfs_write+0xd0/0x1a4
  #1: ffffff81c8935680 (&of->mutex){+.+.}, at: kernfs_fop_write+0x12c/0x1fc
  #2: ffffff81f4c322f0 (kn->count#337){.+.+}, at: kernfs_fop_write+0x134/0x1fc
  #3: ffffffe89a641d60 (system_transition_mutex){+.+.}, at: pm_suspend+0x108/0x348
  #4: ffffff81f190e970 (&dev->mutex){....}, at: __device_suspend+0x168/0x41c
  #5: ffffff81f183d8c0 (lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94
  #6: ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 stack backtrace:
 CPU: 4 PID: 3083 Comm: cat Tainted: G        W         5.4.11 #2
 Hardware name: Google Cheza (rev3+) (DT)
 Call trace:
  dump_backtrace+0x0/0x174
  show_stack+0x20/0x2c
  dump_stack+0xc8/0x124
  print_circular_bug+0x2ac/0x2c4
  check_noncircular+0x1a0/0x1a8
  __lock_acquire+0xeb4/0x2388
  lock_acquire+0x1cc/0x210
  _raw_spin_lock_irqsave+0x64/0x80
  __irq_get_desc_lock+0x64/0x94
  irq_set_irq_wake+0x40/0x144
  msm_gpio_irq_set_wake+0x5c/0x7c
  set_irq_wake_real+0x40/0x5c
  irq_set_irq_wake+0x70/0x144
  cros_ec_rtc_suspend+0x38/0x4c
  platform_pm_suspend+0x34/0x60
  dpm_run_callback+0x64/0xcc
  __device_suspend+0x310/0x41c
  dpm_suspend+0xf8/0x298
  dpm_suspend_start+0x84/0xb4
  suspend_devices_and_enter+0xbc/0x620
  pm_suspend+0x210/0x348
  state_store+0xb0/0x108
  kobj_attr_store+0x14/0x24
  sysfs_kf_write+0x4c/0x64
  kernfs_fop_write+0x15c/0x1fc
  __vfs_write+0x54/0x18c
  vfs_write+0xe4/0x1a4
  ksys_write+0x7c/0xe4
  __arm64_sys_write+0x20/0x2c
  el0_svc_common+0xa8/0x160
  el0_svc_handler+0x7c/0x98
  el0_svc+0x8/0xc

Fixes: 6aced33 ("pinctrl: msm: drop wake_irqs bitmap")
Change-Id: ef2b6a1e1fb3fa770292185f167c9b2dda592ca2
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Brian Masney <masneyb@onstation.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Maulik Shah <mkshah@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20200121180950.36959-1-swboyd@chromium.org
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Mar 15, 2023
On a system that often re-configures a USB gadget device, the kernel log
is filled with:

  configfs-gadget gadget: high-speed config #1: c

Reduce the verbosity of this print to debug.

Change-Id: 16244bd7c13974c50c35860cdd0bbbc74ca18889
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Mar 26, 2023
A recent clang 16 update has introduced segfault when compiling kernel
using the `-polly-invariant-load-hoisting` flag and breaks compilation
when kernel is compiled with full LTO.
This is due to the flag being ignorant about the errors during SCoP
verification and does not take the errors into account as per a recent
issue opened at [LLVM], causing polly to do segfault and the compiler
to print the following backtrace:

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: clang -Wp,-MD,drivers/md/.md.o.d -nostdinc -isystem /tmp/cirrus-ci-build/toolchains/clang/lib/clang/16.0.0/include -I../arch/arm64/include -I./arch/arm64/include/generated -I../include -I./include -I../arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I../include/uapi -I./include/generated/uapi -include ../include/linux/kconfig.h -include ../include/linux/compiler_types.h -I../drivers/md -Idrivers/md -D__KERNEL__ -mlittle-endian -DKASAN_SHADOW_SCALE_SHIFT=3 -Qunused-arguments -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration -Werror=return-type -Wno-format-security -std=gnu89 --target=aarch64-linux-gnu --prefix=/tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin/aarch64-linux-gnu- --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -Wno-misleading-indentation -Wno-bool-operation -Werror=unknown-warning-option -Wno-unsequenced -opaque-pointers -fno-PIE -mgeneral-regs-only -DCONFIG_AS_LSE=1 -fno-asynchronous-unwind-tables -Wno-psabi -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-int-in-bool-context -Wno-address-of-packed-member -O3 -march=armv8.1-a+crypto+fp16+rcpc -mtune=cortex-a53 -mllvm -polly -mllvm -polly-ast-use-context -mllvm -polly-detect-keep-going -mllvm -polly-invariant-load-hoisting -mllvm -polly-run-inliner -mllvm -polly-vectorizer=stripmine -mllvm -polly-loopfusion-greedy=1 -mllvm -polly-reschedule=1 -mllvm -polly-postopts=1 -fstack-protector-strong --target=aarch64-linux-gnu --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -meabi gnu -Wno-format-invalid-specifier -Wno-gnu -Wno-duplicate-decl-specifier -Wno-asm-operand-widths -Wno-initializer-overrides -Wno-tautological-constant-out-of-range-compare -Wno-tautological-compare -mno-global-merge -Wno-void-ptr-dereference -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -Wvla -flto -fwhole-program-vtables -fvisibility=hidden -Wdeclaration-after-statement -Wno-pointer-sign -Wno-array-bounds -fno-strict-overflow -fno-stack-check -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types -fmacro-prefix-map=../= -Wno-initializer-overrides -Wno-unused-value -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-uninitialized -Wno-pointer-to-enum-cast -Wno-unaligned-access -DKBUILD_BASENAME=\"md\" -DKBUILD_MODNAME=\"md_mod\" -D__KBUILD_MODNAME=kmod_md_mod -c -o drivers/md/md.o ../drivers/md/md.c
1.	<eof> parser at end of file
2.	Optimizer
  CC      drivers/media/platform/msm/camera_v2/camera/camera.o
  AR      drivers/media/pci/intel/ipu3/built-in.a
  CC      drivers/md/dm-linear.o
 #0 0x0000559d3527073f (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a7073f)
 #1 0x0000559d352705bf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a705bf)
 #2 0x0000559d3523b198 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b198)
 #3 0x0000559d3523b33e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b33e)
 #4 0x00007f339dc3ea00 (/usr/lib/libc.so.6+0x38a00)
 #5 0x0000559d35affccf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x42ffccf)
 #6 0x0000559d35b01710 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301710)
 #7 0x0000559d35b01a12 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301a12)
 #8 0x0000559d35b09a9e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4309a9e)
 #9 0x0000559d35b14707 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4314707)
clang-16: error: clang frontend command failed with exit code 139 (use -v to see invocation)
Neutron clang version 16.0.0 (https://github.com/llvm/llvm-project.git 598f5275c16049b1e1b5bc934cbde447a82d485e)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin

From the very nature of `-polly-detect-keep-going`, it can be concluded
that this is a potentially unsafe flag and instead of using it conditionally
for CLANG < 16, just remove it all together for all. This should allow polly
to run SCoP verifications as intended.

Issue [LLVM]: llvm/llvm-project#58484 (comment)

Change-Id: 9a2d8205cf5e0259d6b261e2d7753f3d939759bf
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Mar 26, 2023
A recent clang 16 update has introduced segfault when compiling kernel
using the `-polly-invariant-load-hoisting` flag and breaks compilation
when kernel is compiled with full LTO.
This is due to the flag being ignorant about the errors during SCoP
verification and does not take the errors into account as per a recent
issue opened at [LLVM], causing polly to do segfault and the compiler
to print the following backtrace:

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: clang -Wp,-MD,drivers/md/.md.o.d -nostdinc -isystem /tmp/cirrus-ci-build/toolchains/clang/lib/clang/16.0.0/include -I../arch/arm64/include -I./arch/arm64/include/generated -I../include -I./include -I../arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I../include/uapi -I./include/generated/uapi -include ../include/linux/kconfig.h -include ../include/linux/compiler_types.h -I../drivers/md -Idrivers/md -D__KERNEL__ -mlittle-endian -DKASAN_SHADOW_SCALE_SHIFT=3 -Qunused-arguments -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration -Werror=return-type -Wno-format-security -std=gnu89 --target=aarch64-linux-gnu --prefix=/tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin/aarch64-linux-gnu- --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -Wno-misleading-indentation -Wno-bool-operation -Werror=unknown-warning-option -Wno-unsequenced -opaque-pointers -fno-PIE -mgeneral-regs-only -DCONFIG_AS_LSE=1 -fno-asynchronous-unwind-tables -Wno-psabi -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-int-in-bool-context -Wno-address-of-packed-member -O3 -march=armv8.1-a+crypto+fp16+rcpc -mtune=cortex-a53 -mllvm -polly -mllvm -polly-ast-use-context -mllvm -polly-detect-keep-going -mllvm -polly-invariant-load-hoisting -mllvm -polly-run-inliner -mllvm -polly-vectorizer=stripmine -mllvm -polly-loopfusion-greedy=1 -mllvm -polly-reschedule=1 -mllvm -polly-postopts=1 -fstack-protector-strong --target=aarch64-linux-gnu --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -meabi gnu -Wno-format-invalid-specifier -Wno-gnu -Wno-duplicate-decl-specifier -Wno-asm-operand-widths -Wno-initializer-overrides -Wno-tautological-constant-out-of-range-compare -Wno-tautological-compare -mno-global-merge -Wno-void-ptr-dereference -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -Wvla -flto -fwhole-program-vtables -fvisibility=hidden -Wdeclaration-after-statement -Wno-pointer-sign -Wno-array-bounds -fno-strict-overflow -fno-stack-check -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types -fmacro-prefix-map=../= -Wno-initializer-overrides -Wno-unused-value -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-uninitialized -Wno-pointer-to-enum-cast -Wno-unaligned-access -DKBUILD_BASENAME=\"md\" -DKBUILD_MODNAME=\"md_mod\" -D__KBUILD_MODNAME=kmod_md_mod -c -o drivers/md/md.o ../drivers/md/md.c
1.	<eof> parser at end of file
2.	Optimizer
  CC      drivers/media/platform/msm/camera_v2/camera/camera.o
  AR      drivers/media/pci/intel/ipu3/built-in.a
  CC      drivers/md/dm-linear.o
 #0 0x0000559d3527073f (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a7073f)
 #1 0x0000559d352705bf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a705bf)
 #2 0x0000559d3523b198 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b198)
 #3 0x0000559d3523b33e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b33e)
 #4 0x00007f339dc3ea00 (/usr/lib/libc.so.6+0x38a00)
 #5 0x0000559d35affccf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x42ffccf)
 #6 0x0000559d35b01710 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301710)
 #7 0x0000559d35b01a12 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301a12)
 #8 0x0000559d35b09a9e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4309a9e)
 #9 0x0000559d35b14707 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4314707)
clang-16: error: clang frontend command failed with exit code 139 (use -v to see invocation)
Neutron clang version 16.0.0 (https://github.com/llvm/llvm-project.git 598f5275c16049b1e1b5bc934cbde447a82d485e)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin

From the very nature of `-polly-detect-keep-going`, it can be concluded
that this is a potentially unsafe flag and instead of using it conditionally
for CLANG < 16, just remove it all together for all. This should allow polly
to run SCoP verifications as intended.

Issue [LLVM]: llvm/llvm-project#58484 (comment)

Change-Id: 9a2d8205cf5e0259d6b261e2d7753f3d939759bf
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Mar 26, 2023
A recent clang 16 update has introduced segfault when compiling kernel
using the `-polly-invariant-load-hoisting` flag and breaks compilation
when kernel is compiled with full LTO.
This is due to the flag being ignorant about the errors during SCoP
verification and does not take the errors into account as per a recent
issue opened at [LLVM], causing polly to do segfault and the compiler
to print the following backtrace:

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: clang -Wp,-MD,drivers/md/.md.o.d -nostdinc -isystem /tmp/cirrus-ci-build/toolchains/clang/lib/clang/16.0.0/include -I../arch/arm64/include -I./arch/arm64/include/generated -I../include -I./include -I../arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I../include/uapi -I./include/generated/uapi -include ../include/linux/kconfig.h -include ../include/linux/compiler_types.h -I../drivers/md -Idrivers/md -D__KERNEL__ -mlittle-endian -DKASAN_SHADOW_SCALE_SHIFT=3 -Qunused-arguments -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration -Werror=return-type -Wno-format-security -std=gnu89 --target=aarch64-linux-gnu --prefix=/tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin/aarch64-linux-gnu- --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -Wno-misleading-indentation -Wno-bool-operation -Werror=unknown-warning-option -Wno-unsequenced -opaque-pointers -fno-PIE -mgeneral-regs-only -DCONFIG_AS_LSE=1 -fno-asynchronous-unwind-tables -Wno-psabi -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-int-in-bool-context -Wno-address-of-packed-member -O3 -march=armv8.1-a+crypto+fp16+rcpc -mtune=cortex-a53 -mllvm -polly -mllvm -polly-ast-use-context -mllvm -polly-detect-keep-going -mllvm -polly-invariant-load-hoisting -mllvm -polly-run-inliner -mllvm -polly-vectorizer=stripmine -mllvm -polly-loopfusion-greedy=1 -mllvm -polly-reschedule=1 -mllvm -polly-postopts=1 -fstack-protector-strong --target=aarch64-linux-gnu --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -meabi gnu -Wno-format-invalid-specifier -Wno-gnu -Wno-duplicate-decl-specifier -Wno-asm-operand-widths -Wno-initializer-overrides -Wno-tautological-constant-out-of-range-compare -Wno-tautological-compare -mno-global-merge -Wno-void-ptr-dereference -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -Wvla -flto -fwhole-program-vtables -fvisibility=hidden -Wdeclaration-after-statement -Wno-pointer-sign -Wno-array-bounds -fno-strict-overflow -fno-stack-check -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types -fmacro-prefix-map=../= -Wno-initializer-overrides -Wno-unused-value -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-uninitialized -Wno-pointer-to-enum-cast -Wno-unaligned-access -DKBUILD_BASENAME=\"md\" -DKBUILD_MODNAME=\"md_mod\" -D__KBUILD_MODNAME=kmod_md_mod -c -o drivers/md/md.o ../drivers/md/md.c
1.	<eof> parser at end of file
2.	Optimizer
  CC      drivers/media/platform/msm/camera_v2/camera/camera.o
  AR      drivers/media/pci/intel/ipu3/built-in.a
  CC      drivers/md/dm-linear.o
 #0 0x0000559d3527073f (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a7073f)
 #1 0x0000559d352705bf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a705bf)
 #2 0x0000559d3523b198 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b198)
 #3 0x0000559d3523b33e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b33e)
 #4 0x00007f339dc3ea00 (/usr/lib/libc.so.6+0x38a00)
 #5 0x0000559d35affccf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x42ffccf)
 #6 0x0000559d35b01710 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301710)
 #7 0x0000559d35b01a12 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301a12)
 #8 0x0000559d35b09a9e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4309a9e)
 #9 0x0000559d35b14707 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4314707)
clang-16: error: clang frontend command failed with exit code 139 (use -v to see invocation)
Neutron clang version 16.0.0 (https://github.com/llvm/llvm-project.git 598f5275c16049b1e1b5bc934cbde447a82d485e)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin

From the very nature of `-polly-detect-keep-going`, it can be concluded
that this is a potentially unsafe flag and instead of using it conditionally
for CLANG < 16, just remove it all together for all. This should allow polly
to run SCoP verifications as intended.

Issue [LLVM]: llvm/llvm-project#58484 (comment)

Change-Id: 9a2d8205cf5e0259d6b261e2d7753f3d939759bf
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue May 31, 2023
fsck.f2fs forces a filesystem fix on boot if it detects that the current
kernel version differs from the one saved in the superblock, which results in
fsck blocking boot for a long time (~35 seconds). This commit provides a
way to report a constant fake kernel version to fsck to avoid triggering
the version check, which is useful if you boot new kernel builds
frequently.

Signed-off-by: Danny Lin <danny@kdrag0n.dev>
Change-Id: Ibd3fc6305fe49a322d9b34884b6e45979a056906

Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com>

ocean stock-kernel =

CONFIG_F2FS_REPORT_FAKE_KERNEL_VERSION=y
CONFIG_F2FS_FAKE_KERNEL_RELEASE="4.9.206-perf+"
CONFIG_F2FS_FAKE_KERNEL_VERSION="#1 SMP PREEMPT Wed Sep 30 23:50:55 CDT 2020 aarch64"

Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com>
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue May 31, 2023
Patch series "lib/sort & lib/list_sort: faster and smaller", v2.

Because CONFIG_RETPOLINE has made indirect calls much more expensive, I
thought I'd try to reduce the number made by the library sort functions.

The first three patches apply to lib/sort.c.

Patch #1 is a simple optimization.  The built-in swap has special cases
for aligned 4- and 8-byte objects.  But those are almost never used;
most calls to sort() work on larger structures, which fall back to the
byte-at-a-time loop.  This generalizes them to aligned *multiples* of 4
and 8 bytes.  (If nothing else, it saves an awful lot of energy by not
thrashing the store buffers as much.)

Patch #2 grabs a juicy piece of low-hanging fruit.  I agree that nice
simple solid heapsort is preferable to more complex algorithms (sorry,
Andrey), but it's possible to implement heapsort with far fewer
comparisons (50% asymptotically, 25-40% reduction for realistic sizes)
than the way it's been done up to now.  And with some care, the code
ends up smaller, as well.  This is the "big win" patch.

Patch #3 adds the same sort of indirect call bypass that has been added
to the net code of late.  The great majority of the callers use the
builtin swap functions, so replace the indirect call to sort_func with a
(highly preditable) series of if() statements.  Rather surprisingly,
this decreased code size, as the swap functions were inlined and their
prologue & epilogue code eliminated.

lib/list_sort.c is a bit trickier, as merge sort is already close to
optimal, and we don't want to introduce triumphs of theory over
practicality like the Ford-Johnson merge-insertion sort.

Patch #4, without changing the algorithm, chops 32% off the code size
and removes the part[MAX_LIST_LENGTH+1] pointer array (and the
corresponding upper limit on efficiently sortable input size).

Patch #5 improves the algorithm.  The previous code is already optimal
for power-of-two (or slightly smaller) size inputs, but when the input
size is just over a power of 2, there's a very unbalanced final merge.

There are, in the literature, several algorithms which solve this, but
they all depend on the "breadth-first" merge order which was replaced by
commit 835cc0c with a more cache-friendly "depth-first" order.
Some hard thinking came up with a depth-first algorithm which defers
merges as little as possible while avoiding bad merges.  This saves
0.2*n compares, averaged over all sizes.

The code size increase is minimal (64 bytes on x86-64, reducing the net
savings to 26%), but the comments expanded significantly to document the
clever algorithm.

TESTING NOTES: I have some ugly user-space benchmarking code which I
used for testing before moving this code into the kernel.  Shout if you
want a copy.

I'm running this code right now, with CONFIG_TEST_SORT and
CONFIG_TEST_LIST_SORT, but I confess I haven't rebooted since the last
round of minor edits to quell checkpatch.  I figure there will be at
least one round of comments and final testing.

This patch (of 5):

Rather than having special-case swap functions for 4- and 8-byte
objects, special-case aligned multiples of 4 or 8 bytes.  This speeds up
most users of sort() by avoiding fallback to the byte copy loop.

Despite what ca96ab8 ("lib/sort: Add 64 bit swap function") claims,
very few users of sort() sort pointers (or pointer-sized objects); most
sort structures containing at least two words.  (E.g.
drivers/acpi/fan.c:acpi_fan_get_fps() sorts an array of 40-byte struct
acpi_fan_fps.)

The functions also got renamed to reflect the fact that they support
multiple words.  In the great tradition of bikeshedding, the names were
by far the most contentious issue during review of this patch series.

x86-64 code size 872 -> 886 bytes (+14)

With feedback from Andy Shevchenko, Rasmus Villemoes and Geert
Uytterhoeven.

Link: http://lkml.kernel.org/r/f24f932df3a7fa1973c1084154f1cea596bcf341.1552704200.git.lkml@sdf.org
Signed-off-by: George Spelvin <lkml@sdf.org>
Change-Id: Ifc0bc97f8c9fc4c8ffbd122c0b44ec879ca7e369
Acked-by: Andrey Abramov <st5pub@yandex.ru>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Daniel Wagner <daniel.wagner@siemens.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
TogoFire pushed a commit that referenced this issue May 31, 2023
commit ba5a4fdd63ae0c575707030db0b634b160baddd7 upstream.

syzbot complained about a recent change in TCP stack,
hitting a NULL pointer [1]

tcp request sockets have an af_specific pointer, which
was used before the blamed change only for SYNACK generation
in non SYNCOOKIE mode.

tcp requests sockets momentarily created when third packet
coming from client in SYNCOOKIE mode were not using
treq->af_specific.

Make sure this field is populated, in the same way normal
TCP requests sockets do in tcp_conn_request().

[1]
TCP: request_sock_TCPv6: Possible SYN flooding on port 20002. Sending cookies.  Check SNMP counters.
general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 1 PID: 3695 Comm: syz-executor864 Not tainted 5.18.0-rc3-syzkaller-00224-g5fd1fe4807f9 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:tcp_create_openreq_child+0xe16/0x16b0 net/ipv4/tcp_minisocks.c:534
Code: 48 c1 ea 03 80 3c 02 00 0f 85 e5 07 00 00 4c 8b b3 28 01 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7e 08 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 c9 07 00 00 48 8b 3c 24 48 89 de 41 ff 56 08 48
RSP: 0018:ffffc90000de0588 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff888076490330 RCX: 0000000000000100
RDX: 0000000000000001 RSI: ffffffff87d67ff0 RDI: 0000000000000008
RBP: ffff88806ee1c7f8 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff87d67f00 R11: 0000000000000000 R12: ffff88806ee1bfc0
R13: ffff88801b0e0368 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f517fe58700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffcead76960 CR3: 000000006f97b000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 tcp_v6_syn_recv_sock+0x199/0x23b0 net/ipv6/tcp_ipv6.c:1267
 tcp_get_cookie_sock+0xc9/0x850 net/ipv4/syncookies.c:207
 cookie_v6_check+0x15c3/0x2340 net/ipv6/syncookies.c:258
 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:1131 [inline]
 tcp_v6_do_rcv+0x1148/0x13b0 net/ipv6/tcp_ipv6.c:1486
 tcp_v6_rcv+0x3305/0x3840 net/ipv6/tcp_ipv6.c:1725
 ip6_protocol_deliver_rcu+0x2e9/0x1900 net/ipv6/ip6_input.c:422
 ip6_input_finish+0x14c/0x2c0 net/ipv6/ip6_input.c:464
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:473
 dst_input include/net/dst.h:461 [inline]
 ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ipv6_rcv+0x27f/0x3b0 net/ipv6/ip6_input.c:297
 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5405
 __netif_receive_skb+0x24/0x1b0 net/core/dev.c:5519
 process_backlog+0x3a0/0x7c0 net/core/dev.c:5847
 __napi_poll+0xb3/0x6e0 net/core/dev.c:6413
 napi_poll net/core/dev.c:6480 [inline]
 net_rx_action+0x8ec/0xc60 net/core/dev.c:6567
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097

Fixes: 5b0b9e4c2c89 ("tcp: md5: incorrect tcp_header_len for incoming connections")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Change-Id: Ic4ffbecd0373d649a67a5b2c464b99f4db265de3
Cc: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[fruggeri: Account for backport conflicts from 35b2c3211609 and 6fc8c827dd4f]
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
TogoFire pushed a commit that referenced this issue May 31, 2023
During boot we observed several warnings that can be avoided

[    6.725935] pmic_analog_codec 800f000.qcom,spmi:qcom,pm660l@3:analog-codec@f000: Error: regulator not found:cdc-vdd-spkdrv
[    6.726304] ------------[ cut here ]------------
[    6.726364] WARNING: CPU: 1 PID: 455 at regcache_cache_only+0xf0/0x100
[    6.726452] Modules linked in:
[    6.726460] CPU: 1 PID: 455 Comm: kworker/1:3 Not tainted 4.9.170-g00294e841ec8-dirty #1
[    6.726462] Hardware name: Qualcomm Technologies, Inc. SDM 630 PM660 + PM660L MTP (DT)
[    6.726471] Workqueue: events deferred_probe_work_func
[    6.726473] task: ffffffedb575b600 task.stack: ffffffedb5770000
[    6.726476] PC is at regcache_cache_only+0xf0/0x100
[    6.726478] LR is at regcache_cache_only+0x2c/0x100

Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
Change-Id: I7146b0de1f3e96016810574d9ca3b9e115ea38a3
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue May 31, 2023
log:
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld -EL -maarch64elf --lto-O3 -z noexecstack -r -o vmlinux.o --whole-archive built -in.o
 #0 0x000055d50c1017a3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e87a3)
 #1 0x000055d50c1013fa (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e83fa)
 #2 0x00007f52034b9b50 __restore_rt (/lib64/libc.so.6+0x3cb50)
 #3 0x000055d50c38a111 void lld::elf::writeResult<llvm::object::ELFType<(llvm::support::endianness)1, true>>() (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang- neutron/bin/ld.lld+0x2871111)
 #4 0x000055d50c1e539e lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26cc39e)
 #5 0x000055d50c1ce651 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26b5651)
 #6 0x000055d50c1cbc96 lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron /bin/ld.lld+0x26b2c96)
 #7 0x000055d50c0aad03 (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2591d03)
 #8 0x000055d50bc2bc3e lld_main(int, char**) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2112c3e)
 #9 0x00007f52034a4510 __libc_start_call_main (/lib64/libc.so.6+0x27510)
#10 0x00007f52034a45c9 __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c9)
#11 0x000055d50c09e9f5 _start (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25859f5)
../scripts/link-vmlinux.sh, line 94: 60322 Segmentation fault (core image recorded)${LDFINAL} ${LDFLAGS} -r -o ${1} $(lto_lds) ${objects}
make[1]: *** [/home/togo/Documents/GitHub/kernel_xiaomi_panda/Makefile:1272: vmlinux] Error 139
make[1]: Leaving directory '/home/togo/Documentos/GitHub/kernel_xiaomi_panda/out'

make: *** [Makefile:154: sub-make] Error 2

Signed-off-by: GeoPD <geoemmanuelpd2001@gmail.com>
Change-Id: I7a032246e6fc6d7c373206ac6736bf3ea96a37f2
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue May 31, 2023
With RT_FULL we get the below wreckage:

[  126.060484] =======================================================
[  126.060486] [ INFO: possible circular locking dependency detected ]
[  126.060489] 3.0.1-rt10+ #30
[  126.060490] -------------------------------------------------------
[  126.060492] irq/24-eth0/1235 is trying to acquire lock:
[  126.060495]  (&(lock)->wait_lock#2){+.+...}, at: [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060503]
[  126.060504] but task is already holding lock:
[  126.060506]  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060511]
[  126.060511] which lock already depends on the new lock.
[  126.060513]
[  126.060514]
[  126.060514] the existing dependency chain (in reverse order) is:
[  126.060516]
[  126.060516] -> #1 (&p->pi_lock){-...-.}:
[  126.060519]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060524]        [<ffffffff8150291e>] _raw_spin_lock_irqsave+0x4b/0x85
[  126.060527]        [<ffffffff810b5aa4>] task_blocks_on_rt_mutex+0x36/0x20f
[  126.060531]        [<ffffffff815019bb>] rt_mutex_slowlock+0xd1/0x15a
[  126.060534]        [<ffffffff81501ae3>] rt_mutex_lock+0x2d/0x2f
[  126.060537]        [<ffffffff810d9020>] rcu_boost+0xad/0xde
[  126.060541]        [<ffffffff810d90ce>] rcu_boost_kthread+0x7d/0x9b
[  126.060544]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060547]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060551]
[  126.060552] -> #0 (&(lock)->wait_lock#2){+.+...}:
[  126.060555]        [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060558]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060561]        [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060564]        [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060566]        [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060569]        [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060573]        [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060576]        [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060580]        [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060583]        [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060585]        [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060590]        [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060593]        [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060597]        [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060600]        [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060603]        [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060606]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060608]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060611]
[  126.060612] other info that might help us debug this:
[  126.060614]
[  126.060615]  Possible unsafe locking scenario:
[  126.060616]
[  126.060617]        CPU0                    CPU1
[  126.060619]        ----                    ----
[  126.060620]   lock(&p->pi_lock);
[  126.060623]                                lock(&(lock)->wait_lock);
[  126.060625]                                lock(&p->pi_lock);
[  126.060627]   lock(&(lock)->wait_lock);
[  126.060629]
[  126.060629]  *** DEADLOCK ***
[  126.060630]
[  126.060632] 1 lock held by irq/24-eth0/1235:
[  126.060633]  #0:  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060638]
[  126.060638] stack backtrace:
[  126.060641] Pid: 1235, comm: irq/24-eth0 Not tainted 3.0.1-rt10+ #30
[  126.060643] Call Trace:
[  126.060644]  <IRQ>  [<ffffffff810acbde>] print_circular_bug+0x289/0x29a
[  126.060651]  [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060655]  [<ffffffff810ab3aa>] ? trace_hardirqs_off_caller+0x1f/0x99
[  126.060658]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060661]  [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060664]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060668]  [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060671]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060674]  [<ffffffff810d9655>] ? rcu_report_qs_rsp+0x87/0x8c
[  126.060677]  [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060680]  [<ffffffff810d9ea3>] ? rcu_read_unlock_special+0x9b/0x1c4
[  126.060683]  [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060687]  [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060690]  [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060693]  [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060696]  [<ffffffff810683da>] ? select_task_rq_rt+0x27/0xd5
[  126.060701]  [<ffffffff810a852a>] ? clockevents_program_event+0x8e/0x90
[  126.060704]  [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060708]  [<ffffffff810a95dc>] ? tick_program_event+0x1f/0x21
[  126.060711]  [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060715]  [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060718]  [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060721]  [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060724]  [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060726]  <EOI>  [<ffffffff81072855>] ? migrate_disable+0x75/0x12d
[  126.060733]  [<ffffffff81080a61>] ? local_bh_disable+0xe/0x1f
[  126.060736]  [<ffffffff81080a70>] ? local_bh_disable+0x1d/0x1f
[  126.060739]  [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060742]  [<ffffffff81502ac0>] ? _raw_spin_unlock_irq+0x3b/0x59
[  126.060745]  [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060748]  [<ffffffff810d5937>] ? irq_thread_fn+0x3a/0x3a
[  126.060751]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060754]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060757]  [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060761]  [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060764]  [<ffffffff81069ed7>] ? finish_task_switch+0x87/0x10a
[  126.060768]  [<ffffffff81502ec4>] ? retint_restore_args+0xe/0xe
[  126.060771]  [<ffffffff8109a6c7>] ? __init_kthread_worker+0x8c/0x8c
[  126.060774]  [<ffffffff81509b10>] ? gs_change+0xb/0xb

Because irq_exit() does:

void irq_exit(void)
{
	account_system_vtime(current);
	trace_hardirq_exit();
	sub_preempt_count(IRQ_EXIT_OFFSET);
	if (!in_interrupt() && local_softirq_pending())
		invoke_softirq();

	...
}

Which triggers a wakeup, which uses RCU, now if the interrupted task has
t->rcu_read_unlock_special set, the rcu usage from the wakeup will end
up in rcu_read_unlock_special(). rcu_read_unlock_special() will test
for in_irq(), which will fail as we just decremented preempt_count
with IRQ_EXIT_OFFSET, and in_sering_softirq(), which for
PREEMPT_RT_FULL reads:

int in_serving_softirq(void)
{
	int res;

	preempt_disable();
	res = __get_cpu_var(local_softirq_runner) == current;
	preempt_enable();
	return res;
}

Which will thus also fail, resulting in the above wreckage.

The 'somewhat' ugly solution is to open-code the preempt_count() test
in rcu_read_unlock_special().

Also, we're not at all sure how ->rcu_read_unlock_special gets set
here... so this is very likely a bandaid and more thought is required.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Change-Id: I35c03fa69f134baaf5a6dcf75c6ac096d706f3fa
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue May 31, 2023
We don't need to hold the local pinctrl lock here to set irq wake on the
summary irq line. Doing so only leads to lockdep warnings instead of
protecting us from anything. Remove the locking.

 WARNING: possible circular locking dependency detected
 5.4.11 #2 Tainted: G        W
 ------------------------------------------------------
 cat/3083 is trying to acquire lock:
 ffffff81f4fa58c0 (&irq_desc_lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94

 but task is already holding lock:
 ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&pctrl->lock){-.-.}:
        _raw_spin_lock_irqsave+0x64/0x80
        msm_gpio_irq_ack+0x68/0xf4
        __irq_do_set_handler+0xe0/0x180
        __irq_set_handler+0x60/0x9c
        irq_domain_set_info+0x90/0xb4
        gpiochip_hierarchy_irq_domain_alloc+0x110/0x200
        __irq_domain_alloc_irqs+0x130/0x29c
        irq_create_fwspec_mapping+0x1f0/0x300
        irq_create_of_mapping+0x70/0x98
        of_irq_get+0xa4/0xd4
        spi_drv_probe+0x4c/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        __device_attach_driver+0x9c/0x110
        bus_for_each_drv+0x88/0xd0
        __device_attach+0xb0/0x160
        device_initial_probe+0x20/0x2c
        bus_probe_device+0x34/0x94
        device_add+0x35c/0x3f0
        spi_add_device+0xbc/0x194
        of_register_spi_devices+0x2c8/0x408
        spi_register_controller+0x57c/0x6fc
        spi_geni_probe+0x260/0x328
        platform_drv_probe+0x90/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        device_driver_attach+0x4c/0x6c
        __driver_attach+0xcc/0x154
        bus_for_each_dev+0x84/0xcc
        driver_attach+0x2c/0x38
        bus_add_driver+0x108/0x1fc
        driver_register+0x64/0xf8
        __platform_driver_register+0x4c/0x58
        spi_geni_driver_init+0x1c/0x24
        do_one_initcall+0x1a4/0x3e8
        do_initcall_level+0xb4/0xcc
        do_basic_setup+0x30/0x48
        kernel_init_freeable+0x124/0x1a8
        kernel_init+0x14/0x100
        ret_from_fork+0x10/0x18

 -> #0 (&irq_desc_lock_class){-.-.}:
        __lock_acquire+0xeb4/0x2388
        lock_acquire+0x1cc/0x210
        _raw_spin_lock_irqsave+0x64/0x80
        __irq_get_desc_lock+0x64/0x94
        irq_set_irq_wake+0x40/0x144
        msm_gpio_irq_set_wake+0x5c/0x7c
        set_irq_wake_real+0x40/0x5c
        irq_set_irq_wake+0x70/0x144
        cros_ec_rtc_suspend+0x38/0x4c
        platform_pm_suspend+0x34/0x60
        dpm_run_callback+0x64/0xcc
        __device_suspend+0x310/0x41c
        dpm_suspend+0xf8/0x298
        dpm_suspend_start+0x84/0xb4
        suspend_devices_and_enter+0xbc/0x620
        pm_suspend+0x210/0x348
        state_store+0xb0/0x108
        kobj_attr_store+0x14/0x24
        sysfs_kf_write+0x4c/0x64
        kernfs_fop_write+0x15c/0x1fc
        __vfs_write+0x54/0x18c
        vfs_write+0xe4/0x1a4
        ksys_write+0x7c/0xe4
        __arm64_sys_write+0x20/0x2c
        el0_svc_common+0xa8/0x160
        el0_svc_handler+0x7c/0x98
        el0_svc+0x8/0xc

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&pctrl->lock);
                                lock(&irq_desc_lock_class);
                                lock(&pctrl->lock);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

 7 locks held by cat/3083:
  #0: ffffff81f06d1420 (sb_writers#7){.+.+}, at: vfs_write+0xd0/0x1a4
  #1: ffffff81c8935680 (&of->mutex){+.+.}, at: kernfs_fop_write+0x12c/0x1fc
  #2: ffffff81f4c322f0 (kn->count#337){.+.+}, at: kernfs_fop_write+0x134/0x1fc
  #3: ffffffe89a641d60 (system_transition_mutex){+.+.}, at: pm_suspend+0x108/0x348
  #4: ffffff81f190e970 (&dev->mutex){....}, at: __device_suspend+0x168/0x41c
  #5: ffffff81f183d8c0 (lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94
  #6: ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 stack backtrace:
 CPU: 4 PID: 3083 Comm: cat Tainted: G        W         5.4.11 #2
 Hardware name: Google Cheza (rev3+) (DT)
 Call trace:
  dump_backtrace+0x0/0x174
  show_stack+0x20/0x2c
  dump_stack+0xc8/0x124
  print_circular_bug+0x2ac/0x2c4
  check_noncircular+0x1a0/0x1a8
  __lock_acquire+0xeb4/0x2388
  lock_acquire+0x1cc/0x210
  _raw_spin_lock_irqsave+0x64/0x80
  __irq_get_desc_lock+0x64/0x94
  irq_set_irq_wake+0x40/0x144
  msm_gpio_irq_set_wake+0x5c/0x7c
  set_irq_wake_real+0x40/0x5c
  irq_set_irq_wake+0x70/0x144
  cros_ec_rtc_suspend+0x38/0x4c
  platform_pm_suspend+0x34/0x60
  dpm_run_callback+0x64/0xcc
  __device_suspend+0x310/0x41c
  dpm_suspend+0xf8/0x298
  dpm_suspend_start+0x84/0xb4
  suspend_devices_and_enter+0xbc/0x620
  pm_suspend+0x210/0x348
  state_store+0xb0/0x108
  kobj_attr_store+0x14/0x24
  sysfs_kf_write+0x4c/0x64
  kernfs_fop_write+0x15c/0x1fc
  __vfs_write+0x54/0x18c
  vfs_write+0xe4/0x1a4
  ksys_write+0x7c/0xe4
  __arm64_sys_write+0x20/0x2c
  el0_svc_common+0xa8/0x160
  el0_svc_handler+0x7c/0x98
  el0_svc+0x8/0xc

Fixes: 6aced33 ("pinctrl: msm: drop wake_irqs bitmap")
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Brian Masney <masneyb@onstation.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Maulik Shah <mkshah@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Change-Id: Icffb76346e708f27a490e6b76cc848101a551b24
Link: https://lore.kernel.org/r/20200121180950.36959-1-swboyd@chromium.org
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue May 31, 2023
On a system that often re-configures a USB gadget device, the kernel log
is filled with:

  configfs-gadget gadget: high-speed config #1: c

Reduce the verbosity of this print to debug.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Change-Id: I51a0521aab677f20e2e573aef687b47c84f354dc
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue May 31, 2023
A recent clang 16 update has introduced segfault when compiling kernel
using the `-polly-invariant-load-hoisting` flag and breaks compilation
when kernel is compiled with full LTO.
This is due to the flag being ignorant about the errors during SCoP
verification and does not take the errors into account as per a recent
issue opened at [LLVM], causing polly to do segfault and the compiler
to print the following backtrace:

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: clang -Wp,-MD,drivers/md/.md.o.d -nostdinc -isystem /tmp/cirrus-ci-build/toolchains/clang/lib/clang/16.0.0/include -I../arch/arm64/include -I./arch/arm64/include/generated -I../include -I./include -I../arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I../include/uapi -I./include/generated/uapi -include ../include/linux/kconfig.h -include ../include/linux/compiler_types.h -I../drivers/md -Idrivers/md -D__KERNEL__ -mlittle-endian -DKASAN_SHADOW_SCALE_SHIFT=3 -Qunused-arguments -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration -Werror=return-type -Wno-format-security -std=gnu89 --target=aarch64-linux-gnu --prefix=/tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin/aarch64-linux-gnu- --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -Wno-misleading-indentation -Wno-bool-operation -Werror=unknown-warning-option -Wno-unsequenced -opaque-pointers -fno-PIE -mgeneral-regs-only -DCONFIG_AS_LSE=1 -fno-asynchronous-unwind-tables -Wno-psabi -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-int-in-bool-context -Wno-address-of-packed-member -O3 -march=armv8.1-a+crypto+fp16+rcpc -mtune=cortex-a53 -mllvm -polly -mllvm -polly-ast-use-context -mllvm -polly-detect-keep-going -mllvm -polly-invariant-load-hoisting -mllvm -polly-run-inliner -mllvm -polly-vectorizer=stripmine -mllvm -polly-loopfusion-greedy=1 -mllvm -polly-reschedule=1 -mllvm -polly-postopts=1 -fstack-protector-strong --target=aarch64-linux-gnu --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -meabi gnu -Wno-format-invalid-specifier -Wno-gnu -Wno-duplicate-decl-specifier -Wno-asm-operand-widths -Wno-initializer-overrides -Wno-tautological-constant-out-of-range-compare -Wno-tautological-compare -mno-global-merge -Wno-void-ptr-dereference -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -Wvla -flto -fwhole-program-vtables -fvisibility=hidden -Wdeclaration-after-statement -Wno-pointer-sign -Wno-array-bounds -fno-strict-overflow -fno-stack-check -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types -fmacro-prefix-map=../= -Wno-initializer-overrides -Wno-unused-value -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-uninitialized -Wno-pointer-to-enum-cast -Wno-unaligned-access -DKBUILD_BASENAME=\"md\" -DKBUILD_MODNAME=\"md_mod\" -D__KBUILD_MODNAME=kmod_md_mod -c -o drivers/md/md.o ../drivers/md/md.c
1.	<eof> parser at end of file
2.	Optimizer
  CC      drivers/media/platform/msm/camera_v2/camera/camera.o
  AR      drivers/media/pci/intel/ipu3/built-in.a
  CC      drivers/md/dm-linear.o
 #0 0x0000559d3527073f (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a7073f)
 #1 0x0000559d352705bf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a705bf)
 #2 0x0000559d3523b198 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b198)
 #3 0x0000559d3523b33e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b33e)
 #4 0x00007f339dc3ea00 (/usr/lib/libc.so.6+0x38a00)
 #5 0x0000559d35affccf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x42ffccf)
 #6 0x0000559d35b01710 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301710)
 #7 0x0000559d35b01a12 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301a12)
 #8 0x0000559d35b09a9e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4309a9e)
 #9 0x0000559d35b14707 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4314707)
clang-16: error: clang frontend command failed with exit code 139 (use -v to see invocation)
Neutron clang version 16.0.0 (https://github.com/llvm/llvm-project.git 598f5275c16049b1e1b5bc934cbde447a82d485e)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin

From the very nature of `-polly-detect-keep-going`, it can be concluded
that this is a potentially unsafe flag and instead of using it conditionally
for CLANG < 16, just remove it all together for all. This should allow polly
to run SCoP verifications as intended.

Issue [LLVM]: llvm/llvm-project#58484 (comment)

Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Change-Id: I18aa6d58835ac2c79c4e5c070731e01a5fce9ad0
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue May 31, 2023
Qos requirement is not initialized when update is called.

This fixes the following call trace:
[...]
[   27.206894] msm_pm_qos_update_request: update request 100
[   27.206900] msm_pm_qos_add_request: add request
[   27.206904] pm_qos_update_request() called for unknown object
[   27.206915] ------------[ cut here ]------------
[   27.206928] WARNING: CPU: 5 PID: 1107 at kernel/power/qos.c:696 pm_qos_update_request+0x4c/0x60
[   27.206935] CPU: 5 PID: 1107 Comm: qvrservice Tainted: G S              4.14.185-Chips #1
[   27.206937] Hardware name: Qualcomm Technologies, Inc. trinket pm6125 + pmi632 IDP (DT)
[   27.206939] task: 00000000ffaf085c task.stack: 000000007a0f23be
[   27.206942] pc : pm_qos_update_request+0x4c/0x60
[   27.206943] lr : pm_qos_update_request+0x4c/0x60
[...]

Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Change-Id: I0b7c11953462aacdc85e8e1320782f46b5bdb87b
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Jun 2, 2023
commit 26feb9bc3c5322d331d0b713106ac7631cbab938
Author: Ritesh Harjani <riteshh@codeaurora.org>
Date:   Wed Aug 9 18:28:32 2017 +0530

    cfq: Give a chance for arming slice idle timer in case of group_idle

    In below scenario blkio cgroup does not work as per their assigned
    weights :-
    1. When the underlying device is nonrotational with a single HW queue
    with depth of >= CFQ_HW_QUEUE_MIN
    2. When the use case is forming two blkio cgroups cg1(weight 1000) &
    cg2(wight 100) and two processes(file1 and file2) doing sync IO in
    their respective blkio cgroups.

    For above usecase result of fio (without this patch):-
    file1: (groupid=0, jobs=1): err= 0: pid=685: Thu Jan  1 19:41:49 1970
      write: IOPS=1315, BW=41.1MiB/s (43.1MB/s)(1024MiB/24906msec)
    <...>
    file2: (groupid=0, jobs=1): err= 0: pid=686: Thu Jan  1 19:41:49 1970
      write: IOPS=1295, BW=40.5MiB/s (42.5MB/s)(1024MiB/25293msec)
    <...>
    // both the process BW is equal even though they belong to diff.
    cgroups with weight of 1000(cg1) and 100(cg2)

    In above case (for non rotational NCQ devices),
    as soon as the request from cg1 is completed and even
    though it is provided with higher set_slice=10, because of CFQ
    algorithm when the driver tries to fetch the request, CFQ expires
    this group without providing any idle time nor weight priority
    and schedules another cfq group (in this case cg2).
    And thus both cfq groups(cg1 & cg2) keep alternating to get the
    disk time and hence loses the cgroup weight based scheduling.

    Below patch gives a chance to cfq algorithm (cfq_arm_slice_timer)
    to arm the slice timer in case group_idle is enabled.
    In case if group_idle is also not required (including for nonrotational
    NCQ drives), we need to explicitly set group_idle = 0 from sysfs for
    such cases.

    With this patch result of fio(for above usecase) :-
    file1: (groupid=0, jobs=1): err= 0: pid=690: Thu Jan  1 00:06:08 1970
      write: IOPS=1706, BW=53.3MiB/s (55.9MB/s)(1024MiB/19197msec)
    <..>
    file2: (groupid=0, jobs=1): err= 0: pid=691: Thu Jan  1 00:06:08 1970
      write: IOPS=1043, BW=32.6MiB/s (34.2MB/s)(1024MiB/31401msec)
    <..>
    // In this processes BW is as per their respective cgroups weight.

    Change-Id: Ia68ba16e76fdcc60d66e256f94290c871c5b2a73
    Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: mydongistiny <jaysonedson@gmail.com>
Change-Id: I13c34b62c5d006a2f079b66d4c8392b49e35a5a8

commit 2133aba5a779fa206289ea3e1ebbc5ef67be7d2c
Author: Hou Tao <houtao1@huawei.com>
Date:   Wed Mar 1 09:02:33 2017 +0800

    cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode

    When adding a cfq_group into the cfq service tree, we use CFQ_IDLE_DELAY
    as the delay of cfq_group's vdisktime if there have been other cfq_groups
    already.

    When cfq is under iops mode, commit 9a7f38c ("cfq-iosched: Convert
    from jiffies to nanoseconds") could result in a large iops delay and
    lead to an abnormal io schedule delay for the added cfq_group. To fix
    it, we just need to revert to the old CFQ_IDLE_DELAY value: HZ / 5
    when iops mode is enabled.

    Despite having the same value, the delay of a cfq_queue in idle class
    and the delay of cfq_group are different things, so I define two new
    macros for the delay of a cfq_group under time-slice mode and iops mode.

    Change-Id: I6d3a048e063b2826136e5416c28c1a7b471f5d18
    Fixes: 9a7f38c ("cfq-iosched: Convert from jiffies to nanoseconds")
    Cc: <stable@vger.kernel.org> # 4.8+
    Signed-off-by: Hou Tao <houtao1@huawei.com>
    Acked-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Jens Axboe <axboe@fb.com>
    Signed-off-by: mydongistiny <jaysonedson@gmail.com>
Change-Id: I13c34b62c5d006a2f079b66d4c8392b49e35a5a8

commit 9b5738156b8f9da812966499f8982b6cfd25e144
Author: Matthias Kaehlcke <mka@chromium.org>
Date:   Fri May 26 14:22:37 2017 -0700

    cfq-iosched: Delete unused function min_vdisktime()

    This fixes the following warning when building with clang:

        block/cfq-iosched.c:970:19: error: unused function 'min_vdisktime'
            [-Werror,-Wunused-function]

    Change-Id: I74bcf4a5407496a627e6ac41b274c7a3cc887cc3
    Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
    Signed-off-by: Jens Axboe <axboe@fb.com>
    Signed-off-by: mydongistiny <jaysonedson@gmail.com>
Change-Id: I13c34b62c5d006a2f079b66d4c8392b49e35a5a8

commit 962bbb5e92abc53a11042a8d992057b6f1e7505a
Author: Markus Elfring <elfring@users.sourceforge.net>
Date:   Sat Jan 21 22:44:07 2017 +0100

    cfq-iosched: Adjust one function call together with a variable assignment

    The script "checkpatch.pl" pointed information out like the following.

    ERROR: do not use assignment in if condition

    Thus fix the affected source code place.

    Change-Id: I54338d5e9de5d3b8b593b63a535c98ec9a4a0f05
    Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
    Signed-off-by: Jens Axboe <axboe@fb.com>
    Signed-off-by: mydongistiny <jaysonedson@gmail.com>
Change-Id: I13c34b62c5d006a2f079b66d4c8392b49e35a5a8

commit a846d0ed19f1b51314a27a01810922643b457ec8
Author: Jens Axboe <axboe@fb.com>
Date:   Thu Jun 9 15:47:29 2016 -0600

    cfq-iosched: temporarily boost queue priority for idle classes

    If we're queuing REQ_PRIO IO and the task is running at an idle IO
    class, then temporarily boost the priority. This prevents livelocks
    due to priority inversion, when a low priority task is holding file
    system resources while attempting to do IO.

    An example of that is shown below. An ioniced idle task is holding
    the directory mutex, while a normal priority task is trying to do
    a directory lookup.

    [478381.198925] ------------[ cut here ]------------
    [478381.200315] INFO: task ionice:1168369 blocked for more than 120 seconds.
    [478381.201324]       Not tainted 4.0.9-38_fbk5_hotfix1_2936_g85409c6 #1
    [478381.202278] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [478381.203462] ionice          D ffff8803692736a8     0 1168369      1 0x00000080
    [478381.203466]  ffff8803692736a8 ffff880399c21300 ffff880276adcc00 ffff880369273698
    [478381.204589]  ffff880369273fd8 0000000000000000 7fffffffffffffff 0000000000000002
    [478381.205752]  ffffffff8177d5e0 ffff8803692736c8 ffffffff8177cea7 0000000000000000
    [478381.206874] Call Trace:
    [478381.207253]  [<ffffffff8177d5e0>] ? bit_wait_io_timeout+0x80/0x80
    [478381.208175]  [<ffffffff8177cea7>] schedule+0x37/0x90
    [478381.208932]  [<ffffffff8177f5fc>] schedule_timeout+0x1dc/0x250
    [478381.209805]  [<ffffffff81421c17>] ? __blk_run_queue+0x37/0x50
    [478381.210706]  [<ffffffff810ca1c5>] ? ktime_get+0x45/0xb0
    [478381.211489]  [<ffffffff8177c407>] io_schedule_timeout+0xa7/0x110
    [478381.212402]  [<ffffffff810a8c2b>] ? prepare_to_wait+0x5b/0x90
    [478381.213280]  [<ffffffff8177d616>] bit_wait_io+0x36/0x50
    [478381.214063]  [<ffffffff8177d325>] __wait_on_bit+0x65/0x90
    [478381.214961]  [<ffffffff8177d5e0>] ? bit_wait_io_timeout+0x80/0x80
    [478381.215872]  [<ffffffff8177d47c>] out_of_line_wait_on_bit+0x7c/0x90
    [478381.216806]  [<ffffffff810a89f0>] ? wake_atomic_t_function+0x40/0x40
    [478381.217773]  [<ffffffff811f03aa>] __wait_on_buffer+0x2a/0x30
    [478381.218641]  [<ffffffff8123c557>] ext4_bread+0x57/0x70
    [478381.219425]  [<ffffffff8124498c>] __ext4_read_dirblock+0x3c/0x380
    [478381.220467]  [<ffffffff8124665d>] ext4_dx_find_entry+0x7d/0x170
    [478381.221357]  [<ffffffff8114c49e>] ? find_get_entry+0x1e/0xa0
    [478381.222208]  [<ffffffff81246bd4>] ext4_find_entry+0x484/0x510
    [478381.223090]  [<ffffffff812471a2>] ext4_lookup+0x52/0x160
    [478381.223882]  [<ffffffff811c401d>] lookup_real+0x1d/0x60
    [478381.224675]  [<ffffffff811c4698>] __lookup_hash+0x38/0x50
    [478381.225697]  [<ffffffff817745bd>] lookup_slow+0x45/0xab
    [478381.226941]  [<ffffffff811c690e>] link_path_walk+0x7ae/0x820
    [478381.227880]  [<ffffffff811c6a42>] path_init+0xc2/0x430
    [478381.228677]  [<ffffffff813e6e26>] ? security_file_alloc+0x16/0x20
    [478381.229776]  [<ffffffff811c8c57>] path_openat+0x77/0x620
    [478381.230767]  [<ffffffff81185c6e>] ? page_add_file_rmap+0x2e/0x70
    [478381.232019]  [<ffffffff811cb253>] do_filp_open+0x43/0xa0
    [478381.233016]  [<ffffffff8108c4a9>] ? creds_are_invalid+0x29/0x70
    [478381.234072]  [<ffffffff811c0cb0>] do_open_execat+0x70/0x170
    [478381.235039]  [<ffffffff811c1bf8>] do_execveat_common.isra.36+0x1b8/0x6e0
    [478381.236051]  [<ffffffff811c214c>] do_execve+0x2c/0x30
    [478381.236809]  [<ffffffff811ca392>] ? getname+0x12/0x20
    [478381.237564]  [<ffffffff811c23be>] SyS_execve+0x2e/0x40
    [478381.238338]  [<ffffffff81780a1d>] stub_execve+0x6d/0xa0
    [478381.239126] ------------[ cut here ]------------
    [478381.239915] ------------[ cut here ]------------
    [478381.240606] INFO: task python2.7:1168375 blocked for more than 120 seconds.
    [478381.242673]       Not tainted 4.0.9-38_fbk5_hotfix1_2936_g85409c6 #1
    [478381.243653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [478381.244902] python2.7       D ffff88005cf8fb98     0 1168375 1168248 0x00000080
    [478381.244904]  ffff88005cf8fb98 ffff88016c1f0980 ffffffff81c134c0 ffff88016c1f11a0
    [478381.246023]  ffff88005cf8ffd8 ffff880466cd0cbc ffff88016c1f0980 00000000ffffffff
    [478381.247138]  ffff880466cd0cc0 ffff88005cf8fbb8 ffffffff8177cea7 ffff88005cf8fcc8
    [478381.248252] Call Trace:
    [478381.248630]  [<ffffffff8177cea7>] schedule+0x37/0x90
    [478381.249382]  [<ffffffff8177d08e>] schedule_preempt_disabled+0xe/0x10
    [478381.250465]  [<ffffffff8177e892>] __mutex_lock_slowpath+0x92/0x100
    [478381.251409]  [<ffffffff8177e91b>] mutex_lock+0x1b/0x2f
    [478381.252199]  [<ffffffff817745ae>] lookup_slow+0x36/0xab
    [478381.253023]  [<ffffffff811c690e>] link_path_walk+0x7ae/0x820
    [478381.253877]  [<ffffffff811aeb41>] ? try_charge+0xc1/0x700
    [478381.254690]  [<ffffffff811c6a42>] path_init+0xc2/0x430
    [478381.255525]  [<ffffffff813e6e26>] ? security_file_alloc+0x16/0x20
    [478381.256450]  [<ffffffff811c8c57>] path_openat+0x77/0x620
    [478381.257256]  [<ffffffff8115b2fb>] ? lru_cache_add_active_or_unevictable+0x2b/0xa0
    [478381.258390]  [<ffffffff8117b623>] ? handle_mm_fault+0x13f3/0x1720
    [478381.259309]  [<ffffffff811cb253>] do_filp_open+0x43/0xa0
    [478381.260139]  [<ffffffff811d7ae2>] ? __alloc_fd+0x42/0x120
    [478381.260962]  [<ffffffff811b95ac>] do_sys_open+0x13c/0x230
    [478381.261779]  [<ffffffff81011393>] ? syscall_trace_enter_phase1+0x113/0x170
    [478381.262851]  [<ffffffff811b96c2>] SyS_open+0x22/0x30
    [478381.263598]  [<ffffffff81780532>] system_call_fastpath+0x12/0x17
    [478381.264551] ------------[ cut here ]------------
    [478381.265377] ------------[ cut here ]------------

    Change-Id: I8e7e1dd2f9bbdf4b42954260ecde1a4ea53ad1dd
    Signed-off-by: Jens Axboe <axboe@fb.com>
    Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
    Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
    Signed-off-by: mydongistiny <jaysonedson@gmail.com>
Change-Id: I13c34b62c5d006a2f079b66d4c8392b49e35a5a8

commit 05190a47bb024b6a03ce480054b27d1ea6fe315a
Author: Jens Axboe <axboe@fb.com>
Date:   Wed Jun 10 08:01:20 2015 -0600

    cfq-iosched: fix the setting of IOPS mode on SSDs

    A previous commit wanted to make CFQ default to IOPS mode on
    non-rotational storage, however it did so when the queue was
    initialized and the non-rotational flag is only set later on
    in the probe.

    Add an elevator hook that gets called off the add_disk() path,
    at that point we know that feature probing has finished, and
    we can reliably check for the various flags that drivers can
    set.

    Change-Id: I854ce21ffae67cb7d9b4164dc77ad4bc57731ebb
    Fixes: 41c0126 ("block: Make CFQ default to IOPS mode on SSDs")
    Tested-by: Romain Francoise <romain@orebokech.com>
    Signed-off-by: Jens Axboe <axboe@fb.com>
    Signed-off-by: mydongistiny <jaysonedson@gmail.com>
Change-Id: I13c34b62c5d006a2f079b66d4c8392b49e35a5a8

commit 6bd1f2f55e64640caf1a2def7dea38cab3dc3c36
Author: Jan Kara <jack@suse.com>
Date:   Tue Jan 12 16:24:15 2016 +0100

    cfq-iosched: Don't group_idle if cfqq has big thinktime

    There is no point in idling on a cfq group if the only cfq queue that is
    there has too big thinktime.

    Change-Id: I988a6f6681410cdcc9b1f95e0e0a40ed60f60c31
    Signed-off-by: Jan Kara <jack@suse.com>
    Acked-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Jens Axboe <axboe@fb.com>
    Signed-off-by: mydongistiny <jaysonedson@gmail.com>
Change-Id: I13c34b62c5d006a2f079b66d4c8392b49e35a5a8

Signed-off-by: mydongistiny <jaysonedson@gmail.com>
Change-Id: I13c34b62c5d006a2f079b66d4c8392b49e35a5a8
Signed-off-by: kdrag0n <dragon@khronodragon.com>
Signed-off-by: Lau <laststandrighthere@gmail.com>
TogoFire pushed a commit that referenced this issue Jun 14, 2023
When we try to transmit an skb with md_dst attached through wireguard
we hit a null pointer dereference in wg_xmit() due to the use of
dst_mtu() which calls into dst_blackhole_mtu() which in turn tries to
dereference dst->dev.

Since wireguard doesn't use md_dsts we should use skb_valid_dst(), which
checks for DST_METADATA flag, and if it's set, then falls back to
wireguard's device mtu. That gives us the best chance of transmitting
the packet; otherwise if the blackhole netdev is used we'd get
ETH_MIN_MTU.

 [  263.693506] BUG: kernel NULL pointer dereference, address: 00000000000000e0
 [  263.693908] #PF: supervisor read access in kernel mode
 [  263.694174] #PF: error_code(0x0000) - not-present page
 [  263.694424] PGD 0 P4D 0
 [  263.694653] Oops: 0000 [#1] PREEMPT SMP NOPTI
 [  263.694876] CPU: 5 PID: 951 Comm: mausezahn Kdump: loaded Not tainted 5.18.0-rc1+ #522
 [  263.695190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
 [  263.695529] RIP: 0010:dst_blackhole_mtu+0x17/0x20
 [  263.695770] Code: 00 00 00 0f 1f 44 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 10 48 83 e0 fc 8b 40 04 85 c0 75 09 48 8b 07 <8b> 80 e0 00 00 00 c3 66 90 0f 1f 44 00 00 48 89 d7 be 01 00 00 00
 [  263.696339] RSP: 0018:ffffa4a4422fbb28 EFLAGS: 00010246
 [  263.696600] RAX: 0000000000000000 RBX: ffff8ac9c3553000 RCX: 0000000000000000
 [  263.696891] RDX: 0000000000000401 RSI: 00000000fffffe01 RDI: ffffc4a43fb48900
 [  263.697178] RBP: ffffa4a4422fbb90 R08: ffffffff9622635e R09: 0000000000000002
 [  263.697469] R10: ffffffff9b69a6c0 R11: ffffa4a4422fbd0c R12: ffff8ac9d18b1a00
 [  263.697766] R13: ffff8ac9d0ce1840 R14: ffff8ac9d18b1a00 R15: ffff8ac9c3553000
 [  263.698054] FS:  00007f3704c337c0(0000) GS:ffff8acaebf40000(0000) knlGS:0000000000000000
 [  263.698470] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [  263.698826] CR2: 00000000000000e0 CR3: 0000000117a5c000 CR4: 00000000000006e0
 [  263.699214] Call Trace:
 [  263.699505]  <TASK>
 [  263.699759]  wg_xmit+0x411/0x450
 [  263.700059]  ? bpf_skb_set_tunnel_key+0x46/0x2d0
 [   263.700382]  ? dev_queue_xmit_nit+0x31/0x2b0
 [  263.700719]  dev_hard_start_xmit+0xd9/0x220
 [  263.701047]  __dev_queue_xmit+0x8b9/0xd30
 [  263.701344]  __bpf_redirect+0x1a4/0x380
 [  263.701664]  __dev_queue_xmit+0x83b/0xd30
 [  263.701961]  ? packet_parse_headers+0xb4/0xf0
 [  263.702275]  packet_sendmsg+0x9a8/0x16a0
 [  263.702596]  ? _raw_spin_unlock_irqrestore+0x23/0x40
 [  263.702933]  sock_sendmsg+0x5e/0x60
 [  263.703239]  __sys_sendto+0xf0/0x160
 [  263.703549]  __x64_sys_sendto+0x20/0x30
 [  263.703853]  do_syscall_64+0x3b/0x90
 [  263.704162]  entry_SYSCALL_64_after_hwframe+0x44/0xae
 [  263.704494] RIP: 0033:0x7f3704d50506
 [  263.704789] Code: 48 c7 c0 ff ff ff ff eb b7 66 2e 0f 1f 84 00 00 00 00 00 90 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 72 c3 90 55 48 83 ec 30 44 89 4c 24 2c 4c 89
 [  263.705652] RSP: 002b:00007ffe954b0b88 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
 [  263.706141] RAX: ffffffffffffffda RBX: 0000558bb259b490 RCX: 00007f3704d50506
 [  263.706544] RDX: 000000000000004a RSI: 0000558bb259b7b2 RDI: 0000000000000003
 [  263.706952] RBP: 0000000000000000 R08: 00007ffe954b0b90 R09: 0000000000000014
 [  263.707339] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe954b0b90
 [  263.707735] R13: 000000000000004a R14: 0000558bb259b7b2 R15: 0000000000000001
 [  263.708132]  </TASK>
 [  263.708398] Modules linked in: bridge netconsole bonding [last unloaded: bridge]
 [  263.708942] CR2: 00000000000000e0

Link: cilium/cilium#19428
Reported-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Change-Id: I6c4399f88af964c3f2cc6f934dbe7cbd785c9dd0
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
[Jason: polyfilled for < 4.3]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Jul 27, 2023
log:
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld -EL -maarch64elf --lto-O3 -z noexecstack -r -o vmlinux.o --whole-archive built -in.o
 #0 0x000055d50c1017a3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e87a3)
 #1 0x000055d50c1013fa (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e83fa)
 #2 0x00007f52034b9b50 __restore_rt (/lib64/libc.so.6+0x3cb50)
 #3 0x000055d50c38a111 void lld::elf::writeResult<llvm::object::ELFType<(llvm::support::endianness)1, true>>() (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang- neutron/bin/ld.lld+0x2871111)
 #4 0x000055d50c1e539e lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26cc39e)
 #5 0x000055d50c1ce651 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26b5651)
 #6 0x000055d50c1cbc96 lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron /bin/ld.lld+0x26b2c96)
 #7 0x000055d50c0aad03 (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2591d03)
 #8 0x000055d50bc2bc3e lld_main(int, char**) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2112c3e)
 #9 0x00007f52034a4510 __libc_start_call_main (/lib64/libc.so.6+0x27510)
#10 0x00007f52034a45c9 __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c9)
#11 0x000055d50c09e9f5 _start (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25859f5)
../scripts/link-vmlinux.sh, line 94: 60322 Segmentation fault (core image recorded)${LDFINAL} ${LDFLAGS} -r -o ${1} $(lto_lds) ${objects}
make[1]: *** [/home/togo/Documents/GitHub/kernel_xiaomi_panda/Makefile:1272: vmlinux] Error 139
make[1]: Leaving directory '/home/togo/Documentos/GitHub/kernel_xiaomi_panda/out'

make: *** [Makefile:154: sub-make] Error 2

Signed-off-by: GeoPD <geoemmanuelpd2001@gmail.com>
Change-Id: Id572d40ff4fff30544ef3165b75bf760505d8e0c
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Jul 27, 2023
With RT_FULL we get the below wreckage:

[  126.060484] =======================================================
[  126.060486] [ INFO: possible circular locking dependency detected ]
[  126.060489] 3.0.1-rt10+ #30
[  126.060490] -------------------------------------------------------
[  126.060492] irq/24-eth0/1235 is trying to acquire lock:
[  126.060495]  (&(lock)->wait_lock#2){+.+...}, at: [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060503]
[  126.060504] but task is already holding lock:
[  126.060506]  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060511]
[  126.060511] which lock already depends on the new lock.
[  126.060513]
[  126.060514]
[  126.060514] the existing dependency chain (in reverse order) is:
[  126.060516]
[  126.060516] -> #1 (&p->pi_lock){-...-.}:
[  126.060519]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060524]        [<ffffffff8150291e>] _raw_spin_lock_irqsave+0x4b/0x85
[  126.060527]        [<ffffffff810b5aa4>] task_blocks_on_rt_mutex+0x36/0x20f
[  126.060531]        [<ffffffff815019bb>] rt_mutex_slowlock+0xd1/0x15a
[  126.060534]        [<ffffffff81501ae3>] rt_mutex_lock+0x2d/0x2f
[  126.060537]        [<ffffffff810d9020>] rcu_boost+0xad/0xde
[  126.060541]        [<ffffffff810d90ce>] rcu_boost_kthread+0x7d/0x9b
[  126.060544]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060547]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060551]
[  126.060552] -> #0 (&(lock)->wait_lock#2){+.+...}:
[  126.060555]        [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060558]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060561]        [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060564]        [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060566]        [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060569]        [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060573]        [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060576]        [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060580]        [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060583]        [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060585]        [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060590]        [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060593]        [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060597]        [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060600]        [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060603]        [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060606]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060608]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060611]
[  126.060612] other info that might help us debug this:
[  126.060614]
[  126.060615]  Possible unsafe locking scenario:
[  126.060616]
[  126.060617]        CPU0                    CPU1
[  126.060619]        ----                    ----
[  126.060620]   lock(&p->pi_lock);
[  126.060623]                                lock(&(lock)->wait_lock);
[  126.060625]                                lock(&p->pi_lock);
[  126.060627]   lock(&(lock)->wait_lock);
[  126.060629]
[  126.060629]  *** DEADLOCK ***
[  126.060630]
[  126.060632] 1 lock held by irq/24-eth0/1235:
[  126.060633]  #0:  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060638]
[  126.060638] stack backtrace:
[  126.060641] Pid: 1235, comm: irq/24-eth0 Not tainted 3.0.1-rt10+ #30
[  126.060643] Call Trace:
[  126.060644]  <IRQ>  [<ffffffff810acbde>] print_circular_bug+0x289/0x29a
[  126.060651]  [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060655]  [<ffffffff810ab3aa>] ? trace_hardirqs_off_caller+0x1f/0x99
[  126.060658]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060661]  [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060664]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060668]  [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060671]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060674]  [<ffffffff810d9655>] ? rcu_report_qs_rsp+0x87/0x8c
[  126.060677]  [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060680]  [<ffffffff810d9ea3>] ? rcu_read_unlock_special+0x9b/0x1c4
[  126.060683]  [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060687]  [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060690]  [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060693]  [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060696]  [<ffffffff810683da>] ? select_task_rq_rt+0x27/0xd5
[  126.060701]  [<ffffffff810a852a>] ? clockevents_program_event+0x8e/0x90
[  126.060704]  [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060708]  [<ffffffff810a95dc>] ? tick_program_event+0x1f/0x21
[  126.060711]  [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060715]  [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060718]  [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060721]  [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060724]  [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060726]  <EOI>  [<ffffffff81072855>] ? migrate_disable+0x75/0x12d
[  126.060733]  [<ffffffff81080a61>] ? local_bh_disable+0xe/0x1f
[  126.060736]  [<ffffffff81080a70>] ? local_bh_disable+0x1d/0x1f
[  126.060739]  [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060742]  [<ffffffff81502ac0>] ? _raw_spin_unlock_irq+0x3b/0x59
[  126.060745]  [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060748]  [<ffffffff810d5937>] ? irq_thread_fn+0x3a/0x3a
[  126.060751]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060754]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060757]  [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060761]  [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060764]  [<ffffffff81069ed7>] ? finish_task_switch+0x87/0x10a
[  126.060768]  [<ffffffff81502ec4>] ? retint_restore_args+0xe/0xe
[  126.060771]  [<ffffffff8109a6c7>] ? __init_kthread_worker+0x8c/0x8c
[  126.060774]  [<ffffffff81509b10>] ? gs_change+0xb/0xb

Because irq_exit() does:

void irq_exit(void)
{
	account_system_vtime(current);
	trace_hardirq_exit();
	sub_preempt_count(IRQ_EXIT_OFFSET);
	if (!in_interrupt() && local_softirq_pending())
		invoke_softirq();

	...
}

Which triggers a wakeup, which uses RCU, now if the interrupted task has
t->rcu_read_unlock_special set, the rcu usage from the wakeup will end
up in rcu_read_unlock_special(). rcu_read_unlock_special() will test
for in_irq(), which will fail as we just decremented preempt_count
with IRQ_EXIT_OFFSET, and in_sering_softirq(), which for
PREEMPT_RT_FULL reads:

int in_serving_softirq(void)
{
	int res;

	preempt_disable();
	res = __get_cpu_var(local_softirq_runner) == current;
	preempt_enable();
	return res;
}

Which will thus also fail, resulting in the above wreckage.

The 'somewhat' ugly solution is to open-code the preempt_count() test
in rcu_read_unlock_special().

Also, we're not at all sure how ->rcu_read_unlock_special gets set
here... so this is very likely a bandaid and more thought is required.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Change-Id: I767aac9e23c63bdc126a803860b59ef08c8ffb39
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Jul 27, 2023
We don't need to hold the local pinctrl lock here to set irq wake on the
summary irq line. Doing so only leads to lockdep warnings instead of
protecting us from anything. Remove the locking.

 WARNING: possible circular locking dependency detected
 5.4.11 #2 Tainted: G        W
 ------------------------------------------------------
 cat/3083 is trying to acquire lock:
 ffffff81f4fa58c0 (&irq_desc_lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94

 but task is already holding lock:
 ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&pctrl->lock){-.-.}:
        _raw_spin_lock_irqsave+0x64/0x80
        msm_gpio_irq_ack+0x68/0xf4
        __irq_do_set_handler+0xe0/0x180
        __irq_set_handler+0x60/0x9c
        irq_domain_set_info+0x90/0xb4
        gpiochip_hierarchy_irq_domain_alloc+0x110/0x200
        __irq_domain_alloc_irqs+0x130/0x29c
        irq_create_fwspec_mapping+0x1f0/0x300
        irq_create_of_mapping+0x70/0x98
        of_irq_get+0xa4/0xd4
        spi_drv_probe+0x4c/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        __device_attach_driver+0x9c/0x110
        bus_for_each_drv+0x88/0xd0
        __device_attach+0xb0/0x160
        device_initial_probe+0x20/0x2c
        bus_probe_device+0x34/0x94
        device_add+0x35c/0x3f0
        spi_add_device+0xbc/0x194
        of_register_spi_devices+0x2c8/0x408
        spi_register_controller+0x57c/0x6fc
        spi_geni_probe+0x260/0x328
        platform_drv_probe+0x90/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        device_driver_attach+0x4c/0x6c
        __driver_attach+0xcc/0x154
        bus_for_each_dev+0x84/0xcc
        driver_attach+0x2c/0x38
        bus_add_driver+0x108/0x1fc
        driver_register+0x64/0xf8
        __platform_driver_register+0x4c/0x58
        spi_geni_driver_init+0x1c/0x24
        do_one_initcall+0x1a4/0x3e8
        do_initcall_level+0xb4/0xcc
        do_basic_setup+0x30/0x48
        kernel_init_freeable+0x124/0x1a8
        kernel_init+0x14/0x100
        ret_from_fork+0x10/0x18

 -> #0 (&irq_desc_lock_class){-.-.}:
        __lock_acquire+0xeb4/0x2388
        lock_acquire+0x1cc/0x210
        _raw_spin_lock_irqsave+0x64/0x80
        __irq_get_desc_lock+0x64/0x94
        irq_set_irq_wake+0x40/0x144
        msm_gpio_irq_set_wake+0x5c/0x7c
        set_irq_wake_real+0x40/0x5c
        irq_set_irq_wake+0x70/0x144
        cros_ec_rtc_suspend+0x38/0x4c
        platform_pm_suspend+0x34/0x60
        dpm_run_callback+0x64/0xcc
        __device_suspend+0x310/0x41c
        dpm_suspend+0xf8/0x298
        dpm_suspend_start+0x84/0xb4
        suspend_devices_and_enter+0xbc/0x620
        pm_suspend+0x210/0x348
        state_store+0xb0/0x108
        kobj_attr_store+0x14/0x24
        sysfs_kf_write+0x4c/0x64
        kernfs_fop_write+0x15c/0x1fc
        __vfs_write+0x54/0x18c
        vfs_write+0xe4/0x1a4
        ksys_write+0x7c/0xe4
        __arm64_sys_write+0x20/0x2c
        el0_svc_common+0xa8/0x160
        el0_svc_handler+0x7c/0x98
        el0_svc+0x8/0xc

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&pctrl->lock);
                                lock(&irq_desc_lock_class);
                                lock(&pctrl->lock);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

 7 locks held by cat/3083:
  #0: ffffff81f06d1420 (sb_writers#7){.+.+}, at: vfs_write+0xd0/0x1a4
  #1: ffffff81c8935680 (&of->mutex){+.+.}, at: kernfs_fop_write+0x12c/0x1fc
  #2: ffffff81f4c322f0 (kn->count#337){.+.+}, at: kernfs_fop_write+0x134/0x1fc
  #3: ffffffe89a641d60 (system_transition_mutex){+.+.}, at: pm_suspend+0x108/0x348
  #4: ffffff81f190e970 (&dev->mutex){....}, at: __device_suspend+0x168/0x41c
  #5: ffffff81f183d8c0 (lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94
  #6: ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 stack backtrace:
 CPU: 4 PID: 3083 Comm: cat Tainted: G        W         5.4.11 #2
 Hardware name: Google Cheza (rev3+) (DT)
 Call trace:
  dump_backtrace+0x0/0x174
  show_stack+0x20/0x2c
  dump_stack+0xc8/0x124
  print_circular_bug+0x2ac/0x2c4
  check_noncircular+0x1a0/0x1a8
  __lock_acquire+0xeb4/0x2388
  lock_acquire+0x1cc/0x210
  _raw_spin_lock_irqsave+0x64/0x80
  __irq_get_desc_lock+0x64/0x94
  irq_set_irq_wake+0x40/0x144
  msm_gpio_irq_set_wake+0x5c/0x7c
  set_irq_wake_real+0x40/0x5c
  irq_set_irq_wake+0x70/0x144
  cros_ec_rtc_suspend+0x38/0x4c
  platform_pm_suspend+0x34/0x60
  dpm_run_callback+0x64/0xcc
  __device_suspend+0x310/0x41c
  dpm_suspend+0xf8/0x298
  dpm_suspend_start+0x84/0xb4
  suspend_devices_and_enter+0xbc/0x620
  pm_suspend+0x210/0x348
  state_store+0xb0/0x108
  kobj_attr_store+0x14/0x24
  sysfs_kf_write+0x4c/0x64
  kernfs_fop_write+0x15c/0x1fc
  __vfs_write+0x54/0x18c
  vfs_write+0xe4/0x1a4
  ksys_write+0x7c/0xe4
  __arm64_sys_write+0x20/0x2c
  el0_svc_common+0xa8/0x160
  el0_svc_handler+0x7c/0x98
  el0_svc+0x8/0xc

Fixes: 6aced33 ("pinctrl: msm: drop wake_irqs bitmap")
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Brian Masney <masneyb@onstation.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Maulik Shah <mkshah@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Change-Id: Ib7c6166e0c24266937a1fce91af98e23cd67fb73
Link: https://lore.kernel.org/r/20200121180950.36959-1-swboyd@chromium.org
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Jul 27, 2023
On a system that often re-configures a USB gadget device, the kernel log
is filled with:

  configfs-gadget gadget: high-speed config #1: c

Reduce the verbosity of this print to debug.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Change-Id: Ic326559435fb120194369a645af3ff8f68f72ac8
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Jul 27, 2023
A recent clang 16 update has introduced segfault when compiling kernel
using the `-polly-invariant-load-hoisting` flag and breaks compilation
when kernel is compiled with full LTO.
This is due to the flag being ignorant about the errors during SCoP
verification and does not take the errors into account as per a recent
issue opened at [LLVM], causing polly to do segfault and the compiler
to print the following backtrace:

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: clang -Wp,-MD,drivers/md/.md.o.d -nostdinc -isystem /tmp/cirrus-ci-build/toolchains/clang/lib/clang/16.0.0/include -I../arch/arm64/include -I./arch/arm64/include/generated -I../include -I./include -I../arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I../include/uapi -I./include/generated/uapi -include ../include/linux/kconfig.h -include ../include/linux/compiler_types.h -I../drivers/md -Idrivers/md -D__KERNEL__ -mlittle-endian -DKASAN_SHADOW_SCALE_SHIFT=3 -Qunused-arguments -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration -Werror=return-type -Wno-format-security -std=gnu89 --target=aarch64-linux-gnu --prefix=/tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin/aarch64-linux-gnu- --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -Wno-misleading-indentation -Wno-bool-operation -Werror=unknown-warning-option -Wno-unsequenced -opaque-pointers -fno-PIE -mgeneral-regs-only -DCONFIG_AS_LSE=1 -fno-asynchronous-unwind-tables -Wno-psabi -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-int-in-bool-context -Wno-address-of-packed-member -O3 -march=armv8.1-a+crypto+fp16+rcpc -mtune=cortex-a53 -mllvm -polly -mllvm -polly-ast-use-context -mllvm -polly-detect-keep-going -mllvm -polly-invariant-load-hoisting -mllvm -polly-run-inliner -mllvm -polly-vectorizer=stripmine -mllvm -polly-loopfusion-greedy=1 -mllvm -polly-reschedule=1 -mllvm -polly-postopts=1 -fstack-protector-strong --target=aarch64-linux-gnu --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -meabi gnu -Wno-format-invalid-specifier -Wno-gnu -Wno-duplicate-decl-specifier -Wno-asm-operand-widths -Wno-initializer-overrides -Wno-tautological-constant-out-of-range-compare -Wno-tautological-compare -mno-global-merge -Wno-void-ptr-dereference -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -Wvla -flto -fwhole-program-vtables -fvisibility=hidden -Wdeclaration-after-statement -Wno-pointer-sign -Wno-array-bounds -fno-strict-overflow -fno-stack-check -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types -fmacro-prefix-map=../= -Wno-initializer-overrides -Wno-unused-value -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-uninitialized -Wno-pointer-to-enum-cast -Wno-unaligned-access -DKBUILD_BASENAME=\"md\" -DKBUILD_MODNAME=\"md_mod\" -D__KBUILD_MODNAME=kmod_md_mod -c -o drivers/md/md.o ../drivers/md/md.c
1.	<eof> parser at end of file
2.	Optimizer
  CC      drivers/media/platform/msm/camera_v2/camera/camera.o
  AR      drivers/media/pci/intel/ipu3/built-in.a
  CC      drivers/md/dm-linear.o
 #0 0x0000559d3527073f (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a7073f)
 #1 0x0000559d352705bf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a705bf)
 #2 0x0000559d3523b198 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b198)
 #3 0x0000559d3523b33e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b33e)
 #4 0x00007f339dc3ea00 (/usr/lib/libc.so.6+0x38a00)
 #5 0x0000559d35affccf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x42ffccf)
 #6 0x0000559d35b01710 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301710)
 #7 0x0000559d35b01a12 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301a12)
 #8 0x0000559d35b09a9e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4309a9e)
 #9 0x0000559d35b14707 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4314707)
clang-16: error: clang frontend command failed with exit code 139 (use -v to see invocation)
Neutron clang version 16.0.0 (https://github.com/llvm/llvm-project.git 598f5275c16049b1e1b5bc934cbde447a82d485e)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin

From the very nature of `-polly-detect-keep-going`, it can be concluded
that this is a potentially unsafe flag and instead of using it conditionally
for CLANG < 16, just remove it all together for all. This should allow polly
to run SCoP verifications as intended.

Issue [LLVM]: llvm/llvm-project#58484 (comment)

Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Change-Id: I9bf13d78ebe3988d17e494dcef00f0c3db633df4
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Jul 27, 2023
Qos requirement is not initialized when update is called.

This fixes the following call trace:
[...]
[   27.206894] msm_pm_qos_update_request: update request 100
[   27.206900] msm_pm_qos_add_request: add request
[   27.206904] pm_qos_update_request() called for unknown object
[   27.206915] ------------[ cut here ]------------
[   27.206928] WARNING: CPU: 5 PID: 1107 at kernel/power/qos.c:696 pm_qos_update_request+0x4c/0x60
[   27.206935] CPU: 5 PID: 1107 Comm: qvrservice Tainted: G S              4.14.185-Chips #1
[   27.206937] Hardware name: Qualcomm Technologies, Inc. trinket pm6125 + pmi632 IDP (DT)
[   27.206939] task: 00000000ffaf085c task.stack: 000000007a0f23be
[   27.206942] pc : pm_qos_update_request+0x4c/0x60
[   27.206943] lr : pm_qos_update_request+0x4c/0x60
[...]

Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Change-Id: Ib645307e32294ddf5db397f32099b799a3c2ab4d
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Aug 23, 2023
We can't allow spam in CI.

Upate 26th June 2018: This is still an issue:

[  224.739686] ------------[ cut here ]------------
[  224.739712] WARNING: CPU: 3 PID: 2982 at net/sched/sch_generic.c:461 dev_watchdog+0x1fd/0x210
[  224.739714] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_pcm i915 asix usbnet mii mei_me mei prime_numbers i2c_hid pinctrl_sunrisepoint pinctrl_intel btusb btrtl btbcm btintel bluetooth ecdh_generic
[  224.739775] CPU: 3 PID: 2982 Comm: gem_exec_suspen Tainted: G     U  W         4.18.0-rc2-CI-Patchwork_9414+ #1
[  224.739777] Hardware name: Dell Inc. XPS 13 9350/, BIOS 1.4.12 11/30/2016
[  224.739780] RIP: 0010:dev_watchdog+0x1fd/0x210
[  224.739781] Code: 49 63 4c 24 f0 eb 92 4c 89 ef c6 05 21 46 ad 00 01 e8 77 ee fc ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 88 4c 14 82 e8 a3 fe 84 ff <0f> 0b eb be 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 c7 47
[  224.739866] RSP: 0018:ffff88027dd83e40 EFLAGS: 00010286
[  224.739869] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000102
[  224.739871] RDX: 0000000080000102 RSI: ffffffff820c8c6c RDI: 00000000ffffffff
[  224.739873] RBP: ffff8802644c1540 R08: 0000000071be9b33 R09: 0000000000000000
[  224.739874] R10: ffff88027dd83dc0 R11: 0000000000000000 R12: ffff8802644c1588
[  224.739876] R13: ffff8802644c1160 R14: 0000000000000001 R15: ffff88026a5dc728
[  224.739878] FS:  00007f18f4887980(0000) GS:ffff88027dd80000(0000) knlGS:0000000000000000
[  224.739880] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  224.739881] CR2: 00007f4c627ae548 CR3: 000000022ca1a002 CR4: 00000000003606e0
[  224.739883] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  224.739885] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  224.739886] Call Trace:
[  224.739888]  <IRQ>
[  224.739892]  ? qdisc_reset+0xe0/0xe0
[  224.739894]  ? qdisc_reset+0xe0/0xe0
[  224.739897]  call_timer_fn+0x93/0x360
[  224.739903]  expire_timers+0xc1/0x1d0
[  224.739908]  run_timer_softirq+0xc7/0x170
[  224.739916]  __do_softirq+0xd9/0x505
[  224.739923]  irq_exit+0xa9/0xc0
[  224.739926]  smp_apic_timer_interrupt+0x9c/0x2d0
[  224.739929]  apic_timer_interrupt+0xf/0x20
[  224.739931]  </IRQ>
[  224.739934] RIP: 0010:delay_tsc+0x2e/0xb0
[  224.739936] Code: 49 89 fc 55 53 bf 01 00 00 00 e8 6d 2c 78 ff e8 88 9d b6 ff 41 89 c5 0f ae e8 0f 31 48 c1 e2 20 48 09 c2 48 89 d5 eb 16 f3 90 <bf> 01 00 00 00 e8 48 2c 78 ff e8 63 9d b6 ff 44 39 e8 75 36 0f ae
[  224.740021] RSP: 0018:ffffc900002f7d48 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
[  224.740024] RAX: 0000000080000000 RBX: 0000000649565ca9 RCX: 0000000000000001
[  224.740026] RDX: 0000000080000001 RSI: ffffffff820c8c6c RDI: 00000000ffffffff
[  224.740027] RBP: 00000006493ea9ce R08: 000000005e81e2ee R09: 0000000000000000
[  224.740029] R10: 0000000000000120 R11: 0000000000000000 R12: 00000000002ad8d6
[  224.740030] R13: 0000000000000003 R14: 0000000000000004 R15: ffff88025caf5408
[  224.740040]  ? delay_tsc+0x66/0xb0
[  224.740045]  hibernation_debug_sleep+0x1c/0x30
[  224.740048]  hibernation_snapshot+0x2c1/0x690
[  224.740053]  hibernate+0x142/0x2a4
[  224.740057]  state_store+0xd0/0xe0
[  224.740063]  kernfs_fop_write+0x104/0x190
[  224.740068]  __vfs_write+0x31/0x180
[  224.740072]  ? rcu_read_lock_sched_held+0x6f/0x80
[  224.740075]  ? rcu_sync_lockdep_assert+0x29/0x50
[  224.740078]  ? __sb_start_write+0x152/0x1f0
[  224.740080]  ? __sb_start_write+0x168/0x1f0
[  224.740084]  vfs_write+0xbd/0x1a0
[  224.740088]  ksys_write+0x50/0xc0
[  224.740094]  do_syscall_64+0x55/0x190
[  224.740097]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  224.740099] RIP: 0033:0x7f18f400a281
[  224.740100] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
[  224.740186] RSP: 002b:00007fffd1f4fec8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  224.740189] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f18f400a281
[  224.740190] RDX: 0000000000000004 RSI: 00007f18f448069a RDI: 0000000000000006
[  224.740192] RBP: 00007fffd1f4fef0 R08: 0000000000000000 R09: 0000000000000000
[  224.740194] R10: 0000000000000000 R11: 0000000000000246 R12: 000055e795d03400
[  224.740195] R13: 00007fffd1f50500 R14: 0000000000000000 R15: 0000000000000000
[  224.740205] irq event stamp: 1582591
[  224.740207] hardirqs last  enabled at (1582590): [<ffffffff810f9f9c>] vprintk_emit+0x4bc/0x4d0
[  224.740210] hardirqs last disabled at (1582591): [<ffffffff81a0111c>] error_entry+0x7c/0x100
[  224.740212] softirqs last  enabled at (1582568): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505
[  224.740215] softirqs last disabled at (1582571): [<ffffffff8108c959>] irq_exit+0xa9/0xc0
[  224.740218] WARNING: CPU: 3 PID: 2982 at net/sched/sch_generic.c:461 dev_watchdog+0x1fd/0x210
[  224.740219] ---[ end trace 6e41d690e611c338 ]---

References: https://bugzilla.kernel.org/show_bug.cgi?id=196399
Acked-by: Martin Peres <martin.peres@linux.intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Change-Id: I235230ba822fa0bf058ee86f68b09f19a6aea8cb
Link: https://patchwork.freedesktop.org/patch/msgid/20170718082110.12524-1-daniel.vetter@ffwll.ch
Signed-off-by: celtare21 <celtare21@gmail.com>
TogoFire pushed a commit that referenced this issue Aug 23, 2023
fsck.f2fs forces a filesystem fix on boot if it detects that the current
kernel version differs from the one saved in the superblock, which results in
fsck blocking boot for a long time (~35 seconds). This commit provides a
way to report a constant fake kernel version to fsck to avoid triggering
the version check, which is useful if you boot new kernel builds
frequently.

Signed-off-by: Danny Lin <danny@kdrag0n.dev>
Change-Id: I65974f63d81171947399ff31b22fb9e95d14a6f3

Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com>

ocean stock-kernel =

CONFIG_F2FS_REPORT_FAKE_KERNEL_VERSION=y
CONFIG_F2FS_FAKE_KERNEL_RELEASE="4.9.206-perf+"
CONFIG_F2FS_FAKE_KERNEL_VERSION="#1 SMP PREEMPT Wed Sep 30 23:50:55 CDT 2020 aarch64"

Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com>
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Aug 23, 2023
Patch series "lib/sort & lib/list_sort: faster and smaller", v2.

Because CONFIG_RETPOLINE has made indirect calls much more expensive, I
thought I'd try to reduce the number made by the library sort functions.

The first three patches apply to lib/sort.c.

Patch #1 is a simple optimization.  The built-in swap has special cases
for aligned 4- and 8-byte objects.  But those are almost never used;
most calls to sort() work on larger structures, which fall back to the
byte-at-a-time loop.  This generalizes them to aligned *multiples* of 4
and 8 bytes.  (If nothing else, it saves an awful lot of energy by not
thrashing the store buffers as much.)

Patch #2 grabs a juicy piece of low-hanging fruit.  I agree that nice
simple solid heapsort is preferable to more complex algorithms (sorry,
Andrey), but it's possible to implement heapsort with far fewer
comparisons (50% asymptotically, 25-40% reduction for realistic sizes)
than the way it's been done up to now.  And with some care, the code
ends up smaller, as well.  This is the "big win" patch.

Patch #3 adds the same sort of indirect call bypass that has been added
to the net code of late.  The great majority of the callers use the
builtin swap functions, so replace the indirect call to sort_func with a
(highly preditable) series of if() statements.  Rather surprisingly,
this decreased code size, as the swap functions were inlined and their
prologue & epilogue code eliminated.

lib/list_sort.c is a bit trickier, as merge sort is already close to
optimal, and we don't want to introduce triumphs of theory over
practicality like the Ford-Johnson merge-insertion sort.

Patch #4, without changing the algorithm, chops 32% off the code size
and removes the part[MAX_LIST_LENGTH+1] pointer array (and the
corresponding upper limit on efficiently sortable input size).

Patch #5 improves the algorithm.  The previous code is already optimal
for power-of-two (or slightly smaller) size inputs, but when the input
size is just over a power of 2, there's a very unbalanced final merge.

There are, in the literature, several algorithms which solve this, but
they all depend on the "breadth-first" merge order which was replaced by
commit 835cc0c with a more cache-friendly "depth-first" order.
Some hard thinking came up with a depth-first algorithm which defers
merges as little as possible while avoiding bad merges.  This saves
0.2*n compares, averaged over all sizes.

The code size increase is minimal (64 bytes on x86-64, reducing the net
savings to 26%), but the comments expanded significantly to document the
clever algorithm.

TESTING NOTES: I have some ugly user-space benchmarking code which I
used for testing before moving this code into the kernel.  Shout if you
want a copy.

I'm running this code right now, with CONFIG_TEST_SORT and
CONFIG_TEST_LIST_SORT, but I confess I haven't rebooted since the last
round of minor edits to quell checkpatch.  I figure there will be at
least one round of comments and final testing.

This patch (of 5):

Rather than having special-case swap functions for 4- and 8-byte
objects, special-case aligned multiples of 4 or 8 bytes.  This speeds up
most users of sort() by avoiding fallback to the byte copy loop.

Despite what ca96ab8 ("lib/sort: Add 64 bit swap function") claims,
very few users of sort() sort pointers (or pointer-sized objects); most
sort structures containing at least two words.  (E.g.
drivers/acpi/fan.c:acpi_fan_get_fps() sorts an array of 40-byte struct
acpi_fan_fps.)

The functions also got renamed to reflect the fact that they support
multiple words.  In the great tradition of bikeshedding, the names were
by far the most contentious issue during review of this patch series.

x86-64 code size 872 -> 886 bytes (+14)

With feedback from Andy Shevchenko, Rasmus Villemoes and Geert
Uytterhoeven.

Link: http://lkml.kernel.org/r/f24f932df3a7fa1973c1084154f1cea596bcf341.1552704200.git.lkml@sdf.org
Signed-off-by: George Spelvin <lkml@sdf.org>
Change-Id: I87555b81a881a6a2521be3fb7e2fffbba4a13afd
Acked-by: Andrey Abramov <st5pub@yandex.ru>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Daniel Wagner <daniel.wagner@siemens.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
TogoFire pushed a commit that referenced this issue Aug 23, 2023
commit ba5a4fdd63ae0c575707030db0b634b160baddd7 upstream.

syzbot complained about a recent change in TCP stack,
hitting a NULL pointer [1]

tcp request sockets have an af_specific pointer, which
was used before the blamed change only for SYNACK generation
in non SYNCOOKIE mode.

tcp requests sockets momentarily created when third packet
coming from client in SYNCOOKIE mode were not using
treq->af_specific.

Make sure this field is populated, in the same way normal
TCP requests sockets do in tcp_conn_request().

[1]
TCP: request_sock_TCPv6: Possible SYN flooding on port 20002. Sending cookies.  Check SNMP counters.
general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 1 PID: 3695 Comm: syz-executor864 Not tainted 5.18.0-rc3-syzkaller-00224-g5fd1fe4807f9 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:tcp_create_openreq_child+0xe16/0x16b0 net/ipv4/tcp_minisocks.c:534
Code: 48 c1 ea 03 80 3c 02 00 0f 85 e5 07 00 00 4c 8b b3 28 01 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7e 08 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 c9 07 00 00 48 8b 3c 24 48 89 de 41 ff 56 08 48
RSP: 0018:ffffc90000de0588 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff888076490330 RCX: 0000000000000100
RDX: 0000000000000001 RSI: ffffffff87d67ff0 RDI: 0000000000000008
RBP: ffff88806ee1c7f8 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff87d67f00 R11: 0000000000000000 R12: ffff88806ee1bfc0
R13: ffff88801b0e0368 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f517fe58700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffcead76960 CR3: 000000006f97b000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 tcp_v6_syn_recv_sock+0x199/0x23b0 net/ipv6/tcp_ipv6.c:1267
 tcp_get_cookie_sock+0xc9/0x850 net/ipv4/syncookies.c:207
 cookie_v6_check+0x15c3/0x2340 net/ipv6/syncookies.c:258
 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:1131 [inline]
 tcp_v6_do_rcv+0x1148/0x13b0 net/ipv6/tcp_ipv6.c:1486
 tcp_v6_rcv+0x3305/0x3840 net/ipv6/tcp_ipv6.c:1725
 ip6_protocol_deliver_rcu+0x2e9/0x1900 net/ipv6/ip6_input.c:422
 ip6_input_finish+0x14c/0x2c0 net/ipv6/ip6_input.c:464
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:473
 dst_input include/net/dst.h:461 [inline]
 ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ipv6_rcv+0x27f/0x3b0 net/ipv6/ip6_input.c:297
 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5405
 __netif_receive_skb+0x24/0x1b0 net/core/dev.c:5519
 process_backlog+0x3a0/0x7c0 net/core/dev.c:5847
 __napi_poll+0xb3/0x6e0 net/core/dev.c:6413
 napi_poll net/core/dev.c:6480 [inline]
 net_rx_action+0x8ec/0xc60 net/core/dev.c:6567
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097

Fixes: 5b0b9e4c2c89 ("tcp: md5: incorrect tcp_header_len for incoming connections")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Change-Id: Iacb67b36439beb695473efb891e26a4f948b3ce4
Cc: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[fruggeri: Account for backport conflicts from 35b2c3211609 and 6fc8c827dd4f]
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
TogoFire pushed a commit that referenced this issue Aug 23, 2023
During boot we observed several warnings that can be avoided

[    6.725935] pmic_analog_codec 800f000.qcom,spmi:qcom,pm660l@3:analog-codec@f000: Error: regulator not found:cdc-vdd-spkdrv
[    6.726304] ------------[ cut here ]------------
[    6.726364] WARNING: CPU: 1 PID: 455 at regcache_cache_only+0xf0/0x100
[    6.726452] Modules linked in:
[    6.726460] CPU: 1 PID: 455 Comm: kworker/1:3 Not tainted 4.9.170-g00294e841ec8-dirty #1
[    6.726462] Hardware name: Qualcomm Technologies, Inc. SDM 630 PM660 + PM660L MTP (DT)
[    6.726471] Workqueue: events deferred_probe_work_func
[    6.726473] task: ffffffedb575b600 task.stack: ffffffedb5770000
[    6.726476] PC is at regcache_cache_only+0xf0/0x100
[    6.726478] LR is at regcache_cache_only+0x2c/0x100

Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
Change-Id: I9672248471b4dc9736987ed233c12e6d4b03abd5
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Aug 23, 2023
log:
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld -EL -maarch64elf --lto-O3 -z noexecstack -r -o vmlinux.o --whole-archive built -in.o
 #0 0x000055d50c1017a3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e87a3)
 #1 0x000055d50c1013fa (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e83fa)
 #2 0x00007f52034b9b50 __restore_rt (/lib64/libc.so.6+0x3cb50)
 #3 0x000055d50c38a111 void lld::elf::writeResult<llvm::object::ELFType<(llvm::support::endianness)1, true>>() (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang- neutron/bin/ld.lld+0x2871111)
 #4 0x000055d50c1e539e lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26cc39e)
 #5 0x000055d50c1ce651 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26b5651)
 #6 0x000055d50c1cbc96 lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron /bin/ld.lld+0x26b2c96)
 #7 0x000055d50c0aad03 (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2591d03)
 #8 0x000055d50bc2bc3e lld_main(int, char**) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2112c3e)
 #9 0x00007f52034a4510 __libc_start_call_main (/lib64/libc.so.6+0x27510)
#10 0x00007f52034a45c9 __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c9)
#11 0x000055d50c09e9f5 _start (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25859f5)
../scripts/link-vmlinux.sh, line 94: 60322 Segmentation fault (core image recorded)${LDFINAL} ${LDFLAGS} -r -o ${1} $(lto_lds) ${objects}
make[1]: *** [/home/togo/Documents/GitHub/kernel_xiaomi_panda/Makefile:1272: vmlinux] Error 139
make[1]: Leaving directory '/home/togo/Documentos/GitHub/kernel_xiaomi_panda/out'

make: *** [Makefile:154: sub-make] Error 2

Signed-off-by: GeoPD <geoemmanuelpd2001@gmail.com>
Change-Id: I97f30f227ddac3db21835564352cf17a193212bb
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Aug 23, 2023
With RT_FULL we get the below wreckage:

[  126.060484] =======================================================
[  126.060486] [ INFO: possible circular locking dependency detected ]
[  126.060489] 3.0.1-rt10+ #30
[  126.060490] -------------------------------------------------------
[  126.060492] irq/24-eth0/1235 is trying to acquire lock:
[  126.060495]  (&(lock)->wait_lock#2){+.+...}, at: [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060503]
[  126.060504] but task is already holding lock:
[  126.060506]  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060511]
[  126.060511] which lock already depends on the new lock.
[  126.060513]
[  126.060514]
[  126.060514] the existing dependency chain (in reverse order) is:
[  126.060516]
[  126.060516] -> #1 (&p->pi_lock){-...-.}:
[  126.060519]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060524]        [<ffffffff8150291e>] _raw_spin_lock_irqsave+0x4b/0x85
[  126.060527]        [<ffffffff810b5aa4>] task_blocks_on_rt_mutex+0x36/0x20f
[  126.060531]        [<ffffffff815019bb>] rt_mutex_slowlock+0xd1/0x15a
[  126.060534]        [<ffffffff81501ae3>] rt_mutex_lock+0x2d/0x2f
[  126.060537]        [<ffffffff810d9020>] rcu_boost+0xad/0xde
[  126.060541]        [<ffffffff810d90ce>] rcu_boost_kthread+0x7d/0x9b
[  126.060544]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060547]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060551]
[  126.060552] -> #0 (&(lock)->wait_lock#2){+.+...}:
[  126.060555]        [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060558]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060561]        [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060564]        [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060566]        [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060569]        [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060573]        [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060576]        [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060580]        [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060583]        [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060585]        [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060590]        [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060593]        [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060597]        [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060600]        [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060603]        [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060606]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060608]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060611]
[  126.060612] other info that might help us debug this:
[  126.060614]
[  126.060615]  Possible unsafe locking scenario:
[  126.060616]
[  126.060617]        CPU0                    CPU1
[  126.060619]        ----                    ----
[  126.060620]   lock(&p->pi_lock);
[  126.060623]                                lock(&(lock)->wait_lock);
[  126.060625]                                lock(&p->pi_lock);
[  126.060627]   lock(&(lock)->wait_lock);
[  126.060629]
[  126.060629]  *** DEADLOCK ***
[  126.060630]
[  126.060632] 1 lock held by irq/24-eth0/1235:
[  126.060633]  #0:  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060638]
[  126.060638] stack backtrace:
[  126.060641] Pid: 1235, comm: irq/24-eth0 Not tainted 3.0.1-rt10+ #30
[  126.060643] Call Trace:
[  126.060644]  <IRQ>  [<ffffffff810acbde>] print_circular_bug+0x289/0x29a
[  126.060651]  [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060655]  [<ffffffff810ab3aa>] ? trace_hardirqs_off_caller+0x1f/0x99
[  126.060658]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060661]  [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060664]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060668]  [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060671]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060674]  [<ffffffff810d9655>] ? rcu_report_qs_rsp+0x87/0x8c
[  126.060677]  [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060680]  [<ffffffff810d9ea3>] ? rcu_read_unlock_special+0x9b/0x1c4
[  126.060683]  [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060687]  [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060690]  [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060693]  [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060696]  [<ffffffff810683da>] ? select_task_rq_rt+0x27/0xd5
[  126.060701]  [<ffffffff810a852a>] ? clockevents_program_event+0x8e/0x90
[  126.060704]  [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060708]  [<ffffffff810a95dc>] ? tick_program_event+0x1f/0x21
[  126.060711]  [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060715]  [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060718]  [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060721]  [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060724]  [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060726]  <EOI>  [<ffffffff81072855>] ? migrate_disable+0x75/0x12d
[  126.060733]  [<ffffffff81080a61>] ? local_bh_disable+0xe/0x1f
[  126.060736]  [<ffffffff81080a70>] ? local_bh_disable+0x1d/0x1f
[  126.060739]  [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060742]  [<ffffffff81502ac0>] ? _raw_spin_unlock_irq+0x3b/0x59
[  126.060745]  [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060748]  [<ffffffff810d5937>] ? irq_thread_fn+0x3a/0x3a
[  126.060751]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060754]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060757]  [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060761]  [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060764]  [<ffffffff81069ed7>] ? finish_task_switch+0x87/0x10a
[  126.060768]  [<ffffffff81502ec4>] ? retint_restore_args+0xe/0xe
[  126.060771]  [<ffffffff8109a6c7>] ? __init_kthread_worker+0x8c/0x8c
[  126.060774]  [<ffffffff81509b10>] ? gs_change+0xb/0xb

Because irq_exit() does:

void irq_exit(void)
{
	account_system_vtime(current);
	trace_hardirq_exit();
	sub_preempt_count(IRQ_EXIT_OFFSET);
	if (!in_interrupt() && local_softirq_pending())
		invoke_softirq();

	...
}

Which triggers a wakeup, which uses RCU, now if the interrupted task has
t->rcu_read_unlock_special set, the rcu usage from the wakeup will end
up in rcu_read_unlock_special(). rcu_read_unlock_special() will test
for in_irq(), which will fail as we just decremented preempt_count
with IRQ_EXIT_OFFSET, and in_sering_softirq(), which for
PREEMPT_RT_FULL reads:

int in_serving_softirq(void)
{
	int res;

	preempt_disable();
	res = __get_cpu_var(local_softirq_runner) == current;
	preempt_enable();
	return res;
}

Which will thus also fail, resulting in the above wreckage.

The 'somewhat' ugly solution is to open-code the preempt_count() test
in rcu_read_unlock_special().

Also, we're not at all sure how ->rcu_read_unlock_special gets set
here... so this is very likely a bandaid and more thought is required.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Change-Id: I00ff34992eff1b902ee7d197f3ed2862e4f70850
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Aug 23, 2023
We don't need to hold the local pinctrl lock here to set irq wake on the
summary irq line. Doing so only leads to lockdep warnings instead of
protecting us from anything. Remove the locking.

 WARNING: possible circular locking dependency detected
 5.4.11 #2 Tainted: G        W
 ------------------------------------------------------
 cat/3083 is trying to acquire lock:
 ffffff81f4fa58c0 (&irq_desc_lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94

 but task is already holding lock:
 ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&pctrl->lock){-.-.}:
        _raw_spin_lock_irqsave+0x64/0x80
        msm_gpio_irq_ack+0x68/0xf4
        __irq_do_set_handler+0xe0/0x180
        __irq_set_handler+0x60/0x9c
        irq_domain_set_info+0x90/0xb4
        gpiochip_hierarchy_irq_domain_alloc+0x110/0x200
        __irq_domain_alloc_irqs+0x130/0x29c
        irq_create_fwspec_mapping+0x1f0/0x300
        irq_create_of_mapping+0x70/0x98
        of_irq_get+0xa4/0xd4
        spi_drv_probe+0x4c/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        __device_attach_driver+0x9c/0x110
        bus_for_each_drv+0x88/0xd0
        __device_attach+0xb0/0x160
        device_initial_probe+0x20/0x2c
        bus_probe_device+0x34/0x94
        device_add+0x35c/0x3f0
        spi_add_device+0xbc/0x194
        of_register_spi_devices+0x2c8/0x408
        spi_register_controller+0x57c/0x6fc
        spi_geni_probe+0x260/0x328
        platform_drv_probe+0x90/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        device_driver_attach+0x4c/0x6c
        __driver_attach+0xcc/0x154
        bus_for_each_dev+0x84/0xcc
        driver_attach+0x2c/0x38
        bus_add_driver+0x108/0x1fc
        driver_register+0x64/0xf8
        __platform_driver_register+0x4c/0x58
        spi_geni_driver_init+0x1c/0x24
        do_one_initcall+0x1a4/0x3e8
        do_initcall_level+0xb4/0xcc
        do_basic_setup+0x30/0x48
        kernel_init_freeable+0x124/0x1a8
        kernel_init+0x14/0x100
        ret_from_fork+0x10/0x18

 -> #0 (&irq_desc_lock_class){-.-.}:
        __lock_acquire+0xeb4/0x2388
        lock_acquire+0x1cc/0x210
        _raw_spin_lock_irqsave+0x64/0x80
        __irq_get_desc_lock+0x64/0x94
        irq_set_irq_wake+0x40/0x144
        msm_gpio_irq_set_wake+0x5c/0x7c
        set_irq_wake_real+0x40/0x5c
        irq_set_irq_wake+0x70/0x144
        cros_ec_rtc_suspend+0x38/0x4c
        platform_pm_suspend+0x34/0x60
        dpm_run_callback+0x64/0xcc
        __device_suspend+0x310/0x41c
        dpm_suspend+0xf8/0x298
        dpm_suspend_start+0x84/0xb4
        suspend_devices_and_enter+0xbc/0x620
        pm_suspend+0x210/0x348
        state_store+0xb0/0x108
        kobj_attr_store+0x14/0x24
        sysfs_kf_write+0x4c/0x64
        kernfs_fop_write+0x15c/0x1fc
        __vfs_write+0x54/0x18c
        vfs_write+0xe4/0x1a4
        ksys_write+0x7c/0xe4
        __arm64_sys_write+0x20/0x2c
        el0_svc_common+0xa8/0x160
        el0_svc_handler+0x7c/0x98
        el0_svc+0x8/0xc

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&pctrl->lock);
                                lock(&irq_desc_lock_class);
                                lock(&pctrl->lock);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

 7 locks held by cat/3083:
  #0: ffffff81f06d1420 (sb_writers#7){.+.+}, at: vfs_write+0xd0/0x1a4
  #1: ffffff81c8935680 (&of->mutex){+.+.}, at: kernfs_fop_write+0x12c/0x1fc
  #2: ffffff81f4c322f0 (kn->count#337){.+.+}, at: kernfs_fop_write+0x134/0x1fc
  #3: ffffffe89a641d60 (system_transition_mutex){+.+.}, at: pm_suspend+0x108/0x348
  #4: ffffff81f190e970 (&dev->mutex){....}, at: __device_suspend+0x168/0x41c
  #5: ffffff81f183d8c0 (lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94
  #6: ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 stack backtrace:
 CPU: 4 PID: 3083 Comm: cat Tainted: G        W         5.4.11 #2
 Hardware name: Google Cheza (rev3+) (DT)
 Call trace:
  dump_backtrace+0x0/0x174
  show_stack+0x20/0x2c
  dump_stack+0xc8/0x124
  print_circular_bug+0x2ac/0x2c4
  check_noncircular+0x1a0/0x1a8
  __lock_acquire+0xeb4/0x2388
  lock_acquire+0x1cc/0x210
  _raw_spin_lock_irqsave+0x64/0x80
  __irq_get_desc_lock+0x64/0x94
  irq_set_irq_wake+0x40/0x144
  msm_gpio_irq_set_wake+0x5c/0x7c
  set_irq_wake_real+0x40/0x5c
  irq_set_irq_wake+0x70/0x144
  cros_ec_rtc_suspend+0x38/0x4c
  platform_pm_suspend+0x34/0x60
  dpm_run_callback+0x64/0xcc
  __device_suspend+0x310/0x41c
  dpm_suspend+0xf8/0x298
  dpm_suspend_start+0x84/0xb4
  suspend_devices_and_enter+0xbc/0x620
  pm_suspend+0x210/0x348
  state_store+0xb0/0x108
  kobj_attr_store+0x14/0x24
  sysfs_kf_write+0x4c/0x64
  kernfs_fop_write+0x15c/0x1fc
  __vfs_write+0x54/0x18c
  vfs_write+0xe4/0x1a4
  ksys_write+0x7c/0xe4
  __arm64_sys_write+0x20/0x2c
  el0_svc_common+0xa8/0x160
  el0_svc_handler+0x7c/0x98
  el0_svc+0x8/0xc

Fixes: 6aced33 ("pinctrl: msm: drop wake_irqs bitmap")
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Brian Masney <masneyb@onstation.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Maulik Shah <mkshah@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Change-Id: I5c8227280df4234a9d831751aea1f364e3b04664
Link: https://lore.kernel.org/r/20200121180950.36959-1-swboyd@chromium.org
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Aug 23, 2023
On a system that often re-configures a USB gadget device, the kernel log
is filled with:

  configfs-gadget gadget: high-speed config #1: c

Reduce the verbosity of this print to debug.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Change-Id: I60813bb808ee0cf4837703aaadbf47cf67818fcc
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Aug 23, 2023
A recent clang 16 update has introduced segfault when compiling kernel
using the `-polly-invariant-load-hoisting` flag and breaks compilation
when kernel is compiled with full LTO.
This is due to the flag being ignorant about the errors during SCoP
verification and does not take the errors into account as per a recent
issue opened at [LLVM], causing polly to do segfault and the compiler
to print the following backtrace:

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: clang -Wp,-MD,drivers/md/.md.o.d -nostdinc -isystem /tmp/cirrus-ci-build/toolchains/clang/lib/clang/16.0.0/include -I../arch/arm64/include -I./arch/arm64/include/generated -I../include -I./include -I../arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I../include/uapi -I./include/generated/uapi -include ../include/linux/kconfig.h -include ../include/linux/compiler_types.h -I../drivers/md -Idrivers/md -D__KERNEL__ -mlittle-endian -DKASAN_SHADOW_SCALE_SHIFT=3 -Qunused-arguments -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration -Werror=return-type -Wno-format-security -std=gnu89 --target=aarch64-linux-gnu --prefix=/tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin/aarch64-linux-gnu- --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -Wno-misleading-indentation -Wno-bool-operation -Werror=unknown-warning-option -Wno-unsequenced -opaque-pointers -fno-PIE -mgeneral-regs-only -DCONFIG_AS_LSE=1 -fno-asynchronous-unwind-tables -Wno-psabi -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-int-in-bool-context -Wno-address-of-packed-member -O3 -march=armv8.1-a+crypto+fp16+rcpc -mtune=cortex-a53 -mllvm -polly -mllvm -polly-ast-use-context -mllvm -polly-detect-keep-going -mllvm -polly-invariant-load-hoisting -mllvm -polly-run-inliner -mllvm -polly-vectorizer=stripmine -mllvm -polly-loopfusion-greedy=1 -mllvm -polly-reschedule=1 -mllvm -polly-postopts=1 -fstack-protector-strong --target=aarch64-linux-gnu --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -meabi gnu -Wno-format-invalid-specifier -Wno-gnu -Wno-duplicate-decl-specifier -Wno-asm-operand-widths -Wno-initializer-overrides -Wno-tautological-constant-out-of-range-compare -Wno-tautological-compare -mno-global-merge -Wno-void-ptr-dereference -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -Wvla -flto -fwhole-program-vtables -fvisibility=hidden -Wdeclaration-after-statement -Wno-pointer-sign -Wno-array-bounds -fno-strict-overflow -fno-stack-check -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types -fmacro-prefix-map=../= -Wno-initializer-overrides -Wno-unused-value -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-uninitialized -Wno-pointer-to-enum-cast -Wno-unaligned-access -DKBUILD_BASENAME=\"md\" -DKBUILD_MODNAME=\"md_mod\" -D__KBUILD_MODNAME=kmod_md_mod -c -o drivers/md/md.o ../drivers/md/md.c
1.	<eof> parser at end of file
2.	Optimizer
  CC      drivers/media/platform/msm/camera_v2/camera/camera.o
  AR      drivers/media/pci/intel/ipu3/built-in.a
  CC      drivers/md/dm-linear.o
 #0 0x0000559d3527073f (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a7073f)
 #1 0x0000559d352705bf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a705bf)
 #2 0x0000559d3523b198 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b198)
 #3 0x0000559d3523b33e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b33e)
 #4 0x00007f339dc3ea00 (/usr/lib/libc.so.6+0x38a00)
 #5 0x0000559d35affccf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x42ffccf)
 #6 0x0000559d35b01710 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301710)
 #7 0x0000559d35b01a12 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301a12)
 #8 0x0000559d35b09a9e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4309a9e)
 #9 0x0000559d35b14707 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4314707)
clang-16: error: clang frontend command failed with exit code 139 (use -v to see invocation)
Neutron clang version 16.0.0 (https://github.com/llvm/llvm-project.git 598f5275c16049b1e1b5bc934cbde447a82d485e)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin

From the very nature of `-polly-detect-keep-going`, it can be concluded
that this is a potentially unsafe flag and instead of using it conditionally
for CLANG < 16, just remove it all together for all. This should allow polly
to run SCoP verifications as intended.

Issue [LLVM]: llvm/llvm-project#58484 (comment)

Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Change-Id: Iac31845d5dd453d0f2fc887c5d5f8b89bb9d289c
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Aug 23, 2023
Qos requirement is not initialized when update is called.

This fixes the following call trace:
[...]
[   27.206894] msm_pm_qos_update_request: update request 100
[   27.206900] msm_pm_qos_add_request: add request
[   27.206904] pm_qos_update_request() called for unknown object
[   27.206915] ------------[ cut here ]------------
[   27.206928] WARNING: CPU: 5 PID: 1107 at kernel/power/qos.c:696 pm_qos_update_request+0x4c/0x60
[   27.206935] CPU: 5 PID: 1107 Comm: qvrservice Tainted: G S              4.14.185-Chips #1
[   27.206937] Hardware name: Qualcomm Technologies, Inc. trinket pm6125 + pmi632 IDP (DT)
[   27.206939] task: 00000000ffaf085c task.stack: 000000007a0f23be
[   27.206942] pc : pm_qos_update_request+0x4c/0x60
[   27.206943] lr : pm_qos_update_request+0x4c/0x60
[...]

Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Change-Id: I196040640618ded2f3f4ea8e1be92856d877e415
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Aug 24, 2023
Qos requirement is not initialized when update is called.

This fixes the following call trace:
[...]
[   27.206894] msm_pm_qos_update_request: update request 100
[   27.206900] msm_pm_qos_add_request: add request
[   27.206904] pm_qos_update_request() called for unknown object
[   27.206915] ------------[ cut here ]------------
[   27.206928] WARNING: CPU: 5 PID: 1107 at kernel/power/qos.c:696 pm_qos_update_request+0x4c/0x60
[   27.206935] CPU: 5 PID: 1107 Comm: qvrservice Tainted: G S              4.14.185-Chips #1
[   27.206937] Hardware name: Qualcomm Technologies, Inc. trinket pm6125 + pmi632 IDP (DT)
[   27.206939] task: 00000000ffaf085c task.stack: 000000007a0f23be
[   27.206942] pc : pm_qos_update_request+0x4c/0x60
[   27.206943] lr : pm_qos_update_request+0x4c/0x60
[...]

Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Change-Id: I196040640618ded2f3f4ea8e1be92856d877e415
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Sep 5, 2023
fsck.f2fs forces a filesystem fix on boot if it detects that the current
kernel version differs from the one saved in the superblock, which results in
fsck blocking boot for a long time (~35 seconds). This commit provides a
way to report a constant fake kernel version to fsck to avoid triggering
the version check, which is useful if you boot new kernel builds
frequently.

Signed-off-by: Danny Lin <danny@kdrag0n.dev>
Change-Id: I8490fa153177c5e9456f0202990c491a8e64a608

Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com>

ocean stock-kernel =

CONFIG_F2FS_REPORT_FAKE_KERNEL_VERSION=y
CONFIG_F2FS_FAKE_KERNEL_RELEASE="4.9.206-perf+"
CONFIG_F2FS_FAKE_KERNEL_VERSION="#1 SMP PREEMPT Wed Sep 30 23:50:55 CDT 2020 aarch64"

Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com>
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Sep 5, 2023
Patch series "lib/sort & lib/list_sort: faster and smaller", v2.

Because CONFIG_RETPOLINE has made indirect calls much more expensive, I
thought I'd try to reduce the number made by the library sort functions.

The first three patches apply to lib/sort.c.

Patch #1 is a simple optimization.  The built-in swap has special cases
for aligned 4- and 8-byte objects.  But those are almost never used;
most calls to sort() work on larger structures, which fall back to the
byte-at-a-time loop.  This generalizes them to aligned *multiples* of 4
and 8 bytes.  (If nothing else, it saves an awful lot of energy by not
thrashing the store buffers as much.)

Patch #2 grabs a juicy piece of low-hanging fruit.  I agree that nice
simple solid heapsort is preferable to more complex algorithms (sorry,
Andrey), but it's possible to implement heapsort with far fewer
comparisons (50% asymptotically, 25-40% reduction for realistic sizes)
than the way it's been done up to now.  And with some care, the code
ends up smaller, as well.  This is the "big win" patch.

Patch #3 adds the same sort of indirect call bypass that has been added
to the net code of late.  The great majority of the callers use the
builtin swap functions, so replace the indirect call to sort_func with a
(highly preditable) series of if() statements.  Rather surprisingly,
this decreased code size, as the swap functions were inlined and their
prologue & epilogue code eliminated.

lib/list_sort.c is a bit trickier, as merge sort is already close to
optimal, and we don't want to introduce triumphs of theory over
practicality like the Ford-Johnson merge-insertion sort.

Patch #4, without changing the algorithm, chops 32% off the code size
and removes the part[MAX_LIST_LENGTH+1] pointer array (and the
corresponding upper limit on efficiently sortable input size).

Patch #5 improves the algorithm.  The previous code is already optimal
for power-of-two (or slightly smaller) size inputs, but when the input
size is just over a power of 2, there's a very unbalanced final merge.

There are, in the literature, several algorithms which solve this, but
they all depend on the "breadth-first" merge order which was replaced by
commit 835cc0c with a more cache-friendly "depth-first" order.
Some hard thinking came up with a depth-first algorithm which defers
merges as little as possible while avoiding bad merges.  This saves
0.2*n compares, averaged over all sizes.

The code size increase is minimal (64 bytes on x86-64, reducing the net
savings to 26%), but the comments expanded significantly to document the
clever algorithm.

TESTING NOTES: I have some ugly user-space benchmarking code which I
used for testing before moving this code into the kernel.  Shout if you
want a copy.

I'm running this code right now, with CONFIG_TEST_SORT and
CONFIG_TEST_LIST_SORT, but I confess I haven't rebooted since the last
round of minor edits to quell checkpatch.  I figure there will be at
least one round of comments and final testing.

This patch (of 5):

Rather than having special-case swap functions for 4- and 8-byte
objects, special-case aligned multiples of 4 or 8 bytes.  This speeds up
most users of sort() by avoiding fallback to the byte copy loop.

Despite what ca96ab8 ("lib/sort: Add 64 bit swap function") claims,
very few users of sort() sort pointers (or pointer-sized objects); most
sort structures containing at least two words.  (E.g.
drivers/acpi/fan.c:acpi_fan_get_fps() sorts an array of 40-byte struct
acpi_fan_fps.)

The functions also got renamed to reflect the fact that they support
multiple words.  In the great tradition of bikeshedding, the names were
by far the most contentious issue during review of this patch series.

x86-64 code size 872 -> 886 bytes (+14)

With feedback from Andy Shevchenko, Rasmus Villemoes and Geert
Uytterhoeven.

Link: http://lkml.kernel.org/r/f24f932df3a7fa1973c1084154f1cea596bcf341.1552704200.git.lkml@sdf.org
Signed-off-by: George Spelvin <lkml@sdf.org>
Change-Id: I3028c6346b6d8bd00e7de9375aeded9f005ac5d2
Acked-by: Andrey Abramov <st5pub@yandex.ru>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Daniel Wagner <daniel.wagner@siemens.com>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
TogoFire pushed a commit that referenced this issue Sep 5, 2023
commit ba5a4fdd63ae0c575707030db0b634b160baddd7 upstream.

syzbot complained about a recent change in TCP stack,
hitting a NULL pointer [1]

tcp request sockets have an af_specific pointer, which
was used before the blamed change only for SYNACK generation
in non SYNCOOKIE mode.

tcp requests sockets momentarily created when third packet
coming from client in SYNCOOKIE mode were not using
treq->af_specific.

Make sure this field is populated, in the same way normal
TCP requests sockets do in tcp_conn_request().

[1]
TCP: request_sock_TCPv6: Possible SYN flooding on port 20002. Sending cookies.  Check SNMP counters.
general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 1 PID: 3695 Comm: syz-executor864 Not tainted 5.18.0-rc3-syzkaller-00224-g5fd1fe4807f9 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:tcp_create_openreq_child+0xe16/0x16b0 net/ipv4/tcp_minisocks.c:534
Code: 48 c1 ea 03 80 3c 02 00 0f 85 e5 07 00 00 4c 8b b3 28 01 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7e 08 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 c9 07 00 00 48 8b 3c 24 48 89 de 41 ff 56 08 48
RSP: 0018:ffffc90000de0588 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff888076490330 RCX: 0000000000000100
RDX: 0000000000000001 RSI: ffffffff87d67ff0 RDI: 0000000000000008
RBP: ffff88806ee1c7f8 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff87d67f00 R11: 0000000000000000 R12: ffff88806ee1bfc0
R13: ffff88801b0e0368 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f517fe58700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffcead76960 CR3: 000000006f97b000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 tcp_v6_syn_recv_sock+0x199/0x23b0 net/ipv6/tcp_ipv6.c:1267
 tcp_get_cookie_sock+0xc9/0x850 net/ipv4/syncookies.c:207
 cookie_v6_check+0x15c3/0x2340 net/ipv6/syncookies.c:258
 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:1131 [inline]
 tcp_v6_do_rcv+0x1148/0x13b0 net/ipv6/tcp_ipv6.c:1486
 tcp_v6_rcv+0x3305/0x3840 net/ipv6/tcp_ipv6.c:1725
 ip6_protocol_deliver_rcu+0x2e9/0x1900 net/ipv6/ip6_input.c:422
 ip6_input_finish+0x14c/0x2c0 net/ipv6/ip6_input.c:464
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:473
 dst_input include/net/dst.h:461 [inline]
 ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ipv6_rcv+0x27f/0x3b0 net/ipv6/ip6_input.c:297
 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5405
 __netif_receive_skb+0x24/0x1b0 net/core/dev.c:5519
 process_backlog+0x3a0/0x7c0 net/core/dev.c:5847
 __napi_poll+0xb3/0x6e0 net/core/dev.c:6413
 napi_poll net/core/dev.c:6480 [inline]
 net_rx_action+0x8ec/0xc60 net/core/dev.c:6567
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097

Fixes: 5b0b9e4c2c89 ("tcp: md5: incorrect tcp_header_len for incoming connections")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Change-Id: I0b07f85b9d8329e0045a3e5baa417fcd21df3989
Cc: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[fruggeri: Account for backport conflicts from 35b2c3211609 and 6fc8c827dd4f]
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
TogoFire pushed a commit that referenced this issue Sep 5, 2023
During boot we observed several warnings that can be avoided

[    6.725935] pmic_analog_codec 800f000.qcom,spmi:qcom,pm660l@3:analog-codec@f000: Error: regulator not found:cdc-vdd-spkdrv
[    6.726304] ------------[ cut here ]------------
[    6.726364] WARNING: CPU: 1 PID: 455 at regcache_cache_only+0xf0/0x100
[    6.726452] Modules linked in:
[    6.726460] CPU: 1 PID: 455 Comm: kworker/1:3 Not tainted 4.9.170-g00294e841ec8-dirty #1
[    6.726462] Hardware name: Qualcomm Technologies, Inc. SDM 630 PM660 + PM660L MTP (DT)
[    6.726471] Workqueue: events deferred_probe_work_func
[    6.726473] task: ffffffedb575b600 task.stack: ffffffedb5770000
[    6.726476] PC is at regcache_cache_only+0xf0/0x100
[    6.726478] LR is at regcache_cache_only+0x2c/0x100

Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
Change-Id: Ie1976dd08c3bd4d402b85b892cb24b8c46922d31
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Sep 5, 2023
log:
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld -EL -maarch64elf --lto-O3 -z noexecstack -r -o vmlinux.o --whole-archive built -in.o
 #0 0x000055d50c1017a3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e87a3)
 #1 0x000055d50c1013fa (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25e83fa)
 #2 0x00007f52034b9b50 __restore_rt (/lib64/libc.so.6+0x3cb50)
 #3 0x000055d50c38a111 void lld::elf::writeResult<llvm::object::ELFType<(llvm::support::endianness)1, true>>() (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang- neutron/bin/ld.lld+0x2871111)
 #4 0x000055d50c1e539e lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26cc39e)
 #5 0x000055d50c1ce651 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x26b5651)
 #6 0x000055d50c1cbc96 lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron /bin/ld.lld+0x26b2c96)
 #7 0x000055d50c0aad03 (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2591d03)
 #8 0x000055d50bc2bc3e lld_main(int, char**) (/home/togo/Documentos/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x2112c3e)
 #9 0x00007f52034a4510 __libc_start_call_main (/lib64/libc.so.6+0x27510)
#10 0x00007f52034a45c9 __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c9)
#11 0x000055d50c09e9f5 _start (/home/togo/Documents/GitHub/kernel_xiaomi_panda/clang-neutron/bin/ld.lld+0x25859f5)
../scripts/link-vmlinux.sh, line 94: 60322 Segmentation fault (core image recorded)${LDFINAL} ${LDFLAGS} -r -o ${1} $(lto_lds) ${objects}
make[1]: *** [/home/togo/Documents/GitHub/kernel_xiaomi_panda/Makefile:1272: vmlinux] Error 139
make[1]: Leaving directory '/home/togo/Documentos/GitHub/kernel_xiaomi_panda/out'

make: *** [Makefile:154: sub-make] Error 2

Signed-off-by: GeoPD <geoemmanuelpd2001@gmail.com>
Change-Id: I7f562d937da3da19192a1c338039f15795cf1bb5
Signed-off-by: TogoFire <italomellopereira@gmail.com>
TogoFire pushed a commit that referenced this issue Sep 5, 2023
With RT_FULL we get the below wreckage:

[  126.060484] =======================================================
[  126.060486] [ INFO: possible circular locking dependency detected ]
[  126.060489] 3.0.1-rt10+ #30
[  126.060490] -------------------------------------------------------
[  126.060492] irq/24-eth0/1235 is trying to acquire lock:
[  126.060495]  (&(lock)->wait_lock#2){+.+...}, at: [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060503]
[  126.060504] but task is already holding lock:
[  126.060506]  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060511]
[  126.060511] which lock already depends on the new lock.
[  126.060513]
[  126.060514]
[  126.060514] the existing dependency chain (in reverse order) is:
[  126.060516]
[  126.060516] -> #1 (&p->pi_lock){-...-.}:
[  126.060519]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060524]        [<ffffffff8150291e>] _raw_spin_lock_irqsave+0x4b/0x85
[  126.060527]        [<ffffffff810b5aa4>] task_blocks_on_rt_mutex+0x36/0x20f
[  126.060531]        [<ffffffff815019bb>] rt_mutex_slowlock+0xd1/0x15a
[  126.060534]        [<ffffffff81501ae3>] rt_mutex_lock+0x2d/0x2f
[  126.060537]        [<ffffffff810d9020>] rcu_boost+0xad/0xde
[  126.060541]        [<ffffffff810d90ce>] rcu_boost_kthread+0x7d/0x9b
[  126.060544]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060547]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060551]
[  126.060552] -> #0 (&(lock)->wait_lock#2){+.+...}:
[  126.060555]        [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060558]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060561]        [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060564]        [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060566]        [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060569]        [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060573]        [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060576]        [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060580]        [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060583]        [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060585]        [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060590]        [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060593]        [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060597]        [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060600]        [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060603]        [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060606]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060608]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060611]
[  126.060612] other info that might help us debug this:
[  126.060614]
[  126.060615]  Possible unsafe locking scenario:
[  126.060616]
[  126.060617]        CPU0                    CPU1
[  126.060619]        ----                    ----
[  126.060620]   lock(&p->pi_lock);
[  126.060623]                                lock(&(lock)->wait_lock);
[  126.060625]                                lock(&p->pi_lock);
[  126.060627]   lock(&(lock)->wait_lock);
[  126.060629]
[  126.060629]  *** DEADLOCK ***
[  126.060630]
[  126.060632] 1 lock held by irq/24-eth0/1235:
[  126.060633]  #0:  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060638]
[  126.060638] stack backtrace:
[  126.060641] Pid: 1235, comm: irq/24-eth0 Not tainted 3.0.1-rt10+ #30
[  126.060643] Call Trace:
[  126.060644]  <IRQ>  [<ffffffff810acbde>] print_circular_bug+0x289/0x29a
[  126.060651]  [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060655]  [<ffffffff810ab3aa>] ? trace_hardirqs_off_caller+0x1f/0x99
[  126.060658]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060661]  [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060664]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060668]  [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060671]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060674]  [<ffffffff810d9655>] ? rcu_report_qs_rsp+0x87/0x8c
[  126.060677]  [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060680]  [<ffffffff810d9ea3>] ? rcu_read_unlock_special+0x9b/0x1c4
[  126.060683]  [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060687]  [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060690]  [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060693]  [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060696]  [<ffffffff810683da>] ? select_task_rq_rt+0x27/0xd5
[  126.060701]  [<ffffffff810a852a>] ? clockevents_program_event+0x8e/0x90
[  126.060704]  [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060708]  [<ffffffff810a95dc>] ? tick_program_event+0x1f/0x21
[  126.060711]  [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060715]  [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060718]  [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060721]  [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060724]  [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060726]  <EOI>  [<ffffffff81072855>] ? migrate_disable+0x75/0x12d
[  126.060733]  [<ffffffff81080a61>] ? local_bh_disable+0xe/0x1f
[  126.060736]  [<ffffffff81080a70>] ? local_bh_disable+0x1d/0x1f
[  126.060739]  [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060742]  [<ffffffff81502ac0>] ? _raw_spin_unlock_irq+0x3b/0x59
[  126.060745]  [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060748]  [<ffffffff810d5937>] ? irq_thread_fn+0x3a/0x3a
[  126.060751]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060754]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060757]  [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060761]  [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060764]  [<ffffffff81069ed7>] ? finish_task_switch+0x87/0x10a
[  126.060768]  [<ffffffff81502ec4>] ? retint_restore_args+0xe/0xe
[  126.060771]  [<ffffffff8109a6c7>] ? __init_kthread_worker+0x8c/0x8c
[  126.060774]  [<ffffffff81509b10>] ? gs_change+0xb/0xb

Because irq_exit() does:

void irq_exit(void)
{
	account_system_vtime(current);
	trace_hardirq_exit();
	sub_preempt_count(IRQ_EXIT_OFFSET);
	if (!in_interrupt() && local_softirq_pending())
		invoke_softirq();

	...
}

Which triggers a wakeup, which uses RCU, now if the interrupted task has
t->rcu_read_unlock_special set, the rcu usage from the wakeup will end
up in rcu_read_unlock_special(). rcu_read_unlock_special() will test
for in_irq(), which will fail as we just decremented preempt_count
with IRQ_EXIT_OFFSET, and in_sering_softirq(), which for
PREEMPT_RT_FULL reads:

int in_serving_softirq(void)
{
	int res;

	preempt_disable();
	res = __get_cpu_var(local_softirq_runner) == current;
	preempt_enable();
	return res;
}

Which will thus also fail, resulting in the above wreckage.

The 'somewhat' ugly solution is to open-code the preempt_count() test
in rcu_read_unlock_special().

Also, we're not at all sure how ->rcu_read_unlock_special gets set
here... so this is very likely a bandaid and more thought is required.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Change-Id: Ia38183c89fe4e830276c3f873f86c1deba3c03a9
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Sep 5, 2023
We don't need to hold the local pinctrl lock here to set irq wake on the
summary irq line. Doing so only leads to lockdep warnings instead of
protecting us from anything. Remove the locking.

 WARNING: possible circular locking dependency detected
 5.4.11 #2 Tainted: G        W
 ------------------------------------------------------
 cat/3083 is trying to acquire lock:
 ffffff81f4fa58c0 (&irq_desc_lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94

 but task is already holding lock:
 ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&pctrl->lock){-.-.}:
        _raw_spin_lock_irqsave+0x64/0x80
        msm_gpio_irq_ack+0x68/0xf4
        __irq_do_set_handler+0xe0/0x180
        __irq_set_handler+0x60/0x9c
        irq_domain_set_info+0x90/0xb4
        gpiochip_hierarchy_irq_domain_alloc+0x110/0x200
        __irq_domain_alloc_irqs+0x130/0x29c
        irq_create_fwspec_mapping+0x1f0/0x300
        irq_create_of_mapping+0x70/0x98
        of_irq_get+0xa4/0xd4
        spi_drv_probe+0x4c/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        __device_attach_driver+0x9c/0x110
        bus_for_each_drv+0x88/0xd0
        __device_attach+0xb0/0x160
        device_initial_probe+0x20/0x2c
        bus_probe_device+0x34/0x94
        device_add+0x35c/0x3f0
        spi_add_device+0xbc/0x194
        of_register_spi_devices+0x2c8/0x408
        spi_register_controller+0x57c/0x6fc
        spi_geni_probe+0x260/0x328
        platform_drv_probe+0x90/0xb0
        really_probe+0x138/0x3f0
        driver_probe_device+0x70/0x140
        device_driver_attach+0x4c/0x6c
        __driver_attach+0xcc/0x154
        bus_for_each_dev+0x84/0xcc
        driver_attach+0x2c/0x38
        bus_add_driver+0x108/0x1fc
        driver_register+0x64/0xf8
        __platform_driver_register+0x4c/0x58
        spi_geni_driver_init+0x1c/0x24
        do_one_initcall+0x1a4/0x3e8
        do_initcall_level+0xb4/0xcc
        do_basic_setup+0x30/0x48
        kernel_init_freeable+0x124/0x1a8
        kernel_init+0x14/0x100
        ret_from_fork+0x10/0x18

 -> #0 (&irq_desc_lock_class){-.-.}:
        __lock_acquire+0xeb4/0x2388
        lock_acquire+0x1cc/0x210
        _raw_spin_lock_irqsave+0x64/0x80
        __irq_get_desc_lock+0x64/0x94
        irq_set_irq_wake+0x40/0x144
        msm_gpio_irq_set_wake+0x5c/0x7c
        set_irq_wake_real+0x40/0x5c
        irq_set_irq_wake+0x70/0x144
        cros_ec_rtc_suspend+0x38/0x4c
        platform_pm_suspend+0x34/0x60
        dpm_run_callback+0x64/0xcc
        __device_suspend+0x310/0x41c
        dpm_suspend+0xf8/0x298
        dpm_suspend_start+0x84/0xb4
        suspend_devices_and_enter+0xbc/0x620
        pm_suspend+0x210/0x348
        state_store+0xb0/0x108
        kobj_attr_store+0x14/0x24
        sysfs_kf_write+0x4c/0x64
        kernfs_fop_write+0x15c/0x1fc
        __vfs_write+0x54/0x18c
        vfs_write+0xe4/0x1a4
        ksys_write+0x7c/0xe4
        __arm64_sys_write+0x20/0x2c
        el0_svc_common+0xa8/0x160
        el0_svc_handler+0x7c/0x98
        el0_svc+0x8/0xc

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&pctrl->lock);
                                lock(&irq_desc_lock_class);
                                lock(&pctrl->lock);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

 7 locks held by cat/3083:
  #0: ffffff81f06d1420 (sb_writers#7){.+.+}, at: vfs_write+0xd0/0x1a4
  #1: ffffff81c8935680 (&of->mutex){+.+.}, at: kernfs_fop_write+0x12c/0x1fc
  #2: ffffff81f4c322f0 (kn->count#337){.+.+}, at: kernfs_fop_write+0x134/0x1fc
  #3: ffffffe89a641d60 (system_transition_mutex){+.+.}, at: pm_suspend+0x108/0x348
  #4: ffffff81f190e970 (&dev->mutex){....}, at: __device_suspend+0x168/0x41c
  #5: ffffff81f183d8c0 (lock_class){-.-.}, at: __irq_get_desc_lock+0x64/0x94
  #6: ffffff81f4880c18 (&pctrl->lock){-.-.}, at: msm_gpio_irq_set_wake+0x48/0x7c

 stack backtrace:
 CPU: 4 PID: 3083 Comm: cat Tainted: G        W         5.4.11 #2
 Hardware name: Google Cheza (rev3+) (DT)
 Call trace:
  dump_backtrace+0x0/0x174
  show_stack+0x20/0x2c
  dump_stack+0xc8/0x124
  print_circular_bug+0x2ac/0x2c4
  check_noncircular+0x1a0/0x1a8
  __lock_acquire+0xeb4/0x2388
  lock_acquire+0x1cc/0x210
  _raw_spin_lock_irqsave+0x64/0x80
  __irq_get_desc_lock+0x64/0x94
  irq_set_irq_wake+0x40/0x144
  msm_gpio_irq_set_wake+0x5c/0x7c
  set_irq_wake_real+0x40/0x5c
  irq_set_irq_wake+0x70/0x144
  cros_ec_rtc_suspend+0x38/0x4c
  platform_pm_suspend+0x34/0x60
  dpm_run_callback+0x64/0xcc
  __device_suspend+0x310/0x41c
  dpm_suspend+0xf8/0x298
  dpm_suspend_start+0x84/0xb4
  suspend_devices_and_enter+0xbc/0x620
  pm_suspend+0x210/0x348
  state_store+0xb0/0x108
  kobj_attr_store+0x14/0x24
  sysfs_kf_write+0x4c/0x64
  kernfs_fop_write+0x15c/0x1fc
  __vfs_write+0x54/0x18c
  vfs_write+0xe4/0x1a4
  ksys_write+0x7c/0xe4
  __arm64_sys_write+0x20/0x2c
  el0_svc_common+0xa8/0x160
  el0_svc_handler+0x7c/0x98
  el0_svc+0x8/0xc

Fixes: 6aced33 ("pinctrl: msm: drop wake_irqs bitmap")
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Brian Masney <masneyb@onstation.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Maulik Shah <mkshah@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Change-Id: I5813f9d3dadb55d22aa2d571654ff61f2785d82b
Link: https://lore.kernel.org/r/20200121180950.36959-1-swboyd@chromium.org
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Sep 5, 2023
On a system that often re-configures a USB gadget device, the kernel log
is filled with:

  configfs-gadget gadget: high-speed config #1: c

Reduce the verbosity of this print to debug.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Change-Id: I26184a31a6728f9395853ff84e5732a764b87906
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Sep 5, 2023
A recent clang 16 update has introduced segfault when compiling kernel
using the `-polly-invariant-load-hoisting` flag and breaks compilation
when kernel is compiled with full LTO.
This is due to the flag being ignorant about the errors during SCoP
verification and does not take the errors into account as per a recent
issue opened at [LLVM], causing polly to do segfault and the compiler
to print the following backtrace:

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: clang -Wp,-MD,drivers/md/.md.o.d -nostdinc -isystem /tmp/cirrus-ci-build/toolchains/clang/lib/clang/16.0.0/include -I../arch/arm64/include -I./arch/arm64/include/generated -I../include -I./include -I../arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I../include/uapi -I./include/generated/uapi -include ../include/linux/kconfig.h -include ../include/linux/compiler_types.h -I../drivers/md -Idrivers/md -D__KERNEL__ -mlittle-endian -DKASAN_SHADOW_SCALE_SHIFT=3 -Qunused-arguments -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration -Werror=return-type -Wno-format-security -std=gnu89 --target=aarch64-linux-gnu --prefix=/tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin/aarch64-linux-gnu- --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -Wno-misleading-indentation -Wno-bool-operation -Werror=unknown-warning-option -Wno-unsequenced -opaque-pointers -fno-PIE -mgeneral-regs-only -DCONFIG_AS_LSE=1 -fno-asynchronous-unwind-tables -Wno-psabi -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-int-in-bool-context -Wno-address-of-packed-member -O3 -march=armv8.1-a+crypto+fp16+rcpc -mtune=cortex-a53 -mllvm -polly -mllvm -polly-ast-use-context -mllvm -polly-detect-keep-going -mllvm -polly-invariant-load-hoisting -mllvm -polly-run-inliner -mllvm -polly-vectorizer=stripmine -mllvm -polly-loopfusion-greedy=1 -mllvm -polly-reschedule=1 -mllvm -polly-postopts=1 -fstack-protector-strong --target=aarch64-linux-gnu --gcc-toolchain=/tmp/cirrus-ci-build/toolchains/clang -meabi gnu -Wno-format-invalid-specifier -Wno-gnu -Wno-duplicate-decl-specifier -Wno-asm-operand-widths -Wno-initializer-overrides -Wno-tautological-constant-out-of-range-compare -Wno-tautological-compare -mno-global-merge -Wno-void-ptr-dereference -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -Wvla -flto -fwhole-program-vtables -fvisibility=hidden -Wdeclaration-after-statement -Wno-pointer-sign -Wno-array-bounds -fno-strict-overflow -fno-stack-check -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types -fmacro-prefix-map=../= -Wno-initializer-overrides -Wno-unused-value -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-uninitialized -Wno-pointer-to-enum-cast -Wno-unaligned-access -DKBUILD_BASENAME=\"md\" -DKBUILD_MODNAME=\"md_mod\" -D__KBUILD_MODNAME=kmod_md_mod -c -o drivers/md/md.o ../drivers/md/md.c
1.	<eof> parser at end of file
2.	Optimizer
  CC      drivers/media/platform/msm/camera_v2/camera/camera.o
  AR      drivers/media/pci/intel/ipu3/built-in.a
  CC      drivers/md/dm-linear.o
 #0 0x0000559d3527073f (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a7073f)
 #1 0x0000559d352705bf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a705bf)
 #2 0x0000559d3523b198 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b198)
 #3 0x0000559d3523b33e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x3a3b33e)
 #4 0x00007f339dc3ea00 (/usr/lib/libc.so.6+0x38a00)
 #5 0x0000559d35affccf (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x42ffccf)
 #6 0x0000559d35b01710 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301710)
 #7 0x0000559d35b01a12 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4301a12)
 #8 0x0000559d35b09a9e (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4309a9e)
 #9 0x0000559d35b14707 (/tmp/cirrus-ci-build/toolchains/clang/bin/clang-16+0x4314707)
clang-16: error: clang frontend command failed with exit code 139 (use -v to see invocation)
Neutron clang version 16.0.0 (https://github.com/llvm/llvm-project.git 598f5275c16049b1e1b5bc934cbde447a82d485e)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/cirrus-ci-build/Kernel/../toolchains/clang/bin

From the very nature of `-polly-detect-keep-going`, it can be concluded
that this is a potentially unsafe flag and instead of using it conditionally
for CLANG < 16, just remove it all together for all. This should allow polly
to run SCoP verifications as intended.

Issue [LLVM]: llvm/llvm-project#58484 (comment)

Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Change-Id: I8fcd5fa757794a627fd8148935b3426a0d1882ea
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Sep 5, 2023
Qos requirement is not initialized when update is called.

This fixes the following call trace:
[...]
[   27.206894] msm_pm_qos_update_request: update request 100
[   27.206900] msm_pm_qos_add_request: add request
[   27.206904] pm_qos_update_request() called for unknown object
[   27.206915] ------------[ cut here ]------------
[   27.206928] WARNING: CPU: 5 PID: 1107 at kernel/power/qos.c:696 pm_qos_update_request+0x4c/0x60
[   27.206935] CPU: 5 PID: 1107 Comm: qvrservice Tainted: G S              4.14.185-Chips #1
[   27.206937] Hardware name: Qualcomm Technologies, Inc. trinket pm6125 + pmi632 IDP (DT)
[   27.206939] task: 00000000ffaf085c task.stack: 000000007a0f23be
[   27.206942] pc : pm_qos_update_request+0x4c/0x60
[   27.206943] lr : pm_qos_update_request+0x4c/0x60
[...]

Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Change-Id: If27fd02e9d0cfc2c20f8c3d094d0f02757de8e0b
Signed-off-by: TogoFire <togofire@mailfence.com>
TogoFire pushed a commit that referenced this issue Sep 5, 2023
Qos requirement is not initialized when update is called.

This fixes the following call trace:
[...]
[   27.206894] msm_pm_qos_update_request: update request 100
[   27.206900] msm_pm_qos_add_request: add request
[   27.206904] pm_qos_update_request() called for unknown object
[   27.206915] ------------[ cut here ]------------
[   27.206928] WARNING: CPU: 5 PID: 1107 at kernel/power/qos.c:696 pm_qos_update_request+0x4c/0x60
[   27.206935] CPU: 5 PID: 1107 Comm: qvrservice Tainted: G S              4.14.185-Chips #1
[   27.206937] Hardware name: Qualcomm Technologies, Inc. trinket pm6125 + pmi632 IDP (DT)
[   27.206939] task: 00000000ffaf085c task.stack: 000000007a0f23be
[   27.206942] pc : pm_qos_update_request+0x4c/0x60
[   27.206943] lr : pm_qos_update_request+0x4c/0x60
[...]

Signed-off-by: Ahmad Thoriq Najahi <najahi@chips-projects.xyz>
Change-Id: Ie0e0efbfa97d4aacd7d1b184a106b278878fc41e
Signed-off-by: TogoFire <togofire@mailfence.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants