Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unaligned access by llockd in ext4_delete_entry() #7

Closed
abrodkin opened this issue Feb 7, 2019 · 8 comments
Closed

Unaligned access by llockd in ext4_delete_entry() #7

abrodkin opened this issue Feb 7, 2019 · 8 comments
Assignees
Labels

Comments

@abrodkin
Copy link
Member

abrodkin commented Feb 7, 2019

Boot log:

[    0.000000] Linux version 4.19.14-yocto-standard (oe-user@oe-host) (gcc version 8.2.1 20180814 (GCC)) #1 SMP PREEMPT Thu Feb 7 12:43:04 UTC 2019

...

[    4.151331] Misaligned Access
[    4.155771] Path: /bin/busybox.nosuid
[    4.159419] CPU: 3 PID: 174 Comm: rm Not tainted 4.19.14-yocto-standard #1
[    4.166274]
[    4.166274] [ECR   ]: 0x000d0000 => Check Programmer's Manual
[    4.173551] [EFA   ]: 0xbeaec3fc
[    4.173551] [BLINK ]: ext4_delete_entry+0xce/0x224
[    4.173551] [ERET  ]: ext4_delete_entry+0x176/0x224
[    4.186363] [STAT32]: 0x80080002 : IE K
[    4.190614] BTA: 0x9024795a   SP: 0xbe375ec4  FP: 0x00000000
[    4.196194] LPS: 0x9074b214  LPE: 0x9074b218 LPC: 0x00000000
[    4.201759] r00: 0x00000000  r01: 0x0000090d r02: 0x00000001
[    4.201759] r03: 0x00000000  r04: 0x00000000 r05: 0xbea8ecb0
[    4.201759] r06: 0xbeaec3fc  r07: 0x00000400 r08: 0x00000002
[    4.201759] r09: 0x00000000  r10: 0x000002b4 r11: 0xbeaec32c
[    4.201759] r12: 0x9024795a  r13: 0x9004e574 r14: 0x0008e150
[    4.201759] r15: 0x00098a68  r16: 0x0008cbec r17: 0x00097fe4
[    4.201759] r18: 0x00097fe4  r19: 0x0008e150 r20: 0x0008f0f8
[    4.201759] r21: 0x000000ae  r22: 0x0008f0f8 r23: 0x00000000
[    4.201759] r24: 0x00000000  r25: 0x00000000

Disassembly of problematic code:

90247a02:»      222f 1192           »   llockd» r10,[r6]
90247a06:»      0a13 1081           »   brne.nt»r10,r2,18»      ;90247a16 <ext4_delete_entry+0x18a>
90247a0a:»      0b0f 10c1           »   brne.nt»r11,r3,14»      ;90247a16 <ext4_delete_entry+0x18a>

Note this kernel version (v4.19.14 as well as latest in 4.19.y series v4.19.20) doesn't have my patch that fixes Etnaviv GPU, see torvalds@a66d972.

@abrodkin
Copy link
Member Author

abrodkin commented Feb 7, 2019

Hm with vanilla Linux v4.19.19 and initramfs I cannot reproduce that problem.

@abrodkin
Copy link
Member Author

abrodkin commented Feb 7, 2019

Mentioned torvalds@a66d972 made no difference. Which means this is not statically allocated atomic64_t and we need to look into it now.

@vineetgarc
Copy link

Do note that LLOCKD by default needs data to be 64-bit aligned. I'm checking with hw folks if that restriction holds true even AD is enabled.

@vineetgarc
Copy link

Both LLOCK and EX transactions need to be aligned regardless of the AD bit, i.e.:
LLOCK: 32-bit aligned
LLOCKD: 64-bit aligned

@vineetgarc
Copy link

In ur case above, r6 is 0xbeaec3fc so it is not 64-bit aligned !

@abrodkin
Copy link
Member Author

abrodkin commented Feb 8, 2019

@vineetgarc we already knew all that since [1] which ended-up with torvalds@a66d972.

So reason for "Misaligned Access" is clear, what's not clear is:
1.* Which atomic64_t causes this new failure
2. How to solve this once (1) above is done

Unfortunately my patch for devm_xxx() doesn't help here.

[1] http://lists.infradead.org/pipermail/linux-snps-arc/2018-July/004009.html

@abrodkin
Copy link
Member Author

abrodkin commented Feb 8, 2019

So problematic atomic is inode->i_version, see https://elixir.bootlin.com/linux/v4.19.14/source/include/linux/fs.h#L656

And failure happens in atomic64_cmpxchg(), see https://elixir.bootlin.com/linux/v4.19.14/source/include/linux/iversion.h#L198

Stack Trace:
  atomic64_cmpxchg
  inode_maybe_inc_iversion
  inode_inc_iversion
  ext4_generic_delete_entry
  ext4_delete_entry
  ext4_rmdir
  vfs_rmdir
  do_rmdir
  EV_Trap

What's worse obvious "fix" doesn't help:

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7b6084854bfe..d1daa09c3bc6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -653,7 +653,7 @@ struct inode {
                struct hlist_head       i_dentry;
                struct rcu_head         i_rcu;
        };
-       atomic64_t              i_version;
+       atomic64_t              i_version __aligned(sizeof(atomic64_t));
        atomic_t                i_count;
        atomic_t                i_dio_count;
        atomic_t                i_writecount;

We still get:

[    4.015732] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
[    4.167881]
[    4.167881] Misaligned Access
[    4.172356] Path: /bin/busybox.nosuid
[    4.176004] CPU: 2 PID: 171 Comm: rm Not tainted 4.19.14-yocto-standard #1
[    4.182851]
[    4.182851] [ECR   ]: 0x000d0000 => Check Programmer's Manual
[    4.190061] [EFA   ]: 0xbeaec3fc
[    4.190061] [BLINK ]: ext4_delete_entry+0x210/0x234
[    4.190061] [ERET  ]: ext4_delete_entry+0x13e/0x234
[    4.202985] [STAT32]: 0x80080002 : IE K
[    4.207236] BTA: 0x9009329c   SP: 0xbe5b1ec4  FP: 0x00000000
[    4.212790] LPS: 0x9074b118  LPE: 0x9074b120 LPC: 0x00000000
[    4.218348] r00: 0x00000040  r01: 0x00000021 r02: 0x00000001
[    4.218348] r03: 0x00000000  r04: 0x00000002 r05: 0x00000000
[    4.218348] r06: 0x000000c6  r07: 0x00000000 r08: 0x9050f140
[    4.218348] r09: 0x000000c6  r10: 0x0000000a r11: 0x00000000
[    4.218348] r12: 0x90247a9c  r13: 0x9004e574 r14: 0x0008e150
[    4.218348] r15: 0x000989b8  r16: 0x0008cbec r17: 0x0009806c
[    4.218348] r18: 0x0009806c  r19: 0x0008e150 r20: 0x0008f0f8
[    4.218348] r21: 0x000000ab  r22: 0x0008f0f8 r23: 0x00000000
[    4.218348] r24: 0x00000000  r25: 0x00000000
[    4.218348]
[    4.218348]
[    4.270510]
[    4.270510] Stack Trace:
[    4.274510]   ext4_delete_entry+0x13e/0x234
[    4.278695]   ext4_rmdir+0xe0/0x238
[    4.282187]   vfs_rmdir+0x50/0xf0
[    4.285492]   do_rmdir+0x9e/0x154
[    4.288802]   EV_Trap+0x110/0x114

@abrodkin
Copy link
Member Author

abrodkin commented Feb 8, 2019

The culprit was in slab allocator used for inodes.
Even though default ARCH_SLAB_MINALIGN is set in quite a sensible way:

#ifndef ARCH_SLAB_MINALIGN
#define ARCH_SLAB_MINALIGN __alignof__(unsigned long long)
#endif

see https://elixir.bootlin.com/linux/latest/source/include/linux/slab.h#L213

The problem in case of ARC is __alignof__(unsigned long long) = 4!

And then solution is as simple as to define ARCH_SLAB_MINALIGN = 8 for us, see http://lists.infradead.org/pipermail/linux-snps-arc/2019-February/005423.html

@abrodkin abrodkin closed this as completed Feb 8, 2019
shahab-vahedi pushed a commit that referenced this issue Nov 23, 2022
As guest_irq is coming from KVM_IRQFD API call, it may trigger
crash in svm_update_pi_irte() due to out-of-bounds:

crash> bt
PID: 22218  TASK: ffff951a6ad74980  CPU: 73  COMMAND: "vcpu8"
 #0 [ffffb1ba6707fa40] machine_kexec at ffffffff8565b397
 #1 [ffffb1ba6707fa90] __crash_kexec at ffffffff85788a6d
 #2 [ffffb1ba6707fb58] crash_kexec at ffffffff8578995d
 #3 [ffffb1ba6707fb70] oops_end at ffffffff85623c0d
 #4 [ffffb1ba6707fb90] no_context at ffffffff856692c9
 #5 [ffffb1ba6707fbf8] exc_page_fault at ffffffff85f95b51
 torvalds#6 [ffffb1ba6707fc50] asm_exc_page_fault at ffffffff86000ace
    [exception RIP: svm_update_pi_irte+227]
    RIP: ffffffffc0761b53  RSP: ffffb1ba6707fd08  RFLAGS: 00010086
    RAX: ffffb1ba6707fd78  RBX: ffffb1ba66d91000  RCX: 0000000000000001
    RDX: 00003c803f63f1c0  RSI: 000000000000019a  RDI: ffffb1ba66db2ab8
    RBP: 000000000000019a   R8: 0000000000000040   R9: ffff94ca41b82200
    R10: ffffffffffffffcf  R11: 0000000000000001  R12: 0000000000000001
    R13: 0000000000000001  R14: ffffffffffffffcf  R15: 000000000000005f
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffb1ba6707fdb8] kvm_irq_routing_update at ffffffffc09f19a1 [kvm]
 #8 [ffffb1ba6707fde0] kvm_set_irq_routing at ffffffffc09f2133 [kvm]
 #9 [ffffb1ba6707fe18] kvm_vm_ioctl at ffffffffc09ef544 [kvm]
    RIP: 00007f143c36488b  RSP: 00007f143a4e04b8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007f05780041d0  RCX: 00007f143c36488b
    RDX: 00007f05780041d0  RSI: 000000004008ae6a  RDI: 0000000000000020
    RBP: 00000000000004e8   R8: 0000000000000008   R9: 00007f05780041e0
    R10: 00007f0578004560  R11: 0000000000000246  R12: 00000000000004e0
    R13: 000000000000001a  R14: 00007f1424001c60  R15: 00007f0578003bc0
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

Vmx have been fix this in commit 3a8b067 (KVM: VMX: Do not BUG() on
out-of-bounds guest IRQ), so we can just copy source from that to fix
this.

Co-developed-by: Yi Liu <liu.yi24@zte.com.cn>
Signed-off-by: Yi Liu <liu.yi24@zte.com.cn>
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Message-Id: <20220309113025.44469-1-wang.yi59@zte.com.cn>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants