Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device removal of indirect vdev panics the kernel #9327

Merged
merged 1 commit into from
Sep 16, 2019

Conversation

loli10K
Copy link
Contributor

@loli10K loli10K commented Sep 15, 2019

Motivation and Context

I was trying to remove a vdev by its guid, picked wrong value from zdb -P | grep guid output (indirect vdev instead of file vdev):

root@linux:~# zdb -P
testpool:
    version: 5000
    name: 'testpool'
    state: 0
    txg: 32
    pool_guid: 17643256605659644812
    errata: 0
    hostname: 'linux'
    com.delphix:has_per_vdev_zaps
    vdev_children: 3
    vdev_tree:
        type: 'root'
        id: 0
        guid: 17643256605659644812
        create_txg: 4
        children[0]:
            type: 'indirect'
            id: 0
            guid: 15157881756569472318
            whole_disk: 0
            metaslab_array: 0
            metaslab_shift: 26
            ashift: 9
            asize: 1069023232
            is_log: 0
            com.delphix:indirect_object: 389
            com.delphix:indirect_births: 391
            create_txg: 4
            com.delphix:vdev_zap_top: 130
        children[1]:
            type: 'file'
            id: 1
            guid: 7707596351290300199
            path: '/var/tmp/2'
            metaslab_array: 143
            metaslab_shift: 26
            ashift: 9
            asize: 1069023232
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_leaf: 131
            com.delphix:vdev_zap_top: 132
        children[2]:
            type: 'file'
            id: 2
            guid: 1989233471868155580
            path: '/var/tmp/3'
            metaslab_array: 135
            metaslab_shift: 26
            ashift: 9
            asize: 1069023232
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_leaf: 133
            com.delphix:vdev_zap_top: 134
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
        com.delphix:device_removal
root@linux:~# zdb -P | grep guid
    pool_guid: 17643256605659644812
        guid: 17643256605659644812
            guid: 15157881756569472318
            guid: 7707596351290300199
            guid: 1989233471868155580
root@linux:~# 
root@linux:~# zpool remove testpool 15157881756569472318
[30933.313207] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
[30933.317686] IP: [<ffffffffc030e004>] spa_vdev_remove_top_check+0xc4/0x440 [zfs]
[30933.321679] PGD 3653e067 PUD b822c067 PMD 0 
[30933.352991] Oops: 0000 [#1] SMP 
[30933.355106] Dumping ftrace buffer:
[30933.357215]    (ftrace buffer empty)
Thread 180 received signal SIGSEGV, Segmentation fault.
0xffffffffc030e004 in spa_vdev_remove_top_check (vd=0xffff880139ca4000) at /usr/src/zfs/module/zfs/vdev_removal.c:1972
1972		metaslab_class_t *mc = vd->vdev_mg->mg_class;
(gdb) bt
#0  0xffffffffc030e004 in spa_vdev_remove_top_check (vd=0xffff880139ca4000) at /usr/src/zfs/module/zfs/vdev_removal.c:1972
#1  0xffffffffc030e3ad in spa_vdev_remove_top (vd=0xffff880139ca4000, txg=0xffff8800b8993d88) at /usr/src/zfs/module/zfs/vdev_removal.c:2073
#2  0xffffffffc030e6c5 in spa_vdev_remove (spa=0xffff8800b8630000, guid=15157881756569472318, unspare=<optimized out>) at /usr/src/zfs/module/zfs/vdev_removal.c:2226
#3  0xffffffffc0331a38 in zfs_ioc_vdev_remove (zc=0xffff8800b8e90000) at /usr/src/zfs/module/zfs/zfs_ioctl.c:1939
#4  0xffffffffc033d6a5 in zfsdev_ioctl (filp=<optimized out>, cmd=<optimized out>, arg=<optimized out>) at /usr/src/zfs/module/zfs/zfs_ioctl.c:7502
#5  0xffffffff81202d74 in vfs_ioctl (arg=<optimized out>, cmd=<optimized out>, filp=<optimized out>) at fs/ioctl.c:43
#6  do_vfs_ioctl (filp=0xffff8800ba86f100, fd=<optimized out>, cmd=<optimized out>, arg=140725645557792) at fs/ioctl.c:607
#7  0xffffffff81202fc9 in SYSC_ioctl (arg=<optimized out>, cmd=<optimized out>, fd=<optimized out>) at fs/ioctl.c:622
#8  SyS_ioctl (fd=3, cmd=23052, arg=140725645557792) at fs/ioctl.c:613
#9  0xffffffff818096f6 in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:185
#10 0x00007f4c82c5a678 in ?? ()
#11 0x0000000000040000 in ?? ()
#12 0x0000000000041000 in ?? ()
#13 0x00007f4c840cb010 in ?? ()
#14 0x0000000000040010 in ?? ()
#15 0x00007f4c82c5a620 in ?? ()
#16 0x0000000000000246 in irq_stack_union ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) p *vd->vdev_ops
$2 = {vdev_op_open = 0xffffffffc02f0d00 <vdev_indirect_open>, vdev_op_close = 0xffffffffc02f0cf0 <vdev_indirect_close>, vdev_op_asize = 0xffffffffc02e3690 <vdev_default_asize>, 
  vdev_op_io_start = 0xffffffffc02f4610 <vdev_indirect_io_start>, vdev_op_io_done = 0xffffffffc02f3450 <vdev_indirect_io_done>, vdev_op_state_change = 0x0 <irq_stack_union>, 
  vdev_op_need_resilver = 0x0 <irq_stack_union>, vdev_op_hold = 0x0 <irq_stack_union>, vdev_op_rele = 0x0 <irq_stack_union>, vdev_op_remap = 0xffffffffc02f3fa0 <vdev_indirect_remap>, 
  vdev_op_xlate = 0x0 <irq_stack_union>, vdev_op_type = "indirect\000\000\000\000\000\000\000", vdev_op_leaf = B_FALSE}
(gdb) p vd->vdev_guid
$3 = 15157881756569472318
(gdb) p vd->vdev_mg
$4 = (metaslab_group_t *) 0x0 <irq_stack_union>
(gdb) p vd->vdev_mg->mg_class
Cannot access memory at address 0xa8
(gdb) 

Description

This change prevents zpool removal for indirect (and other non-concrete) vdevs.

How Has This Been Tested?

Reproducer added as ZTS test case.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

This commit fixes a NULL pointer dereference triggered in
spa_vdev_remove_top_check() by trying to "zpool remove" an indirect
vdev.

Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
@loli10K loli10K added the Status: Code Review Needed Ready for review and testing label Sep 15, 2019
@codecov
Copy link

codecov bot commented Sep 16, 2019

Codecov Report

Merging #9327 into master will increase coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9327      +/-   ##
==========================================
+ Coverage   79.02%   79.06%   +0.04%     
==========================================
  Files         401      401              
  Lines      122469   122471       +2     
==========================================
+ Hits        96777    96829      +52     
+ Misses      25692    25642      -50
Flag Coverage Δ
#kernel 79.84% <100%> (+0.05%) ⬆️
#user 66.91% <50%> (+0.4%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a0d418...1ad67d5. Read the comment docs.

@ahrens
Copy link
Member

ahrens commented Sep 16, 2019

Totally forgot that you could specify a vdev by its GUID. Good catch and thanks for fixing this!

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Sep 16, 2019
@behlendorf behlendorf merged commit fcd37b6 into openzfs:master Sep 16, 2019
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Dec 24, 2019
This commit fixes a NULL pointer dereference triggered in
spa_vdev_remove_top_check() by trying to "zpool remove" an indirect
vdev.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes openzfs#9327
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Dec 27, 2019
This commit fixes a NULL pointer dereference triggered in
spa_vdev_remove_top_check() by trying to "zpool remove" an indirect
vdev.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes openzfs#9327
tonyhutter pushed a commit that referenced this pull request Jan 23, 2020
This commit fixes a NULL pointer dereference triggered in
spa_vdev_remove_top_check() by trying to "zpool remove" an indirect
vdev.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #9327
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants