-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zpool import not working. Hence unable to make zdb work. #14
Comments
I actually thought it was working, unless you found a cornercase. One thing that could be worth checking, is "offline" of disks, there was an issue where we left disks offline after export. Flipping them Online in diskmanager would fix it. |
Thanks for the prompt response! FYI
Output:
For zdb properties like :
It has same output :
|
Sorry about the delay, the cstyle commit was a lot of work. I don't have any issues with export and import itself:
But we are tripping over a silly assert for random there at the end. |
Fixed a couple of the asserts.. it fails to find the vdev in userland mode... probably needs a bit of patching
|
It seems you are using WSL2, i tried zpool export then import over physical and virtual machine (Windows) and it never worked for me. Eventually i tried same over a WSL2 machine, and for me it also worked with same binaries.
If you are using WSL2, you may try same commands outside WSL2. |
I have WSL2 installed but I'm not using it. Suppose that could be changing things, I am also in "git for windows bash" - which is built on MinGW - which could also be changing things. I could try from a vanilla CMD shell |
@datacore-skumar Could you attach the diff of zdb changes? It may help @lundman use/approve/review those changes and help us root cause the issues faster. |
Attached file is the diff of zdb changes. |
Hi @lundman, Were you able to repro the issue on your end that @datacore-skumar is running into? |
I can export pool just fine in CMD as well - unfortunately. I do not use the pool much, just create and export, to have something to test zdb against. I noticed the work I did for zdb in here, is part of the OpenZFS2_git_diff_zdb.txt - and there are a few more things in there too, so really it would be better if OpenZFS2_git_diff_zdb.txt was made into a PR against this repo, so we can discuss it. For example, all the changes to But otherwise, all the other changes are right on the money, exactly how I would/did change them. |
Hi @lundman, I'm also not able to import a pool using 'zpool import' command. I'm using the latest source code. Z:>zpool.exe create pool-1 PHYSICALDRIVE2 PHYSICALDRIVE1 Z:>zpool.exe status
errors: No known data errors Z:>zpool.exe export pool-1 Z:>zpool.exe status Z:>zpool.exe import
pool: pool-1
Z:> |
Thank you, that is very interesting. It knows there is a pool, but fails to find the vdev. I will try to replicate this here. Do you know if it happens especially for mirrors? |
@lundman We are seeing failures with zpool import even when there is only one vdev in the zpool. |
Hi @lundman, Please let us know if you need any specific / further info to help us narrow this down. Happy to collect and share. Hopefully, once we get past this issue, we will be able to get the zdb to work and resubmit the PR for your review. |
Having a hard time to replicate the issue, could it be device names? I created two VHDs, to see if that will have issues:
So no issues there. |
Interesting. Thanks for trying @lundman. Could you upload a zip file with all your binaries and driver? We will see if we can repro the issue on our machines with your binaries. |
https://www.lundman.net/OpenZFSOnWindows-debug-2.0.0-31-g3ee1b5689-dirty.exe Not signed, so you need TestMode on. |
Thanks @lundman. We were able to repro the issues with your binaries too. Now that the zdb is fixed, we will focus on investigating the zpool import issue. |
Fixed in commit log: |
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 #7 0xffffffff80669841 at vgone+0x31 #8 0xffffffff8065806d at vfs_hash_insert+0x26d #9 0xffffffff81a39069 at sfs_vgetx+0x149 #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #11 0xffffffff8065a28c at lookup+0x45c #12 0xffffffff806594b9 at namei+0x259 #13 0xffffffff80676a33 at kern_statat+0xf3 #14 0xffffffff8067712f at sys_fstatat+0x2f #15 0xffffffff808ae50c at amd64_syscall+0x10c #16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d #8 0xffffffff81a39069 at sfs_vgetx+0x149 #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #10 0xffffffff80661c2c at lookup+0x45c #11 0xffffffff80660e59 at namei+0x259 #12 0xffffffff8067e3d3 at kern_statat+0xf3 #13 0xffffffff8067eacf at sys_fstatat+0x2f #14 0xffffffff808b5ecc at amd64_syscall+0x10c #15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes openzfs#14501
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 #7 0xffffffff80669841 at vgone+0x31 #8 0xffffffff8065806d at vfs_hash_insert+0x26d #9 0xffffffff81a39069 at sfs_vgetx+0x149 #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #11 0xffffffff8065a28c at lookup+0x45c #12 0xffffffff806594b9 at namei+0x259 #13 0xffffffff80676a33 at kern_statat+0xf3 #14 0xffffffff8067712f at sys_fstatat+0x2f #15 0xffffffff808ae50c at amd64_syscall+0x10c #16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d #8 0xffffffff81a39069 at sfs_vgetx+0x149 #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #10 0xffffffff80661c2c at lookup+0x45c #11 0xffffffff80660e59 at namei+0x259 #12 0xffffffff8067e3d3 at kern_statat+0xf3 #13 0xffffffff8067eacf at sys_fstatat+0x2f #14 0xffffffff808b5ecc at amd64_syscall+0x10c #15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes openzfs#14501
I am trying to make zdb work by porting changes from ZFSin.
Looks like zpool import is broken on Windows.
@lundman Is this a known issue? and are you working on it?
The text was updated successfully, but these errors were encountered: