-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: zvol: Fix zvol_misc VERIFY(zh->zh_claim_txg == 0) #14879
Conversation
Darn it... looks like it is still failing on buildbot...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably a good idea to drop this optimization either way since it seems like it's assuming a little too much.
5fc370c
to
964ed30
Compare
@behlendorf yea, there may actually be two issue afoot here. The crashes that were fixed by the commit, and the So don't pull this in yet. |
964ed30
to
e5beb37
Compare
We have recently been seeing a lot of zvol_misc test failures when blk-mq was enabled on F38 and Centos 9 (openzfs#14872). The failures look to be caused by kernel memory corruption. This fix removes a slightly dubious optimization in zfs_uiomove_bvec_rq() that saved the iterator contents of a rq_for_each_segment(). This optimization allowed restoring the "saved state" from a previous rq_for_each_segment() call on the same uio so that you wouldn't need to iterate though each bvec on every zfs_uiomove_bvec_rq() call. However, if the kernel is manipulating the requests/bios/bvecs under the covers between zfs_uiomove_bvec_rq() calls, then it could result in corruption from using the "saved state". Fixes: openzfs#14872 Signed-off-by: Tony Hutter <hutter2@llnl.gov> Requires-builders: fedora38
e5beb37
to
85f2e36
Compare
This reverts commit e197bb2.
f911bbf
to
6f432ec
Compare
@@ -1134,6 +1170,7 @@ | |||
if (zh->zh_claim_txg == 0 && !BP_IS_HOLE(&zh->zh_log)) { | |||
(void) zil_parse(zilog, zil_claim_log_block, | |||
zil_claim_log_record, tx, first_txg, B_FALSE); | |||
zil_log(zilog, "%s: setting %p to %d\n", __func__, zh, first_txg); |
Check failure
Code scanning / CodeQL
Wrong type of arguments to formatting function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not have been dismissed. It is an actual bug that will cause us to print the upper 32-bits on big-endian machines. Also, high txg numbers (2^31 and up) will not be printed correctly on any machine.
We should use %llu
and typecast to (u_longlong_t) here like we do elsewhere in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ryao I'm just doing some "debug via buildbot" testing right now. This is just instrumentation that will not be checked in. I'll mark this PR as a WIP for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Carry on then. :)
1d21963
to
1a65401
Compare
8b1f67d
to
41866b7
Compare
41866b7
to
100ffed
Compare
|
||
#ifdef _KERNEL | ||
zil_log(os->os_zil, "%s: beginn %p txg %llu, txg os_zil_header %llu, os->os_phys->os_zil_header %p", __func__, os->os_zil->zl_header, os->os_zil->zl_header->zh_claim_txg, | ||
os->os_phys->os_zil_header.zh_claim_txg, os->os_phys->os_zil_header); |
Check failure
Code scanning / CodeQL
Wrong type of arguments to formatting function
I have a little more information now. The issue is that the zvol's zil header has zh->zh_claim_txg == 103 when it should be 0. This often happens in the zvol_misc_trim test, which does the following: < setup zvol_misc_trim pool >
So it appears the 103 txg was written out to disk first, and we just read it back on import. I'm still digging.. |
f0beef6
to
032c449
Compare
Signed-off-by: Tony Hutter <hutter2@llnl.gov> Requires-builders: fedora38
032c449
to
14f3491
Compare
I'm not super familiar with this code, but I added in some debug to print the block pointers:
My zil_log_blkptr() function prints the values for os->os_rootbp (below). In this particular test run, I saw "bad" block pointers with
|
@tonyhutter - what is the state of this PR? It is both marked as |
@sempervictus long story short - this PR doesn't fix the problem as I had originally thought. The issue also seems to affect the non-blk-mq case as well. I'm not sure this is necessarily data corruption. I started going down the rabbit hole investigating it in #14879 (comment) but it fell off my radar when some higher-priority stuff from work came in. I do see there's been a bunch of zil chances since I put out this PR, so I'd be curious to see if this is still an issue: eda3fcd ZIL: Second attempt to reduce scope of zl_issuer_lock. |
This seems outdated. This PR is reported to not fix the issue, while while it is already closed by other PR. |
Motivation and Context
Fix #14872
Description
We have recently been seeing a lot of zvol_misc test failures when blk-mq was enabled on F38 and Centos 9 (#14872). The failures look to be caused by kernel memory corruption.
This fix removes a slightly dubious optimization in zfs_uiomove_bvec_rq() that saved the iterator contents of a rq_for_each_segment(). This optimization allowed restoring the "saved state" from a previous rq_for_each_segment() call on the same uio so that you wouldn't need to iterate though each bvec on every zfs_uiomove_bvec_rq() call. However, if the kernel is manipulating the requests/bios/bvecs under the covers between zfs_uiomove_bvec_rq() calls, then it could result in corruption from using the "saved state".
How Has This Been Tested?
Ran zvol_misc with debug (asserts enabled) on Fedora 38 and reproduced the crashes in first 1-3 runs. Re-ran zvol_misc with the fix 10 times and didn't see the issue.
Types of changes
Checklist:
Signed-off-by
.