-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issues with raw sends and receive_write_byref() #7701
Conversation
module/zfs/dmu_send.c
Outdated
@@ -2846,6 +2845,9 @@ receive_write_byref(struct receive_writer_arg *rwa, | |||
} | |||
if (dmu_objset_from_ds(gmep->gme_ds, &ref_os)) | |||
return (SET_ERROR(EINVAL)); | |||
|
|||
if (gmep->raw) | |||
ref_os->os_raw_receive = B_TRUE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we setting this here, rather than when we create the guid_map_entry_t
? At a minimum, this deserves a comment explaining why we need to set this flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right we can move it there. This used to be a long hold, so the objset was technically eligible for eviction.
module/zfs/arc.c
Outdated
@@ -1574,15 +1577,16 @@ arc_cksum_verify(arc_buf_t *buf) | |||
if (!(zfs_flags & ZFS_DEBUG_MODIFY)) | |||
return; | |||
|
|||
mutex_enter(&hdr->b_l1hdr.b_freeze_lock); | |||
if (ARC_BUF_COMPRESSED(buf)) { | |||
ASSERT(hdr->b_l1hdr.b_freeze_cksum == NULL || | |||
arc_hdr_has_uncompressed_buf(hdr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should add an assertion to arc_cksum_verify() to check that we already have the HDR_LOCK, since we aren't grabbing it but arc_hdr_has_uncompressed_buf() requires it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add that. I see one spot where that isn't true that I will correct (arc_write_done()
)
module/zfs/dmu_send.c
Outdated
* os_raw_receive flag now. | ||
*/ | ||
if (raw) { | ||
if (dmu_objset_from_ds(snapds, &os)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest preserving and returning the real error. It's currently unused by the caller so this doesn't really matter, but I don't see a compelling reason to convert it to EINVAL
.
module/zfs/dmu_send.c
Outdated
if (raw) { | ||
if (dmu_objset_from_ds(snapds, &os)) { | ||
dsl_pool_rele(dp, FTAG); | ||
return (SET_ERROR(EINVAL)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me like gmep
is leaked here and the the dataset is never disowned since it's not added (correctly) to the guid_map
on error.
0e39b32
to
709efcf
Compare
165c523
to
7c44f91
Compare
Codecov Report
@@ Coverage Diff @@
## master #7701 +/- ##
==========================================
+ Coverage 78.26% 78.34% +0.07%
==========================================
Files 373 373
Lines 112791 112791
==========================================
+ Hits 88279 88361 +82
+ Misses 24512 24430 -82
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
datasetexists $TESTPOOL/recv && \ | ||
log_must zfs destroy -r $TESTPOOL/recv | ||
datasetexists $TESTPOOL/$TESTVOL && \ | ||
log_must zfs destroy -r $TESTPOOL/$TESTVOL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest using the destroy_dataset
helper here since it handles the case where the dataset is open which can happen with volumes.
destroy_dataset $TESTPOOL/recv "-r"
destroy_dataset $TESTPOOL/$TESTVOL "-r"
module/zfs/arc.c
Outdated
} | ||
|
||
hash_lock = HDR_LOCK(hdr); | ||
mutex_enter(hash_lock); | ||
|
||
ASSERT(HDR_HAS_L1HDR(hdr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For non-debug builds. Let's drop the local variable which is only used in the ASSERT now.
../../module/zfs/arc.c: In function 'arc_buf_freeze':
../../module/zfs/arc.c:1759:17: warning: unused variable 'hdr' [-Wunused-variable]
This patch fixes 2 issues with raw, deduplicated send streams. The first is that datasets who had been completely received earlier in the stream were not still marked as raw receives. This caused problems when newly received datasets attempted to fetch raw data from these datasets without this flag set. The second problem was that the arc freeze checksum code was not consistent about which locks needed to be held while performing its asserts. The proper locking needed to run these asserts is actually fairly nuanced, since the asserts touch the linked list of buffers (requiring the header lock), the arc_state (requiring the b_evict_lock), and the b_freeze_cksum (requiring the b_freeze_lock). This seems like a large performance sacrifice and a lot of unneeded complexity to verify that this relatively small debug feature is working as intended, so this patch simply removes these asserts instead. Signed-off-by: Tom Caputi <tcaputi@datto.com>
7c44f91
to
d747d86
Compare
This patch fixes 2 issues with raw, deduplicated send streams. The
first is that datasets who had been completely received earlier in
the stream were not still marked as raw receives. This caused
problems when newly received datasets attempted to fetch raw data
from these datasets without this flag set.
The second problem was that the arc freeze checksum code was not
consistent about which locks needed to be held while performing
its asserts. The code now guarantees that the hdr lock is held
when iterating through the linked list of buffers and the
b_freeze_lock is held when attempting to read or modify the
b_freeze_cksum. This is not strictly a problem with the write_byref
code, but it seems to be the only consistent code path to trigger
the issue.
Signed-off-by: Tom Caputi tcaputi@datto.com
Motivation and Context
Without this patch raw, deduplicated send streams would cause crashes.
How Has This Been Tested?
send-wDR_encrypted_zvol.ksh
has been addedTypes of changes
Checklist:
Signed-off-by
.