-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait for txg sync if the last DRR_FREEOBJECTS might result in a hole #14358
Conversation
If you can construct a small stream which triggers this it would be nice to add a test case for this. |
I have added a test with the potential of triggering the issue, which can be verified by running it without the fix applied and with the following hack in its stead:
Given the nature of the issue I'm not sure if there's a way to write a test that reliably (or perhaps even likely) triggers it under real world conditions, but this will at least exercise the related code paths. Does this seem OK? |
If we receive a DRR_FREEOBJECTS as the first entry in an object range, this might end up producing a hole if the freed objects were the only existing objects in the block. If the txg starts syncing before we've processed any following DRR_OBJECT records, this leads to a possible race where the backing arc_buf_t gets its psize set to 0 in the arc_write_ready() callback while still being referenced from a dirty record in the open txg. To prevent this, we insert a txg_wait_synced call if the first record in the range was a DRR_FREEOBJECTS that actually resulted in one or more freed objects. Signed-off-by: David Hedberg <david.hedberg@findity.com> Sponsored by: Findity AB Closes: openzfs#11893
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think exercising the relevant code paths should be sufficient. It'd be nice to be able to hit it every time, but I agree it's probably not worth adding additional code to the kmod to be able to trigger exactly this issue. This looks good to me. Thanks.
done | ||
if [[ $i -eq $tries ]]; then | ||
log_fail "Failed to create object with number $num" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little concerned this won't be entirely reliable since it depends on some assumptions about how object numbers are allocated. But as you pointed out we already do this same trick in send_freeobjects.ksh
so I'm okay with doing the same thing here.
If we receive a DRR_FREEOBJECTS as the first entry in an object range, this might end up producing a hole if the freed objects were the only existing objects in the block. If the txg starts syncing before we've processed any following DRR_OBJECT records, this leads to a possible race where the backing arc_buf_t gets its psize set to 0 in the arc_write_ready() callback while still being referenced from a dirty record in the open txg. To prevent this, we insert a txg_wait_synced call if the first record in the range was a DRR_FREEOBJECTS that actually resulted in one or more freed objects. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: David Hedberg <david.hedberg@findity.com> Sponsored by: Findity AB Closes openzfs#11893 Closes openzfs#14358
If we receive a DRR_FREEOBJECTS as the first entry in an object range, this might end up producing a hole if the freed objects were the only existing objects in the block. If the txg starts syncing before we've processed any following DRR_OBJECT records, this leads to a possible race where the backing arc_buf_t gets its psize set to 0 in the arc_write_ready() callback while still being referenced from a dirty record in the open txg. To prevent this, we insert a txg_wait_synced call if the first record in the range was a DRR_FREEOBJECTS that actually resulted in one or more freed objects. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: David Hedberg <david.hedberg@findity.com> Sponsored by: Findity AB Closes openzfs#11893 Closes openzfs#14358
If we receive a DRR_FREEOBJECTS as the first entry in an object range, this might end up producing a hole if the freed objects were the only existing objects in the block. If the txg starts syncing before we've processed any following DRR_OBJECT records, this leads to a possible race where the backing arc_buf_t gets its psize set to 0 in the arc_write_ready() callback while still being referenced from a dirty record in the open txg. To prevent this, we insert a txg_wait_synced call if the first record in the range was a DRR_FREEOBJECTS that actually resulted in one or more freed objects. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: David Hedberg <david.hedberg@findity.com> Sponsored by: Findity AB Closes #11893 Closes #14358
Motivation, Context and Description
If we receive a DRR_FREEOBJECTS as the first entry in an object range, this might end up producing a hole if the freed objects were the only existing objects in the block.
If the txg starts syncing before we've processed any following DRR_OBJECT records, this leads to a possible race where the backing arc_buf_t gets its psize set to 0 in the arc_write_ready() callback while still being referenced from a dirty record in the open txg.
To prevent this, we insert a txg_wait_synced call if the first record in the range was a DRR_FREEOBJECTS that actually resulted in one or more freed objects.
Sponsored by: Findity AB
Closes: #11893
How Has This Been Tested?
The patch has been tested in a VM running a cloud image of Ubuntu 22.04 with kernel 5.15.0-56. The issue can be triggered either by simply by starting a few zfs receive/zfs rollback in a loop with an affected incremental stream and waiting, or by inserting a busy loop before dmu_tx_create() in object_receive():
The patch has been tested both by leaving the hacky trigger in place, and also by just letting the final patch run with a few receive threads looping for a while without detecting any issues.
Types of changes
Checklist:
Signed-off-by
.