-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix error handling in receive_writer_thread() #10320
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ahrens
added
Status: Code Review Needed
Ready for review and testing
Type: Defect
Incorrect behavior (e.g. crash, hang)
labels
May 12, 2020
behlendorf
approved these changes
May 13, 2020
If `receive_writer_thread()` gets an error from `receive_process_record()`, it should be saved in `rwa->err` so that we will stop processing records, and the main thread will notice that the receive has failed. When an error is first encountered, this happens correctly. However, if there are more records to dequeue, the next time through the loop we will reset `rwa->err` to zero, allowing us to try to process the following record (2 after the failed record). Depending on what types of records remain, we may incorrectly complete the receive "successfully", but without actually having processed all the records. The fix is to only set `rwa->err` if we got a *non-zero* error. This bug was introduced by openzfs#10099 "Improve zfs receive performance by batching writes". Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
pcd1193182
approved these changes
May 14, 2020
behlendorf
added
Status: Accepted
Ready to integrate (reviewed, tested)
and removed
Status: Code Review Needed
Ready for review and testing
labels
May 15, 2020
ahrens
added a commit
to ahrens/zfs
that referenced
this pull request
May 15, 2020
If `receive_writer_thread()` gets an error from `receive_process_record()`, it should be saved in `rwa->err` so that we will stop processing records, and the main thread will notice that the receive has failed. When an error is first encountered, this happens correctly. However, if there are more records to dequeue, the next time through the loop we will reset `rwa->err` to zero, allowing us to try to process the following record (2 after the failed record). Depending on what types of records remain, we may incorrectly complete the receive "successfully", but without actually having processed all the records. The fix is to only set `rwa->err` if we got a *non-zero* error. This bug was introduced by openzfs#10099 "Improve zfs receive performance by batching writes". Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes openzfs#10320
as-com
pushed a commit
to as-com/zfs
that referenced
this pull request
Jun 20, 2020
If `receive_writer_thread()` gets an error from `receive_process_record()`, it should be saved in `rwa->err` so that we will stop processing records, and the main thread will notice that the receive has failed. When an error is first encountered, this happens correctly. However, if there are more records to dequeue, the next time through the loop we will reset `rwa->err` to zero, allowing us to try to process the following record (2 after the failed record). Depending on what types of records remain, we may incorrectly complete the receive "successfully", but without actually having processed all the records. The fix is to only set `rwa->err` if we got a *non-zero* error. This bug was introduced by openzfs#10099 "Improve zfs receive performance by batching writes". Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes openzfs#10320 (cherry picked from commit 1b9cd1a)
as-com
pushed a commit
to as-com/zfs
that referenced
this pull request
Jun 20, 2020
If `receive_writer_thread()` gets an error from `receive_process_record()`, it should be saved in `rwa->err` so that we will stop processing records, and the main thread will notice that the receive has failed. When an error is first encountered, this happens correctly. However, if there are more records to dequeue, the next time through the loop we will reset `rwa->err` to zero, allowing us to try to process the following record (2 after the failed record). Depending on what types of records remain, we may incorrectly complete the receive "successfully", but without actually having processed all the records. The fix is to only set `rwa->err` if we got a *non-zero* error. This bug was introduced by openzfs#10099 "Improve zfs receive performance by batching writes". Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes openzfs#10320 (cherry picked from commit 1b9cd1a)
jsai20
pushed a commit
to jsai20/zfs
that referenced
this pull request
Mar 30, 2021
If `receive_writer_thread()` gets an error from `receive_process_record()`, it should be saved in `rwa->err` so that we will stop processing records, and the main thread will notice that the receive has failed. When an error is first encountered, this happens correctly. However, if there are more records to dequeue, the next time through the loop we will reset `rwa->err` to zero, allowing us to try to process the following record (2 after the failed record). Depending on what types of records remain, we may incorrectly complete the receive "successfully", but without actually having processed all the records. The fix is to only set `rwa->err` if we got a *non-zero* error. This bug was introduced by openzfs#10099 "Improve zfs receive performance by batching writes". Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes openzfs#10320
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Status: Accepted
Ready to integrate (reviewed, tested)
Type: Defect
Incorrect behavior (e.g. crash, hang)
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation and Context
If
receive_writer_thread()
gets an error fromreceive_process_record()
,it should be saved in
rwa->err
so that we will stop processing records,and the main thread will notice that the receive has failed.
When an error is first encountered, this happens correctly. However, if
there are more records to dequeue, the next time through the loop we
will reset
rwa->err
to zero, allowing us to try to process thefollowing record (2 after the failed record). Depending on what types
of records remain, we may incorrectly complete the receive
"successfully", but without actually having processed all the records.
Description
The fix is to only set
rwa->err
if we got a non-zero error.This bug was introduced by #10099 "Improve zfs receive performance by
batching writes".
How Has This Been Tested?
Do a receive that should fail due to an incorrectly-constructed send stream.
Types of changes
Checklist:
Signed-off-by
.