Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #1053: Use SET_ATOMIC_BOOL to set receive_complete flag on mess… #1054

Merged
merged 1 commit into from
Apr 11, 2023

Conversation

ganeshmurthy
Copy link
Contributor

…age content

@codecov
Copy link

codecov bot commented Apr 7, 2023

Codecov Report

Patch coverage has no change and project coverage change: -0.01 ⚠️

Comparison is base (b56395a) 30.96% compared to head (e80c292) 30.95%.

❗ Current head e80c292 differs from pull request most recent head 1551c46. Consider uploading reports for the commit 1551c46 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1054      +/-   ##
==========================================
- Coverage   30.96%   30.95%   -0.01%     
==========================================
  Files         174      174              
  Lines       38181    38191      +10     
  Branches     5342     5345       +3     
==========================================
  Hits        11823    11823              
- Misses      25204    25214      +10     
  Partials     1154     1154              
Flag Coverage Δ
pyunittests 54.50% <ø> (ø)
unittests 30.95% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/message.c 45.03% <0.00%> (-0.07%) ⬇️

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

content->receive_complete = true;
SET_ATOMIC_BOOL(&content->receive_complete, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this clear any tsan.supp entry, by any chance?

I noticed

# DISPATCH-2127
race:qd_message_receive

but looking at https://issues.apache.org/jira/browse/DISPATCH-2127, it does appear that atomically setting this flag will allow getting rid of the suppression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jiridanek , I will take a look

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of the suppression is dependent upon these three flags being accessed safely -

content->receive_complete
content->q2_input_holdoff
content->aborted

I have added additional code to this PR so that access to content->q2_input_holdoff is secured after holding content->lock

Let's see what the CI thinks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After removing the suppression, I see no new test failures on this PR. I am going to kick off the tests one more time to see what happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the suppression did not have any side effects. @jiridanek @kgiusti please take a look at this PR one more time and approve/comment.

Copy link
Contributor

@kgiusti kgiusti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provisional approval once Ganesh checks if tsan.supp update is needed.

src/message.c Outdated
@@ -2392,11 +2392,13 @@ qd_message_t *qd_message_compose(qd_composed_field_t *f1,
qd_compose_free(fields[idx]);
}

LOCK(&content->lock);
// initialize the Q2 flag. Note that we do not need to hold the content
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is out-of-date, now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this comment is out-of-date. I agree with the comments and am inclined to remove the lock I added. @kgiusti can you please confirm ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this lock shouldn't be necessary IIUC because the message hasn't yet been shared with any other thread - it was just created in this function and we're just setting the q2_input_holdoff initial value.

Fixing the tsan warning was not the purpose of the original patch, which is to simply corrected a missing atomic operation. I'd recommend committing the original one-line change only without removing the tsan suppression since IIUC the suppression is still necessary after the one-line change is made.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the suppression can be removed with this change. I will take out the lock that I added (since it is not necessary) and take out the suppression as well and see if this passed CI. if it does, the suppression can be safely removed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI has passed. I am merging this PR to main, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

In qd_message_receive(), the content->receive_complete flag needs to be set with atomic function
3 participants