Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Bytes & BytesMut compatible with ThreadSanitizer #405

Closed
wants to merge 1 commit into from
Closed

Make Bytes & BytesMut compatible with ThreadSanitizer #405

wants to merge 1 commit into from

Conversation

tmiasko
Copy link
Contributor

@tmiasko tmiasko commented Jul 2, 2020

Replace atomic fences with atomic loads for compatibility with ThreadSanitizer.

@@ -1046,7 +1046,7 @@ unsafe fn release_shared(ptr: *mut Shared) {
// > "acquire" operation before deleting the object.
//
// [1]: (www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html)
atomic::fence(Ordering::Acquire);
(*ptr).ref_cnt.load(Ordering::Acquire);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will force a second load of the ref_cnt, won't it?

Also, if we diverge from how std::arc::Arc does things, we should probably have a good reason and explain why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there will be an additional load when the reference count when counter reaches zero.

I added a comment explaining why there is a load instead of a fence, so there is no need to go through git history to understand that.

To make it clear, the only reason for doing it is compatibility with ThreadSanitizer. The std::sync::Arc is currently implemented as proposed here, although impl is used conditionally. In overall tokio ecosystem, this is last remaining false positive I had seen reported with ThreadSanitizer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, they've added this conditional compilation: https://github.com/rust-lang/rust/blob/f844ea1e561475e6023282ef167e76bc973773ef/src/liballoc/sync.rs#L43-L58

Could we do similar?

Copy link
Contributor Author

@tmiasko tmiasko Jul 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfg(sanitize = "thread") is unstable so probably not. I don't think it is worth the complexity anyway.

On x86 this approach avoids a single mov from a location that was written a few instructions before, on a cold path that needs to do a bunch of work to deallocate the memory. Last time I looked this was essentially unmeasurable for any real world applications. For weak memory models where acquire fence leads to actual codegen, the situation is even more ambiguous.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use cfg(bytes_ci_tsan) or something, setting CARGO_CFG_BYTES_CI_TSAN=1 environment variable.

It seems it's at least worth enough that the standard library does it. Are they wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think approach from std is best idea. Requiring a custom cfg would be impractical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading a bit about the change in libstd (rust-lang/rust#65097), I think we should care about the performance here, and only enable a load instead of a fence via a conditional config, to be used with tsan.

Copy link
Contributor Author

@tmiasko tmiasko Jul 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is measurable outside microbenchmarks. Note that the most of that discussion is about different implementation.

If doing this conditionally is the only acceptable implementation, then I suggest closing this. Doing this conditionally serves no purpose, because if it doesn't work out of the box, it doesn't work period. The situation in std is different, because cfg is automatically enabled when tsan is used during compilation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify one point: there is no measurable impact on any benchmarks here and they do exercise this code. If you noticed anything raised your concerned I can take look again, but honestly, this change does not make any difference.

Replace atomic fences with atomic loads for compatibility with
ThreadSanitizer.
@@ -1046,7 +1046,7 @@ unsafe fn release_shared(ptr: *mut Shared) {
// > "acquire" operation before deleting the object.
//
// [1]: (www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html)
atomic::fence(Ordering::Acquire);
(*ptr).ref_cnt.load(Ordering::Acquire);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading a bit about the change in libstd (rust-lang/rust#65097), I think we should care about the performance here, and only enable a load instead of a fence via a conditional config, to be used with tsan.

@tmiasko tmiasko closed this Sep 20, 2020
@tmiasko tmiasko deleted the tsan branch September 20, 2020 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants