Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ASan][libc++] Optimize __annotate_delete for the default allocator #76176

Closed

Conversation

AdvenamTacet
Copy link
Member

This commit optimizes the ASan helper functions __annotate_delete(), in std::basic_string, std::vector and std::deque, by adding if statements to prevent unpoisoning of memory for the default allocator. Unpoisoning is not required by the default allocator, and since it is widely used, this optimization should yield a meaningful performance improvement.

The optimization was suggested by @EricWF.

@AdvenamTacet AdvenamTacet added libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance labels Dec 21, 2023
@AdvenamTacet AdvenamTacet requested a review from a team as a code owner December 21, 2023 19:39
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 21, 2023

@llvm/pr-subscribers-libcxx

Author: Tacet (AdvenamTacet)

Changes

This commit optimizes the ASan helper functions __annotate_delete(), in std::basic_string, std::vector and std::deque, by adding if statements to prevent unpoisoning of memory for the default allocator. Unpoisoning is not required by the default allocator, and since it is widely used, this optimization should yield a meaningful performance improvement.

The optimization was suggested by @EricWF.


Full diff: https://github.com/llvm/llvm-project/pull/76176.diff

3 Files Affected:

  • (modified) libcxx/include/deque (+3)
  • (modified) libcxx/include/string (+4-2)
  • (modified) libcxx/include/vector (+3-1)
diff --git a/libcxx/include/deque b/libcxx/include/deque
index d0520b635bcc8f..91e2b73d83e2f5 100644
--- a/libcxx/include/deque
+++ b/libcxx/include/deque
@@ -1007,6 +1007,9 @@ private:
   }
 
   _LIBCPP_HIDE_FROM_ABI void __annotate_delete() const _NOEXCEPT {
+    // The default allocator does not require unpoisoning before returning memory.
+    if _LIBCPP_CONSTEXPR (is_same<allocator_type, allocator<_Tp>>::value)
+      return;
     if (empty()) {
       for (size_t __i = 0; __i < __map_.size(); ++__i) {
         __annotate_whole_block(__i, __asan_unposion);
diff --git a/libcxx/include/string b/libcxx/include/string
index fdffca5aed18be..08180ffb130d8a 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -1908,8 +1908,10 @@ private:
   }
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __annotate_delete() const _NOEXCEPT {
-    if (!__libcpp_is_constant_evaluated() && (__asan_short_string_is_annotated() || __is_long()))
-      __annotate_contiguous_container(data() + size() + 1, data() + capacity() + 1);
+    // The default allocator does not require unpoisoning before returning memory.
+    if _LIBCPP_CONSTEXPR (!is_same<allocator_type, allocator<__default_allocator_type>>::value)
+      if (!__libcpp_is_constant_evaluated() && (__asan_short_string_is_annotated() || __is_long()))
+        __annotate_contiguous_container(data() + size() + 1, data() + capacity() + 1);
   }
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __annotate_increase(size_type __n) const _NOEXCEPT {
diff --git a/libcxx/include/vector b/libcxx/include/vector
index 3abc917f5c0e18..1b95e1e5c88b75 100644
--- a/libcxx/include/vector
+++ b/libcxx/include/vector
@@ -849,7 +849,9 @@ private:
   }
 
   _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void __annotate_delete() const _NOEXCEPT {
-    __annotate_contiguous_container(data(), data() + capacity(), data() + size(), data() + capacity());
+    // The default allocator does not require unpoisoning before returning memory.
+    if _LIBCPP_CONSTEXPR (!is_same<allocator_type, __default_allocator_type>::value)
+      __annotate_contiguous_container(data(), data() + capacity(), data() + size(), data() + capacity());
   }
 
   _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void __annotate_increase(size_type __n) const _NOEXCEPT {

Advenam Tacet added 3 commits December 22, 2023 03:41
This commit optimizes the ASan helper functions `__annotate_delete()`, in `std::basic_string`, `std::vector` and `std::deque`, by adding `if` statements to prevent unpoisoning of memory for the default allocator. Unpoisoning is not required by the default allocator, and since it is widely used, this optimization should yield a meaningful performance improvement.

The optimization was suggested by @EricWF.
Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much of a performance improvement is that in reality?

@AdvenamTacet
Copy link
Member Author

When a container's buffer contains N bytes, a pessimistic estimation is that N/8 bytes will be written into shadow memory. In the worst-case scenario, this can double the run-time of ASan operations during deallocation. If the buffer is large and there are many deallocations, this performance impact could be noticeable.

While the performance gain is small in the grand scheme of things, it is still a gain that we are leaving on the table. However, if you believe that adding an additional line with an if statement in an ASan helper functions is too high a price to pay for that performance gain, I understand, but disagree.

Real values depend on the container and its state.
For vector, that may be accurate estimation instead of a pessimistic one, due to a clear just before destruction.

@philnik777
Copy link
Contributor

@AdvenamTacet I don't necessarily think that this isn't worth the cost. I'd simply like to see some numbers. If this is a 1% perf improvement in a micro-benchmark, I don't think this is worth it. If it's a 10x improvement, it probably is worth the cost.

@AdvenamTacet
Copy link
Member Author

AdvenamTacet commented Jan 10, 2024

I just ran a small benchmark with flags -stdlib=libc++ -fsanitize=address -O3.

#ifdef VECTOR_FAST
#include "vector-optimized.hpp"
#endif

#ifdef VECTOR_MAIN
#include "vector-main.hpp"
#endif

int main() {
    uint32_t sum = 0;
    for(uint32_t i = 2; i < 100'000; ++i) {
        std::vector<uint64_t> v(1'000'000, i);
        sum += v[i/2];
        v.clear();
    }

    return sum != 0 ? 0 : -1;
}

Example results (quite representative):

# Optimized times:
real    3m6.008s
user    0m25.815s
sys     2m39.491s


# Times from upstream mian:
real    3m32.434s
user    0m27.063s
sys     3m4.495s

It gives around 13% of speedup on that benchmark.

In [1]: 1 - (3*60+3.008) / (3*60+32.434)
Out[1]: 0.13851831627705535

I don't see many potential issues, maybe problems after replacing default allocator with some problematic allocator?
It only affects strictly ASan related functions, so readability is arguably untouched.
Performance is consistently better.

@philnik777 What do you think?

@philnik777
Copy link
Contributor

While the numbers looks nice, I'm not sure it's good enough in the current state. I think I'd be much happier with this if we could canonicalize the interface for all the containers, at least to some degree. That way we should be able to apply these kinds of optimizations in a single place instead of spreading them across the code base.

@AdvenamTacet
Copy link
Member Author

AdvenamTacet commented Jan 10, 2024

I see your point, but I'm not sure an interface is feasible option.
We would end up adding a lot of code in another place, but removing very little of code here. basic_string and vector have very similar implementations of annotations, but inside basic_string we have to additionally take care of a null terminator (not included in size() or capacity()). Therefore, the only function we can unify in the interface is __annotate_contiguous_container (but then __beg and __end have to be arguments to it) and possibly we can move some checks (like the one included in this PR). But the rest has to be handled separately. deque is so different, that we need a separate interface (I'm not even sure, if it's possible as in deque we need access to private variables and functions).

Possibly we can just move those functions to another class/file (but keep separate implementations for every class).
We would have to do it in a way that optimizations are not affected in any way to keep current performance.
I don't see a way to do it, without increasing complexity.

I don't really like an idea of an interface for ASan annotations.

@EricWF @ldionne what do you think?

@ldionne
Copy link
Member

ldionne commented Jan 16, 2024

I'm a bit skeptical of the benefits of doing this optimization in a real code base, keeping in mind that this is under sanitized mode. My threshold of acceptable complexity for adding sanitizer support to our containers is right at its limit TBH and I would rather explore refactorings to make it simpler and less intrusive than anything that requires adding more complexity.

@AdvenamTacet
Copy link
Member Author

Ok, as a first step, I just created a PR simplifying vector helper functions (to make them analogical to string annotation functions) and requested your @philnik777 review: #78322

I'm happy to experiment with a refactor after we finish work on string annotations. One PR left: #75882

But probably we should focus on the refactor after branching, to not introduce potential issues now.

@AdvenamTacet AdvenamTacet deleted the asan-annotate-delete-faster branch January 22, 2024 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants