Skip to content

Commit

Permalink
[ASan][libc++] std::basic_string annotations (#72677)
Browse files Browse the repository at this point in the history
This commit introduces basic annotations for `std::basic_string`,
mirroring the approach used in `std::vector` and `std::deque`.
Initially, only long strings with the default allocator will be
annotated. Short strings (_SSO - short string optimization_) and strings
with non-default allocators will be annotated in the near future, with
separate commits dedicated to enabling them. The process will be similar
to the workflow employed for enabling annotations in `std::deque`.

**Please note**: these annotations function effectively only when libc++
and libc++abi dylibs are instrumented (with ASan). This aligns with the
prevailing behavior of Memory Sanitizer.

To avoid breaking everything, this commit also appends
`_LIBCPP_INSTRUMENTED_WITH_ASAN` to `__config_site` whenever libc++ is
compiled with ASan. If this macro is not defined, string annotations are
not enabled. However, linking a binary that does **not** annotate
strings with a dynamic library that annotates strings, is not permitted.

Originally proposed here: https://reviews.llvm.org/D132769

Related patches on Phabricator:
- Turning on annotations for short strings:
https://reviews.llvm.org/D147680
- Turning on annotations for all allocators:
https://reviews.llvm.org/D146214

This PR is a part of a series of patches extending AddressSanitizer C++
container overflow detection capabilities by adding annotations, similar
to those existing in `std::vector` and `std::deque` collections. These
enhancements empower ASan to effectively detect instances where the
instrumented program attempts to access memory within a collection's
internal allocation that remains unused. This includes cases where
access occurs before or after the stored elements in `std::deque`, or
between the `std::basic_string`'s size (including the null terminator)
and capacity bounds.

The introduction of these annotations was spurred by a real-world
software bug discovered by Trail of Bits, involving an out-of-bounds
memory access during the comparison of two strings using the
`std::equals` function. This function was taking iterators
(`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison,
using a custom comparison function. When the `iter1` object exceeded the
length of `iter2`, an out-of-bounds read could occur on the `iter2`
object. Container sanitization, upon enabling these annotations, would
effectively identify and flag this potential vulnerability.

This Pull Request introduces basic annotations for `std::basic_string`.
Long strings exhibit structural similarities to `std::vector` and will
be annotated accordingly. Short strings are already implemented, but
will be turned on separately in a forthcoming commit. Look at [a
comment](llvm/llvm-project#72677 (comment))
below to read about SSO issues at current moment.

Due to the functionality introduced in
[D132522](llvm/llvm-project@dd1b7b7),
the `__sanitizer_annotate_contiguous_container` function now offers
compatibility with all allocators. However, enabling this support will
be done in a subsequent commit. For the time being, only strings with
the default allocator will be annotated.

If you have any questions, please email:
- advenam.tacet@trailofbits.com
- disconnect3d@trailofbits.com

NOKEYCHECK=True
GitOrigin-RevId: 9ed20568e7de53dce85f1631d7d8c1415e7930ae
  • Loading branch information
Tacet authored and copybara-github committed Dec 13, 2023
1 parent 1f70899 commit e7db482
Show file tree
Hide file tree
Showing 92 changed files with 862 additions and 64 deletions.
13 changes: 13 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -651,6 +651,19 @@ get_sanitizer_flags(SANITIZER_FLAGS "${LLVM_USE_SANITIZER}")
add_library(cxx-sanitizer-flags INTERFACE)
target_compile_options(cxx-sanitizer-flags INTERFACE ${SANITIZER_FLAGS})

# _LIBCPP_INSTRUMENTED_WITH_ASAN informs that library was built with ASan.
# Defining _LIBCPP_INSTRUMENTED_WITH_ASAN while building the library with ASan is required.
# Normally, the _LIBCPP_INSTRUMENTED_WITH_ASAN flag is used to keep information whether
# dylibs are built with AddressSanitizer. However, when building libc++,
# this flag needs to be defined so that the resulting dylib has all ASan functionalities guarded by this flag.
# If the _LIBCPP_INSTRUMENTED_WITH_ASAN flag is not defined, then parts of the ASan instrumentation code in libc++
# will not be compiled into it, resulting in false positives.
# For context, read: https://github.com/llvm/llvm-project/pull/72677#pullrequestreview-1765402800
string(FIND "${LLVM_USE_SANITIZER}" "Address" building_with_asan)
if (NOT "${building_with_asan}" STREQUAL "-1")
config_define(ON _LIBCPP_INSTRUMENTED_WITH_ASAN)
endif()

# Link system libraries =======================================================
function(cxx_link_system_libraries target)
if (NOT MSVC)
Expand Down
1 change: 1 addition & 0 deletions include/__config_site.in
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
#cmakedefine _LIBCPP_HAS_NO_WIDE_CHARACTERS
#cmakedefine _LIBCPP_HAS_NO_STD_MODULES
#cmakedefine _LIBCPP_HAS_NO_TIME_ZONE_DATABASE
#cmakedefine _LIBCPP_INSTRUMENTED_WITH_ASAN

// PSTL backends
#cmakedefine _LIBCPP_PSTL_CPU_BACKEND_SERIAL
Expand Down
282 changes: 230 additions & 52 deletions include/string

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

#include "test_allocator.h"
#include "min_allocator.h"
#include "asan_testing.h"

#include "test_macros.h"

Expand All @@ -28,6 +29,7 @@ TEST_CONSTEXPR_CXX20 void test_invariant(S s, test_allocator_statistics& alloc_s
while (s.size() < s.capacity())
s.push_back(typename S::value_type());
assert(s.size() == s.capacity());
LIBCPP_ASSERT(is_string_asan_correct(s));
}
#ifndef TEST_HAS_NO_EXCEPTIONS
catch (...) {
Expand All @@ -43,17 +45,20 @@ TEST_CONSTEXPR_CXX20 void test_string(const Alloc& a) {
{
S const s((Alloc(a)));
assert(s.capacity() >= 0);
LIBCPP_ASSERT(is_string_asan_correct(s));
}
{
S const s(3, 'x', Alloc(a));
assert(s.capacity() >= 3);
LIBCPP_ASSERT(is_string_asan_correct(s));
}
#if TEST_STD_VER >= 11
// Check that we perform SSO
{
S const s;
assert(s.capacity() > 0);
ASSERT_NOEXCEPT(s.capacity());
LIBCPP_ASSERT(is_string_asan_correct(s));
}
#endif
}
Expand All @@ -63,18 +68,22 @@ TEST_CONSTEXPR_CXX20 bool test() {
test_string(test_allocator<char>());
test_string(test_allocator<char>(3));
test_string(min_allocator<char>());
test_string(safe_allocator<char>());

{
test_allocator_statistics alloc_stats;
typedef std::basic_string<char, std::char_traits<char>, test_allocator<char> > S;
S s((test_allocator<char>(&alloc_stats)));
test_invariant(s, alloc_stats);
LIBCPP_ASSERT(is_string_asan_correct(s));
s.assign(10, 'a');
s.erase(5);
test_invariant(s, alloc_stats);
LIBCPP_ASSERT(is_string_asan_correct(s));
s.assign(100, 'a');
s.erase(50);
test_invariant(s, alloc_stats);
LIBCPP_ASSERT(is_string_asan_correct(s));
}

return true;
Expand Down
8 changes: 8 additions & 0 deletions test/std/strings/basic.string/string.capacity/clear.pass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,31 +15,39 @@

#include "test_macros.h"
#include "min_allocator.h"
#include "asan_testing.h"

template <class S>
TEST_CONSTEXPR_CXX20 void test(S s) {
s.clear();
assert(s.size() == 0);
LIBCPP_ASSERT(is_string_asan_correct(s));
}

template <class S>
TEST_CONSTEXPR_CXX20 void test_string() {
S s;
test(s);
LIBCPP_ASSERT(is_string_asan_correct(s));

s.assign(10, 'a');
s.erase(5);
LIBCPP_ASSERT(is_string_asan_correct(s));
test(s);
LIBCPP_ASSERT(is_string_asan_correct(s));

s.assign(100, 'a');
s.erase(50);
LIBCPP_ASSERT(is_string_asan_correct(s));
test(s);
LIBCPP_ASSERT(is_string_asan_correct(s));
}

TEST_CONSTEXPR_CXX20 bool test() {
test_string<std::string>();
#if TEST_STD_VER >= 11
test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char>>>();
test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
#endif

return true;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@

#include "test_macros.h"
#include "min_allocator.h"
#include "asan_testing.h"

template <class S>
void test(typename S::size_type min_cap, typename S::size_type erased_index) {
Expand All @@ -33,6 +34,7 @@ void test(typename S::size_type min_cap, typename S::size_type erased_index) {
assert(s == s0);
assert(s.capacity() <= old_cap);
assert(s.capacity() >= s.size());
LIBCPP_ASSERT(is_string_asan_correct(s));
}

template <class S>
Expand All @@ -47,6 +49,7 @@ bool test() {
test_string<std::string>();
#if TEST_STD_VER >= 11
test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char>>>();
test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
#endif

return true;
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
//===----------------------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

// <string>

// This test verifies that the ASan annotations for basic_string objects remain accurate
// after invoking basic_string::reserve(size_type __requested_capacity).
// Different types are used to confirm that ASan works correctly with types of different sizes.
#include <string>
#include <cassert>

#include "test_macros.h"
#include "asan_testing.h"

template <class S>
void test() {
S short_s1(3, 'a'), long_s1(100, 'c');
short_s1.reserve(0x1337);
long_s1.reserve(0x1337);

LIBCPP_ASSERT(is_string_asan_correct(short_s1));
LIBCPP_ASSERT(is_string_asan_correct(long_s1));

short_s1.clear();
long_s1.clear();

LIBCPP_ASSERT(is_string_asan_correct(short_s1));
LIBCPP_ASSERT(is_string_asan_correct(long_s1));

short_s1.reserve(0x1);
long_s1.reserve(0x1);

LIBCPP_ASSERT(is_string_asan_correct(short_s1));
LIBCPP_ASSERT(is_string_asan_correct(long_s1));

S short_s2(3, 'a'), long_s2(100, 'c');
short_s2.reserve(0x1);
long_s2.reserve(0x1);

LIBCPP_ASSERT(is_string_asan_correct(short_s2));
LIBCPP_ASSERT(is_string_asan_correct(long_s2));
}

int main(int, char**) {
test<std::string>();
#ifndef TEST_HAS_NO_WIDE_CHARACTERS
test<std::wstring>();
#endif
#if TEST_STD_VER >= 11
test<std::u16string>();
test<std::u32string>();
#endif
#if TEST_STD_VER >= 20
test<std::u8string>();
#endif

return 0;
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@

#include "test_macros.h"
#include "min_allocator.h"
#include "asan_testing.h"

template <class S>
TEST_CONSTEXPR_CXX20 void
Expand All @@ -28,6 +29,7 @@ test(typename S::size_type min_cap, typename S::size_type erased_index, typename
s.erase(erased_index);
assert(s.size() == erased_index);
assert(s.capacity() >= min_cap); // Check that we really have at least this capacity.
LIBCPP_ASSERT(is_string_asan_correct(s));

#if TEST_STD_VER > 17
typename S::size_type old_cap = s.capacity();
Expand All @@ -39,6 +41,7 @@ test(typename S::size_type min_cap, typename S::size_type erased_index, typename
assert(s == s0);
assert(s.capacity() >= res_arg);
assert(s.capacity() >= s.size());
LIBCPP_ASSERT(is_string_asan_correct(s));
#if TEST_STD_VER > 17
assert(s.capacity() >= old_cap); // reserve never shrinks as of P0966 (C++20)
#endif
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

#include "make_string.h"
#include "test_macros.h"
#include "asan_testing.h"

template <class S>
constexpr void test_appending(std::size_t k, size_t N, size_t new_capacity) {
Expand All @@ -37,6 +38,7 @@ constexpr void test_appending(std::size_t k, size_t N, size_t new_capacity) {
const S expected = S(k, 'a') + S(N - k, 'b');
assert(s == expected);
assert(s.c_str()[N] == '\0');
LIBCPP_ASSERT(is_string_asan_correct(s));
}

template <class S>
Expand All @@ -55,6 +57,7 @@ constexpr void test_truncating(std::size_t o, size_t N) {
const S expected = S(N - 1, 'a') + S(1, 'b');
assert(s == expected);
assert(s.c_str()[N] == '\0');
LIBCPP_ASSERT(is_string_asan_correct(s));
}

template <class String>
Expand All @@ -76,11 +79,14 @@ constexpr bool test() {
void test_value_categories() {
std::string s;
s.resize_and_overwrite(10, [](char*&&, std::size_t&&) { return 0; });
LIBCPP_ASSERT(is_string_asan_correct(s));
s.resize_and_overwrite(10, [](char* const&, const std::size_t&) { return 0; });
LIBCPP_ASSERT(is_string_asan_correct(s));
struct RefQualified {
int operator()(char*, std::size_t) && { return 0; }
};
s.resize_and_overwrite(10, RefQualified{});
LIBCPP_ASSERT(is_string_asan_correct(s));
}

int main(int, char**) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,15 @@

#include "test_macros.h"
#include "min_allocator.h"
#include "asan_testing.h"

template <class S>
TEST_CONSTEXPR_CXX20 void test(S s, typename S::size_type n, S expected) {
if (n <= s.max_size()) {
s.resize(n);
LIBCPP_ASSERT(s.__invariants());
assert(s == expected);
LIBCPP_ASSERT(is_string_asan_correct(s));
}
#ifndef TEST_HAS_NO_EXCEPTIONS
else if (!TEST_IS_CONSTANT_EVALUATED) {
Expand Down Expand Up @@ -61,6 +63,7 @@ TEST_CONSTEXPR_CXX20 bool test() {
test_string<std::string>();
#if TEST_STD_VER >= 11
test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char>>>();
test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
#endif

return true;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,15 @@

#include "test_macros.h"
#include "min_allocator.h"
#include "asan_testing.h"

template <class S>
TEST_CONSTEXPR_CXX20 void test(S s, typename S::size_type n, typename S::value_type c, S expected) {
if (n <= s.max_size()) {
s.resize(n, c);
LIBCPP_ASSERT(s.__invariants());
assert(s == expected);
LIBCPP_ASSERT(is_string_asan_correct(s));
}
#ifndef TEST_HAS_NO_EXCEPTIONS
else if (!TEST_IS_CONSTANT_EVALUATED) {
Expand Down Expand Up @@ -57,12 +59,23 @@ TEST_CONSTEXPR_CXX20 void test_string() {
'a',
S("12345678901234567890123456789012345678901234567890aaaaaaaaaa"));
test(S(), S::npos, 'a', S("not going to happen"));
//ASan:
test(S(), 21, 'a', S("aaaaaaaaaaaaaaaaaaaaa"));
test(S(), 22, 'a', S("aaaaaaaaaaaaaaaaaaaaaa"));
test(S(), 23, 'a', S("aaaaaaaaaaaaaaaaaaaaaaa"));
test(S(), 24, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaa"));
test(S(), 29, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaaaaaaa"));
test(S(), 30, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"));
test(S(), 31, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"));
test(S(), 32, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"));
test(S(), 33, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"));
}

TEST_CONSTEXPR_CXX20 bool test() {
test_string<std::string>();
#if TEST_STD_VER >= 11
test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char>>>();
test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
#endif

return true;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

#include "test_macros.h"
#include "min_allocator.h"
#include "asan_testing.h"

template <class S>
TEST_CONSTEXPR_CXX20 void test(S s) {
Expand All @@ -25,6 +26,7 @@ TEST_CONSTEXPR_CXX20 void test(S s) {
assert(s == s0);
assert(s.capacity() <= old_cap);
assert(s.capacity() >= s.size());
LIBCPP_ASSERT(is_string_asan_correct(s));
}

template <class S>
Expand All @@ -43,12 +45,19 @@ TEST_CONSTEXPR_CXX20 void test_string() {
s.assign(100, 'a');
s.erase(50);
test(s);

s.assign(100, 'a');
for (int i = 0; i <= 9; ++i) {
s.erase(90 - 10 * i);
test(s);
}
}

TEST_CONSTEXPR_CXX20 bool test() {
test_string<std::string>();
#if TEST_STD_VER >= 11
test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char>>>();
test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
#endif

return true;
Expand Down
Loading

0 comments on commit e7db482

Please sign in to comment.