Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make two texts static in ReplayInlineAdvisor #79489

Merged
merged 1 commit into from
Jan 26, 2024

Conversation

AdvenamTacet
Copy link
Member

@AdvenamTacet AdvenamTacet commented Jan 25, 2024

Edit after merge: reason of failure below may be incorrect, based on #79522.


This commit makes two variables static.
That makes two buildbot tests pass with short string annotations.
I suspect that there may be use after end of life bug and it's fixed by this change, but it requires confirmation.

Short string annotations PR (reverted):

Tests fixed with this PR:

  LLVM :: Transforms/Inline/cgscc-inline-replay.ll 
  LLVM :: Transforms/SampleProfile/inline-replay.ll

Buildbot output: https://lab.llvm.org/buildbot/#/builders/5/builds/40364/steps/9/logs/stdio

This PR does not resolve a problem with Clang :: SemaCXX/builtins.cpp, related PR is:

This commit makes two variables static.
That makes two buildbot tests pass with short string annotations.

Short string annotations PR (reverted):
- llvm#79049

Tests fixed with this PR:
``
  LLVM :: Transforms/Inline/cgscc-inline-replay.ll
  LLVM :: Transforms/SampleProfile/inline-replay.ll
```
Buildbot output: https://lab.llvm.org/buildbot/#/builders/5/builds/40364/steps/9/logs/stdio

This PR does not resolve a problem with `Clang :: SemaCXX/builtins.cpp`.

I suspect that there may be use after end of life bug and it's fixed by this change.
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 25, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Tacet (AdvenamTacet)

Changes

This commit makes two variables static.
That makes two buildbot tests pass with short string annotations.
I suspect that there may be use after end of life bug and it's fixed by this change, but it requires confirmation.

Short string annotations PR (reverted):

Tests fixed with this PR:

  LLVM :: Transforms/Inline/cgscc-inline-replay.ll 
  LLVM :: Transforms/SampleProfile/inline-replay.ll

Buildbot output: https://lab.llvm.org/buildbot/#/builders/5/builds/40364/steps/9/logs/stdio

This PR does not resolve a problem with Clang :: SemaCXX/builtins.cpp.


Full diff: https://github.com/llvm/llvm-project/pull/79489.diff

1 Files Affected:

  • (modified) llvm/lib/Analysis/ReplayInlineAdvisor.cpp (+2-2)
diff --git a/llvm/lib/Analysis/ReplayInlineAdvisor.cpp b/llvm/lib/Analysis/ReplayInlineAdvisor.cpp
index 2ca02eb1741712b..0814483db343cee 100644
--- a/llvm/lib/Analysis/ReplayInlineAdvisor.cpp
+++ b/llvm/lib/Analysis/ReplayInlineAdvisor.cpp
@@ -43,8 +43,8 @@ ReplayInlineAdvisor::ReplayInlineAdvisor(
   //   main:3:1.1;
   // We use the callsite string after `at callsite` to replay inlining.
   line_iterator LineIt(*BufferOrErr.get(), /*SkipBlanks=*/true);
-  const std::string PositiveRemark = "' inlined into '";
-  const std::string NegativeRemark = "' will not be inlined into '";
+  static const std::string PositiveRemark = "' inlined into '";
+  static const std::string NegativeRemark = "' will not be inlined into '";
 
   for (; !LineIt.is_at_eof(); ++LineIt) {
     StringRef Line = *LineIt;

@modiking
Copy link
Contributor

LGTM, thanks!

AdvenamTacet pushed a commit to trail-of-forks/llvm-project that referenced this pull request Jan 25, 2024
This commit makes two variables static extending their life span.
This patch is designed to address the issue of buildbots failing when AddressSanitizer's (ASan) short string annotations are enabled.
It's esentially same as:
- llvm#79489
however, it's less likely to solve the real problem as those strings change (aren't `const`).
I suspect that there may be use after end of life bug (in StringRef), but it requires confirmation.
In that case, one alternative solution, which unfortunately results in memory leaks, is to always allocate new strings instead of overwriting existing (static) ones. This approach would prevent potential data corruption, but I don't suggest it in this PR.

This patch makes `Clang :: SemaCXX/builtins.cpp` test pass with short string annotations (ASan).
With llvm#79489 it fixes known problems with buildbots, while running with short string annotations.
However, the potential issue still requires more investigation therefore FIXME comment is added in that patch.

Short string annotations PR (reverted):
- llvm#79049

Buildbots (failure) output:
- https://lab.llvm.org/buildbot/#/builders/5/builds/40364/steps/9/logs/stdio

While buildbots should not fail with proposed changes, we still should investigate why buildbots were failing with ASan short string annotations turned on.
StringRef objects (made from those strings) can potentially change their contents unexpectedly or even (potentially) use of freed memory may happen.
That interpretation is only my educated guess, I still didn't understand exactly why those buildbots are failing.
@AdvenamTacet AdvenamTacet merged commit 67f9c35 into llvm:main Jan 26, 2024
5 checks passed
@AdvenamTacet AdvenamTacet deleted the fix-ReplayInlineAdvisor branch January 26, 2024 01:04
AdvenamTacet pushed a commit to trail-of-forks/llvm-project that referenced this pull request Jan 26, 2024
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM:
- llvm#79489
- llvm#79522

Additionaly annotations were updated (but it shouldn't have any impact on anything):
- llvm#79292

Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang.
Read PRs above and below for details.

---

Previous description:

Originally merged here: llvm#75882
Reverted here: llvm#78627

Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error.

Fixes are implemented in:
- llvm#79065
- llvm#79066

Problematic code from `UniqueFunctionBase` for example:
```cpp
    // In debug builds, we also scribble across the rest of the storage.
    memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize);
```

---

Original description:

This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case).

Originally suggested here: https://reviews.llvm.org/D147680

String annotations added here: llvm#72677

Requires to pass CI without fails:
- llvm#75845
- llvm#75858

Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations.

Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec.

This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds.

The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability.

If you have any questions, please email:

    advenam.tacet@trailofbits.com
    disconnect3d@trailofbits.com
AdvenamTacet pushed a commit that referenced this pull request Jan 26, 2024
This commit makes two variables static extending their life span.
This patch is designed to address the issue of buildbots failing when AddressSanitizer's (ASan) short string annotations are enabled.
It's esentially same as:
- #79489
however, it's less likely to solve the real problem as those strings change (aren't `const`).
I suspect that there may be use after end of life bug (in StringRef), but it requires confirmation.
In that case, one alternative solution, which unfortunately results in memory leaks, is to always allocate new strings instead of overwriting existing (static) ones. This approach would prevent potential data corruption, but I don't suggest it in this PR.

This patch makes `Clang :: SemaCXX/builtins.cpp` test pass with short string annotations (ASan).
With #79489 it fixes known problems with buildbots, while running with short string annotations.
However, the potential issue still requires more investigation therefore FIXME comment is added in that patch.

Short string annotations PR (reverted):
- #79049

Buildbots (failure) output:
- https://lab.llvm.org/buildbot/#/builders/5/builds/40364/steps/9/logs/stdio

While buildbots should not fail with proposed changes, we still should investigate why buildbots were failing with ASan short string annotations turned on.
StringRef objects (made from those strings) can potentially change their contents unexpectedly or even (potentially) use of freed memory may happen.
That interpretation is only my educated guess, I still didn't understand exactly why those buildbots are failing.
@@ -43,8 +43,8 @@ ReplayInlineAdvisor::ReplayInlineAdvisor(
// main:3:1.1;
// We use the callsite string after `at callsite` to replay inlining.
line_iterator LineIt(*BufferOrErr.get(), /*SkipBlanks=*/true);
const std::string PositiveRemark = "' inlined into '";
const std::string NegativeRemark = "' will not be inlined into '";
static const std::string PositiveRemark = "' inlined into '";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringRef PositiveRemark?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the question.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use string ref here

StringRef PositiveRemark = "' inlined into '";

However I don't understand why you can report in code as is

vitalybuka added a commit that referenced this pull request Feb 16, 2024
vitalybuka added a commit that referenced this pull request Feb 22, 2024
vitalybuka added a commit that referenced this pull request Apr 1, 2024
Reverts #79489

We know that the issues was with asan/annotations. We can revert it.
AdvenamTacet pushed a commit to trail-of-forks/llvm-project that referenced this pull request May 5, 2024
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM:
- llvm#79489
- llvm#79522

Additionaly annotations were updated (but it shouldn't have any impact on anything):
- llvm#79292

Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang.
Read PRs above and below for details.

---

Previous description:

Originally merged here: llvm#75882
Reverted here: llvm#78627

Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error.

Fixes are implemented in:
- llvm#79065
- llvm#79066

Problematic code from `UniqueFunctionBase` for example:
```cpp
    // In debug builds, we also scribble across the rest of the storage.
    memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize);
```

---

Original description:

This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case).

Originally suggested here: https://reviews.llvm.org/D147680

String annotations added here: llvm#72677

Requires to pass CI without fails:
- llvm#75845
- llvm#75858

Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations.

Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec.

This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds.

The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability.

If you have any questions, please email:

    advenam.tacet@trailofbits.com
    disconnect3d@trailofbits.com
AdvenamTacet pushed a commit to trail-of-forks/llvm-project that referenced this pull request May 5, 2024
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM:
- llvm#79489
- llvm#79522

Additionaly annotations were updated (but it shouldn't have any impact on anything):
- llvm#79292

Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang.
Read PRs above and below for details.

---

Previous description:

Originally merged here: llvm#75882
Reverted here: llvm#78627

Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error.

Fixes are implemented in:
- llvm#79065
- llvm#79066

Problematic code from `UniqueFunctionBase` for example:
```cpp
    // In debug builds, we also scribble across the rest of the storage.
    memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize);
```

---

Original description:

This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case).

Originally suggested here: https://reviews.llvm.org/D147680

String annotations added here: llvm#72677

Requires to pass CI without fails:
- llvm#75845
- llvm#75858

Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations.

Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec.

This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds.

The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability.

If you have any questions, please email:

    advenam.tacet@trailofbits.com
    disconnect3d@trailofbits.com
AdvenamTacet pushed a commit to trail-of-forks/llvm-project that referenced this pull request May 5, 2024
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM:
- llvm#79489
- llvm#79522

Additionaly annotations were updated (but it shouldn't have any impact on anything):
- llvm#79292

Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang.
Read PRs above and below for details.

---

Previous description:

Originally merged here: llvm#75882
Reverted here: llvm#78627

Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error.

Fixes are implemented in:
- llvm#79065
- llvm#79066

Problematic code from `UniqueFunctionBase` for example:
```cpp
    // In debug builds, we also scribble across the rest of the storage.
    memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize);
```

---

Original description:

This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case).

Originally suggested here: https://reviews.llvm.org/D147680

String annotations added here: llvm#72677

Requires to pass CI without fails:
- llvm#75845
- llvm#75858

Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations.

Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec.

This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds.

The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability.

If you have any questions, please email:

    advenam.tacet@trailofbits.com
    disconnect3d@trailofbits.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants