Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang-tidy] Improve sizeof(pointer) handling in bugprone-sizeof-expression #94356

Merged
merged 7 commits into from
Jun 11, 2024

Conversation

NagyDonat
Copy link
Contributor

@NagyDonat NagyDonat commented Jun 4, 2024

This commit reimplements the functionality of the Clang Static Analyzer checker alpha.core.SizeofPointer within clang-tidy by adding a new (off-by-default) option to bugprone-sizeof-expression which activates reporting all the sizeof(ptr) expressions (where ptr is an expression that produces a pointer).

The main motivation for this change is that alpha.core.SizeofPointer was an AST-based checker, which did not rely on the path sensitive capabilities of the Static Analyzer, so there was no reason to keep it in the Static Analyzer instead of the more lightweight clang-tidy.

After this commit I'm planning to create a separate commit that deletes alpha.core.SizeofPointer from Clang Static Analyzer.

It was natural to place this moved logic in bugprone-sizeof-expression, because that check already provided several heuristics that reported various especially suspicious classes of sizeof(ptr) expressions.

The new mode WarnOnSizeOfPointer is off-by-default, so it won't surprise the existing users; but it can provide a more through coverage for the vulnerability CWE-467 ("Use of sizeof() on a Pointer Type") than the existing partial heuristics.

Previously this checker had an exception that the RHS of a sizeof(array) / sizeof(array[0]) expression is not reported; I generalized this to an exception that the check doesn't report sizeof(expr[0]) and sizeof(*expr). This idea is taken from the Static Analyzer checker alpha.core.SizeofPointer (which had an exception for *expr), but analysis of open source projects confirmed that this indeed eliminates lots of unwanted results.

Note that the suppression of sizeof(expr[0]) and sizeof(*expr) reports also affects the "old" mode WarnOnSizeOfPointerToAggregate which is enabled by default.

This commit also replaces the old message "suspicious usage of 'sizeof(A*)'; pointer to aggregate" with two more concrete messages; but I feel that this tidy check would deserve a through cleanup of all the diagnostic messages that it can produce. (I added a FIXME to mark one outright misleading message.)

This commit reimplements the functionality of the Clang Static Analyzer
checker `alpha.core.SizeofPointer` within clang-tidy by adding a new
(off-by-default) option to bugprone-sizeof-expression which activates
reporting all the `sizeof(ptr)` expressions (where ptr is an expression
that produces a pointer).

The main motivation for this change is that `alpha.core.SizeofPointer`
was an AST-based checker, which did not rely on the path sensitive
capabilities of the Static Analyzer, so there was no reason to keep it
in the Static Analyzer instead of the more lightweight clang-tidy.

After this commit I'm planning to create a separate commit that deletes
`alpha.core.SizeofPointer` from Clang Static Analyzer.

It was natural to place this moved logic in bugprone-sizeof-expression,
because that check already provided several heuristics that reported
various especially suspicious classes of `sizeof(ptr)` expressions.

The new mode `WarnOnSizeOfPointer` is off-by-default, so it won't
surprise the existing users; but it can provide a more through coverage
for the vulnerability CWE-467 ("Use of sizeof() on a Pointer Type") than
the existing partial heuristics.

I preserved the exception that the RHS of an expression that looks like
`sizeof(array) / sizeof(array[0])` is not reported; and I added another
exception which ensures that `sizeof(*pp)` is not reported when `pp` is
a pointer-to-pointer expression.

This second exception (which I also apply in the "old" on-by-default mode
`WarnOnSizeOfPointerToAggregate`) was present in the CSA checker
`alpha.core.SizeofPoionter` and I decided to copy it because it helped
to avoid several false positives on open-source code.

This commit also replaces the old message "suspicious usage of
'sizeof(A*)'; pointer to aggregate" with two more concrete messages; but
I feel that this tidy check would deserve a through cleanup of all the
diagnostic messages that it can produce. (I added a FIXME to mark one
outright misleading message.)
@llvmbot
Copy link
Member

llvmbot commented Jun 4, 2024

@llvm/pr-subscribers-clang-tidy

@llvm/pr-subscribers-clang-tools-extra

Author: Donát Nagy (NagyDonat)

Changes

This commit reimplements the functionality of the Clang Static Analyzer checker alpha.core.SizeofPointer within clang-tidy by adding a new (off-by-default) option to bugprone-sizeof-expression which activates reporting all the sizeof(ptr) expressions (where ptr is an expression that produces a pointer).

The main motivation for this change is that alpha.core.SizeofPointer was an AST-based checker, which did not rely on the path sensitive capabilities of the Static Analyzer, so there was no reason to keep it in the Static Analyzer instead of the more lightweight clang-tidy.

After this commit I'm planning to create a separate commit that deletes alpha.core.SizeofPointer from Clang Static Analyzer.

It was natural to place this moved logic in bugprone-sizeof-expression, because that check already provided several heuristics that reported various especially suspicious classes of sizeof(ptr) expressions.

The new mode WarnOnSizeOfPointer is off-by-default, so it won't surprise the existing users; but it can provide a more through coverage for the vulnerability CWE-467 ("Use of sizeof() on a Pointer Type") than the existing partial heuristics.

I preserved the exception that the RHS of an expression that looks like sizeof(array) / sizeof(array[0]) is not reported; and I added another exception which ensures that sizeof(*pp) is not reported when pp is a pointer-to-pointer expression.

This second exception (which I also apply in the "old" on-by-default mode WarnOnSizeOfPointerToAggregate) was present in the CSA checker alpha.core.SizeofPoionter and I decided to copy it because it helped to avoid several false positives on open-source code.

This commit also replaces the old message "suspicious usage of 'sizeof(A*)'; pointer to aggregate" with two more concrete messages; but I feel that this tidy check would deserve a through cleanup of all the diagnostic messages that it can produce. (I added a FIXME to mark one outright misleading message.)


Patch is 30.99 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/94356.diff

6 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp (+66-29)
  • (modified) clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.h (+1)
  • (modified) clang-tools-extra/docs/clang-tidy/checks/bugprone/sizeof-expression.rst (+5)
  • (modified) clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c (+6-6)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-any-pointer.cpp (+231)
  • (modified) clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression.cpp (+42-21)
diff --git a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp
index 5e64d23874ec1..05ef666d4f01e 100644
--- a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp
@@ -67,7 +67,8 @@ SizeofExpressionCheck::SizeofExpressionCheck(StringRef Name,
       WarnOnSizeOfCompareToConstant(
           Options.get("WarnOnSizeOfCompareToConstant", true)),
       WarnOnSizeOfPointerToAggregate(
-          Options.get("WarnOnSizeOfPointerToAggregate", true)) {}
+          Options.get("WarnOnSizeOfPointerToAggregate", true)),
+      WarnOnSizeOfPointer(Options.get("WarnOnSizeOfPointer", false)) {}
 
 void SizeofExpressionCheck::storeOptions(ClangTidyOptions::OptionMap &Opts) {
   Options.store(Opts, "WarnOnSizeOfConstant", WarnOnSizeOfConstant);
@@ -78,6 +79,7 @@ void SizeofExpressionCheck::storeOptions(ClangTidyOptions::OptionMap &Opts) {
                 WarnOnSizeOfCompareToConstant);
   Options.store(Opts, "WarnOnSizeOfPointerToAggregate",
                 WarnOnSizeOfPointerToAggregate);
+  Options.store(Opts, "WarnOnSizeOfPointer", WarnOnSizeOfPointer);
 }
 
 void SizeofExpressionCheck::registerMatchers(MatchFinder *Finder) {
@@ -127,17 +129,30 @@ void SizeofExpressionCheck::registerMatchers(MatchFinder *Finder) {
   const auto ConstStrLiteralDecl =
       varDecl(isDefinition(), hasType(hasCanonicalType(CharPtrType)),
               hasInitializer(ignoringParenImpCasts(stringLiteral())));
+  const auto VarWithConstStrLiteralDecl = expr(
+      hasType(hasCanonicalType(CharPtrType)),
+      ignoringParenImpCasts(declRefExpr(hasDeclaration(ConstStrLiteralDecl))));
   Finder->addMatcher(
-      sizeOfExpr(has(ignoringParenImpCasts(
-                     expr(hasType(hasCanonicalType(CharPtrType)),
-                          ignoringParenImpCasts(declRefExpr(
-                              hasDeclaration(ConstStrLiteralDecl)))))))
+      sizeOfExpr(has(ignoringParenImpCasts(VarWithConstStrLiteralDecl)))
           .bind("sizeof-charp"),
       this);
 
-  // Detect sizeof(ptr) where ptr points to an aggregate (i.e. sizeof(&S)).
-  // Do not find it if RHS of a 'sizeof(arr) / sizeof(arr[0])' expression.
-  if (WarnOnSizeOfPointerToAggregate) {
+  // Detect sizeof(ptr) where ptr is a pointer (CWE-467).
+  //
+  // In WarnOnSizeOfPointerToAggregate mode only report cases when ptr points
+  // to an aggregate type or ptr is an expression that (implicitly or
+  // explicitly) casts an array to a pointer type. (These are more suspicious
+  // than other sizeof(ptr) expressions because they can appear as distorted
+  // forms of the common sizeof(aggregate) expressions.)
+  //
+  // To avoid false positives, some idiomatic constructs are accepted:
+  //  + the RHS of a 'sizeof(arr) / sizeof(arr[0])' expression;
+  //  + 'sizeof(*pp)' where 'pp' a pointer-to-pointer value, because this is
+  //    a natural solution when dynamical typing is emulated by passing
+  //    arguments as `generic_function(..., (void *)pp, sizeof(*pp))`.
+  // Moreover this generic message is suppressed in cases that are also matched
+  // by the more concrete matchers 'sizeof-this' and 'sizeof-charp'.
+  if (WarnOnSizeOfPointerToAggregate || WarnOnSizeOfPointer) {
     const auto ArrayExpr =
         ignoringParenImpCasts(hasType(hasCanonicalType(arrayType())));
     const auto ArrayCastExpr = expr(anyOf(
@@ -149,8 +164,16 @@ void SizeofExpressionCheck::registerMatchers(MatchFinder *Finder) {
 
     const auto PointerToStructType =
         hasUnqualifiedDesugaredType(pointerType(pointee(recordType())));
-    const auto PointerToStructExpr = expr(
-        hasType(hasCanonicalType(PointerToStructType)), unless(cxxThisExpr()));
+    const auto PointerToStructTypeWithBinding =
+        type(PointerToStructType).bind("struct-type");
+    const auto PointerToStructExpr =
+        expr(hasType(hasCanonicalType(PointerToStructType)));
+
+    const auto PointerToDetectedExpr =
+        WarnOnSizeOfPointer
+            ? expr(hasType(hasUnqualifiedDesugaredType(pointerType())))
+            : expr(anyOf(ArrayCastExpr, PointerToArrayExpr,
+                         PointerToStructExpr));
 
     const auto ArrayOfPointersExpr = ignoringParenImpCasts(
         hasType(hasCanonicalType(arrayType(hasElementType(pointerType()))
@@ -167,14 +190,17 @@ void SizeofExpressionCheck::registerMatchers(MatchFinder *Finder) {
                                       hasLHS(ignoringParenImpCasts(sizeOfExpr(
                                           has(ArrayOfPointersExpr)))))),
              sizeOfExpr(has(ArrayOfSamePointersZeroSubscriptExpr)));
+    const auto DerefExpr =
+        ignoringParenImpCasts(unaryOperator(hasOperatorName("*")));
 
     Finder->addMatcher(
-        expr(sizeOfExpr(anyOf(
-                 has(ignoringParenImpCasts(anyOf(
-                     ArrayCastExpr, PointerToArrayExpr, PointerToStructExpr))),
-                 has(PointerToStructType))),
+        expr(sizeOfExpr(anyOf(has(ignoringParenImpCasts(
+                                  expr(PointerToDetectedExpr, unless(DerefExpr),
+                                       unless(VarWithConstStrLiteralDecl),
+                                       unless(cxxThisExpr())))),
+                              has(PointerToStructTypeWithBinding))),
              unless(ArrayLengthExprDenom))
-            .bind("sizeof-pointer-to-aggregate"),
+            .bind("sizeof-pointer"),
         this);
   }
 
@@ -292,11 +318,17 @@ void SizeofExpressionCheck::check(const MatchFinder::MatchResult &Result) {
     diag(E->getBeginLoc(),
          "suspicious usage of 'sizeof(char*)'; do you mean 'strlen'?")
         << E->getSourceRange();
-  } else if (const auto *E =
-                 Result.Nodes.getNodeAs<Expr>("sizeof-pointer-to-aggregate")) {
-    diag(E->getBeginLoc(),
-         "suspicious usage of 'sizeof(A*)'; pointer to aggregate")
-        << E->getSourceRange();
+  } else if (const auto *E = Result.Nodes.getNodeAs<Expr>("sizeof-pointer")) {
+    if (Result.Nodes.getNodeAs<Type>("struct-type")) {
+      diag(E->getBeginLoc(),
+           "suspicious usage of 'sizeof(A*)' on pointer-to-aggregate type; did "
+           "you mean 'sizeof(A)'?")
+          << E->getSourceRange();
+    } else {
+      diag(E->getBeginLoc(), "suspicious usage of 'sizeof()' on an expression "
+                             "that results in a pointer")
+          << E->getSourceRange();
+    }
   } else if (const auto *E = Result.Nodes.getNodeAs<BinaryOperator>(
                  "sizeof-compare-constant")) {
     diag(E->getOperatorLoc(),
@@ -332,18 +364,23 @@ void SizeofExpressionCheck::check(const MatchFinder::MatchResult &Result) {
                                 " numerator is not a multiple of denominator")
           << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
     } else if (NumTy && DenomTy && NumTy == DenomTy) {
+      // FIXME: This message is wrong, it should not refer to sizeof "pointer"
+      // usage (and by the way, it would be to clarify all the messages).
       diag(E->getOperatorLoc(),
            "suspicious usage of sizeof pointer 'sizeof(T)/sizeof(T)'")
           << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
-    } else if (PointedTy && DenomTy && PointedTy == DenomTy) {
-      diag(E->getOperatorLoc(),
-           "suspicious usage of sizeof pointer 'sizeof(T*)/sizeof(T)'")
-          << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
-    } else if (NumTy && DenomTy && NumTy->isPointerType() &&
-               DenomTy->isPointerType()) {
-      diag(E->getOperatorLoc(),
-           "suspicious usage of sizeof pointer 'sizeof(P*)/sizeof(Q*)'")
-          << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
+    } else if (!WarnOnSizeOfPointer) {
+      // When 'WarnOnSizeOfPointer' is enabled, these messages become redundant:
+      if (PointedTy && DenomTy && PointedTy == DenomTy) {
+        diag(E->getOperatorLoc(),
+             "suspicious usage of sizeof pointer 'sizeof(T*)/sizeof(T)'")
+            << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
+      } else if (NumTy && DenomTy && NumTy->isPointerType() &&
+                 DenomTy->isPointerType()) {
+        diag(E->getOperatorLoc(),
+             "suspicious usage of sizeof pointer 'sizeof(P*)/sizeof(Q*)'")
+            << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
+      }
     }
   } else if (const auto *E =
                  Result.Nodes.getNodeAs<Expr>("sizeof-sizeof-expr")) {
diff --git a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.h b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.h
index 55becdd4ecdba..9ca17bc9e6f12 100644
--- a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.h
+++ b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.h
@@ -30,6 +30,7 @@ class SizeofExpressionCheck : public ClangTidyCheck {
   const bool WarnOnSizeOfThis;
   const bool WarnOnSizeOfCompareToConstant;
   const bool WarnOnSizeOfPointerToAggregate;
+  const bool WarnOnSizeOfPointer;
 };
 
 } // namespace clang::tidy::bugprone
diff --git a/clang-tools-extra/docs/clang-tidy/checks/bugprone/sizeof-expression.rst b/clang-tools-extra/docs/clang-tidy/checks/bugprone/sizeof-expression.rst
index c37df1706eb4e..8ddee3254c090 100644
--- a/clang-tools-extra/docs/clang-tidy/checks/bugprone/sizeof-expression.rst
+++ b/clang-tools-extra/docs/clang-tidy/checks/bugprone/sizeof-expression.rst
@@ -193,3 +193,8 @@ Options
    When `true`, the check will warn on an expression like
    ``sizeof(expr)`` where the expression is a pointer
    to aggregate. Default is `true`.
+
+.. option:: WarnOnSizeOfPointer
+
+   When `true`, the check will warn on an expression like ``sizeof(expr)``
+   where the expression is a pointer. Default is `false`.
diff --git a/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c
index 8c4feb8f86169..aef930f2c8fda 100644
--- a/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c
+++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c
@@ -34,24 +34,24 @@ int Test5() {
 
   int sum = 0;
   sum += sizeof(&S);
-  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(A*)'; pointer to aggregate
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
   sum += sizeof(__typeof(&S));
   sum += sizeof(&TS);
-  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(A*)'; pointer to aggregate
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
   sum += sizeof(__typeof(&TS));
   sum += sizeof(STRKWD MyStruct*);
   sum += sizeof(__typeof(STRKWD MyStruct*));
   sum += sizeof(TypedefStruct*);
   sum += sizeof(__typeof(TypedefStruct*));
   sum += sizeof(PTTS);
-  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(A*)'; pointer to aggregate
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
   sum += sizeof(PMyStruct);
   sum += sizeof(PS);
-  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(A*)'; pointer to aggregate
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
   sum += sizeof(PS2);
-  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(A*)'; pointer to aggregate
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
   sum += sizeof(&A10);
-  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(A*)'; pointer to aggregate
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
 
 #ifdef __cplusplus
   MyStruct &rS = S;
diff --git a/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-any-pointer.cpp b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-any-pointer.cpp
new file mode 100644
index 0000000000000..98ad2b385de7d
--- /dev/null
+++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-any-pointer.cpp
@@ -0,0 +1,231 @@
+// RUN: %check_clang_tidy %s bugprone-sizeof-expression %t -- -config="{CheckOptions: {bugprone-sizeof-expression.WarnOnSizeOfIntegerExpression: true, bugprone-sizeof-expression.WarnOnSizeOfPointer: true}}" --
+
+class C {
+  int size() { return sizeof(this); }
+  // CHECK-MESSAGES: :[[@LINE-1]]:23: warning: suspicious usage of 'sizeof(this)'
+};
+
+#define LEN 8
+
+int X;
+extern int A[10];
+extern short B[10];
+
+#pragma pack(1)
+struct  S { char a, b, c; };
+
+enum E { E_VALUE = 0 };
+enum class EC { VALUE = 0 };
+
+bool AsBool() { return false; }
+int AsInt() { return 0; }
+E AsEnum() { return E_VALUE; }
+EC AsEnumClass() { return EC::VALUE; }
+S AsStruct() { return {}; }
+
+struct M {
+  int AsInt() { return 0; }
+  E AsEnum() { return E_VALUE; }
+  S AsStruct() { return {}; }
+};
+
+int Test1(const char* ptr) {
+  int sum = 0;
+  sum += sizeof(LEN);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(K)'
+  sum += sizeof(LEN + 1);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(K)'
+  sum += sizeof(sum, LEN);
+  // CHECK-MESSAGES: :[[@LINE-1]]:20: warning: suspicious usage of 'sizeof(..., ...)'
+  sum += sizeof(AsBool());
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in an integer
+  sum += sizeof(AsInt());
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in an integer
+  sum += sizeof(AsEnum());
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in an integer
+  sum += sizeof(AsEnumClass());
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in an integer
+  sum += sizeof(M{}.AsInt());
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in an integer
+  sum += sizeof(M{}.AsEnum());
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in an integer
+  sum += sizeof(sizeof(X));
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(sizeof(...))'
+  sum += sizeof(LEN + sizeof(X));
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(sizeof(...))'
+  sum += sizeof(LEN + LEN + sizeof(X));
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(sizeof(...))'
+  sum += sizeof(LEN + (LEN + sizeof(X)));
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(sizeof(...))'
+  sum += sizeof(LEN + -sizeof(X));
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(sizeof(...))'
+  sum += sizeof(LEN + - + -sizeof(X));
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(sizeof(...))'
+  sum += sizeof(char) / sizeof(char);
+  // CHECK-MESSAGES: :[[@LINE-1]]:23: warning: suspicious usage of sizeof pointer 'sizeof(T)/sizeof(T)'
+  sum += sizeof(A) / sizeof(S);
+  // CHECK-MESSAGES: :[[@LINE-1]]:20: warning: suspicious usage of 'sizeof(...)/sizeof(...)'; numerator is not a multiple of denominator
+  sum += sizeof(char) / sizeof(int);
+  // CHECK-MESSAGES: :[[@LINE-1]]:23: warning: suspicious usage of 'sizeof(...)/sizeof(...)'; numerator is not a multiple of denominator
+  sum += sizeof(char) / sizeof(A);
+  // CHECK-MESSAGES: :[[@LINE-1]]:23: warning: suspicious usage of 'sizeof(...)/sizeof(...)'; numerator is not a multiple of denominator
+  sum += sizeof(B[0]) / sizeof(A);
+  // CHECK-MESSAGES: :[[@LINE-1]]:23: warning: suspicious usage of 'sizeof(...)/sizeof(...)'; numerator is not a multiple of denominator
+  sum += sizeof(ptr) / sizeof(char);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(ptr) / sizeof(ptr[0]);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(ptr) / sizeof(char*);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(ptr) / sizeof(void*);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(ptr) / sizeof(const void volatile*);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(ptr) / sizeof(char);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(int) * sizeof(char);
+  // CHECK-MESSAGES: :[[@LINE-1]]:22: warning: suspicious 'sizeof' by 'sizeof' multiplication
+  sum += sizeof(ptr) * sizeof(ptr[0]);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  // CHECK-MESSAGES: :[[@LINE-2]]:22: warning: suspicious 'sizeof' by 'sizeof' multiplication
+  sum += sizeof(int) * (2 * sizeof(char));
+  // CHECK-MESSAGES: :[[@LINE-1]]:22: warning: suspicious 'sizeof' by 'sizeof' multiplication
+  sum += (2 * sizeof(char)) * sizeof(int);
+  // CHECK-MESSAGES: :[[@LINE-1]]:29: warning: suspicious 'sizeof' by 'sizeof' multiplication
+  if (sizeof(A) < 0x100000) sum += 42;
+  // CHECK-MESSAGES: :[[@LINE-1]]:17: warning: suspicious comparison of 'sizeof(expr)' to a constant
+  if (sizeof(A) <= 0xFFFFFFFEU) sum += 42;
+  // CHECK-MESSAGES: :[[@LINE-1]]:17: warning: suspicious comparison of 'sizeof(expr)' to a constant
+  return sum;
+}
+
+int Test5() {
+  typedef int Array10[10];
+  typedef C ArrayC[10];
+
+  struct MyStruct {
+    Array10 arr;
+    Array10* ptr;
+  };
+  typedef const MyStruct TMyStruct;
+  typedef const MyStruct *PMyStruct;
+  typedef TMyStruct *PMyStruct2;
+
+  static TMyStruct kGlocalMyStruct = {};
+  static TMyStruct volatile * kGlocalMyStructPtr = &kGlocalMyStruct;
+
+  MyStruct S;
+  PMyStruct PS;
+  PMyStruct2 PS2;
+  Array10 A10;
+  C *PtrArray[10];
+  C *PC;
+
+  char *PChar;
+  int *PInt, **PPInt;
+  MyStruct **PPMyStruct;
+
+  int sum = 0;
+  sum += sizeof(&S.arr);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(&kGlocalMyStruct.arr);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(&kGlocalMyStructPtr->arr);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(S.arr + 0);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(+ S.arr);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof((int*)S.arr);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+
+  sum += sizeof(S.ptr);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(kGlocalMyStruct.ptr);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(kGlocalMyStructPtr->ptr);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+
+  sum += sizeof(&kGlocalMyStruct);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(&S);
+  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof()' on an expression that results in a pointer
+  sum += sizeof(MyStruct*);
+  sum...
[truncated]

Comment on lines -127 to -128
sum += sizeof(ptr) / sizeof(ptr[0]);
// CHECK-MESSAGES: :[[@LINE-1]]:22: warning: suspicious usage of sizeof pointer 'sizeof(T*)/sizeof(T)'
Copy link
Contributor Author

@NagyDonat NagyDonat Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm deleting this test line because the same expression also appears on line 117.

Copy link
Contributor

@EugeneZelenko EugeneZelenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention changes in Release Notes.

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Jun 5, 2024

@EugeneZelenko Thanks for inviting some additional reviewers (and @ reviewers thanks in advance for any feedback). I'll add an entry in the release notes.

Copy link
Member

@PiotrZSL PiotrZSL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+- LGTM, except nits.

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Jun 5, 2024

I analyzed several open source project with this check to observe the effects of my commit and I was (pleasantly?) surprised to see that it detected some ugly errors (despite the fact that the inputs are stable open source projects...).

EDIT: the original contents of this comment became obsolete, see the full evaluation in my more recent comment.

Co-authored-by: EugeneZelenko <eugene.zelenko@gmail.com>
Comment on lines +193 to +197
When `true`, the check will warn when the argument of ``sizeof`` is either a
pointer-to-aggregate type, an expression returning a pointer-to-aggregate
value or an expression that returns a pointer from an array-to-pointer
conversion (that may be implicit or explicit, for example ``array + 2`` or
``(int *)array``). Default is `true`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added a proper description for WarnOnSizeOfPointerToAggregate, because the old one was very vague and didn't explain the (somewhat complex) effects that are handled here.

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Jun 6, 2024

Currently the mode WarnOnSizeOfPointer = true is implemented to be a superset of WarnOnSizeOfPointerToAggregate, so it will produce warnings when the argument of sizeof is a pointer-to-aggregate type (e.g. sizeof(struct foo *)). However, this "argument is a type" area is not (yet) generalized for arbitrary pointers, so e.g. sizeof(int *) is not reported. Here I can imagine several potential solutions:

  1. Keep this behavior and document that WarnOnSizeOfPointer is a superset of WarnOnSizeOfPointerToAggregate.
  2. Limit reports from WarnOnSizeOfPointer to sizeof(expression) expressions (ruling out cases where the argument of sizeof is a type) and perhaps document this by changing the option name to WarnOnSizeOfPointerExpr.
  3. Extend WarnOnSizeOfPointer to all sizeof(type *) expressions (and document this).
  4. Something else, e.g. separating the sizeof(pointer expression) and sizeof(pointer type) cases into two separate options...

What do you think about these? Which would be the best way forward?

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Jun 7, 2024

I re-ran the open source evaluation; see below the clean diff that I promised (italicized notes are just copied from the old table). This compares the commit proposed in this PR and a minimal change which only applies the message modifications. „New Reports” are those that will be visible when the off-by-default flag WarnOnSizeOfPointer is enabled; Resolved Reports are reports that will be discarded after this commit (even when WarnOnSizeOfPointer is off).

Project New Reports Resolved Reports Notes
memcached No reports No reports
tmux No reports 23 resolved reports reports seem to be FPs, including several ones that use qsort in a clear and straightforward way
curl 3 new reports 1 resolved reports new reports are TPs (all reporting incorrect use of the same data structure), resolved one is FP
twin No reports No reports
vim 1 new reports No reports true positive
openssl 23 new reports 22 resolved reports resolved reports are FPs, new reports are mostly TPs or "works, but ugly and dodgy" code with a few FPs that look like generic_function(&arg, sizeof(arg)) or get_memory(length*sizeof(array[0]))
sqlite 11 new reports No reports among the new results there are many FPs ((1), (2)) that do things like char **mem; realloc(mem, numElements*sizeof(mem[0]))
ffmpeg 22 new reports 109 resolved reports new reports are FPs; most of them look like memcpy(&val, ptr, sizeof(val)); but there are some sizeof(array[0]) ones as well; among the resolved reports 10/10 randomly chosen ones were FPs
postgres No reports 5 resolved reports resolved reports are FPs
tinyxml2 No reports No reports
libwebm No reports No reports
xerces 1 new reports No reports true positive, seems to be an ugly bug
bitcoin 1 new reports No reports false positive hasher.Write((const unsigned char*)&ptr, sizeof(ptr));
protobuf 5 new reports 1 resolved reports resolved report is FP, new reports are mostly sizeof(array[0]) FPs + one confusing trickery
qtbase 10 new reports No reports most of them are FPs, including a few sizeof(array[0]) issues
contour No reports No reports
openrct2 1 new reports No reports arguably a false positive, but the highlighted code is suspicious
llvm-project No reports 1 resolved reports false positive

NagyDonat added 4 commits June 7, 2024 20:20
...instead of just the RHS of the `sizeof(array) / sizeof(array[0])`
expression. This simplifies the code *and* will suppress many false
positives in various open source projects.
…llvm-project into SizeofPointer-move-to-tidy
@NagyDonat
Copy link
Contributor Author

NagyDonat commented Jun 10, 2024

I evaluated the effects of the additional change Generalize the suppression to all sizeof(array[0]) expressions compared to the earlier version of this PR, and it seems that this tweak eliminates a significant amount of false positives:

Project New Reports Resolved Reports
memcached No reports No reports
tmux No reports No reports
curl No reports 3 resolved reports
twin No reports No reports
vim No reports No reports
openssl No reports 8 resolved reports
sqlite No reports 12 resolved reports
ffmpeg No reports 9 resolved reports
postgres No reports 1 resolved reports
tinyxml2 No reports No reports
libwebm No reports No reports
xerces No reports 10 resolved reports
bitcoin No reports 1 resolved reports
protobuf No reports 4 resolved reports
qtbase No reports 7 resolved reports
contour No reports No reports
openrct2 No reports No reports
llvm-project No reports 6 resolved reports

I randomly checked ~20% of the resolved reports, and they all were false positives (that is, situations where sizeof(pointer) is the idiomatic and correct solution), so I'm satisfied with this additional change.

At this point I'm satisfied with the behavior of this check on the real-world code and I'm not planning to add more features within this PR. (There is one known issue which I highlighted in a FIXME in commit Extend GenericFunctionTest, document a class of FPs with a FIXME -- but that's significantly less severe than the issues before this PR, so I think it's enough to handle that later.)

I also answered all the review suggestions, so I'd be grateful for another round of reviews from @PiotrZSL @EugeneZelenko or anybody else who has time for it.

Copy link
Member

@PiotrZSL PiotrZSL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.
It doesn't look to break (by default) previous behavior, so it's fine.

@NagyDonat NagyDonat changed the title [clang-tidy] Add WarnOnSizeOfPointer mode to bugprone-sizeof-expression [clang-tidy] Improve sizeof(pointer) handling in bugprone-sizeof-expression Jun 11, 2024
@NagyDonat NagyDonat merged commit 546c816 into llvm:main Jun 11, 2024
8 checks passed
@NagyDonat
Copy link
Contributor Author

Minor correction: in the commit message I accidentally wrote alpha.core.SizeofPointer instead of alpha.core.SizeofPtr. (Fortunately this difference is mostly academical, because my followup commit will remove that checker.)

Lukacma pushed a commit to Lukacma/llvm-project that referenced this pull request Jun 12, 2024
…ession (llvm#94356)

This commit reimplements the functionality of the Clang Static Analyzer
checker `alpha.core.SizeofPointer` within clang-tidy by adding a new
(off-by-default) option to bugprone-sizeof-expression which activates
reporting all the `sizeof(ptr)` expressions (where ptr is an expression
that produces a pointer).

The main motivation for this change is that `alpha.core.SizeofPointer`
was an AST-based checker, which did not rely on the path sensitive
capabilities of the Static Analyzer, so there was no reason to keep it
in the Static Analyzer instead of the more lightweight clang-tidy.

After this commit I'm planning to create a separate commit that deletes
`alpha.core.SizeofPointer` from Clang Static Analyzer.

It was natural to place this moved logic in bugprone-sizeof-expression,
because that check already provided several heuristics that reported
various especially suspicious classes of `sizeof(ptr)` expressions.

The new mode `WarnOnSizeOfPointer` is off-by-default, so it won't
surprise the existing users; but it can provide a more through coverage
for the vulnerability CWE-467 ("Use of sizeof() on a Pointer Type") than
the existing partial heuristics.

Previously this checker had an exception that the RHS of a
`sizeof(array) / sizeof(array[0])` expression is not reported; I
generalized this to an exception that the check doesn't report
`sizeof(expr[0])` and `sizeof(*expr)`. This idea is taken from the
Static Analyzer checker `alpha.core.SizeofPointer` (which had an
exception for `*expr`), but analysis of open source projects confirmed
that this indeed eliminates lots of unwanted results.

Note that the suppression of `sizeof(expr[0])` and `sizeof(*expr)`
reports also affects the "old" mode `WarnOnSizeOfPointerToAggregate`
which is enabled by default.

This commit also replaces the old message "suspicious usage of
'sizeof(A*)'; pointer to aggregate" with two more concrete messages; but
I feel that this tidy check would deserve a through cleanup of all the
diagnostic messages that it can produce. (I added a FIXME to mark one
outright misleading message.)
@NagyDonat NagyDonat deleted the SizeofPointer-move-to-tidy branch June 12, 2024 12:27
@HerrCai0907 HerrCai0907 mentioned this pull request Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants