Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nonblocking/nonallocating attributes: 2nd pass caller/callee analysis #99656

Merged
merged 71 commits into from
Oct 3, 2024

Conversation

dougsonos
Copy link
Contributor

  • In Sema, when encountering Decls with function effects needing verification, add them to a vector, DeclsWithEffectsToVerify.
  • Update AST serialization to include DeclsWithEffectsToVerify.
  • In AnalysisBasedWarnings, use DeclsWithEffectsToVerify as a work queue, verifying functions with declared effects, and inferring (when permitted and necessary) whether their callees have effects.

@dougsonos dougsonos requested a review from Endilll as a code owner July 19, 2024 14:47
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules labels Jul 19, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 19, 2024

@llvm/pr-subscribers-clang-modules

@llvm/pr-subscribers-clang

Author: Doug Wyatt (dougsonos)

Changes
  • In Sema, when encountering Decls with function effects needing verification, add them to a vector, DeclsWithEffectsToVerify.
  • Update AST serialization to include DeclsWithEffectsToVerify.
  • In AnalysisBasedWarnings, use DeclsWithEffectsToVerify as a work queue, verifying functions with declared effects, and inferring (when permitted and necessary) whether their callees have effects.

Patch is 68.21 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/99656.diff

16 Files Affected:

  • (modified) clang/include/clang/AST/Type.h (+1-1)
  • (modified) clang/include/clang/Basic/DiagnosticGroups.td (+1)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+49)
  • (modified) clang/include/clang/Sema/Sema.h (+13)
  • (modified) clang/include/clang/Serialization/ASTBitCodes.h (+4)
  • (modified) clang/include/clang/Serialization/ASTReader.h (+3)
  • (modified) clang/include/clang/Serialization/ASTWriter.h (+1)
  • (modified) clang/lib/Sema/AnalysisBasedWarnings.cpp (+1259)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+56)
  • (modified) clang/lib/Sema/SemaExpr.cpp (+4)
  • (modified) clang/lib/Sema/SemaLambda.cpp (+5)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+19)
  • (modified) clang/lib/Serialization/ASTWriter.cpp (+12)
  • (added) clang/test/Sema/attr-nonblocking-constraints.cpp (+194)
  • (modified) clang/test/Sema/attr-nonblocking-syntax.cpp (+1)
  • (added) clang/test/SemaObjCXX/attr-nonblocking-constraints.mm (+23)
diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index 25defea58c2dc..08141f75de8db 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -4699,7 +4699,7 @@ class FunctionEffect {
 
 private:
   LLVM_PREFERRED_TYPE(Kind)
-  unsigned FKind : 3;
+  uint8_t FKind : 3;
 
   // Expansion: for hypothetical TCB+types, there could be one Kind for TCB,
   // then ~16(?) bits "SubKind" to map to a specific named TCB. SubKind would
diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td
index 19c3f1e043349..55d9442a939da 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -1557,6 +1557,7 @@ def UnsafeBufferUsage : DiagGroup<"unsafe-buffer-usage", [UnsafeBufferUsageInCon
 // Warnings and notes related to the function effects system underlying
 // the nonblocking and nonallocating attributes.
 def FunctionEffects : DiagGroup<"function-effects">;
+def PerfConstraintImpliesNoexcept : DiagGroup<"perf-constraint-implies-noexcept">;
 
 // Warnings and notes InstallAPI verification.
 def InstallAPIViolation : DiagGroup<"installapi-violation">;
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index d60f32674ca3a..ec02c02d158c8 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -10928,6 +10928,55 @@ def warn_imp_cast_drops_unaligned : Warning<
   InGroup<DiagGroup<"unaligned-qualifier-implicit-cast">>;
 
 // Function effects
+def warn_func_effect_allocates : Warning<
+  "'%0' function must not allocate or deallocate memory">,
+  InGroup<FunctionEffects>;
+def note_func_effect_allocates : Note<
+  "function cannot be inferred '%0' because it allocates/deallocates memory">;
+def warn_func_effect_throws_or_catches : Warning<
+  "'%0' function must not throw or catch exceptions">,
+  InGroup<FunctionEffects>;
+def note_func_effect_throws_or_catches : Note<
+  "function cannot be inferred '%0' because it throws or catches exceptions">;
+def warn_func_effect_has_static_local : Warning<
+  "'%0' function must not have static locals">,
+  InGroup<FunctionEffects>;
+def note_func_effect_has_static_local : Note<
+  "function cannot be inferred '%0' because it has a static local">;
+def warn_func_effect_uses_thread_local : Warning<
+  "'%0' function must not use thread-local variables">,
+  InGroup<FunctionEffects>;
+def note_func_effect_uses_thread_local : Note<
+  "function cannot be inferred '%0' because it uses a thread-local variable">;
+def warn_func_effect_calls_objc : Warning<
+  "'%0' function must not access an ObjC method or property">,
+  InGroup<FunctionEffects>;
+def note_func_effect_calls_objc : Note<
+  "function cannot be inferred '%0' because it accesses an ObjC method or property">;
+def warn_func_effect_calls_func_without_effect : Warning<
+  "'%0' function must not call non-'%0' function '%1'">,
+  InGroup<FunctionEffects>;
+def warn_func_effect_calls_expr_without_effect : Warning<
+  "'%0' function must not call non-'%0' expression">,
+  InGroup<FunctionEffects>;
+def note_func_effect_calls_func_without_effect : Note<
+  "function cannot be inferred '%0' because it calls non-'%0' function '%1'">;
+def note_func_effect_call_extern : Note<
+  "function cannot be inferred '%0' because it has no definition in this translation unit">;
+def note_func_effect_call_disallows_inference : Note<
+  "function does not permit inference of '%0'">;
+def note_func_effect_call_virtual : Note<
+  "virtual method cannot be inferred '%0'">;
+def note_func_effect_call_func_ptr : Note<
+  "function pointer cannot be inferred '%0'">;
+def warn_perf_constraint_implies_noexcept : Warning<
+  "'%0' function should be declared noexcept">,
+  InGroup<PerfConstraintImpliesNoexcept>;
+
+// FIXME: It would be nice if we could provide fuller template expansion notes.
+def note_func_effect_from_template : Note<
+  "in template expansion here">;
+
 // spoofing nonblocking/nonallocating
 def warn_invalid_add_func_effects : Warning<
   "attribute '%0' should not be added via type conversion">,
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index d638d31e050dc..e1867348497da 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -875,6 +875,13 @@ class Sema final : public SemaBase {
 
   // ----- function effects ---
 
+  /// All functions/lambdas/blocks which have bodies and which have a non-empty
+  /// FunctionEffectsRef to be verified.
+  SmallVector<const Decl *> DeclsWithEffectsToVerify;
+  /// The union of all effects present on DeclsWithEffectsToVerify. Conditions
+  /// are all null.
+  FunctionEffectSet AllEffectsToVerify;
+
   /// Warn when implicitly changing function effects.
   void diagnoseFunctionEffectConversion(QualType DstType, QualType SrcType,
                                         SourceLocation Loc);
@@ -891,6 +898,12 @@ class Sema final : public SemaBase {
                                        SourceLocation NewLoc,
                                        SourceLocation OldLoc);
 
+  /// Potentially add a FunctionDecl or BlockDecl to DeclsWithEffectsToVerify.
+  void maybeAddDeclWithEffects(const Decl *D, const FunctionEffectsRef &FX);
+
+  /// Unconditionally add a Decl to DeclsWithEfffectsToVerify.
+  void addDeclWithEffects(const Decl *D, const FunctionEffectsRef &FX);
+
   /// Try to parse the conditional expression attached to an effect attribute
   /// (e.g. 'nonblocking'). (c.f. Sema::ActOnNoexceptSpec). Return an empty
   /// optional on error.
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index 5dd0ba33f8a9c..b975db88dbaae 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -721,6 +721,10 @@ enum ASTRecordTypes {
 
   /// Record code for \#pragma clang unsafe_buffer_usage begin/end
   PP_UNSAFE_BUFFER_USAGE = 69,
+
+  /// Record code for Sema's vector of functions/blocks with effects to
+  /// be verified.
+  DECLS_WITH_EFFECTS_TO_VERIFY = 70,
 };
 
 /// Record types used within a source manager block.
diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h
index 76e51ac7ab979..1d8985602146b 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -948,6 +948,9 @@ class ASTReader
   /// Sema tracks these to emit deferred diags.
   llvm::SmallSetVector<GlobalDeclID, 4> DeclsToCheckForDeferredDiags;
 
+  /// The IDs of all decls with function effects to be checked.
+  SmallVector<GlobalDeclID, 0> DeclsWithEffectsToVerify;
+
 private:
   struct ImportedSubmodule {
     serialization::SubmoduleID ID;
diff --git a/clang/include/clang/Serialization/ASTWriter.h b/clang/include/clang/Serialization/ASTWriter.h
index a0e475ec9f862..4eaf77e8cb8d9 100644
--- a/clang/include/clang/Serialization/ASTWriter.h
+++ b/clang/include/clang/Serialization/ASTWriter.h
@@ -592,6 +592,7 @@ class ASTWriter : public ASTDeserializationListener,
   void WriteMSPointersToMembersPragmaOptions(Sema &SemaRef);
   void WritePackPragmaOptions(Sema &SemaRef);
   void WriteFloatControlPragmaOptions(Sema &SemaRef);
+  void WriteDeclsWithEffectsToVerify(Sema &SemaRef);
   void WriteModuleFileExtension(Sema &SemaRef,
                                 ModuleFileExtensionWriter &Writer);
 
diff --git a/clang/lib/Sema/AnalysisBasedWarnings.cpp b/clang/lib/Sema/AnalysisBasedWarnings.cpp
index 0f604c61fa3af..3909d5b44a32e 100644
--- a/clang/lib/Sema/AnalysisBasedWarnings.cpp
+++ b/clang/lib/Sema/AnalysisBasedWarnings.cpp
@@ -2397,6 +2397,1262 @@ class UnsafeBufferUsageReporter : public UnsafeBufferUsageHandler {
 };
 } // namespace
 
+// =============================================================================
+
+namespace FXAnalysis {
+
+enum class DiagnosticID : uint8_t {
+  None = 0, // sentinel for an empty Diagnostic
+  Throws,
+  Catches,
+  CallsObjC,
+  AllocatesMemory,
+  HasStaticLocal,
+  AccessesThreadLocal,
+
+  // These only apply to callees, where the analysis stops at the Decl
+  DeclDisallowsInference,
+
+  CallsDeclWithoutEffect,
+  CallsExprWithoutEffect,
+};
+
+// Holds an effect diagnosis, potentially for the entire duration of the
+// analysis phase, in order to refer to it when explaining why a caller has been
+// made unsafe by a callee.
+struct Diagnostic {
+  FunctionEffect Effect;
+  DiagnosticID ID = DiagnosticID::None;
+  SourceLocation Loc;
+  const Decl *Callee = nullptr; // only valid for Calls*
+
+  Diagnostic() = default;
+
+  Diagnostic(const FunctionEffect &Effect, DiagnosticID ID, SourceLocation Loc,
+             const Decl *Callee = nullptr)
+      : Effect(Effect), ID(ID), Loc(Loc), Callee(Callee) {}
+};
+
+enum class SpecialFuncType : uint8_t { None, OperatorNew, OperatorDelete };
+enum class CallType {
+  // unknown: probably function pointer
+  Unknown,
+  Function,
+  Virtual,
+  Block
+};
+
+// Return whether a function's effects CAN be verified.
+// The question of whether it SHOULD be verified is independent.
+static bool functionIsVerifiable(const FunctionDecl *FD) {
+  if (!(FD->hasBody() || FD->isInlined())) {
+    // externally defined; we couldn't verify if we wanted to.
+    return false;
+  }
+  if (FD->isTrivial()) {
+    // Otherwise `struct x { int a; };` would have an unverifiable default
+    // constructor.
+    return true;
+  }
+  return true;
+}
+
+/// A mutable set of FunctionEffect, for use in places where any conditions
+/// have been resolved or can be ignored.
+class EffectSet {
+  // This implementation optimizes footprint, since we hold one of these for
+  // every function visited, which, due to inference, can be many more functions
+  // than have declared effects.
+
+  template <typename T, typename SizeT, SizeT Capacity> struct FixedVector {
+    SizeT Count = 0;
+    T Items[Capacity] = {};
+
+    using value_type = T;
+
+    using iterator = T *;
+    using const_iterator = const T *;
+    iterator begin() { return &Items[0]; }
+    iterator end() { return &Items[Count]; }
+    const_iterator begin() const { return &Items[0]; }
+    const_iterator end() const { return &Items[Count]; }
+    const_iterator cbegin() const { return &Items[0]; }
+    const_iterator cend() const { return &Items[Count]; }
+
+    void insert(iterator I, const T &Value) {
+      assert(Count < Capacity);
+      iterator E = end();
+      if (I != E)
+        std::copy_backward(I, E, E + 1);
+      *I = Value;
+      ++Count;
+    }
+
+    void push_back(const T &Value) {
+      assert(Count < Capacity);
+      Items[Count++] = Value;
+    }
+  };
+
+  // As long as FunctionEffect is only 1 byte, and there are only 2 verifiable
+  // effects, this fixed-size vector with a capacity of 7 is more than
+  // sufficient and is only 8 bytes.
+  FixedVector<FunctionEffect, uint8_t, 7> Impl;
+
+public:
+  EffectSet() = default;
+  explicit EffectSet(FunctionEffectsRef FX) { insert(FX); }
+
+  operator ArrayRef<FunctionEffect>() const {
+    return ArrayRef(Impl.cbegin(), Impl.cend());
+  }
+
+  using iterator = const FunctionEffect *;
+  iterator begin() const { return Impl.cbegin(); }
+  iterator end() const { return Impl.cend(); }
+
+  void insert(const FunctionEffect &Effect) {
+    FunctionEffect *Iter = Impl.begin();
+    FunctionEffect *End = Impl.end();
+    // linear search; lower_bound is overkill for a tiny vector like this
+    for (; Iter != End; ++Iter) {
+      if (*Iter == Effect)
+        return;
+      if (Effect < *Iter)
+        break;
+    }
+    Impl.insert(Iter, Effect);
+  }
+  void insert(const EffectSet &Set) {
+    for (const FunctionEffect &Item : Set) {
+      // push_back because set is already sorted
+      Impl.push_back(Item);
+    }
+  }
+  void insert(FunctionEffectsRef FX) {
+    for (const FunctionEffectWithCondition &EC : FX) {
+      assert(EC.Cond.getCondition() ==
+             nullptr); // should be resolved by now, right?
+      // push_back because set is already sorted
+      Impl.push_back(EC.Effect);
+    }
+  }
+  bool contains(const FunctionEffect::Kind EK) const {
+    for (const FunctionEffect &E : Impl)
+      if (E.kind() == EK)
+        return true;
+    return false;
+  }
+
+  void dump(llvm::raw_ostream &OS) const;
+
+  static EffectSet difference(ArrayRef<FunctionEffect> LHS,
+                              ArrayRef<FunctionEffect> RHS) {
+    EffectSet Result;
+    std::set_difference(LHS.begin(), LHS.end(), RHS.begin(), RHS.end(),
+                        std::back_inserter(Result.Impl));
+    return Result;
+  }
+};
+
+LLVM_DUMP_METHOD void EffectSet::dump(llvm::raw_ostream &OS) const {
+  OS << "Effects{";
+  bool First = true;
+  for (const FunctionEffect &Effect : *this) {
+    if (!First)
+      OS << ", ";
+    else
+      First = false;
+    OS << Effect.name();
+  }
+  OS << "}";
+}
+
+// Transitory, more extended information about a callable, which can be a
+// function, block, function pointer, etc.
+struct CallableInfo {
+  // CDecl holds the function's definition, if any.
+  // FunctionDecl if CallType::Function or Virtual
+  // BlockDecl if CallType::Block
+  const Decl *CDecl;
+  mutable std::optional<std::string> MaybeName;
+  SpecialFuncType FuncType = SpecialFuncType::None;
+  EffectSet Effects;
+  CallType CType = CallType::Unknown;
+
+  CallableInfo(Sema &SemaRef, const Decl &CD,
+               SpecialFuncType FT = SpecialFuncType::None)
+      : CDecl(&CD), FuncType(FT) {
+    FunctionEffectsRef FXRef;
+
+    if (auto *FD = dyn_cast<FunctionDecl>(CDecl)) {
+      // Use the function's definition, if any.
+      if (const FunctionDecl *Def = FD->getDefinition())
+        CDecl = FD = Def;
+      CType = CallType::Function;
+      if (auto *Method = dyn_cast<CXXMethodDecl>(FD);
+          Method && Method->isVirtual())
+        CType = CallType::Virtual;
+      FXRef = FD->getFunctionEffects();
+    } else if (auto *BD = dyn_cast<BlockDecl>(CDecl)) {
+      CType = CallType::Block;
+      FXRef = BD->getFunctionEffects();
+    } else if (auto *VD = dyn_cast<ValueDecl>(CDecl)) {
+      // ValueDecl is function, enum, or variable, so just look at its type.
+      FXRef = FunctionEffectsRef::get(VD->getType());
+    }
+    Effects = EffectSet(FXRef);
+  }
+
+  bool isDirectCall() const {
+    return CType == CallType::Function || CType == CallType::Block;
+  }
+
+  bool isVerifiable() const {
+    switch (CType) {
+    case CallType::Unknown:
+    case CallType::Virtual:
+      break;
+    case CallType::Block:
+      return true;
+    case CallType::Function:
+      return functionIsVerifiable(dyn_cast<FunctionDecl>(CDecl));
+    }
+    return false;
+  }
+
+  /// Generate a name for logging and diagnostics.
+  std::string name(Sema &Sem) const {
+    if (!MaybeName) {
+      std::string Name;
+      llvm::raw_string_ostream OS(Name);
+
+      if (auto *FD = dyn_cast<FunctionDecl>(CDecl))
+        FD->getNameForDiagnostic(OS, Sem.getPrintingPolicy(),
+                                 /*Qualified=*/true);
+      else if (auto *BD = dyn_cast<BlockDecl>(CDecl))
+        OS << "(block " << BD->getBlockManglingNumber() << ")";
+      else if (auto *VD = dyn_cast<NamedDecl>(CDecl))
+        VD->printQualifiedName(OS);
+      MaybeName = Name;
+    }
+    return *MaybeName;
+  }
+};
+
+// ----------
+// Map effects to single diagnostics, to hold the first (of potentially many)
+// diagnostics pertaining to an effect, per function.
+class EffectToDiagnosticMap {
+  // Since we currently only have a tiny number of effects (typically no more
+  // than 1), use a sorted SmallVector with an inline capacity of 1. Since it
+  // is often empty, use a unique_ptr to the SmallVector.
+  // Note that Diagnostic itself contains a FunctionEffect which is the key.
+  using ImplVec = llvm::SmallVector<Diagnostic, 1>;
+  std::unique_ptr<ImplVec> Impl;
+
+public:
+  // Insert a new diagnostic if we do not already have one for its effect.
+  void maybeInsert(const Diagnostic &Diag) {
+    if (Impl == nullptr)
+      Impl = std::make_unique<ImplVec>();
+    auto *Iter = _find(Diag.Effect);
+    if (Iter != Impl->end() && Iter->Effect == Diag.Effect)
+      return;
+
+    Impl->insert(Iter, Diag);
+  }
+
+  const Diagnostic *lookup(FunctionEffect Key) {
+    if (Impl == nullptr)
+      return nullptr;
+
+    auto *Iter = _find(Key);
+    if (Iter != Impl->end() && Iter->Effect == Key)
+      return &*Iter;
+
+    return nullptr;
+  }
+
+  size_t size() const { return Impl ? Impl->size() : 0; }
+
+private:
+  ImplVec::iterator _find(const FunctionEffect &key) {
+    // A linear search suffices for a tiny number of possible effects.
+    auto *End = Impl->end();
+    for (auto *Iter = Impl->begin(); Iter != End; ++Iter)
+      if (!(Iter->Effect < key))
+        return Iter;
+    return End;
+  }
+};
+
+// ----------
+// State pertaining to a function whose AST is walked and whose effect analysis
+// is dependent on a subsequent analysis of other functions.
+class PendingFunctionAnalysis {
+  friend class CompleteFunctionAnalysis;
+
+public:
+  struct DirectCall {
+    const Decl *Callee;
+    SourceLocation CallLoc;
+    // Not all recursive calls are detected, just enough
+    // to break cycles.
+    bool Recursed = false;
+
+    DirectCall(const Decl *D, SourceLocation CallLoc)
+        : Callee(D), CallLoc(CallLoc) {}
+  };
+
+  // We always have two disjoint sets of effects to verify:
+  // 1. Effects declared explicitly by this function.
+  // 2. All other inferrable effects needing verification.
+  EffectSet DeclaredVerifiableEffects;
+  EffectSet FXToInfer;
+
+private:
+  // Diagnostics pertaining to the function's explicit effects.
+  SmallVector<Diagnostic, 0> DiagnosticsForExplicitFX;
+
+  // Diagnostics pertaining to other, non-explicit, inferrable effects.
+  EffectToDiagnosticMap InferrableEffectToFirstDiagnostic;
+
+  // These unverified direct calls are what keeps the analysis "pending",
+  // until the callees can be verified.
+  SmallVector<DirectCall, 0> UnverifiedDirectCalls;
+
+public:
+  PendingFunctionAnalysis(
+      Sema &Sem, const CallableInfo &CInfo,
+      ArrayRef<FunctionEffect> AllInferrableEffectsToVerify) {
+    DeclaredVerifiableEffects = CInfo.Effects;
+
+    // Check for effects we are not allowed to infer
+    EffectSet InferrableFX;
+
+    for (const FunctionEffect &effect : AllInferrableEffectsToVerify) {
+      if (effect.canInferOnFunction(*CInfo.CDecl))
+        InferrableFX.insert(effect);
+      else {
+        // Add a diagnostic for this effect if a caller were to
+        // try to infer it.
+        InferrableEffectToFirstDiagnostic.maybeInsert(
+            Diagnostic(effect, DiagnosticID::DeclDisallowsInference,
+                       CInfo.CDecl->getLocation()));
+      }
+    }
+    // InferrableFX is now the set of inferrable effects which are not
+    // prohibited
+    FXToInfer = EffectSet::difference(InferrableFX, DeclaredVerifiableEffects);
+  }
+
+  // Hide the way that diagnostics for explicitly required effects vs. inferred
+  // ones are handled differently.
+  void checkAddDiagnostic(bool Inferring, const Diagnostic &NewDiag) {
+    if (!Inferring)
+      DiagnosticsForExplicitFX.push_back(NewDiag);
+    else
+      InferrableEffectToFirstDiagnostic.maybeInsert(NewDiag);
+  }
+
+  void addUnverifiedDirectCall(const Decl *D, SourceLocation CallLoc) {
+    UnverifiedDirectCalls.emplace_back(D, CallLoc);
+  }
+
+  // Analysis is complete when there are no unverified direct calls.
+  bool isComplete() const { return UnverifiedDirectCalls.empty(); }
+
+  const Diagnostic *diagnosticForInferrableEffect(FunctionEffect effect) {
+    return InferrableEffectToFirstDiagnostic.lookup(effect);
+  }
+
+  SmallVector<DirectCall, 0> &unverifiedCalls() {
+    assert(!isComplete());
+    return UnverifiedDirectCalls;
+  }
+
+  SmallVector<Diagnostic, 0> &getDiagnosticsForExplicitFX() {
+    return DiagnosticsForExplicitFX;
+  }
+
+  void dump(Sema &SemaRef, llvm::raw_ostream &OS) const {
+    OS << "Pending: Declared ";
+    DeclaredVerifiableEffects.dump(OS);
+    OS << ", " << DiagnosticsForExplicitFX.size() << " diags; ";
+    OS << " Infer ";
+    FXToInfer.dump(OS);
+    OS << ", " << InferrableEffectToFi...
[truncated]

…ctions.

- Fix ObjC++ test which was using the attribute's old name.
Copy link

github-actions bot commented Jul 20, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@Sirraide Sirraide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few things I’ve noticed so far. I definitely need to take a closer look at all of this, because it is a lot of code, but before I do that, I have a few questions/remarks about the overall design of this:

  1. I think there already is a comment about that, but this is a lot of code. All the code added to AnalysisBasedWarnings.cpp should probably just be in a separate file to make this a bit easier to review (and also because it’s probably only going to grow in size as time goes on...)

  2. This pr feels like it’s spending a lot of time defining data structures (often just for optimisation purposes), if that makes sense; I’ve seen at least three separate classes that define an insert function. I’m wondering if we can’t just reuse existing facilities for a lot of this (e.g. PartialDiagnostic), and whether we really need to be this up-front about optimising certain things, e.g. a std::unique_ptr<SmallVector<...>> for optimising for the case of having 0 elements feels like it’s a bit much, at least to me.

clang/lib/Sema/AnalysisBasedWarnings.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/AnalysisBasedWarnings.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/AnalysisBasedWarnings.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/AnalysisBasedWarnings.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/AnalysisBasedWarnings.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/AnalysisBasedWarnings.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/AnalysisBasedWarnings.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/SemaDecl.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/SemaDecl.cpp Outdated Show resolved Hide resolved
Doug Wyatt added 2 commits July 26, 2024 07:47
Make FunctionEffect just have a Kind instead of a bitfield

(from PR llvm#100753)
Sirraide pushed a commit that referenced this pull request Jul 26, 2024
As a preliminary to PR #99656 ,
reduce the size of `FunctionEffect` to 1 byte.

---------

Co-authored-by: Doug Wyatt <dwyatt@apple.com>
Doug Wyatt added 5 commits July 26, 2024 12:48
Use LLVM_DEBUG for debug logging.
Remove unneeded isInline() check in functionIsVerifiable
Clean up traversal of constructor initializers and implicit destructors (with test).
… effect preventing inference.

- EffectAnalysis: Rename Diagnostic to Violation to clarify that it is an abstraction which can be reported as either a warning or a note depending on context.
Doug Wyatt and others added 2 commits September 25, 2024 17:15
Copy link
Member

@Sirraide Sirraide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was everything, so lgtm. Since this is a rather big pr tho, I’d like for someone else to approve it too before we merge it.

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, thank you so much @Sirraide for doing such a thorough review here! I've been buried lately, so knowing you were taking care of this was really appreciated!

Second, this patch looks really close! I have a couple of quick/small comments. The ReadDeclID changing the ID seems odd, but otherwise its just documentation/comment requests.

@@ -571,6 +571,9 @@ New features
if class of allocation and deallocation function mismatches.
`Documentation <https://clang.llvm.org/docs/analyzer/checkers.html#unix-mismatcheddeallocator-c-c>`__.

- Function effects (the ``nonblocking`` and ``nonallocating`` "performance constraint" attributes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like some elaboration here, reading this after coming in 'cold' I have no idea what this means. Also, why double-tilde's? I think we only use 1 for these? Or is .rst different enough again...?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RST needs double backticks for code, yeah. And the plan here is that we’ll add a link to the documentation (#109855) in a later pr (a bit like the release note above). Explaining what all of this is about is too much for a release note imo.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the help on the backticks. I think we could probably have the release notes still have a bit more detail. I don't need a full document, but at least a good sentence or of "here is what this is".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can take this sentence from the documentation:

Functions with these attributes are verified as not containing any language constructs or calls to other functions which violate the constraint.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think even that is a little 'requires existing knowledge'? Basically, the audience for this is 'random redditor trying to figure out what features to use'. So we need to make sure anything like this is sufficiently titillating while not being so long as to lose interest.

For most bugs in release notes, we can assume that if the person cares about it, they know the details about the bug, but for features this is sort of our first chance to 'advertise' them. So I'd like something less vague/that gives some good 'hook' as to why the person wants to use it.

Perhaps I'll make @shafik come in, read the documentation enough to understand it, and propose some text :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a shot at something more "hook-y".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still want @shafik to review it (as someone who hasn't been mentally 'poisoned' here by knowing too much about the feature), but perhaps his work is greatly reduced.

NonAllocating = 1,
Blocking = 2,
Allocating = 3,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be a value to an EndList sort of element, so you could use that to auto-generate KindCount?

Also, is there some meaning to the values of the enums besides just wanting the default behavior of enums?

Copy link
Contributor Author

@dougsonos dougsonos Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add and use EndList. Being explicit with enum values is an ancient habit from the days when debugging sometimes involved looking at memory in hex. I'd agree they're superfluous here.

Copy link
Collaborator

@erichkeane erichkeane Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we get around that in a few places I think... perhaps you can find what we do in other enums? Not at a computer at the moment, else I'd help search.

EDIT TO ADD: It might be "EndList = LastEntry" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last seems to be the most common name for that last value...

enum class ComparisonCategoryType : unsigned char {
  PartialOrdering,
  WeakOrdering,
  StrongOrdering,
  First = PartialOrdering,
  Last = StrongOrdering
};

enum class Lang { C = 0, CXX, LastValue = CXX };

enum class SyncScope {
  SystemScope,
  DeviceScope,
  WorkgroupScope,
  WavefrontScope,
  SingleScope,
  HIPSingleThread,
  HIPWavefront,
  HIPWorkgroup,
  HIPAgent,
  HIPSystem,
  OpenCLWorkGroup,
  OpenCLDevice,
  OpenCLAllSVMDevices,
  OpenCLSubGroup,
  Last = OpenCLSubGroup

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, we do use it quite a bit, you can also see the = , so just do that and use it kinda thing IMO.

clang/lib/Serialization/ASTReader.cpp Show resolved Hide resolved
SemaObj->addDeclWithEffects(FD, FD->getFunctionEffects());
else if (auto *BD = dyn_cast<BlockDecl>(D))
SemaObj->addDeclWithEffects(BD, BD->getFunctionEffects());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we expecting anything else to make it into this list? Should there be an 'else assert' here?

Copy link
Contributor Author

@dougsonos dougsonos Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make it llvm_unreachable :

/// Use this instead of assert(0). It conveys intent more clearly, suppresses
/// diagnostics for unreachable code paths, and allows compilers to omit
/// unnecessary code.


enum class ViolationID : uint8_t {
None = 0, // Sentinel for an empty Violation.
// These first few map to a %select{} in a diagnostic.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of 'first few' can you clarify "first 6" or whatever? And also mention the diagnostic/diagnostics? I'd hate to have someone come along and add something here in the wrong place and get surprising/difficult to debug behavior.

class ViolationSite {
public:
enum class Kind : uint8_t {
Default = 0, // Function body.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here with the values set, we typically don't set these to the default values unless it matches some sort of 'meaning' (that is, it acts as documentation).

@Sirraide
Copy link
Member

First, thank you so much @Sirraide for doing such a thorough review here! I've been buried lately, so knowing you were taking care of this was really appreciated!

Np. I also reviewed the first effects pr before this one, so I figured it’d make sense for me to review this because I was already familiar w/ the rest of the implementation.

Doug Wyatt added 3 commits September 26, 2024 08:21
… for ViolationSite::Kind.

- Update comment in ViolationID.
- Add an llvm_unreachable() in ASTReader.
- Tweak release notes.
…cept" now says "function/lambda/block" etc.
Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no further comments here, this looks good to me.

@Sirraide
Copy link
Member

Sirraide commented Oct 2, 2024

@erichkeane Since you’ve approved the pr, shall we merge it once CI is done or do you still want to wait for Shafik to review the release note?

(I don’t have anything else to comment on either)

@erichkeane
Copy link
Collaborator

@erichkeane Since you’ve approved the pr, shall we merge it once CI is done or do you still want to wait for Shafik to review the release note?

(I don’t have anything else to comment on either)

I'm OK just committing. Shafik did a run at the release note that wasn't better enough to warrant any changes here, and I think it is clear enough for now. If someone has an improvement to make the release note better, patches welcome :)

@dougsonos
Copy link
Contributor Author

Thank you @Sirraide @erichkeane and all, for your patient, careful and thorough reviews!

@Sirraide
Copy link
Member

Sirraide commented Oct 3, 2024

Thank you @Sirraide @erichkeane and all, for your patient, careful and thorough reviews!

No problem, and thank you for contributing!

@Sirraide Sirraide merged commit 7fe43ad into llvm:main Oct 3, 2024
9 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 3, 2024

LLVM Buildbot has detected a new failure on builder lldb-arm-ubuntu running on linaro-lldb-arm-ubuntu while building clang at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/4846

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: lang/cpp/incomplete-stl-types/TestStlIncompleteTypes.py (794 of 2814)
PASS: lldb-api :: lang/cpp/keywords_enabled/TestCppKeywordsEnabled.py (795 of 2814)
PASS: lldb-api :: lang/cpp/incomplete-types/TestCppIncompleteTypes.py (796 of 2814)
PASS: lldb-api :: lang/cpp/inlines/TestInlines.py (797 of 2814)
PASS: lldb-api :: lang/cpp/lambdas/TestLambdas.py (798 of 2814)
PASS: lldb-api :: lang/cpp/llvm-style/TestLLVMStyle.py (799 of 2814)
UNSUPPORTED: lldb-api :: lang/cpp/modules-import/TestCXXModulesImport.py (800 of 2814)
PASS: lldb-api :: lang/cpp/limit-debug-info/TestWithLimitDebugInfo.py (801 of 2814)
PASS: lldb-api :: lang/cpp/member-and-local-vars-with-same-name/TestMembersAndLocalsWithSameName.py (802 of 2814)
PASS: lldb-api :: lang/cpp/multiple-inheritance/TestCppMultipleInheritance.py (803 of 2814)
FAIL: lldb-api :: lang/c/shared_lib_stripped_symbols/TestSharedLibStrippedSymbols.py (804 of 2814)
******************** TEST 'lldb-api :: lang/c/shared_lib_stripped_symbols/TestSharedLibStrippedSymbols.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env ARCHIVER=/usr/local/bin/llvm-ar --env OBJCOPY=/usr/bin/llvm-objcopy --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --arch armv8l --build-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/dsymutil --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/lang/c/shared_lib_stripped_symbols -p TestSharedLibStrippedSymbols.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 20.0.0git (https://github.com/llvm/llvm-project.git revision 7fe43ada28c31a9e9d82a76650d987a8b209755e)
  clang revision 7fe43ada28c31a9e9d82a76650d987a8b209755e
  llvm revision 7fe43ada28c31a9e9d82a76650d987a8b209755e
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_expr_dsym (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase) (test case does not fall in any category of interest for this run) 
FAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_expr_dwarf (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_expr_dwo (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_frame_variable_dsym (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase) (test case does not fall in any category of interest for this run) 
XFAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_frame_variable_dwarf (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
XFAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_frame_variable_dwo (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
======================================================================
FAIL: test_expr_dwarf (TestSharedLibStrippedSymbols.SharedLibStrippedTestCase)
   Test that types work when defined in a shared library and forwa/d-declared in the main executable
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1769, in test_method
    return attrvalue(self)
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/lang/c/shared_lib_stripped_symbols/TestSharedLibStrippedSymbols.py", line 24, in test_expr
    self.expect(
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2370, in expect
    self.runCmd(
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1000, in runCmd
    self.assertTrue(self.res.Succeeded(), msg + output)
AssertionError: False is not true : Variable(s) displayed correctly
Error output:

@Sirraide
Copy link
Member

Sirraide commented Oct 3, 2024

LLVM Buildbot has detected a new failure on builder lldb-arm-ubuntu

Looks like the failure is somewhere in an LLDB test, so I think we can ignore this because that’s completely unrelated.

xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Oct 4, 2024
…analysis (llvm#99656)

- In Sema, when encountering Decls with function effects needing
verification, add them to a vector, DeclsWithEffectsToVerify.
- Update AST serialization to include DeclsWithEffectsToVerify.
- In AnalysisBasedWarnings, use DeclsWithEffectsToVerify as a work
queue, verifying functions with declared effects, and inferring (when
permitted and necessary) whether their callees have effects.

---------

Co-authored-by: Doug Wyatt <dwyatt@apple.com>
Co-authored-by: Sirraide <aeternalmail@gmail.com>
Co-authored-by: Erich Keane <ekeane@nvidia.com>
ldionne pushed a commit that referenced this pull request Oct 17, 2024
…#109151)

Make __libcpp_verbose_abort() noexcept (it is already noreturn), to
match std::terminate(). Clang's function effect analysis can use this to
ignore such functions as being beyond its scope. (See
#99656).
jroelofs pushed a commit to swiftlang/llvm-project that referenced this pull request Oct 17, 2024
As a preliminary to PR llvm#99656 ,
reduce the size of `FunctionEffect` to 1 byte.

---------

Co-authored-by: Doug Wyatt <dwyatt@apple.com>
jroelofs pushed a commit to swiftlang/llvm-project that referenced this pull request Oct 17, 2024
…aller/callee analysis (llvm#99656)

- In Sema, when encountering Decls with function effects needing
verification, add them to a vector, DeclsWithEffectsToVerify.
- Update AST serialization to include DeclsWithEffectsToVerify.
- In AnalysisBasedWarnings, use DeclsWithEffectsToVerify as a work
queue, verifying functions with declared effects, and inferring (when
permitted and necessary) whether their callees have effects.

---------

Co-authored-by: Doug Wyatt <dwyatt@apple.com>
Co-authored-by: Sirraide <aeternalmail@gmail.com>
Co-authored-by: Erich Keane <ekeane@nvidia.com>
EricWF pushed a commit to efcs/llvm-project that referenced this pull request Oct 22, 2024
…llvm#109151)

Make __libcpp_verbose_abort() noexcept (it is already noreturn), to
match std::terminate(). Clang's function effect analysis can use this to
ignore such functions as being beyond its scope. (See
llvm#99656).
cjappl pushed a commit that referenced this pull request Oct 28, 2024
Follow-on from #99656, which introduces 2nd pass caller/callee analysis
for function effects.

Wrote a new documentation page, derived directly from the RFC posted to
LLVM Discourse earlier this year.

---------

Co-authored-by: Doug Wyatt <dwyatt@apple.com>
Co-authored-by: Sirraide <aeternalmail@gmail.com>
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
Follow-on from llvm#99656, which introduces 2nd pass caller/callee analysis
for function effects.

Wrote a new documentation page, derived directly from the RFC posted to
LLVM Discourse earlier this year.

---------

Co-authored-by: Doug Wyatt <dwyatt@apple.com>
Co-authored-by: Sirraide <aeternalmail@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants