Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang][deps] Fix __has_include behavior with umbrella headers #70144

Merged
merged 2 commits into from
Oct 25, 2023

Conversation

jansvoboda11
Copy link
Contributor

@jansvoboda11 jansvoboda11 commented Oct 24, 2023

Previously, Clang wouldn't try to resolve the module for the header being checked via __has_include. This meant that such header was considered textual (a.k.a. part of the module Clang is currently compiling).

rdar://116926060

Previously, Clang wouldn't try to resolve the module for the header being checked via `__has_include`. This meant that such header was considered textual (a.k.a. part of the module Clang is currently compiling).
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Oct 24, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 24, 2023

@llvm/pr-subscribers-clang

Author: Jan Svoboda (jansvoboda11)

Changes

Previously, Clang wouldn't try to resolve the module for the header being checked via __has_include. This meant that such header was considered textual (a.k.a. part of the module Clang is currently compiling).


Full diff: https://github.com/llvm/llvm-project/pull/70144.diff

2 Files Affected:

  • (modified) clang/lib/Lex/PPMacroExpansion.cpp (+6-1)
  • (added) clang/test/ClangScanDeps/modules-has-include-umbrella-header.c (+99)
diff --git a/clang/lib/Lex/PPMacroExpansion.cpp b/clang/lib/Lex/PPMacroExpansion.cpp
index b371f8cf7a9c072..30c4abdbad8aa44 100644
--- a/clang/lib/Lex/PPMacroExpansion.cpp
+++ b/clang/lib/Lex/PPMacroExpansion.cpp
@@ -1248,10 +1248,15 @@ static bool EvaluateHasIncludeCommon(Token &Tok, IdentifierInfo *II,
   if (Filename.empty())
     return false;
 
+  // Passing this to LookupFile forces header search to check whether the found
+  // file belongs to a module. Skipping that check could incorrectly mark
+  // modular header as textual, causing issues down the line.
+  ModuleMap::KnownHeader KH;
+
   // Search include directories.
   OptionalFileEntryRef File =
       PP.LookupFile(FilenameLoc, Filename, isAngled, LookupFrom, LookupFromFile,
-                    nullptr, nullptr, nullptr, nullptr, nullptr, nullptr);
+                    nullptr, nullptr, nullptr, &KH, nullptr, nullptr);
 
   if (PPCallbacks *Callbacks = PP.getPPCallbacks()) {
     SrcMgr::CharacteristicKind FileType = SrcMgr::C_User;
diff --git a/clang/test/ClangScanDeps/modules-has-include-umbrella-header.c b/clang/test/ClangScanDeps/modules-has-include-umbrella-header.c
new file mode 100644
index 000000000000000..45ff2bd3b9cd24e
--- /dev/null
+++ b/clang/test/ClangScanDeps/modules-has-include-umbrella-header.c
@@ -0,0 +1,99 @@
+// This test checks that __has_include(<FW/PrivateHeader.h>) in a module does
+// not clobber #include <FW/PrivateHeader.h> in importers of said module.
+
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+//--- cdb.json.template
+[{
+  "file": "DIR/tu.c",
+  "directory": "DIR",
+  "command": "clang DIR/tu.c -fmodules -fmodules-cache-path=DIR/cache -I DIR/modules -F DIR/frameworks -o DIR/tu.o"
+}]
+
+//--- frameworks/FW.framework/Modules/module.private.modulemap
+framework module FW_Private {
+  umbrella header "A.h"
+  module * { export * }
+}
+//--- frameworks/FW.framework/PrivateHeaders/A.h
+#include <FW/B.h>
+//--- frameworks/FW.framework/PrivateHeaders/B.h
+struct B {};
+
+//--- modules/module.modulemap
+module Foo { header "foo.h" }
+//--- modules/foo.h
+#if __has_include(<FW/B.h>)
+#define HAS_B 1
+#else
+#define HAS_B 0
+#endif
+
+//--- tu.c
+#include "foo.h"
+#include <FW/B.h>
+
+// RUN: sed -e "s|DIR|%/t|g" %t/cdb.json.template > %t/cdb.json
+// RUN: clang-scan-deps -compilation-database %t/cdb.json -format experimental-full > %t/deps.json
+// RUN: cat %t/deps.json | sed 's:\\\\\?:/:g' | FileCheck %s -DPREFIX=%t
+
+// CHECK:      {
+// CHECK-NEXT:   "modules": [
+// CHECK-NEXT:     {
+// CHECK-NEXT:       "clang-module-deps": [],
+// CHECK-NEXT:       "clang-modulemap-file": "[[PREFIX]]/frameworks/FW.framework/Modules/module.private.modulemap",
+// CHECK-NEXT:       "command-line": [
+// CHECK:            ],
+// CHECK-NEXT:       "context-hash": "{{.*}}",
+// CHECK-NEXT:       "file-deps": [
+// CHECK-NEXT:         "[[PREFIX]]/frameworks/FW.framework/Modules/module.private.modulemap",
+// CHECK-NEXT:         "[[PREFIX]]/frameworks/FW.framework/PrivateHeaders/A.h",
+// CHECK-NEXT:         "[[PREFIX]]/frameworks/FW.framework/PrivateHeaders/B.h"
+// CHECK-NEXT:       ],
+// CHECK-NEXT:       "name": "FW_Private"
+// CHECK-NEXT:     },
+// CHECK-NEXT:     {
+// CHECK-NEXT:       "clang-module-deps": [],
+// CHECK-NEXT:       "clang-modulemap-file": "[[PREFIX]]/modules/module.modulemap",
+// CHECK-NEXT:       "command-line": [
+// CHECK:              "-fmodule-map-file=[[PREFIX]]/frameworks/FW.framework/Modules/module.private.modulemap",
+// CHECK:            ],
+// CHECK-NEXT:       "context-hash": "{{.*}}",
+// CHECK-NEXT:       "file-deps": [
+// FIXME: The frameworks/FW.framework/PrivateHeaders/B.h header never makes it into SourceManager,
+//        so we don't track it as a file dependency (even though we should).
+// CHECK-NEXT:         "[[PREFIX]]/frameworks/FW.framework/Modules/module.private.modulemap",
+// CHECK-NEXT:         "[[PREFIX]]/modules/foo.h",
+// CHECK-NEXT:         "[[PREFIX]]/modules/module.modulemap"
+// CHECK-NEXT:       ],
+// CHECK-NEXT:       "name": "Foo"
+// CHECK-NEXT:     }
+// CHECK-NEXT:   ],
+// CHECK-NEXT:   "translation-units": [
+// CHECK-NEXT:     {
+// CHECK-NEXT:       "commands": [
+// CHECK-NEXT:         {
+// CHECK-NEXT:           "clang-context-hash": "{{.*}}",
+// CHECK-NEXT:           "clang-module-deps": [
+// CHECK-NEXT:             {
+// CHECK-NEXT:               "context-hash": "{{.*}}",
+// CHECK-NEXT:               "module-name": "FW_Private"
+// CHECK-NEXT:             },
+// CHECK-NEXT:             {
+// CHECK-NEXT:               "context-hash": "{{.*}}",
+// CHECK-NEXT:               "module-name": "Foo"
+// CHECK-NEXT:             }
+// CHECK-NEXT:           ],
+// CHECK-NEXT:           "command-line": [
+// CHECK:                ],
+// CHECK-NEXT:           "executable": "clang",
+// CHECK-NEXT:           "file-deps": [
+// CHECK-NEXT:             "[[PREFIX]]/tu.c"
+// CHECK-NEXT:           ],
+// CHECK-NEXT:           "input-file": "[[PREFIX]]/tu.c"
+// CHECK:              }
+// CHECK:            ]
+// CHECK:          }
+// CHECK:        ]
+// CHECK:      }

Copy link
Collaborator

@benlangmuir benlangmuir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify how this bug occurs in terms of what information about the header is stored that causes us to get the wrong module? It seems bad that passing nullptr to LookupFile could cause incorrect behaviour and makes me wonder if we need to always trigger module lookup or if there is another way to fix this that would make it safe to not do module lookup here.

@jansvoboda11
Copy link
Contributor Author

Can you clarify how this bug occurs in terms of what information about the header is stored that causes us to get the wrong module?

I updated the test case to be more in line with how the bug manifested in the wild. Let me annotate the test case here to hopefully clarify things:

//--- tu.c
#include "poison.h" // This imports Poison.pcm. In its HEADER_SEARCH_TABLE record, this PCM stores
                    // information that "FW/B.h" is textual. This happens because the compiler never
                    // looked for the module map when it encountered `__has_include(<FW/B.h>)`, and
                    // therefore set `HeaderFileInfo::{isModuleHeader,isCompilingModuleHeader}` to
                    // `false` for that header. `ASTWriter` then decided to serialize info on that header:
                    // https://github.com/llvm/llvm-project/blob/740582fa4c9512b34128dc97b2eff56820984421/clang/lib/Serialization/ASTWriter.cpp#L2023

#if __has_include(<FW/B.h>) // This looks into Poison.pcm, finds out that "FW/B.h" is textual,
                            // and caches that in HeaderSearch::FileInfo.
#endif

#include "import.h" // This transitively imports FW_Private.pcm.
                    // This does not parse FW_Private module map, that would actually tell us that "FW/B.h"
                    // is most likely part of FW_Private (via the PrivateHeaders umbrella header).
                    // FW_Private.pcm does contain the information that "FW/B.h" is part of FW_Private, but...

#include <FW/B.h> // ... this does not deserialize the info on "FW/B.h" from FW_Private.pcm
                  // due to the unimplemented FIXME here:
                  // https://github.com/llvm/llvm-project/blob/740582fa4c9512b34128dc97b2eff56820984421/clang/lib/Lex/HeaderSearch.cpp#L1320
                  // This header is therefore considered textual.

So an alternative solution would be to implement that fixme and ensure we do deserialize HeaderFileInfo from newly loaded PCM files. That would tell us the FW_Private.pcm considers "FW/B.h" modular. I'd rather fix what we actually serialize into PCM files first.

It seems bad that passing nullptr to LookupFile could cause incorrect behavior and makes me wonder if we need to always trigger module lookup or if there is another way to fix this that would make it safe to not do module lookup here.

Yeah. I did try to fix up all calls to LookupFile to perform module map lookup, but a bunch of tests started failing (mostly standard C++ modules tests IIRC), so there's probably more nuance required there.

Copy link
Collaborator

@benlangmuir benlangmuir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I did try to fix up all calls to LookupFile to perform module map lookup, but a bunch of tests started failing (mostly standard C++ modules tests IIRC), so there's probably more nuance required there.

Okay, I do think this is worth fixing long term, but I don't want to block on it. Your change LGTM in the meantime.

@jansvoboda11 jansvoboda11 merged commit a1b4238 into llvm:main Oct 25, 2023
2 of 3 checks passed
@jansvoboda11 jansvoboda11 deleted the has-include-umbrella-header branch October 25, 2023 21:58
jansvoboda11 added a commit to swiftlang/llvm-project that referenced this pull request Oct 26, 2023
…m#70144)

Previously, Clang wouldn't try to resolve the module for the header
being checked via `__has_include`. This meant that such header was
considered textual (a.k.a. part of the module Clang is currently
compiling).

rdar://116926060
(cherry picked from commit a1b4238)
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Oct 26, 2023
…m#70144)

Previously, Clang wouldn't try to resolve the module for the header
being checked via `__has_include`. This meant that such header was
considered textual (a.k.a. part of the module Clang is currently
compiling).

rdar://116926060
jansvoboda11 added a commit that referenced this pull request Nov 7, 2023
)

After #70144 Clang started resolving module maps even for
`__has_include()` expressions. This had the unintended consequence of
emitting diagnostics around header misuse. These don't make sense if you
actually don't bring contents of the header into the importer, so should
be skipped for `__has_include()`. This patch moves emission of these
diagnostics out of `Preprocessor::LookupFile()` up into
`Preprocessor::LookupHeaderIncludeOrImport()`.
jansvoboda11 added a commit to swiftlang/llvm-project that referenced this pull request Nov 7, 2023
…m#71450)

After llvm#70144 Clang started resolving module maps even for
`__has_include()` expressions. This had the unintended consequence of
emitting diagnostics around header misuse. These don't make sense if you
actually don't bring contents of the header into the importer, so should
be skipped for `__has_include()`. This patch moves emission of these
diagnostics out of `Preprocessor::LookupFile()` up into
`Preprocessor::LookupHeaderIncludeOrImport()`.

(cherry picked from commit bdb309c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants