Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MC] Make UseAssemblerInfoForParsing mostly true #91082

Merged
merged 2 commits into from
May 15, 2024

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented May 4, 2024

Commit 6c0665e
(https://reviews.llvm.org/D45164) enabled certain constant expression
evaluation for MCObjectStreamer at parse time (e.g. .if directives,
see llvm/test/MC/AsmParser/assembler-expressions.s).

getUseAssemblerInfoForParsing was added to make clang -c handling
inline assembly similar to MCAsmStreamer (e.g. llvm-mc -filetype=asm),
where such expression folding (related to
AttemptToFoldSymbolOffsetDifference) is unavailable.

I believe this is overly conservative. We can make some parse-time
expression folding work for clang -c even if clang -S would still
report an error, a MCAsmStreamer issue (we cannot print .if
directives) that should not restrict the functionality of
MCObjectStreamer.

% cat b.cc
asm(R"(
.pushsection .text,"ax"
.globl _start; _start: ret
.if . -_start == 1
  ret
.endif
.popsection
)");
% gcc -S b.cc && gcc -c b.cc
% clang -S -fno-integrated-as b.cc     # succeeded

% clang -c b.cc     # succeeded with this patch
% clang -S b.cc     # still failed
<inline asm>:4:5: error: expected absolute expression
    4 | .if . -_start == 1
      |     ^
1 error generated.

Close #62520
Link: https://discourse.llvm.org/t/rfc-clang-assembly-object-equivalence-for-files-with-inline-assembly/78841

Created using spr 1.3.5-bogner
@MaskRay MaskRay requested a review from arsenm May 4, 2024 20:10
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU mc Machine (object) code backend:SPIR-V labels May 4, 2024
@MaskRay MaskRay requested review from smithp35 and jayfoad May 4, 2024 20:10
@llvmbot
Copy link
Member

llvmbot commented May 4, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-spir-v

Author: Fangrui Song (MaskRay)

Changes

Commit 6c0665e
(https://reviews.llvm.org/D45164) enabled certain constant expression
evaluation for MCObjectStreamer at parse time (e.g. .if directives,
see llvm/test/MC/AsmParser/assembler-expressions.s).

getUseAssemblerInfoForParsing was added to make clang -c handling
inline assembly similar to MCAsmStreamer (e.g. llvm-mc -filetype=asm),
where such expression folding (related to
AttemptToFoldSymbolOffsetDifference) is unavailable.

I believe this is overly conservative. We can make some parse-time
expression folding work for clang -c even if clang -S would still
report an error, a MCAsmStreamer issue (we cannot print .if
directives) that should not restrict the functionality of
MCObjectStreamer.

% cat b.cc
asm(R"(
.pushsection .text,"ax"
.globl _start; _start: ret
.if . -_start == 1
  ret
.endif
.popsection
)");
% gcc -S b.cc &amp;&amp; gcc -c b.cc
% clang -S -fno-integrated-as b.cc     # succeeded

% clang -c b.cc     # succeeded with this patch
% clang -S b.cc     # still failed
&lt;inline asm&gt;:4:5: error: expected absolute expression
    4 | .if . -_start == 1
      |     ^
1 error generated.

Close #62520


Full diff: https://github.com/llvm/llvm-project/pull/91082.diff

10 Files Affected:

  • (modified) clang/tools/driver/cc1as_main.cpp (-3)
  • (modified) llvm/include/llvm/MC/MCStreamer.h (+2-5)
  • (modified) llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp (-3)
  • (modified) llvm/lib/MC/MCObjectStreamer.cpp (+1-8)
  • (modified) llvm/lib/MC/MCStreamer.cpp (+1-1)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp (+2-5)
  • (modified) llvm/lib/Target/SPIRV/SPIRVAsmPrinter.cpp (-3)
  • (modified) llvm/test/MC/AsmParser/assembler-expressions-inlineasm.ll (+10-6)
  • (modified) llvm/tools/llvm-mc/llvm-mc.cpp (-3)
  • (modified) llvm/tools/llvm-ml/llvm-ml.cpp (-3)
diff --git a/clang/tools/driver/cc1as_main.cpp b/clang/tools/driver/cc1as_main.cpp
index 86afe22fac24cc..4eb753a7297a92 100644
--- a/clang/tools/driver/cc1as_main.cpp
+++ b/clang/tools/driver/cc1as_main.cpp
@@ -576,9 +576,6 @@ static bool ExecuteAssemblerImpl(AssemblerInvocation &Opts,
     Str.get()->emitZeros(1);
   }
 
-  // Assembly to object compilation should leverage assembly info.
-  Str->setUseAssemblerInfoForParsing(true);
-
   bool Failed = false;
 
   std::unique_ptr<MCAsmParser> Parser(
diff --git a/llvm/include/llvm/MC/MCStreamer.h b/llvm/include/llvm/MC/MCStreamer.h
index 69867620e1bf8a..50986e6bde8867 100644
--- a/llvm/include/llvm/MC/MCStreamer.h
+++ b/llvm/include/llvm/MC/MCStreamer.h
@@ -245,8 +245,6 @@ class MCStreamer {
   /// requires.
   unsigned NextWinCFIID = 0;
 
-  bool UseAssemblerInfoForParsing;
-
   /// Is the assembler allowed to insert padding automatically?  For
   /// correctness reasons, we sometimes need to ensure instructions aren't
   /// separated in unexpected ways.  At the moment, this feature is only
@@ -296,11 +294,10 @@ class MCStreamer {
 
   MCContext &getContext() const { return Context; }
 
+  // MCObjectStreamer has an MCAssembler and allows more expression folding at
+  // parse time.
   virtual MCAssembler *getAssemblerPtr() { return nullptr; }
 
-  void setUseAssemblerInfoForParsing(bool v) { UseAssemblerInfoForParsing = v; }
-  bool getUseAssemblerInfoForParsing() { return UseAssemblerInfoForParsing; }
-
   MCTargetStreamer *getTargetStreamer() {
     return TargetStreamer.get();
   }
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
index d0ef3e5a19391c..08e3c208ba4d38 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
@@ -102,9 +102,6 @@ void AsmPrinter::emitInlineAsm(StringRef Str, const MCSubtargetInfo &STI,
   std::unique_ptr<MCAsmParser> Parser(
       createMCAsmParser(SrcMgr, OutContext, *OutStreamer, *MAI, BufNum));
 
-  // Do not use assembler-level information for parsing inline assembly.
-  OutStreamer->setUseAssemblerInfoForParsing(false);
-
   // We create a new MCInstrInfo here since we might be at the module level
   // and not have a MachineFunction to initialize the TargetInstrInfo from and
   // we only need MCInstrInfo for asm parsing. We create one unconditionally
diff --git a/llvm/lib/MC/MCObjectStreamer.cpp b/llvm/lib/MC/MCObjectStreamer.cpp
index d2da5d0d3f90f2..a9003a164b306d 100644
--- a/llvm/lib/MC/MCObjectStreamer.cpp
+++ b/llvm/lib/MC/MCObjectStreamer.cpp
@@ -40,14 +40,7 @@ MCObjectStreamer::MCObjectStreamer(MCContext &Context,
 
 MCObjectStreamer::~MCObjectStreamer() = default;
 
-// AssemblerPtr is used for evaluation of expressions and causes
-// difference between asm and object outputs. Return nullptr to in
-// inline asm mode to limit divergence to assembly inputs.
-MCAssembler *MCObjectStreamer::getAssemblerPtr() {
-  if (getUseAssemblerInfoForParsing())
-    return Assembler.get();
-  return nullptr;
-}
+MCAssembler *MCObjectStreamer::getAssemblerPtr() { return Assembler.get(); }
 
 void MCObjectStreamer::addPendingLabel(MCSymbol* S) {
   MCSection *CurSection = getCurrentSectionOnly();
diff --git a/llvm/lib/MC/MCStreamer.cpp b/llvm/lib/MC/MCStreamer.cpp
index 176d55aa890bed..199d865ea3496d 100644
--- a/llvm/lib/MC/MCStreamer.cpp
+++ b/llvm/lib/MC/MCStreamer.cpp
@@ -93,7 +93,7 @@ void MCTargetStreamer::emitAssignment(MCSymbol *Symbol, const MCExpr *Value) {}
 
 MCStreamer::MCStreamer(MCContext &Ctx)
     : Context(Ctx), CurrentWinFrameInfo(nullptr),
-      CurrentProcWinFrameInfoStartIndex(0), UseAssemblerInfoForParsing(false) {
+      CurrentProcWinFrameInfoStartIndex(0) {
   SectionStack.push_back(std::pair<MCSectionSubPair, MCSectionSubPair>());
 }
 
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
index c090d6133dbad4..f891eb97f358b8 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
@@ -511,12 +511,9 @@ bool AMDGPUAsmPrinter::runOnMachineFunction(MachineFunction &MF) {
 
   DumpCodeInstEmitter = nullptr;
   if (STM.dumpCode()) {
-    // For -dumpcode, get the assembler out of the streamer, even if it does
-    // not really want to let us have it. This only works with -filetype=obj.
-    bool SaveFlag = OutStreamer->getUseAssemblerInfoForParsing();
-    OutStreamer->setUseAssemblerInfoForParsing(true);
+    // For -dumpcode, get the assembler out of the streamer. This only works
+    // with -filetype=obj.
     MCAssembler *Assembler = OutStreamer->getAssemblerPtr();
-    OutStreamer->setUseAssemblerInfoForParsing(SaveFlag);
     if (Assembler)
       DumpCodeInstEmitter = Assembler->getEmitterPtr();
   }
diff --git a/llvm/lib/Target/SPIRV/SPIRVAsmPrinter.cpp b/llvm/lib/Target/SPIRV/SPIRVAsmPrinter.cpp
index 2ebe5bdc47715b..ad015808604487 100644
--- a/llvm/lib/Target/SPIRV/SPIRVAsmPrinter.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVAsmPrinter.cpp
@@ -114,12 +114,9 @@ void SPIRVAsmPrinter::emitEndOfAsmFile(Module &M) {
   // Bound is an approximation that accounts for the maximum used register
   // number and number of generated OpLabels
   unsigned Bound = 2 * (ST->getBound() + 1) + NLabels;
-  bool FlagToRestore = OutStreamer->getUseAssemblerInfoForParsing();
-  OutStreamer->setUseAssemblerInfoForParsing(true);
   if (MCAssembler *Asm = OutStreamer->getAssemblerPtr())
     Asm->setBuildVersion(static_cast<MachO::PlatformType>(0), Major, Minor,
                          Bound, VersionTuple(Major, Minor, 0, Bound));
-  OutStreamer->setUseAssemblerInfoForParsing(FlagToRestore);
 }
 
 void SPIRVAsmPrinter::emitFunctionHeader() {
diff --git a/llvm/test/MC/AsmParser/assembler-expressions-inlineasm.ll b/llvm/test/MC/AsmParser/assembler-expressions-inlineasm.ll
index 35f110f37e2fb6..9d9a38f5b5a54b 100644
--- a/llvm/test/MC/AsmParser/assembler-expressions-inlineasm.ll
+++ b/llvm/test/MC/AsmParser/assembler-expressions-inlineasm.ll
@@ -1,13 +1,17 @@
-; RUN: not llc -mtriple x86_64-unknown-linux-gnu -o %t.s -filetype=asm %s 2>&1 | FileCheck %s
-; RUN: not llc -mtriple x86_64-unknown-linux-gnu -o %t.o -filetype=obj %s 2>&1 | FileCheck %s
-
-; Assembler-aware expression evaluation should be disabled in inline
-; assembly to prevent differences in behavior between object and
-; assembly output.
+; RUN: not llc -mtriple=x86_64 %s -o /dev/null 2>&1 | FileCheck %s
+; RUN: llc -mtriple=x86_64 -no-integrated-as < %s | FileCheck %s --check-prefix=GAS
+; RUN: llc -mtriple=x86_64 -filetype=obj %s -o - | llvm-objdump -d - |  FileCheck %s --check-prefix=DISASM
 
+; GAS: nop;  .if . - foo==1; nop;.endif
 
 ; CHECK: <inline asm>:1:17: error: expected absolute expression
 
+; DISASM:      <main>:
+; DISASM-NEXT:   nop
+; DISASM-NEXT:   nop
+; DISASM-NEXT:   xorl %eax, %eax
+; DISASM-NEXT:   retq
+
 define i32 @main() local_unnamed_addr {
   tail call void asm sideeffect "foo: nop;  .if . - foo==1;  nop;.endif", "~{dirflag},~{fpsr},~{flags}"()
   ret i32 0
diff --git a/llvm/tools/llvm-mc/llvm-mc.cpp b/llvm/tools/llvm-mc/llvm-mc.cpp
index 807071a7b9a16a..506e4f22ef8f54 100644
--- a/llvm/tools/llvm-mc/llvm-mc.cpp
+++ b/llvm/tools/llvm-mc/llvm-mc.cpp
@@ -569,9 +569,6 @@ int main(int argc, char **argv) {
       Str->initSections(true, *STI);
   }
 
-  // Use Assembler information for parsing.
-  Str->setUseAssemblerInfoForParsing(true);
-
   int Res = 1;
   bool disassemble = false;
   switch (Action) {
diff --git a/llvm/tools/llvm-ml/llvm-ml.cpp b/llvm/tools/llvm-ml/llvm-ml.cpp
index 1cac576f54e77f..f1f39af059aa49 100644
--- a/llvm/tools/llvm-ml/llvm-ml.cpp
+++ b/llvm/tools/llvm-ml/llvm-ml.cpp
@@ -428,9 +428,6 @@ int llvm_ml_main(int Argc, char **Argv, const llvm::ToolContext &) {
     Str->emitAssignment(Feat00Sym, MCConstantExpr::create(Feat00Flags, Ctx));
   }
 
-  // Use Assembler information for parsing.
-  Str->setUseAssemblerInfoForParsing(true);
-
   int Res = 1;
   if (InputArgs.hasArg(OPT_as_lex)) {
     // -as-lex; Lex only, and output a stream of tokens

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code-changes look good, what I'm less sure about is whether this is the right thing to do unconditionally.

A few months ago I would likely to have said this wouldn't matter that much as most people don't use -S in there build systems so any failures are likely to have been of minor incovenience. However we recently encountered a customer using -save-temps on their main build, which produces assembly and then re-assembles it; in our case there was a problem with describing the target to the assembler so that step failed.

With this in mind I'm a bit reluctant to say that this wouldn't affect anyone's use case.

This could be worth a RFC on Discourse to see if there is wider support or just apathy?

@MaskRay
Copy link
Member Author

MaskRay commented May 7, 2024

This PR will address a Linux kernel need.
https://lore.kernel.org/all/CAFP8O3JkgQsH-4Lmr2W_teuvLjOCPi1htr9r3CO1O0yLyw=Azg@mail.gmail.com/

The code already assembles with clang -S -fno-integrated-as and gcc -c,
and this patch is to make clang -c work by removeing an unnecessary restriction in MCObjectStreamer.

I plan on making separate, more nuanced changes to MCAsmStreamer in the future,
(#62520 (comment)),
but I believe it is not worth holding back the MCObjectStreamer improvement.

Do you think this description is clear? I can create a Discourse post, but given the specific nature and subtlety of integrated assemblers, there might be limited general interest.

BTW: clang -c -save-temps using an .s immediate file and MCAsmStreamer
inherently has some differences with clang -c using MCObjectStreamer.
For example, I've fixed a debug info issue #75022
and some clangDriver stuff I forget now.

@smithp35
Copy link
Collaborator

smithp35 commented May 7, 2024

Thanks for the additional context. My main concern is that we're undoing the consensus of https://reviews.llvm.org/D45164 which if I've understood the comments properly was "There is a reasonable expectation that compiled (not assembled) code should be identical, or at least as close to the assembly output.

I'm not hugely concerned about that personally as I don't think there are any written guarantees and I come from a background of a toolchain that didn't come close to that (assembler output was disassembled from object file), however there were some strong opinions on the original change.

Do we have any strong opinions from the other reviewers?

If there is a RFC I suggest that it would be entitled something like "[RFC] Clang assembly/object equivalence for files with inline assembly". If it is worded in such a way that this is needed for the kernel and we want to check for community input then if there is no response then we can go ahead.

@MaskRay
Copy link
Member Author

MaskRay commented May 7, 2024

Thanks for the additional context. My main concern is that we're undoing the consensus of reviews.llvm.org/D45164 which if I've understood the comments properly was "There is a reasonable expectation that compiled (not assembled) code should be identical, or at least as close to the assembly output.

I'm not hugely concerned about that personally as I don't think there are any written guarantees and I come from a background of a toolchain that didn't come close to that (assembler output was disassembled from object file), however there were some strong opinions on the original change.

Do we have any strong opinions from the other reviewers?

If there is a RFC I suggest that it would be entitled something like "[RFC] Clang assembly/object equivalence for files with inline assembly". If it is worded in such a way that this is needed for the kernel and we want to check for community input then if there is no response then we can go ahead.

Thanks for the suggestion. I created https://discourse.llvm.org/t/rfc-clang-assembly-object-equivalence-for-files-with-inline-assembly/78841

@MaskRay
Copy link
Member Author

MaskRay commented May 14, 2024

Thanks for the additional context. My main concern is that we're undoing the consensus of reviews.llvm.org/D45164 which if I've understood the comments properly was "There is a reasonable expectation that compiled (not assembled) code should be identical, or at least as close to the assembly output.
I'm not hugely concerned about that personally as I don't think there are any written guarantees and I come from a background of a toolchain that didn't come close to that (assembler output was disassembled from object file), however there were some strong opinions on the original change.
Do we have any strong opinions from the other reviewers?
If there is a RFC I suggest that it would be entitled something like "[RFC] Clang assembly/object equivalence for files with inline assembly". If it is worded in such a way that this is needed for the kernel and we want to check for community input then if there is no response then we can go ahead.

Thanks for the suggestion. I created discourse.llvm.org/t/rfc-clang-assembly-object-equivalence-for-files-with-inline-assembly/78841

So far there has only been one reply from @efriedma-quic . Shall we go ahead?

I care less about error messages than outright miscompiles.

We don't have miscompiles, just the difference in whether a .if construct can be assembled.

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is reasonable to proceed given the RFC and the response.

Created using spr 1.3.5-bogner
@MaskRay MaskRay merged commit 03c53c6 into main May 15, 2024
4 of 5 checks passed
@MaskRay MaskRay deleted the users/MaskRay/spr/mc-remove-useassemblerinfoforparsing branch May 15, 2024 16:18
@kparzysz
Copy link
Contributor

It's still used:

/work/kparzysz/git/llvm.org/mlir/lib/Target/LLVM/ROCDL/Target.cpp: In member function ‘std::optional<llvm::SmallVector<char, 0> > mlir::ROCDL::SerializeGPUModuleBase::assembleIsa(llvm::StringRef)’:
/work/kparzysz/git/llvm.org/mlir/lib/Target/LLVM/ROCDL/Target.cpp:302:15: error:
 ‘class llvm::MCStreamer’ has no member named ‘setUseAssemblerInfoForParsing’
  302 |   mcStreamer->setUseAssemblerInfoForParsing(true);
      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@arsenm
Copy link
Contributor

arsenm commented May 15, 2024

It's still used:

/work/kparzysz/git/llvm.org/mlir/lib/Target/LLVM/ROCDL/Target.cpp: In member function ‘std::optional<llvm::SmallVector<char, 0> > mlir::ROCDL::SerializeGPUModuleBase::assembleIsa(llvm::StringRef)’:
/work/kparzysz/git/llvm.org/mlir/lib/Target/LLVM/ROCDL/Target.cpp:302:15: error:
 ‘class llvm::MCStreamer’ has no member named ‘setUseAssemblerInfoForParsing’
  302 |   mcStreamer->setUseAssemblerInfoForParsing(true);
      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

But why? I don't know what business MLIR could possibly have touching this, for AMDGPU of all things

@kparzysz
Copy link
Contributor

But why? I don't know what business MLIR could possibly have touching this, for AMDGPU of all things

I don't know why, but my build is failing now.

@kparzysz
Copy link
Contributor

Problem solved in 29c2475.

@nikic
Copy link
Contributor

nikic commented May 16, 2024

Reverted in fa750f0 due to large compile-time regressions in some cases, see https://llvm-compile-time-tracker.com/compare.php?from=9bbefb7f600019c9d7025281132dd160729bfff2&to=03c53c69a367008da689f0d2940e2197eb4a955c&stat=instructions:u. sqlite3 regresses by 5% in unoptimized builds.

@MaskRay MaskRay changed the title [MC] Remove UseAssemblerInfoForParsing [MC] Make UseAssemblerInfoForParsing mostly true May 20, 2024
@MaskRay
Copy link
Member Author

MaskRay commented May 20, 2024

Removing getUseAssemblerInfoForParsing would make MCDwarfFrameEmitter::Emit (for .eh_frame FDE) slow (~4% compile time regression for sqlite3.c amalgamation) due to expensive AttemptToFoldSymbolOffsetDifference. (See https://reviews.llvm.org/D153096)
For now, make UseAssemblerInfoForParsing false in MCDwarfFrameEmitter::Emit.

MaskRay added a commit that referenced this pull request May 20, 2024
Commit 6c0665e
(https://reviews.llvm.org/D45164) enabled certain constant expression
evaluation for `MCObjectStreamer` at parse time (e.g. `.if` directives,
see llvm/test/MC/AsmParser/assembler-expressions.s).

`getUseAssemblerInfoForParsing` was added to make `clang -c` handling
inline assembly similar to `MCAsmStreamer` (e.g. `llvm-mc -filetype=asm`),
where such expression folding (related to
`AttemptToFoldSymbolOffsetDifference`) is unavailable.

I believe this is overly conservative. We can make some parse-time
expression folding work for `clang -c` even if `clang -S` would still
report an error, a MCAsmStreamer issue (we cannot print `.if`
directives) that should not restrict the functionality of
MCObjectStreamer.

```
% cat b.cc
asm(R"(
.pushsection .text,"ax"
.globl _start; _start: ret
.if . -_start == 1
  ret
.endif
.popsection
)");
% gcc -S b.cc && gcc -c b.cc
% clang -S -fno-integrated-as b.cc     # succeeded

% clang -c b.cc     # succeeded with this patch
% clang -S b.cc     # still failed
<inline asm>:4:5: error: expected absolute expression
    4 | .if . -_start == 1
      |     ^
1 error generated.
```

However, removing `getUseAssemblerInfoForParsing` would make
MCDwarfFrameEmitter::Emit (for .eh_frame FDE) slow (~4% compile time
regression for sqlite3.c amalgamation) due to expensive
`AttemptToFoldSymbolOffsetDifference`. For now, make
`UseAssemblerInfoForParsing` false in MCDwarfFrameEmitter::Emit.

Close #62520
Link: https://discourse.llvm.org/t/rfc-clang-assembly-object-equivalence-for-files-with-inline-assembly/78841

Pull Request: #91082
@nikic
Copy link
Contributor

nikic commented May 20, 2024

@MaskRay It looks like the new version still causes large compile-time regressions for sqlite3 debug builds: http://llvm-compile-time-tracker.com/compare.php?from=7529fe2e92e79eef22a528a7168e4dd777d6e9bd&to=9500a5d02e23f9b43294e5f662ac099f8989c0e4&stat=instructions:u It's smaller than before (2% instead of 4%) but still there.

MaskRay added a commit that referenced this pull request May 20, 2024
Related to the poor performance of MCAssembler based constant folding
(see `bool MCExpr::evaluateAsAbsolute(int64_t &Res, const MCAssembler *Asm) const` and
`AttemptToFoldSymbolOffsetDifference`),
commit 9500a5d (#91082) caused -O0 -g
compile time regression.

9500a5d special cased .eh_frame FDE
emitting. This patch adds a special case to .debug_* emitting as well to
mitigate the rest regression.

The MCAssembler based constant folding strategy should be improved to
remove the two special cases.
@MaskRay
Copy link
Member Author

MaskRay commented May 20, 2024

@MaskRay It looks like the new version still causes large compile-time regressions for sqlite3 debug builds: http://llvm-compile-time-tracker.com/compare.php?from=7529fe2e92e79eef22a528a7168e4dd777d6e9bd&to=9500a5d02e23f9b43294e5f662ac099f8989c0e4&stat=instructions:u It's smaller than before (2% instead of 4%) but still there.

Thanks for the report. 245491a should fix the regression.


Related to the poor performance of MCAssembler based constant folding
(see bool MCExpr::evaluateAsAbsolute(int64_t &Res, const MCAssembler *Asm) const and
AttemptToFoldSymbolOffsetDifference),
commit 9500a5d caused -O0 -g compile time regression.

9500a5d special cased .eh_frame FDE emitting. We need a special case for .debug_* emitting as well to mitigate the rest regression.


The MCAssembler based constant folding strategy should be improved to remove the two special cases.

@nikic
Copy link
Contributor

nikic commented May 20, 2024

@MaskRay Thanks! That does fix the regression.

@gulfemsavrun
Copy link
Contributor

gulfemsavrun commented May 30, 2024

We started seeing a build time regression in our Clang toolchain builders on Mac, and I bisected to this commit:

  1. Build step took 1.7 hours in the last build before (clang -> build)
    https://luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-mac-x64/b8747439072851399889/overview

  2. Build step took 2.5 hours in the first build after (clang -> build)
    https://luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-mac-x64/b8747425464531858929/overview

@MaskRay
Copy link
Member Author

MaskRay commented May 30, 2024

We started seeing a build time regression in our Clang toolchain builders on Mac, and I bisected to this commit:

  1. Build step took 1.7 hours in the last build before (clang -> build)
    luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-mac-x64/b8747439072851399889/overview
  2. Build step took 2.5 hours in the first build after (clang -> build)
    luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-mac-x64/b8747425464531858929/overview

@gulfemsavrun
Commit 245491a ([MC] Disable MCAssembler based constant folding for DwarfDebug; independent of this patch) should have fixed the compile time regression. Might be worth testing the commit before and at 245491a to see the time decrease.

@gulfemsavrun
Copy link
Contributor

gulfemsavrun commented May 30, 2024

@gulfemsavrun Commit 245491a ([MC] Disable MCAssembler based constant folding for DwarfDebug; independent of this patch) should have fixed the compile time regression. Might be worth testing the commit before and at 245491a to see the time decrease.

@MaskRay I already checked with that patch, and we still have the same issue. We see this issue in llvm ToT. For example, this is a run from today:
https://luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-mac-x64/b8746512510687347537/overview

@MaskRay
Copy link
Member Author

MaskRay commented May 31, 2024

@gulfemsavrun Commit 245491a ([MC] Disable MCAssembler based constant folding for DwarfDebug; independent of this patch) should have fixed the compile time regression. Might be worth testing the commit before and at 245491a to see the time decrease.

@MaskRay I already checked with that patch, and we still have the same issue. We see this issue in llvm ToT. For example, this is a run from today: luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-mac-x64/b8746512510687347537/overview

This change exposed a latent performance issue (primarily due to -O1 or above + -g) in https://reviews.llvm.org/D153096 , which should be properly fixed by b06e736 (just landed). Can you verify?

245491a was a workaround, but I believe it effectively addressed the performance issue, and I am unable to detect compile time regression. I am curious if you can find a source file that demonstrates a slowdown with 245491a applied. The AttemptToFoldSymbolOffsetDifference function is likely related.

(The aforementioned commits modify different files and are independent.)

@gulfemsavrun
Copy link
Contributor

gulfemsavrun commented May 31, 2024

This change exposed a latent performance issue (primarily due to -O1 or above + -g) in https://reviews.llvm.org/D153096 , which should be properly fixed by b06e736 (just landed). Can you verify?

245491a was a workaround, but I believe it effectively addressed the performance issue, and I am unable to detect compile time regression. I am curious if you can find a source file that demonstrates a slowdown with 245491a applied. The AttemptToFoldSymbolOffsetDifference function is likely related.

(The aforementioned commits modify different files and are independent.)

We build with Full LTO, and linking takes much longer with this patch. For example, building bin/lldb-server takes almost 2.5x longer time now.

  1. Before
bin/lldb-server: 9 min 18 seconds
  1. After
bin/lldb-server: 22 min 24 seconds

Is there a specific source file or binary that might be interesting to look at so I can provide more information?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU backend:SPIR-V clang Clang issues not falling into any other category mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.if constant expression folding works for .s but not for inline assembly due to UseAssemblerInfoForParsing
7 participants