-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow 128-bit discriminants in DWARF variants #125578
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-debuginfo Author: Tom Tromey (tromey) ChangesIf a variant part has a 128-bit discriminator, then DwarfUnit::constructTypeDIE will assert. This patch fixes the problem by allowing any size of integer to be used here. This is mostly implemented by copying DwarfUnit::addConstantValue. However, I did not reimplement that method in terms of the new addInt because that would introduce the need for unrelated test case changes. Fixes #119655 Full diff: https://github.com/llvm/llvm-project/pull/125578.diff 3 Files Affected:
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index 0a8a1ad38c959f..2d97d8d483ff70 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -232,6 +232,38 @@ void DwarfUnit::addUInt(DIEValueList &Block, dwarf::Form Form,
addUInt(Block, (dwarf::Attribute)0, Form, Integer);
}
+void DwarfUnit::addInt(DIE &Die, dwarf::Attribute Attribute,
+ const APInt &Val, bool Unsigned) {
+ unsigned CIBitWidth = Val.getBitWidth();
+ if (CIBitWidth <= 64) {
+ if (Unsigned)
+ addUInt(Die, Attribute, std::nullopt, Val.getZExtValue());
+ else
+ addSInt(Die, Attribute, std::nullopt, Val.getSExtValue());
+ return;
+ }
+
+ DIEBlock *Block = new (DIEValueAllocator) DIEBlock;
+
+ // Get the raw data form of the large APInt.
+ const uint64_t *Ptr64 = Val.getRawData();
+
+ int NumBytes = Val.getBitWidth() / 8; // 8 bits per byte.
+ bool LittleEndian = Asm->getDataLayout().isLittleEndian();
+
+ // Output the constant to DWARF one byte at a time.
+ for (int i = 0; i < NumBytes; i++) {
+ uint8_t c;
+ if (LittleEndian)
+ c = Ptr64[i / 8] >> (8 * (i & 7));
+ else
+ c = Ptr64[(NumBytes - 1 - i) / 8] >> (8 * ((NumBytes - 1 - i) & 7));
+ addUInt(*Block, dwarf::DW_FORM_data1, c);
+ }
+
+ addBlock(Die, Attribute, Block);
+}
+
void DwarfUnit::addSInt(DIEValueList &Die, dwarf::Attribute Attribute,
std::optional<dwarf::Form> Form, int64_t Integer) {
if (!Form)
@@ -972,12 +1004,8 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
DIE &Variant = createAndAddDIE(dwarf::DW_TAG_variant, Buffer);
if (const ConstantInt *CI =
dyn_cast_or_null<ConstantInt>(DDTy->getDiscriminantValue())) {
- if (DD->isUnsignedDIType(Discriminator->getBaseType()))
- addUInt(Variant, dwarf::DW_AT_discr_value, std::nullopt,
- CI->getZExtValue());
- else
- addSInt(Variant, dwarf::DW_AT_discr_value, std::nullopt,
- CI->getSExtValue());
+ addInt(Variant, dwarf::DW_AT_discr_value, CI->getValue(),
+ DD->isUnsignedDIType(Discriminator->getBaseType()));
}
constructMemberDIE(Variant, DDTy);
} else {
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
index 163205378fb4b6..51eabad6b3c8c5 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
@@ -167,6 +167,10 @@ class DwarfUnit : public DIEUnit {
void addSInt(DIELoc &Die, std::optional<dwarf::Form> Form, int64_t Integer);
+ /// Add an integer attribute data and value; value may be any width.
+ void addInt(DIE &Die, dwarf::Attribute Attribute, const APInt &Integer,
+ bool Unsigned);
+
/// Add a string attribute data and value.
///
/// We always emit a reference to the string pool instead of immediate
diff --git a/llvm/test/DebugInfo/Generic/discriminated-union.ll b/llvm/test/DebugInfo/Generic/discriminated-union.ll
index d267d9b029e950..592f2152ae6820 100644
--- a/llvm/test/DebugInfo/Generic/discriminated-union.ll
+++ b/llvm/test/DebugInfo/Generic/discriminated-union.ll
@@ -22,7 +22,7 @@
; CHECK: DW_AT_alignment
; CHECK: DW_AT_data_member_location [DW_FORM_data1] (0x00)
; CHECK: DW_TAG_variant
-; CHECK: DW_AT_discr_value [DW_FORM_data1] (0x00)
+; CHECK: DW_AT_discr_value [DW_FORM_block1] (<0x10> 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 )
; CHECK: DW_TAG_member
; CHECK: DW_AT_type
; CHECK: DW_AT_alignment
@@ -71,7 +71,7 @@ attributes #0 = { nounwind uwtable }
!21 = !DIBasicType(name: "u8", size: 8, encoding: DW_ATE_unsigned)
!22 = !DIDerivedType(tag: DW_TAG_member, name: "__1", scope: !18, file: !7, baseType: !23, size: 64, align: 64)
!23 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&u8", baseType: !21, size: 64, align: 64)
-!24 = !DIDerivedType(tag: DW_TAG_member, scope: !14, file: !7, baseType: !25, size: 128, align: 64, extraData: i64 0)
+!24 = !DIDerivedType(tag: DW_TAG_member, scope: !14, file: !7, baseType: !25, size: 128, align: 64, extraData: i128 18446744073709551616)
!25 = !DICompositeType(tag: DW_TAG_structure_type, name: "Nope", scope: !12, file: !7, size: 128, align: 64, elements: !4, identifier: "7ce1efff6b82281ab9ceb730566e7e20::Nope")
!27 = !DIBasicType(name: "u64", size: 64, encoding: DW_ATE_unsigned)
!28 = !DIExpression()
|
I think that's probably still the right path - what unrelated test changes did you discover down this path? Potentially doing this in two steps, even - one, an API change to Then, after that, we could reuse the function from this new location. |
If
I tend to think this FIXME can be removed, since DWARF recommends what LLVM is doing here. See 7.5.5 Classes and Forms subheading "constant":
My first take on the patch did this, but in the end I chose to add a new method because There's also By "two steps" do you mean opening two separate pull requests? I'm new to LLVM and my impression is that patch series aren't really done, but I wanted to confirm. Anyway I'm happy to proceed whatever way you like, just let me know, thanks. |
Ah, thanks for walking me through it... Yeah, thinking about alternatives (ways to pass in the form, etc) they all seem awkward. Perhaps only refactoring out the block code would be workable?
& reuse that from the new
Mixed feelings about that - it'd impact a lot of pretty simple form uses, like DW_AT_decl_file/line/column, and probably other places where it's obviously unsigned and may benefit from fixed-size DIEs (though I'm incrceasingly suspicious of the fixed-size DIE value, I think with several LEB forms added in DWARFv5 there aren't that many fixed-size DIEs anyway)
Yeah, I wouldn't feel too badly about generalizing that, possibly renaming - but given the above issues about forms, we'll leave that for another time.
Yeah, seems harder to argue that signedness is valuable when it's just bits to be converted into an fp value anyway.
Think one patch'll do. There are some ways to do stacked PRs in LLVM, but I haven't dabbled with them - I think I had in mind either separate (stacked) PRs, or separate commits in one PR, just to make it easier to isolate and review the changes, even if they would get committed as a single merged commit. We do prefer isolated changes, but small amounts of refactoring in a functional commit are fine. Oh, out of curiosity: What's the motivation for this support? |
It is twofold. First this particular patch enables 128-bit discriminants in Rust. I'm not sure how crucial this is, but anyway I stumbled across this because I'm improving the DWARF generation for Ada and it also needs the ability to add large constant attributes to a DIE. |
If a variant part has a 128-bit discriminator, then DwarfUnit::constructTypeDIE will assert. This patch fixes the problem by allowing any size of integer to be used here. This is largely accomplished by moving part of DwarfUnit::addConstantValue to a new method. Fixes llvm#119655
This version of the patch breaks out |
My memory from a previous discussion with @clayborg is that around half of DIEs were fixed-size. It's the main reason we have multiple fixed-size forms for indexes into .debug_addr, so those references didn't all have to be variable-size. And forms are cheap. |
... I also updated the intro comment here in the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks!
@tromey Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
Yeah... funnily enough, LLVM doesn't use the fixed-size forms for addr indexes, we just use addrx... (though we do use strxN \o/): https://godbolt.org/z/Kv6b4dcaP (been thinking about this lately, because of size problems due to recent proposed LLVM changes to add DW_AT_object_pointer to member function declarations (LLVM had only been emitting them into the definitions only, making it hard to identify member V non-member function declarations) which regressed size substantially - we discussed/might use an index into parameters rather than CU-relative DIE offset as an extension for LLDB, but I wonder if an SLEB128 DIE-relative offset could be a significant space savings (reducing ref4 down to often 1 byte) - I suppose if we were willing to do the relaxation at the DWARF creation level, rather than pushing it off on the assembler, we could use ref1/2/4/8 as needed, potentially) |
Would it be possible for this to be backported to LLVM 20? I've checked the LLVM GitHub User Guide for how to request a backport but it says the PR needs to be added to a release milestone first. |
This patch broke the Solaris/sparcv9 buildbot. |
…n targets (#125849) Fixes the failure of the [Solaris/sparcv9 buildbot](https://lab.llvm.org/buildbot/#/builders/13/builds/5103) caused by #125578. cc @rorth @tromey @dwblaikie
I don't know the answer, I just wanted to say that if there's something I should do here, feel free to just ping me. Thanks. |
I don't think there's anything you need to do here: according the the guide all that needs to happen to request a backport is for someone who can edit PR milestones to add this PR (or the issue) to the LLVM 20.X Release milestone (maybe @dwblaikie is able to do that? I don't know which permissions are required) and then I can comment "/cherry-pick 3c28076 3492985" which will make @llvmbot automatically create a backport pull request. |
Error: Command failed due to missing milestone. |
/pull-request #126029 |
If a variant part has a 128-bit discriminator, then DwarfUnit::constructTypeDIE will assert. This patch fixes the problem by allowing any size of integer to be used here. This is largely accomplished by moving part of DwarfUnit::addConstantValue to a new method. Fixes llvm#119655 (cherry picked from commit 3c28076)
…n targets (llvm#125849) Fixes the failure of the [Solaris/sparcv9 buildbot](https://lab.llvm.org/buildbot/#/builders/13/builds/5103) caused by llvm#125578. cc @rorth @tromey @dwblaikie (cherry picked from commit 3492985)
If a variant part has a 128-bit discriminator, then DwarfUnit::constructTypeDIE will assert. This patch fixes the problem by allowing any size of integer to be used here. This is largely accomplished by moving part of DwarfUnit::addConstantValue to a new method. Fixes llvm#119655
…n targets (llvm#125849) Fixes the failure of the [Solaris/sparcv9 buildbot](https://lab.llvm.org/buildbot/#/builders/13/builds/5103) caused by llvm#125578. cc @rorth @tromey @dwblaikie
Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data. LLVM added support for 128-bit enumerators in llvm/llvm-project#125578 That patchset also trusts the constant to describe how wide the variant tag is. As a result, we went from emitting tags that looked like: DW_AT_discr_value (0xfe) (`form1`) to emitting tags that looked like: DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ) This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which: 1. Is probably closer to our intentions in terms of describing the data. 2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools. 3. Will result in smaller debug information.
Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data. LLVM added support for 128-bit enumerators in llvm/llvm-project#125578 That patchset also trusts the constant to describe how wide the variant tag is. As a result, we went from emitting tags that looked like: DW_AT_discr_value (0xfe) (`form1`) to emitting tags that looked like: DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ) This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which: 1. Is probably closer to our intentions in terms of describing the data. 2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools. 3. Will result in smaller debug information.
Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data. LLVM added support for 128-bit enumerators in llvm/llvm-project#125578 That patchset also trusts the constant to describe how wide the variant tag is. As a result, we went from emitting tags that looked like: DW_AT_discr_value (0xfe) (`form1`) to emitting tags that looked like: DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ) This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which: 1. Is probably closer to our intentions in terms of describing the data. 2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools. 3. Will result in smaller debug information.
Previously, we unconditionally set the bitwidth to 128-bits, the largest an discrimnator would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit descriminators, so this would also have occasionally resulted in truncated data (or an assert) if more than 64-bits were used. LLVM added support for 128-bit enumerators in llvm/llvm-project#125578 That patchset also trusts the constant to describe how wide the variant tag is. As a result, we went from emitting tags that looked like: DW_AT_discr_value (0xfe) (`form1`) to emitting tags that looked like: DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ) This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which: 1. Is probably closer to our intentions in terms of describing the data. 2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools. 3. Will result in smaller debug information.
debuginfo: Set bitwidth appropriately in enum variant tags Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data. LLVM added support for 128-bit enumerators in llvm/llvm-project#125578 That patchset trusts the constant to describe how wide the variant tag is, so the high 64-bits of zeros are considered potentially load-bearing. As a result, we went from emitting tags that looked like: DW_AT_discr_value (0xfe) (because `dwarf::BestForm` selected `data1`) to emitting tags that looked like: DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ) This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which: 1. Is probably closer to our intentions in terms of describing the data. 2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools. 3. Will result in smaller debug information.
debuginfo: Set bitwidth appropriately in enum variant tags Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data. LLVM added support for 128-bit enumerators in llvm/llvm-project#125578 That patchset trusts the constant to describe how wide the variant tag is, so the high 64-bits of zeros are considered potentially load-bearing. As a result, we went from emitting tags that looked like: DW_AT_discr_value (0xfe) (because `dwarf::BestForm` selected `data1`) to emitting tags that looked like: DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ) This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which: 1. Is probably closer to our intentions in terms of describing the data. 2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools. 3. Will result in smaller debug information.
Rollup merge of rust-lang#136895 - maurer:fix-enum-discr, r=nikic debuginfo: Set bitwidth appropriately in enum variant tags Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data. LLVM added support for 128-bit enumerators in llvm/llvm-project#125578 That patchset trusts the constant to describe how wide the variant tag is, so the high 64-bits of zeros are considered potentially load-bearing. As a result, we went from emitting tags that looked like: DW_AT_discr_value (0xfe) (because `dwarf::BestForm` selected `data1`) to emitting tags that looked like: DW_AT_discr_value (<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ) This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which: 1. Is probably closer to our intentions in terms of describing the data. 2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools. 3. Will result in smaller debug information.
If a variant part has a 128-bit discriminator, then
DwarfUnit::constructTypeDIE will assert. This patch fixes the problem
by allowing any size of integer to be used here. This is largely
accomplished by moving part of DwarfUnit::addConstantValue to a new
method.
Fixes #119655