Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow 128-bit discriminants in DWARF variants #125578

Merged
merged 1 commit into from
Feb 4, 2025

Conversation

tromey
Copy link
Contributor

@tromey tromey commented Feb 3, 2025

If a variant part has a 128-bit discriminator, then
DwarfUnit::constructTypeDIE will assert. This patch fixes the problem
by allowing any size of integer to be used here. This is largely
accomplished by moving part of DwarfUnit::addConstantValue to a new
method.

Fixes #119655

Copy link

github-actions bot commented Feb 3, 2025

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Feb 3, 2025

@llvm/pr-subscribers-debuginfo

Author: Tom Tromey (tromey)

Changes

If a variant part has a 128-bit discriminator, then DwarfUnit::constructTypeDIE will assert. This patch fixes the problem by allowing any size of integer to be used here.

This is mostly implemented by copying DwarfUnit::addConstantValue. However, I did not reimplement that method in terms of the new addInt because that would introduce the need for unrelated test case changes.

Fixes #119655


Full diff: https://github.com/llvm/llvm-project/pull/125578.diff

3 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp (+34-6)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h (+4)
  • (modified) llvm/test/DebugInfo/Generic/discriminated-union.ll (+2-2)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index 0a8a1ad38c959f..2d97d8d483ff70 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -232,6 +232,38 @@ void DwarfUnit::addUInt(DIEValueList &Block, dwarf::Form Form,
   addUInt(Block, (dwarf::Attribute)0, Form, Integer);
 }
 
+void DwarfUnit::addInt(DIE &Die, dwarf::Attribute Attribute,
+		       const APInt &Val, bool Unsigned) {
+  unsigned CIBitWidth = Val.getBitWidth();
+  if (CIBitWidth <= 64) {
+    if (Unsigned)
+      addUInt(Die, Attribute, std::nullopt, Val.getZExtValue());
+    else
+      addSInt(Die, Attribute, std::nullopt, Val.getSExtValue());
+    return;
+  }
+
+  DIEBlock *Block = new (DIEValueAllocator) DIEBlock;
+
+  // Get the raw data form of the large APInt.
+  const uint64_t *Ptr64 = Val.getRawData();
+
+  int NumBytes = Val.getBitWidth() / 8; // 8 bits per byte.
+  bool LittleEndian = Asm->getDataLayout().isLittleEndian();
+
+  // Output the constant to DWARF one byte at a time.
+  for (int i = 0; i < NumBytes; i++) {
+    uint8_t c;
+    if (LittleEndian)
+      c = Ptr64[i / 8] >> (8 * (i & 7));
+    else
+      c = Ptr64[(NumBytes - 1 - i) / 8] >> (8 * ((NumBytes - 1 - i) & 7));
+    addUInt(*Block, dwarf::DW_FORM_data1, c);
+  }
+
+  addBlock(Die, Attribute, Block);
+}
+
 void DwarfUnit::addSInt(DIEValueList &Die, dwarf::Attribute Attribute,
                         std::optional<dwarf::Form> Form, int64_t Integer) {
   if (!Form)
@@ -972,12 +1004,8 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
           DIE &Variant = createAndAddDIE(dwarf::DW_TAG_variant, Buffer);
           if (const ConstantInt *CI =
               dyn_cast_or_null<ConstantInt>(DDTy->getDiscriminantValue())) {
-            if (DD->isUnsignedDIType(Discriminator->getBaseType()))
-              addUInt(Variant, dwarf::DW_AT_discr_value, std::nullopt,
-                      CI->getZExtValue());
-            else
-              addSInt(Variant, dwarf::DW_AT_discr_value, std::nullopt,
-                      CI->getSExtValue());
+	    addInt(Variant, dwarf::DW_AT_discr_value, CI->getValue(),
+		   DD->isUnsignedDIType(Discriminator->getBaseType()));
           }
           constructMemberDIE(Variant, DDTy);
         } else {
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
index 163205378fb4b6..51eabad6b3c8c5 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
@@ -167,6 +167,10 @@ class DwarfUnit : public DIEUnit {
 
   void addSInt(DIELoc &Die, std::optional<dwarf::Form> Form, int64_t Integer);
 
+  /// Add an integer attribute data and value; value may be any width.
+  void addInt(DIE &Die, dwarf::Attribute Attribute, const APInt &Integer,
+	      bool Unsigned);
+
   /// Add a string attribute data and value.
   ///
   /// We always emit a reference to the string pool instead of immediate
diff --git a/llvm/test/DebugInfo/Generic/discriminated-union.ll b/llvm/test/DebugInfo/Generic/discriminated-union.ll
index d267d9b029e950..592f2152ae6820 100644
--- a/llvm/test/DebugInfo/Generic/discriminated-union.ll
+++ b/llvm/test/DebugInfo/Generic/discriminated-union.ll
@@ -22,7 +22,7 @@
 ;         CHECK: DW_AT_alignment
 ;         CHECK: DW_AT_data_member_location [DW_FORM_data1]	(0x00)
 ;     CHECK: DW_TAG_variant
-;       CHECK: DW_AT_discr_value [DW_FORM_data1]	(0x00)
+;       CHECK: DW_AT_discr_value [DW_FORM_block1]	(<0x10> 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 )
 ;       CHECK: DW_TAG_member
 ;         CHECK: DW_AT_type
 ;         CHECK: DW_AT_alignment
@@ -71,7 +71,7 @@ attributes #0 = { nounwind uwtable }
 !21 = !DIBasicType(name: "u8", size: 8, encoding: DW_ATE_unsigned)
 !22 = !DIDerivedType(tag: DW_TAG_member, name: "__1", scope: !18, file: !7, baseType: !23, size: 64, align: 64)
 !23 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&u8", baseType: !21, size: 64, align: 64)
-!24 = !DIDerivedType(tag: DW_TAG_member, scope: !14, file: !7, baseType: !25, size: 128, align: 64, extraData: i64 0)
+!24 = !DIDerivedType(tag: DW_TAG_member, scope: !14, file: !7, baseType: !25, size: 128, align: 64, extraData: i128 18446744073709551616)
 !25 = !DICompositeType(tag: DW_TAG_structure_type, name: "Nope", scope: !12, file: !7, size: 128, align: 64, elements: !4, identifier: "7ce1efff6b82281ab9ceb730566e7e20::Nope")
 !27 = !DIBasicType(name: "u64", size: 64, encoding: DW_ATE_unsigned)
 !28 = !DIExpression()

@dwblaikie
Copy link
Collaborator

This is mostly implemented by copying DwarfUnit::addConstantValue. However, I did not reimplement that method in terms of the new addInt because that would introduce the need for unrelated test case changes.

I think that's probably still the right path - what unrelated test changes did you discover down this path?

Potentially doing this in two steps, even - one, an API change to DwarfUnit::addConstantValue take the DW_AT_ to use as a parameter. That change shouldn't cause any change to LLVM's behavior.

Then, after that, we could reuse the function from this new location.

@tromey
Copy link
Contributor Author

tromey commented Feb 4, 2025

I think that's probably still the right path - what unrelated test changes did you discover down this path?

If addConstantValue calls the new addInt, then different forms can be emitted in some tests. This is because addUInt and addSInt use DIEInteger::BestForm, while the uint64_t overload of addConstantValue has this:

  // FIXME: This is a bit conservative/simple - it emits negative values always
  // sign extended to 64 bits rather than minimizing the number of bytes.
  addUInt(Die, dwarf::DW_AT_const_value,
          Unsigned ? dwarf::DW_FORM_udata : dwarf::DW_FORM_sdata, Val);

I tend to think this FIXME can be removed, since DWARF recommends what LLVM is doing here. See 7.5.5 Classes and Forms subheading "constant":

If one of the DW_FORM_dataforms is used to represent a signed or unsigned
integer, it can be hard for a consumer to discover the context necessary to
determine which interpretation is intended. Producers are therefore strongly
encouraged to use DW_FORM_sdata or DW_FORM_udata for signed and
unsigned integers respectively, rather than DW_FORM_data.

Potentially doing this in two steps, even - one, an API change to DwarfUnit::addConstantValue take the DW_AT_ to use as a parameter. That change shouldn't cause any change to LLVM's behavior.

Then, after that, we could reuse the function from this new location.

My first take on the patch did this, but in the end I chose to add a new method because addConstantValue by its name seemed like it is intended to be specific to DW_AT_const_value.

There's also addConstantFPValue, where the aforementioned DWARF section seems to imply that DW_FORM_udata should not be used -- rather DW_FORM_data8 -- see the second paragraph. I feel this probably doesn't matter to readers in practice.

By "two steps" do you mean opening two separate pull requests? I'm new to LLVM and my impression is that patch series aren't really done, but I wanted to confirm.

Anyway I'm happy to proceed whatever way you like, just let me know, thanks.

@dwblaikie
Copy link
Collaborator

I think that's probably still the right path - what unrelated test changes did you discover down this path?

If addConstantValue calls the new addInt, then different forms can be emitted in some tests. This is because addUInt and addSInt use DIEInteger::BestForm, while the uint64_t overload of addConstantValue has this:

  // FIXME: This is a bit conservative/simple - it emits negative values always
  // sign extended to 64 bits rather than minimizing the number of bytes.
  addUInt(Die, dwarf::DW_AT_const_value,
          Unsigned ? dwarf::DW_FORM_udata : dwarf::DW_FORM_sdata, Val);

Ah, thanks for walking me through it... Yeah, thinking about alternatives (ways to pass in the form, etc) they all seem awkward. Perhaps only refactoring out the block code would be workable?

  void DwarfUnit::addConstantValue(DIE &Die, const APInt &Val, bool Unsigned) {
    unsigned CIBitWidth = Val.getBitWidth();
    if (CIBitWidth <= 64) {
      addConstantValue(Die, Unsigned,
                       Unsigned ? Val.getZExtValue() : Val.getSExtValue());
      return;
    }

    addIntAsBlock(Val);
  }
  void DwarfUnit::addIntAsBlock(DIE &Die, const APInt &Val) {
    DIEBlock *Block = new (DIEValueAllocator) DIEBlock;

    // Get the raw data form of the large APInt.
    const uint64_t *Ptr64 = Val.getRawData();

    int NumBytes = Val.getBitWidth() / 8; // 8 bits per byte.
    bool LittleEndian = Asm->getDataLayout().isLittleEndian();

    // Output the constant to DWARF one byte at a time.
    for (int i = 0; i < NumBytes; i++) {
      uint8_t c;
      if (LittleEndian)
        c = Ptr64[i / 8] >> (8 * (i & 7));
      else
        c = Ptr64[(NumBytes - 1 - i) / 8] >> (8 * ((NumBytes - 1 - i) & 7));
      addUInt(*Block, dwarf::DW_FORM_data1, c);
    }

    addBlock(Die, dwarf::DW_AT_const_value, Block);
  }

& reuse that from the new addInt?

I tend to think this FIXME can be removed, since DWARF recommends what LLVM is doing here. See 7.5.5 Classes and Forms subheading "constant":

If one of the DW_FORM_dataforms is used to represent a signed or unsigned
integer, it can be hard for a consumer to discover the context necessary to
determine which interpretation is intended. Producers are therefore strongly
encouraged to use DW_FORM_sdata or DW_FORM_udata for signed and
unsigned integers respectively, rather than DW_FORM_data.

Mixed feelings about that - it'd impact a lot of pretty simple form uses, like DW_AT_decl_file/line/column, and probably other places where it's obviously unsigned and may benefit from fixed-size DIEs (though I'm incrceasingly suspicious of the fixed-size DIE value, I think with several LEB forms added in DWARFv5 there aren't that many fixed-size DIEs anyway)

Potentially doing this in two steps, even - one, an API change to DwarfUnit::addConstantValue take the DW_AT_ to use as a parameter. That change shouldn't cause any change to LLVM's behavior.
Then, after that, we could reuse the function from this new location.

My first take on the patch did this, but in the end I chose to add a new method because addConstantValue by its name seemed like it is intended to be specific to DW_AT_const_value.

Yeah, I wouldn't feel too badly about generalizing that, possibly renaming - but given the above issues about forms, we'll leave that for another time.

There's also addConstantFPValue, where the aforementioned DWARF section seems to imply that DW_FORM_udata should not be used -- rather DW_FORM_data8 -- see the second paragraph. I feel this probably doesn't matter to readers in practice.

Yeah, seems harder to argue that signedness is valuable when it's just bits to be converted into an fp value anyway.

By "two steps" do you mean opening two separate pull requests? I'm new to LLVM and my impression is that patch series aren't really done, but I wanted to confirm.

Anyway I'm happy to proceed whatever way you like, just let me know, thanks.

Think one patch'll do. There are some ways to do stacked PRs in LLVM, but I haven't dabbled with them - I think I had in mind either separate (stacked) PRs, or separate commits in one PR, just to make it easier to isolate and review the changes, even if they would get committed as a single merged commit. We do prefer isolated changes, but small amounts of refactoring in a functional commit are fine.

Oh, out of curiosity: What's the motivation for this support?

@tromey
Copy link
Contributor Author

tromey commented Feb 4, 2025

Oh, out of curiosity: What's the motivation for this support?

It is twofold. First this particular patch enables 128-bit discriminants in Rust. I'm not sure how crucial this is, but anyway I stumbled across this because I'm improving the DWARF generation for Ada and it also needs the ability to add large constant attributes to a DIE.

If a variant part has a 128-bit discriminator, then
DwarfUnit::constructTypeDIE will assert.  This patch fixes the problem
by allowing any size of integer to be used here.  This is largely
accomplished by moving part of DwarfUnit::addConstantValue to a new
method.

Fixes llvm#119655
@tromey
Copy link
Contributor Author

tromey commented Feb 4, 2025

This version of the patch breaks out addIntAsBlock.

@pogo59
Copy link
Collaborator

pogo59 commented Feb 4, 2025

I'm incrceasingly suspicious of the fixed-size DIE value, I think with several LEB forms added in DWARFv5 there aren't that many fixed-size DIEs anyway

My memory from a previous discussion with @clayborg is that around half of DIEs were fixed-size. It's the main reason we have multiple fixed-size forms for indexes into .debug_addr, so those references didn't all have to be variable-size. And forms are cheap.

@tromey
Copy link
Contributor Author

tromey commented Feb 4, 2025

... I also updated the intro comment here in the PR.

Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

@dwblaikie dwblaikie merged commit 3c28076 into llvm:main Feb 4, 2025
4 of 6 checks passed
Copy link

github-actions bot commented Feb 4, 2025

@tromey Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@dwblaikie
Copy link
Collaborator

I'm incrceasingly suspicious of the fixed-size DIE value, I think with several LEB forms added in DWARFv5 there aren't that many fixed-size DIEs anyway

My memory from a previous discussion with @clayborg is that around half of DIEs were fixed-size. It's the main reason we have multiple fixed-size forms for indexes into .debug_addr, so those references didn't all have to be variable-size. And forms are cheap.

Yeah... funnily enough, LLVM doesn't use the fixed-size forms for addr indexes, we just use addrx... (though we do use strxN \o/): https://godbolt.org/z/Kv6b4dcaP

(been thinking about this lately, because of size problems due to recent proposed LLVM changes to add DW_AT_object_pointer to member function declarations (LLVM had only been emitting them into the definitions only, making it hard to identify member V non-member function declarations) which regressed size substantially - we discussed/might use an index into parameters rather than CU-relative DIE offset as an extension for LLDB, but I wonder if an SLEB128 DIE-relative offset could be a significant space savings (reducing ref4 down to often 1 byte) - I suppose if we were willing to do the relaxation at the DWARF creation level, rather than pushing it off on the assembler, we could use ref1/2/4/8 as needed, potentially)

@tromey tromey deleted the apint-discrim branch February 4, 2025 22:25
@beetrees
Copy link
Contributor

beetrees commented Feb 4, 2025

Would it be possible for this to be backported to LLVM 20? I've checked the LLVM GitHub User Guide for how to request a backport but it says the PR needs to be added to a release milestone first.

@rorth
Copy link
Collaborator

rorth commented Feb 5, 2025

This patch broke the Solaris/sparcv9 buildbot.

@tromey
Copy link
Contributor Author

tromey commented Feb 5, 2025

Would it be possible for this to be backported to LLVM 20? I've checked the LLVM GitHub User Guide for how to request a backport but it says the PR needs to be added to a release milestone first.

I don't know the answer, I just wanted to say that if there's something I should do here, feel free to just ping me. Thanks.

@beetrees
Copy link
Contributor

beetrees commented Feb 5, 2025

I don't think there's anything you need to do here: according the the guide all that needs to happen to request a backport is for someone who can edit PR milestones to add this PR (or the issue) to the LLVM 20.X Release milestone (maybe @dwblaikie is able to do that? I don't know which permissions are required) and then I can comment "/cherry-pick 3c28076 3492985" which will make @llvmbot automatically create a backport pull request.

@llvmbot
Copy link
Member

llvmbot commented Feb 5, 2025

I don't think there's anything you need to do here: according the the guide all that needs to happen to request a backport is for someone who can edit PR milestones to add this PR (or the issue) to the LLVM 20.X Release milestone (maybe @dwblaikie is able to do that? I don't know which permissions are required) and then I can comment "/cherry-pick 3c28076 3492985" which will make @llvmbot automatically create a backport pull request.

Error: Command failed due to missing milestone.

@dwblaikie dwblaikie added this to the LLVM 20.X Release milestone Feb 5, 2025
@beetrees
Copy link
Contributor

beetrees commented Feb 6, 2025

/cherry-pick 3c28076 3492985

@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2025

/pull-request #126029

swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request Feb 8, 2025
If a variant part has a 128-bit discriminator, then
DwarfUnit::constructTypeDIE will assert.  This patch fixes the problem
by allowing any size of integer to be used here.  This is largely
accomplished by moving part of DwarfUnit::addConstantValue to a new
method.

Fixes llvm#119655

(cherry picked from commit 3c28076)
swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request Feb 8, 2025
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
If a variant part has a 128-bit discriminator, then
DwarfUnit::constructTypeDIE will assert.  This patch fixes the problem
by allowing any size of integer to be used here.  This is largely
accomplished by moving part of DwarfUnit::addConstantValue to a new
method.

Fixes llvm#119655
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
maurer added a commit to maurer/rust that referenced this pull request Feb 12, 2025
Previously, we unconditionally set the bitwidth to 128-bits, the largest
an enum would possibly be. Then, LLVM would cut down the constant by
chopping off leading zeroes before emitting the DWARF. LLVM only
supported 64-bit enumerators, so this would also have occasionally
resulted in truncated data.

LLVM added support for 128-bit enumerators in llvm/llvm-project#125578

That patchset also trusts the constant to describe how wide the variant tag is.
As a result, we went from emitting tags that looked like:
DW_AT_discr_value     (0xfe)

(`form1`)

to emitting tags that looked like:
DW_AT_discr_value	(<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 )

This makes the `DW_AT_discr_value` encode at the bitwidth of the tag,
which:
1. Is probably closer to our intentions in terms of describing the data.
2. Doesn't invoke the 128-bit support which may not be supported by all
   debuggers / downstream tools.
3. Will result in smaller debug information.
maurer added a commit to maurer/rust that referenced this pull request Feb 12, 2025
Previously, we unconditionally set the bitwidth to 128-bits, the largest
an enum would possibly be. Then, LLVM would cut down the constant by
chopping off leading zeroes before emitting the DWARF. LLVM only
supported 64-bit enumerators, so this would also have occasionally
resulted in truncated data.

LLVM added support for 128-bit enumerators in llvm/llvm-project#125578

That patchset also trusts the constant to describe how wide the variant tag is.
As a result, we went from emitting tags that looked like:
DW_AT_discr_value     (0xfe)

(`form1`)

to emitting tags that looked like:
DW_AT_discr_value	(<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 )

This makes the `DW_AT_discr_value` encode at the bitwidth of the tag,
which:
1. Is probably closer to our intentions in terms of describing the data.
2. Doesn't invoke the 128-bit support which may not be supported by all
   debuggers / downstream tools.
3. Will result in smaller debug information.
maurer added a commit to maurer/rust that referenced this pull request Feb 12, 2025
Previously, we unconditionally set the bitwidth to 128-bits, the largest
an enum would possibly be. Then, LLVM would cut down the constant by
chopping off leading zeroes before emitting the DWARF. LLVM only
supported 64-bit enumerators, so this would also have occasionally
resulted in truncated data.

LLVM added support for 128-bit enumerators in llvm/llvm-project#125578

That patchset also trusts the constant to describe how wide the variant tag is.
As a result, we went from emitting tags that looked like:
DW_AT_discr_value     (0xfe)

(`form1`)

to emitting tags that looked like:
DW_AT_discr_value	(<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 )

This makes the `DW_AT_discr_value` encode at the bitwidth of the tag,
which:
1. Is probably closer to our intentions in terms of describing the data.
2. Doesn't invoke the 128-bit support which may not be supported by all
   debuggers / downstream tools.
3. Will result in smaller debug information.
maurer added a commit to maurer/rust that referenced this pull request Feb 12, 2025
Previously, we unconditionally set the bitwidth to 128-bits, the largest
an discrimnator would possibly be. Then, LLVM would cut down the constant by
chopping off leading zeroes before emitting the DWARF. LLVM only
supported 64-bit descriminators, so this would also have occasionally
resulted in truncated data (or an assert) if more than 64-bits were
used.

LLVM added support for 128-bit enumerators in llvm/llvm-project#125578

That patchset also trusts the constant to describe how wide the variant tag is.
As a result, we went from emitting tags that looked like:
DW_AT_discr_value     (0xfe)

(`form1`)

to emitting tags that looked like:
DW_AT_discr_value	(<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 )

This makes the `DW_AT_discr_value` encode at the bitwidth of the tag,
which:
1. Is probably closer to our intentions in terms of describing the data.
2. Doesn't invoke the 128-bit support which may not be supported by all
   debuggers / downstream tools.
3. Will result in smaller debug information.
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Feb 14, 2025
debuginfo: Set bitwidth appropriately in enum variant tags

Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data.

LLVM added support for 128-bit enumerators in llvm/llvm-project#125578

That patchset trusts the constant to describe how wide the variant tag is, so the high 64-bits of zeros are considered potentially load-bearing.

As a result, we went from emitting tags that looked like:
DW_AT_discr_value     (0xfe)

(because `dwarf::BestForm` selected `data1`)

to emitting tags that looked like:
DW_AT_discr_value	(<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 )

This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which:
1. Is probably closer to our intentions in terms of describing the data.
2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools.
3. Will result in smaller debug information.
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Feb 14, 2025
debuginfo: Set bitwidth appropriately in enum variant tags

Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data.

LLVM added support for 128-bit enumerators in llvm/llvm-project#125578

That patchset trusts the constant to describe how wide the variant tag is, so the high 64-bits of zeros are considered potentially load-bearing.

As a result, we went from emitting tags that looked like:
DW_AT_discr_value     (0xfe)

(because `dwarf::BestForm` selected `data1`)

to emitting tags that looked like:
DW_AT_discr_value	(<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 )

This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which:
1. Is probably closer to our intentions in terms of describing the data.
2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools.
3. Will result in smaller debug information.
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Feb 14, 2025
Rollup merge of rust-lang#136895 - maurer:fix-enum-discr, r=nikic

debuginfo: Set bitwidth appropriately in enum variant tags

Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data.

LLVM added support for 128-bit enumerators in llvm/llvm-project#125578

That patchset trusts the constant to describe how wide the variant tag is, so the high 64-bits of zeros are considered potentially load-bearing.

As a result, we went from emitting tags that looked like:
DW_AT_discr_value     (0xfe)

(because `dwarf::BestForm` selected `data1`)

to emitting tags that looked like:
DW_AT_discr_value	(<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 )

This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which:
1. Is probably closer to our intentions in terms of describing the data.
2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools.
3. Will result in smaller debug information.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
6 participants