Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] Add format eFormatEnumWithValues to ensure raw enum value is always shown #90059

Closed
wants to merge 1 commit into from

Conversation

DavidSpickett
Copy link
Collaborator

@DavidSpickett DavidSpickett commented Apr 25, 2024

When an enum is used to represent certain data it can be useful to know
its name and the value of it. For instance, register fields are
often set in source code as numbers, but in the debugger you'd like
to see the meaning as well.

(lldb) register read fpcr
fpcr = 0x00000000
= (... RMode = RN (0), ...)

Often you do just want the meaning but the value saves you having
to manually decode it if you want to confirm what your source code
has done, or try to replicate the current state in your source code.

This also works for bitfield like enums, with the added change
that if a bitfield like enum has the value 0, we will print 0 if
asked to always show a value. Normally we don't print a 0 there
because 0 means no flags are set.

I did not intend to make this new format available to the user,
but it ended up being so. So you can do:

expression --format "enumeration with values" -- foo

So testing is mainly from c++ but I have added a couple to the
Python tests.

@llvmbot
Copy link
Member

llvmbot commented Apr 25, 2024

@llvm/pr-subscribers-lldb

Author: David Spickett (DavidSpickett)

Changes

When an enum is used to represent certain data it can be useful to know its name and the value of it. For instance, register fields are often set in source code as numbers, but in the debugger you'd like to see the meaning as well.

(lldb) register read fpcr
fpcr = 0x00000000
= (... RMode = RN (0), ...)

Often you do just want the meaning but the value saves you having to manually decode it if you want to confirm what your source code has done, or try to replicate the current state in your source code.

I have not added tests for this because right now the use case for this is registers and those will have their own test cases to cover this. If we decide to expose this to formatters then this will need more testing.


Full diff: https://github.com/llvm/llvm-project/pull/90059.diff

7 Files Affected:

  • (modified) lldb/include/lldb/Core/ValueObject.h (+7)
  • (modified) lldb/include/lldb/Symbol/CompilerType.h (+2-1)
  • (modified) lldb/include/lldb/Symbol/TypeSystem.h (+2-1)
  • (modified) lldb/source/DataFormatters/TypeFormat.cpp (+1-1)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+14-7)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h (+2-1)
  • (modified) lldb/source/Symbol/CompilerType.cpp (+3-2)
diff --git a/lldb/include/lldb/Core/ValueObject.h b/lldb/include/lldb/Core/ValueObject.h
index e7e35e2b2bffc0..36a6321428ec05 100644
--- a/lldb/include/lldb/Core/ValueObject.h
+++ b/lldb/include/lldb/Core/ValueObject.h
@@ -757,6 +757,12 @@ class ValueObject {
 
   AddressType GetAddressTypeOfChildren();
 
+  void SetEnumsAlwaysShowValue(bool always) {
+    m_enums_always_show_value = always;
+  }
+
+  bool GetEnumsAlwaysShowValue() { return m_enums_always_show_value; }
+
   void SetHasCompleteType() {
     m_flags.m_did_calculate_complete_objc_class_type = true;
   }
@@ -889,6 +895,7 @@ class ValueObject {
   lldb::SyntheticChildrenSP m_synthetic_children_sp;
   ProcessModID m_user_id_of_forced_summary;
   AddressType m_address_type_of_ptr_or_ref_children = eAddressTypeInvalid;
+  bool m_enums_always_show_value = false;
 
   llvm::SmallVector<uint8_t, 16> m_value_checksum;
 
diff --git a/lldb/include/lldb/Symbol/CompilerType.h b/lldb/include/lldb/Symbol/CompilerType.h
index b71c531f21633a..9e26e4c5f7b93d 100644
--- a/lldb/include/lldb/Symbol/CompilerType.h
+++ b/lldb/include/lldb/Symbol/CompilerType.h
@@ -490,7 +490,8 @@ class CompilerType {
   bool DumpTypeValue(Stream *s, lldb::Format format, const DataExtractor &data,
                      lldb::offset_t data_offset, size_t data_byte_size,
                      uint32_t bitfield_bit_size, uint32_t bitfield_bit_offset,
-                     ExecutionContextScope *exe_scope);
+                     ExecutionContextScope *exe_scope,
+                     bool enums_always_show_value = false);
 
   /// Dump to stdout.
   void DumpTypeDescription(lldb::DescriptionLevel level =
diff --git a/lldb/include/lldb/Symbol/TypeSystem.h b/lldb/include/lldb/Symbol/TypeSystem.h
index 3a927d313b823d..3395442f5b3f56 100644
--- a/lldb/include/lldb/Symbol/TypeSystem.h
+++ b/lldb/include/lldb/Symbol/TypeSystem.h
@@ -398,7 +398,8 @@ class TypeSystem : public PluginInterface,
                              lldb::offset_t data_offset, size_t data_byte_size,
                              uint32_t bitfield_bit_size,
                              uint32_t bitfield_bit_offset,
-                             ExecutionContextScope *exe_scope) = 0;
+                             ExecutionContextScope *exe_scope,
+                             bool enums_always_show_value = false) = 0;
 
   /// Dump the type to stdout.
   virtual void DumpTypeDescription(
diff --git a/lldb/source/DataFormatters/TypeFormat.cpp b/lldb/source/DataFormatters/TypeFormat.cpp
index 409c452110bddc..2898b4394891c3 100644
--- a/lldb/source/DataFormatters/TypeFormat.cpp
+++ b/lldb/source/DataFormatters/TypeFormat.cpp
@@ -108,7 +108,7 @@ bool TypeFormatImpl_Format::FormatObject(ValueObject *valobj,
             *size,                          // Byte size of item in "m_data"
             valobj->GetBitfieldBitSize(),   // Bitfield bit size
             valobj->GetBitfieldBitOffset(), // Bitfield bit offset
-            exe_scope);
+            exe_scope, valobj->GetEnumsAlwaysShowValue());
         // Given that we do not want to set the ValueObject's m_error for a
         // formatting error (or else we wouldn't be able to reformat until a
         // next update), an empty string is treated as a "false" return from
diff --git a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
index 2621f682011b41..b1dbf7fbca1b92 100644
--- a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
+++ b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
@@ -8474,7 +8474,7 @@ void TypeSystemClang::DumpFromSymbolFile(Stream &s,
 static bool DumpEnumValue(const clang::QualType &qual_type, Stream &s,
                           const DataExtractor &data, lldb::offset_t byte_offset,
                           size_t byte_size, uint32_t bitfield_bit_offset,
-                          uint32_t bitfield_bit_size) {
+                          uint32_t bitfield_bit_size, bool always_show_value) {
   const clang::EnumType *enutype =
       llvm::cast<clang::EnumType>(qual_type.getTypePtr());
   const clang::EnumDecl *enum_decl = enutype->getDecl();
@@ -8501,7 +8501,11 @@ static bool DumpEnumValue(const clang::QualType &qual_type, Stream &s,
     ++num_enumerators;
     if (val == enum_svalue) {
       // Found an exact match, that's all we need to do.
-      s.PutCString(enumerator->getNameAsString());
+      if (always_show_value)
+        s.Printf("%s (%" PRIi64 ")", enumerator->getNameAsString().c_str(),
+                 enum_svalue);
+      else
+        s.PutCString(enumerator->getNameAsString());
       return true;
     }
   }
@@ -8556,7 +8560,7 @@ bool TypeSystemClang::DumpTypeValue(
     lldb::opaque_compiler_type_t type, Stream &s, lldb::Format format,
     const lldb_private::DataExtractor &data, lldb::offset_t byte_offset,
     size_t byte_size, uint32_t bitfield_bit_size, uint32_t bitfield_bit_offset,
-    ExecutionContextScope *exe_scope) {
+    ExecutionContextScope *exe_scope, bool enums_always_show_value) {
   if (!type)
     return false;
   if (IsAggregateType(type)) {
@@ -8568,8 +8572,10 @@ bool TypeSystemClang::DumpTypeValue(
 
     if (type_class == clang::Type::Elaborated) {
       qual_type = llvm::cast<clang::ElaboratedType>(qual_type)->getNamedType();
-      return DumpTypeValue(qual_type.getAsOpaquePtr(), s, format, data, byte_offset, byte_size,
-                           bitfield_bit_size, bitfield_bit_offset, exe_scope);
+      return DumpTypeValue(qual_type.getAsOpaquePtr(), s, format, data,
+                           byte_offset, byte_size, bitfield_bit_size,
+                           bitfield_bit_offset, exe_scope,
+                           enums_always_show_value);
     }
 
     switch (type_class) {
@@ -8595,7 +8601,7 @@ bool TypeSystemClang::DumpTypeValue(
                              // treat as a bitfield
           bitfield_bit_offset, // Offset in bits of a bitfield value if
                                // bitfield_bit_size != 0
-          exe_scope);
+          exe_scope, enums_always_show_value);
     } break;
 
     case clang::Type::Enum:
@@ -8604,7 +8610,8 @@ bool TypeSystemClang::DumpTypeValue(
       if ((format == eFormatEnum || format == eFormatDefault) &&
           GetCompleteType(type))
         return DumpEnumValue(qual_type, s, data, byte_offset, byte_size,
-                             bitfield_bit_offset, bitfield_bit_size);
+                             bitfield_bit_offset, bitfield_bit_size,
+                             enums_always_show_value);
       // format was not enum, just fall through and dump the value as
       // requested....
       [[fallthrough]];
diff --git a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h
index 68b82e9688f12b..c3a4abf39016ca 100644
--- a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h
+++ b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h
@@ -1047,7 +1047,8 @@ class TypeSystemClang : public TypeSystem {
                      lldb::Format format, const DataExtractor &data,
                      lldb::offset_t data_offset, size_t data_byte_size,
                      uint32_t bitfield_bit_size, uint32_t bitfield_bit_offset,
-                     ExecutionContextScope *exe_scope) override;
+                     ExecutionContextScope *exe_scope,
+                     bool enums_always_show_value = false) override;
 
   void DumpTypeDescription(
       lldb::opaque_compiler_type_t type,
diff --git a/lldb/source/Symbol/CompilerType.cpp b/lldb/source/Symbol/CompilerType.cpp
index 96e74b890d2d90..2d9b7aad275a63 100644
--- a/lldb/source/Symbol/CompilerType.cpp
+++ b/lldb/source/Symbol/CompilerType.cpp
@@ -1016,12 +1016,13 @@ bool CompilerType::DumpTypeValue(Stream *s, lldb::Format format,
                                  lldb::offset_t byte_offset, size_t byte_size,
                                  uint32_t bitfield_bit_size,
                                  uint32_t bitfield_bit_offset,
-                                 ExecutionContextScope *exe_scope) {
+                                 ExecutionContextScope *exe_scope,
+                                 bool enums_always_show_value) {
   if (IsValid())
     if (auto type_system_sp = GetTypeSystem())
       return type_system_sp->DumpTypeValue(
           m_type, *s, format, data, byte_offset, byte_size, bitfield_bit_size,
-          bitfield_bit_offset, exe_scope);
+          bitfield_bit_offset, exe_scope, enums_always_show_value);
   return false;
 }
 

@DavidSpickett
Copy link
Collaborator Author

This is the first change for https://discourse.llvm.org/t/rfc-adding-register-field-enums-to-lldb/77275, though there has been a change of plans and now I'm implementing existing GDB features to get the same result.

It is a subset of #69815, only for use by register printing (ping @Endilll).

I have not added tests for this because right now the use case for this is registers

Also I wasn't sure if I could test this given only C++ can set this right now, but I'll add tests if someone knows where they might go.

@Endilll
Copy link
Contributor

Endilll commented Apr 25, 2024

I'm excited so see changes in this area!

It is a subset of #69815, only for use by register printing (ping @Endilll).

Yeah, that effort is stalled because I don't see a clear path forward there, and I'm lacking energy to find and push for one. I guess I can try to piggyback on this PR if it goes in the direction I need (I hope you don't mind :).

Seriously, though, I'm not sure how your design with a flag orthogonal to format fares against the feedback I was given back then. It's also not clear how this new "presentation mode" is going to be controlled. While it's not in the scope of this PR, it would be nice if you can share a bigger picture.

Another thing this PR doesn't seem to touch is flag-like enums. Even if you don't want to add a numerical value there, because it'd be too much information to present, it still would be nice to see it explicitly considered.

@DavidSpickett
Copy link
Collaborator Author

It's also not clear how this new "presentation mode" is going to be controlled. While it's not in the scope of this PR, it would be nice if you can share a bigger picture.

For me the picture is limited to register printing. I'll be setting this new option on the register type when printing (DavidSpickett@24dbefa#diff-18135f619417bbaa1ab0c181ce55b2c55681323c06e90fa1a3e16099c7176e21). So only from C++ for now.

Another thing this PR doesn't seem to touch is flag-like enums. Even if you don't want to add a numerical value there, because it'd be too much information to present, it still would be nice to see it explicitly considered.

What is a flag like enum?

@Endilll
Copy link
Contributor

Endilll commented Apr 25, 2024

For me the picture is limited to register printing. I'll be setting this new option on the register type when printing (DavidSpickett@24dbefa#diff-18135f619417bbaa1ab0c181ce55b2c55681323c06e90fa1a3e16099c7176e21). So only from C++ for now.

So it's hard-coded, I see.

What is a flag like enum?

enum { a = 1, b = 2, c = 4, d = 8} — this sort of enumeration. Sorry if my terminology was confusing. They are handled further in DumpEnumValue(), after the changes you've made here, and called bit-fields in the comments.

@DavidSpickett
Copy link
Collaborator Author

I understand, so the value is made from a | b | c for example. I didn't think of whether register fields could hit this path, but it's equivalent to a C enum they'll be using, so I think it could happen.

I'll test this, thanks for bringing it up.

@DavidSpickett
Copy link
Collaborator Author

I found a few things:

  • Yes you can get these flags like enums with a register field as expected.
  • Only C++ code can manipulate a DumpValueObjectOptions object, there's no Python binding at all.
  • The only Python tests for it are things that call commands that then use the object e.g. watchpoints or evaluating enum expressions. They do not have the low level control like I am adding in this PR.

I'm not sure of the utility of flag like enums as register fields yet but whatever I come up with, the lack of tests bothers me. So I'm trying to find a way to test DumpValueObjectOptions in relative isolation.

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So each ValueObject has the ability to show its value as an enumeration if its format is set to eFormatEnum. If the format is set to eFormatHex, eFormatUnsigned, or eFormatSigned, then we show the numeric value.

Can you show some sample output for this? I would like the const char *SBValue::GetValue() function to obey the format that is selected and not return something that has both inside of it. It would be ok for the const char *SBValue::GetSummary() to return the enumeration if the format isn't set to eFormatEnum.

This makes me question the need for the ValueObject::SetEnumsAlwaysShowValue() and ValueObject::GetEnumsAlwaysShowValue() methods, the ivars they use and many of the other changes here.

@clayborg
Copy link
Collaborator

If your register has fields, you can set its format to be eFormatEnum by default. Each register defines how it gets displayed when we query the dynamic register information from the lldb-server or debugserver

@DavidSpickett
Copy link
Collaborator Author

So each ValueObject has the ability to show its value as an enumeration if its format is set to eFormatEnum. If the format is set to eFormatHex, eFormatUnsigned, or eFormatSigned, then we show the numeric value.

Sure, the problem I have is that often with registers you'd want to see not just the name but the value. It's not 100% crucial for the feature but I found it useful in a previous debugger I used.

Can you show some sample output for this?

I only have examples for registers right now, for example AArch64's mte_ctrl will look like (where TCF is an enum):

= (TAGS = 65535, TCF = TCF_SYNC (1), TAGGED_ADDR_ENABLE = 1)

So that users know what a value means (no looking in the manuals) and the value (in case they need to add to or confirm the number they see in their source code).

When I've added tests for the existing options, I'll update this PR with test cases that are not register specific.

…lways shown

When an enum is used to represent certain data it can be useful to know
its name and the value of it. For instance, register fields are
often set in source code as numbers, but in the debugger you'd like
to see the meaning as well.

(lldb) register read fpcr
    fpcr = 0x00000000
         = (... RMode = RN (0), ...)

Often you do just want the meaning but the value saves you having
to manually decode it if you want to confirm what your source code
has done, or try to replicate the current state in your source code.

This also works for bitfield like enums, with the added change
that if a bitfield like enum has the value 0, we will print 0 if
asked to always show a value. Normally we don't print a 0 there
because 0 means no flags are set.

I did not intend to make this new format avaialable to the user,
but it ended up being so. So you can do:

expression --format "enumeration with values" -- foo

So testing is mainly from c++ but I have added a couple to the
Python tests.
@DavidSpickett DavidSpickett changed the title [lldb] Add ability to show enum as name and value at the same time [lldb] Add format eFormatEnumWithValues to ensure raw enum value is always shown May 28, 2024
@DavidSpickett
Copy link
Collaborator Author

@clayborg I have updated this to instead use a format, inspired by eFormatBytesWithASCII. Which is also one value printed twice. There are examples in the added c++ tests.

eFormatEnumWithValues, ///< Format as an enum but if the value matches one or
///< more enumerators, print the enumerator name and
///< value of those enumerators. For example "foo (1)"
///< instead of "foo".
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum is used in the API so I've added to the end of it.

@DavidSpickett
Copy link
Collaborator Author

@Endilll This being a format type now might mean you can use it from the formatters. I'm not familiar with what parts of the API you're using there.

@DavidSpickett
Copy link
Collaborator Author

@clayborg ping!

@DavidSpickett
Copy link
Collaborator Author

ping!

@DavidSpickett
Copy link
Collaborator Author

To be clear, this is not blocking register field enums work anymore. I decided to push on with that because folks can register info to see the values if they really need them.

Having this would be nice, but there's plenty of time to consider whether it's worth it.

@clayborg
Copy link
Collaborator

Sorry for the delay, I am still wondering if we just want to have the value objects for enums return the text name as the value, and have the summary show the value as an appropriate integer if the format is eFormatEnum. When you display registers bitfields do you have a ValueObject? If so we can modify the ValueObject::GetSummary() to return the integer value, but only if we have the format set to eFormatEnum. The output for variables would be similar to what your tests expect, but we would just modify the summary string that we return and then we don't need the new format.

One reason I like this way of doing things is if you call "valobj.GetValue()", and the format is set the enum with value, we will get "foo (1)" as the value back from the value object. I would rather call valobj.GetValue() and get "foo" and then call valojb.GetSummary() and get "1", or of course you can always just call valobj.GetValueAsUnsigned(...) or valobj.GetValueAsSigned(...). If we have special display code for the register bitfields, they can call these APIs when displaying register by default?

@DavidSpickett
Copy link
Collaborator Author

DavidSpickett commented Jul 26, 2024

When you display registers bitfields do you have a ValueObject?

Yes:

lldb::ValueObjectSP vobj_sp = lldb_private::ValueObjectConstResult::Create(

I'm just calling dump, there's no register specific code there, ideally I wouldn't need any.

If so we can modify the ValueObject::GetSummary() to return the integer value, but only if we have the format set to eFormatEnum. The output for variables would be similar to what your tests expect, but we would just modify the summary string that we return and then we don't need the new format.

I am not sure if a call to dump will end up using GetSummary or whether I'd need to loop over the children of the object myself, I'll look into it.

One reason I like this way of doing things is if you call "valobj.GetValue()", and the format is set the enum with value, we will get "foo (1)" as the value back from the value object. I would rather call valobj.GetValue() and get "foo" and then call valojb.GetSummary() and get "1"

When you put it like API style calls like that, I agree. foo is a "value" in the sense you could write it in your C source, foo (1) is not, it is literally a summary, an overview of the value, not usable in code.

If we have special display code for the register bitfields, they can call these APIs when displaying register by default?

By using the existing type infrastructure I avoided having to write printer code, so there isn't any right now.

You could sort of do this by having options that only C++ code can set and are only used by the register dump code, this is what the original version of this PR did (which you can see here to compare).

It's not great because it is this one off use case, but at least it's hidden from the command line and API users. Or I could add some kind of callback that only register dumping sets, that asks how to format the enum, but this is the same thing with different steps.

Could wait and see if registers need anything else unique, and once enough of that exists, invest in custom code. After all, users can register info to see the values for an enum, and only a minority will need them anyway.

...and an off the wall idea I just thought of. These register types are only for display, and we could generate different ones for expressions (https://discourse.llvm.org/t/rfc-register-fields-in-expressions/79424) if needed. So instead of using enum for the field why not a union of enum and unsigned.

The printed format might not be obvious to the reader but I'll give it a try, as it would sidestep the problems of the other solutions.

I'm going to try this.

@DavidSpickett
Copy link
Collaborator Author

So instead of using enum for the field why not a union of enum and unsigned.

The union idea sort of works, but it's clunky enough I'm not going to pursue it. The most verbose version:

    fpcr = 0x00000000
         = {
<...>
             RMode = (meaning = RN, value = 0)

Then you can set a decl printing helper and match on the generated type names to remove
some of it:

    fpcr = 0x00000000
         = {
<...>
             RMode = {
                RN
               RMode = 0
             }

With a typedef I could remove the final "RMode", right now it's "unsigned int:2" so the
helper doesn't kick in.

Seems to force vertical layout, and the two values will always be on separate lines.
This looks awkward enough that I'd rather just keep the existing format.

Now I'll see if I can use the existing dump code for all but the enums, as you suggested.

@DavidSpickett
Copy link
Collaborator Author

Custom printing code for just enums for registers would be possible but it means copying a lot of existing code. A callback reduces that to a copy of DumpEnumValue only. Or you can iterate the value object yourself but then you have to provide the surrounding format too for example the {} around a struct and so on.

I found I can modify the summary by changing TypeFormatImpl_Format::FormatObject, but this has the problem that it does not know if the enum value will match an enumerator ahead of time. So it works well for <value> <unconditional extra stuff>, but not if you only sometimes want the extra stuff. So another lookahead or callback needed here.

None of the options are great and this is not a required feature so I'm going to drop this for now. Maybe one of them will stand out when I look again later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants