Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory #86359

Merged

Conversation

jasonmolenda
Copy link
Collaborator

It is possible to gather code coverage in a firmware environment, where the __LLVM_COV segment will not be mapped in memory but does exist in the binary, see
https://llvm.org/devmtg/2020-09/slides/PhippsAlan_EmbeddedCodeCoverage_LLVM_Conf_Talk_final.pdf

The __LLVM_COV segment in the binary happens to be at the same address as the __DATA segment, so if lldb treats this segment as loaded, it shadows the __DATA segment and address->symbol resolution can fail.

For these non-userland code cases, we need to mark __LLVM_COV as not a loadable segment.

rdar://124475661

It is possible to gather code coverage in a firmware environment,
where the __LLVM_COV segment will not be mapped in memory but does
exist in the binary, see
https://llvm.org/devmtg/2020-09/slides/PhippsAlan_EmbeddedCodeCoverage_LLVM_Conf_Talk_final.pdf

The __LLVM_COV segment in the binary happens to be at the same
address as the __DATA segment, so if lldb treats this segment as
loaded, it shadows the __DATA segment and address->symbol resolution
can fail.

For these non-userland code cases, we need to mark __LLVM_COV as
not a loadable segment.

rdar://124475661
@llvmbot
Copy link
Member

llvmbot commented Mar 22, 2024

@llvm/pr-subscribers-lldb

Author: Jason Molenda (jasonmolenda)

Changes

It is possible to gather code coverage in a firmware environment, where the __LLVM_COV segment will not be mapped in memory but does exist in the binary, see
https://llvm.org/devmtg/2020-09/slides/PhippsAlan_EmbeddedCodeCoverage_LLVM_Conf_Talk_final.pdf

The __LLVM_COV segment in the binary happens to be at the same address as the __DATA segment, so if lldb treats this segment as loaded, it shadows the __DATA segment and address->symbol resolution can fail.

For these non-userland code cases, we need to mark __LLVM_COV as not a loadable segment.

rdar://124475661


Full diff: https://github.com/llvm/llvm-project/pull/86359.diff

2 Files Affected:

  • (modified) lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp (+12)
  • (modified) lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h (+1)
diff --git a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
index bcf3a3274cf3a0..1caf93659956b4 100644
--- a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
+++ b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
@@ -905,6 +905,11 @@ ConstString ObjectFileMachO::GetSegmentNameDWARF() {
   return g_section_name;
 }
 
+ConstString ObjectFileMachO::GetSegmentNameLLVM_COV() {
+  static ConstString g_section_name("__LLVM_COV");
+  return g_section_name;
+}
+
 ConstString ObjectFileMachO::GetSectionNameEHFrame() {
   static ConstString g_section_name_eh_frame("__eh_frame");
   return g_section_name_eh_frame;
@@ -6145,6 +6150,13 @@ bool ObjectFileMachO::SectionIsLoadable(const Section *section) {
     return false;
   if (GetModule().get() != section->GetModule().get())
     return false;
+  // firmware style binaries with llvm gcov segment do
+  // not have that segment mapped into memory.
+  if (section->GetName() == GetSegmentNameLLVM_COV()) {
+    const Strata strata = GetStrata();
+    if (strata == eStrataKernel || strata == eStrataRawImage)
+      return false;
+  }
   // Be careful with __LINKEDIT and __DWARF segments
   if (section->GetName() == GetSegmentNameLINKEDIT() ||
       section->GetName() == GetSegmentNameDWARF()) {
diff --git a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
index 0a47f3a7dd1861..55bc688126eb36 100644
--- a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
+++ b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
@@ -271,6 +271,7 @@ class ObjectFileMachO : public lldb_private::ObjectFile {
   static lldb_private::ConstString GetSegmentNameOBJC();
   static lldb_private::ConstString GetSegmentNameLINKEDIT();
   static lldb_private::ConstString GetSegmentNameDWARF();
+  static lldb_private::ConstString GetSegmentNameLLVM_COV();
   static lldb_private::ConstString GetSectionNameEHFrame();
 
   llvm::MachO::dysymtab_command m_dysymtab;

Copy link

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link

✅ With the latest revision this PR passed the Python code formatter.

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@jasonmolenda jasonmolenda merged commit 765d4c4 into llvm:main Mar 25, 2024
4 checks passed
@jasonmolenda jasonmolenda deleted the llvm-cov-not-loaded-in-memory-for-firmware branch March 25, 2024 22:00
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Mar 26, 2024
…lvm#86359)

It is possible to gather code coverage in a firmware environment, where
the __LLVM_COV segment will not be mapped in memory but does exist in
the binary, see

https://llvm.org/devmtg/2020-09/slides/PhippsAlan_EmbeddedCodeCoverage_LLVM_Conf_Talk_final.pdf

The __LLVM_COV segment in the binary happens to be at the same address
as the __DATA segment, so if lldb treats this segment as loaded, it
shadows the __DATA segment and address->symbol resolution can fail.

For these non-userland code cases, we need to mark __LLVM_COV as not a
loadable segment.

rdar://124475661
(cherry picked from commit 765d4c4)
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Mar 26, 2024
…lvm#86359)

It is possible to gather code coverage in a firmware environment, where
the __LLVM_COV segment will not be mapped in memory but does exist in
the binary, see

https://llvm.org/devmtg/2020-09/slides/PhippsAlan_EmbeddedCodeCoverage_LLVM_Conf_Talk_final.pdf

The __LLVM_COV segment in the binary happens to be at the same address
as the __DATA segment, so if lldb treats this segment as loaded, it
shadows the __DATA segment and address->symbol resolution can fail.

For these non-userland code cases, we need to mark __LLVM_COV as not a
loadable segment.

rdar://124475661
(cherry picked from commit 765d4c4)
jasonmolenda added a commit to swiftlang/llvm-project that referenced this pull request Mar 26, 2024
…memory-20230725

[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory (llvm#86359)
JDevlieghere added a commit to swiftlang/llvm-project that referenced this pull request Mar 26, 2024
…memory-6.0

[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory (llvm#86359)
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Apr 3, 2024
This is a followup to
llvm#86359
"[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory
(llvm#86359)"

where I treat LLVM_COV segments in a Mach-O binary as non-loadable.
There is another codepath in
`DynamicLoaderStatic::LoadAllImagesAtFileAddresses` which is called
to set the load addresses for a Module to the file addresses.  It
has no logic to detect a segment that is not loaded in virtual
memory (ObjectFileMachO::SectionIsLoadable), so it would set the
load address for this LLVM_COV segment to the file address and
shadow actual code, breaking lldb behavior.

This method currently sets the load address for any section that
doesn't have a load address set already.  This presumes that a
Module was added to the Target, some mechanism set the correct load
address for SOME segments, and then this method is going to set the
other segments to a no-slide value, assuming they were forgotten.

ObjectFile base class doesn't, today, vend a SectionIsLoadable
method, but we do have ObjectFile::SetLoadAddress and at a higher
level, Module::SetLoadAddress, when we're setting the same slide
to all segments.

That's the behavior we want in this method.  If any section has a
load address, we don't touch this Module.  Otherwise we set all
sections to have a load address that is the same as the file address.

I also audited the other parts of lldb that are calling
SectionList::SectionLoadAddress and looked if they should be more
correctly using Module::SetLoadAddress for the entire binary.  But in
most cases, we have the potential for different slides for different
sections so this section-by-section approach must be taken.

rdar://125800290
jasonmolenda added a commit that referenced this pull request Apr 3, 2024
This is a followup to
#86359
"[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory
(#86359)"

where I treat LLVM_COV segments in a Mach-O binary as non-loadable.
There is another codepath in
`DynamicLoaderStatic::LoadAllImagesAtFileAddresses` which is called to
set the load addresses for a Module to the file addresses. It has no
logic to detect a segment that is not loaded in virtual memory
(ObjectFileMachO::SectionIsLoadable), so it would set the load address
for this LLVM_COV segment to the file address and shadow actual code,
breaking lldb behavior.

This method currently sets the load address for any section that doesn't
have a load address set already. This presumes that a Module was added
to the Target, some mechanism set the correct load address for SOME
segments, and then this method is going to set the other segments to a
no-slide value, assuming they were forgotten.

ObjectFile base class doesn't, today, vend a SectionIsLoadable method,
but we do have ObjectFile::SetLoadAddress and at a higher level,
Module::SetLoadAddress, when we're setting the same slide to all
segments.

That's the behavior we want in this method. If any section has a load
address, we don't touch this Module. Otherwise we set all sections to
have a load address that is the same as the file address.

I also audited the other parts of lldb that are calling
SectionList::SectionLoadAddress and looked if they should be more
correctly using Module::SetLoadAddress for the entire binary. But in
most cases, we have the potential for different slides for different
sections so this section-by-section approach must be taken.

rdar://125800290
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Apr 4, 2024
This is a followup to
llvm#86359
"[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory
(llvm#86359)"

where I treat LLVM_COV segments in a Mach-O binary as non-loadable.
There is another codepath in
`DynamicLoaderStatic::LoadAllImagesAtFileAddresses` which is called to
set the load addresses for a Module to the file addresses. It has no
logic to detect a segment that is not loaded in virtual memory
(ObjectFileMachO::SectionIsLoadable), so it would set the load address
for this LLVM_COV segment to the file address and shadow actual code,
breaking lldb behavior.

This method currently sets the load address for any section that doesn't
have a load address set already. This presumes that a Module was added
to the Target, some mechanism set the correct load address for SOME
segments, and then this method is going to set the other segments to a
no-slide value, assuming they were forgotten.

ObjectFile base class doesn't, today, vend a SectionIsLoadable method,
but we do have ObjectFile::SetLoadAddress and at a higher level,
Module::SetLoadAddress, when we're setting the same slide to all
segments.

That's the behavior we want in this method. If any section has a load
address, we don't touch this Module. Otherwise we set all sections to
have a load address that is the same as the file address.

I also audited the other parts of lldb that are calling
SectionList::SectionLoadAddress and looked if they should be more
correctly using Module::SetLoadAddress for the entire binary. But in
most cases, we have the potential for different slides for different
sections so this section-by-section approach must be taken.

rdar://125800290
(cherry picked from commit 622851a)
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Apr 4, 2024
This is a followup to
llvm#86359
"[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory
(llvm#86359)"

where I treat LLVM_COV segments in a Mach-O binary as non-loadable.
There is another codepath in
`DynamicLoaderStatic::LoadAllImagesAtFileAddresses` which is called to
set the load addresses for a Module to the file addresses. It has no
logic to detect a segment that is not loaded in virtual memory
(ObjectFileMachO::SectionIsLoadable), so it would set the load address
for this LLVM_COV segment to the file address and shadow actual code,
breaking lldb behavior.

This method currently sets the load address for any section that doesn't
have a load address set already. This presumes that a Module was added
to the Target, some mechanism set the correct load address for SOME
segments, and then this method is going to set the other segments to a
no-slide value, assuming they were forgotten.

ObjectFile base class doesn't, today, vend a SectionIsLoadable method,
but we do have ObjectFile::SetLoadAddress and at a higher level,
Module::SetLoadAddress, when we're setting the same slide to all
segments.

That's the behavior we want in this method. If any section has a load
address, we don't touch this Module. Otherwise we set all sections to
have a load address that is the same as the file address.

I also audited the other parts of lldb that are calling
SectionList::SectionLoadAddress and looked if they should be more
correctly using Module::SetLoadAddress for the entire binary. But in
most cases, we have the potential for different slides for different
sections so this section-by-section approach must be taken.

rdar://125800290
(cherry picked from commit 622851a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants