Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement support for R_X86_64_GLOB_DAT relocations #1135

Closed
ivg opened this issue Jun 17, 2020 · 1 comment · Fixed by #1209
Closed

implement support for R_X86_64_GLOB_DAT relocations #1135

ivg opened this issue Jun 17, 2020 · 1 comment · Fixed by #1209
Assignees

Comments

@ivg
Copy link
Member

ivg commented Jun 17, 2020

Binaries on Manjaro do not employ R_X86_64_JUMP_SLOT relocations but are calling external functions via R_X86_64_GLOB_DAT which we in general support through the relocatable plugin, but either it is not triggered in (I think it doesn't enter the play for non-relocatable files, which is probably wrong) or is not detecting them correctly.

Additionally, obdump symbols provider (and the future r2 provider) could handle them too. While objdump itself is confused with these relocations, we can still get them using the -DRz option, which will output them as

 9f78: R_X86_64_GLOB_DAT malloc@GLIBC_2.2.5

echo.zip

Originally posted by @Enkelmann in #1131 (comment)

@ivg
Copy link
Member Author

ivg commented Jun 17, 2020

@gitoleg, I want you (after you will finish your current tasks) and after #1119 is merged to look into your relocatable symbolizer and see if (1) it handles R_X86_64_GLOB_DAT relocations at all (I remember it was originally designed for direct jump relocations, but it could be the case that it will handle these also) and (2) if we can enable it for all binaries. Don't bother with the objdump symbolizer, I would probably look into it during #1119 or implement it later.

ivg added a commit to ivg/bap that referenced this issue Aug 3, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
ivg added a commit to ivg/bap that referenced this issue Aug 3, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
ivg added a commit to ivg/bap that referenced this issue Aug 4, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
ivg added a commit to ivg/bap that referenced this issue Aug 4, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
ivg added a commit to ivg/bap that referenced this issue Aug 4, 2020
ivg added a commit to ivg/bap that referenced this issue Aug 5, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
ivg added a commit to ivg/bap that referenced this issue Aug 5, 2020
Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes BinaryAnalysisPlatform#1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes BinaryAnalysisPlatform#1183
Fixes BinaryAnalysisPlatform#1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
ivg added a commit that referenced this issue Aug 5, 2020
* fixes the base calculation

1. For ELF files we compute base as the difference between the address of
any loadable code segment and its offset. If there are no loadable code
segments, then we find a section with minimal offset value and
substract its address from its offset.

2. For MachO, when the file is relocatable, i.e., it doesn't have addresses we
compute base as $vaddr - offset$, the same as we do in ELF. This
gives us results that match objdump (but do not match radare2, however
radare2 is not seeing any symbols, so it doesn't really matter)

3. For COFF nothing is done, and I am not sure that we need
to do anything.

4. Removed special computation of the base
address (Base.from_sections_offset) from ELF, MachO, and COFF.

It is not tested on LLVM versions below 6, but I believe it should
work up to 3.4.

resolves #1183

Co-authored-by: gitoleg <forown@yandex.ru>

* re-enables the failing test again

Hope we will pass it now.

* updates paths to artifacts

* renovates the LLVM backend

Droping support of old LLVM and legacy backends
-----------------------------------------------

We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.

Fixes #1166

Simplifies the implementation
-----------------------------

The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).

Fixes #1183
Fixes #1189

Produces more information
-------------------------------

The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.

Paves the road to #1135 and #1161

Co-authored-by: gitoleg <forown@yandex.ru>
ivg added a commit to ivg/bap that referenced this issue Aug 5, 2020
ivg added a commit to ivg/bap that referenced this issue Aug 11, 2020
ivg added a commit to ivg/bap that referenced this issue Aug 17, 2020
ivg added a commit to ivg/bap that referenced this issue Aug 17, 2020
Implements support for various relocations and improves existing that
enables us to pass all tests without relying on external symbols or
tools such as objdump or radare2.

This branch support PLT-like relocations, as well as direct calls with
GLOB_DAT relocations (fixes BinaryAnalysisPlatform#1135). The PLT entries are constant
folded and memory references are then analyzed. We also extended the
analysis that detects stub functions to support various ABI and file
formats. For PowerPC MachO, that stores stubs directly in the text
section, we implemented a signature matching procedure to reliably
detect the stubs. We also significantly improved support of mips,
which was sufferening from missing function starts that correspond to
the stubbed functions as byteweigh is unable to detect these stubs.

In addition, this PR brings a new library called Bap_relation that is
a bidirectional mapping useful for storing addr <-> name mapping and
ensure their bijection. This library is now used explicitly or
implicitly (via the old symbolizer interface) by all our providers of
symbolic information. This change prevents symbolizers from providing
conflicting information, which may later lead to the knowledge base
conflicts.

We also removed so far the name to address translation service that we
recently introduced BinaryAnalysisPlatform#1119. We are not ready for this service yet (our
knowledge base is not having enough rules stored in it) and without
this rule we can disassemble 25% faster.

There are also a couple of minor fixes and quality of life
improvements:
- fixes Insn.dests domain functions
- a better default for the KB.Domain.Powerset inspect parameter
- makes glibc-runtime heuristic more aggressive
ivg added a commit to ivg/bap that referenced this issue Aug 17, 2020
Implements support for various relocations and improves existing that
enables us to pass all tests without relying on external symbols or
tools such as objdump or radare2.

This branch support PLT-like relocations, as well as direct calls with
GLOB_DAT relocations (fixes BinaryAnalysisPlatform#1135). The PLT entries are constant
folded and memory references are then analyzed. We also extended the
analysis that detects stub functions to support various ABI and file
formats. For PowerPC MachO, that stores stubs directly in the text
section, we implemented a signature matching procedure to reliably
detect the stubs. We also significantly improved support of mips,
which was sufferening from missing function starts that correspond to
the stubbed functions as byteweigh is unable to detect these stubs.

In addition, this PR brings a new library called Bap_relation that is
a bidirectional mapping useful for storing addr <-> name mapping and
ensure their bijection. This library is now used explicitly or
implicitly (via the old symbolizer interface) by all our providers of
symbolic information. This change prevents symbolizers from providing
conflicting information, which may later lead to the knowledge base
conflicts.

We also removed so far the name to address translation service that we
recently introduced BinaryAnalysisPlatform#1119. We are not ready for this service yet (our
knowledge base is not having enough rules stored in it) and without
this rule we can disassemble 25% faster.

There are also a couple of minor fixes and quality of life
improvements:
- fixes Insn.dests domain functions
- a better default for the KB.Domain.Powerset inspect parameter
- makes glibc-runtime heuristic more aggressive
ivg added a commit to ivg/bap that referenced this issue Aug 18, 2020
Implements support for various relocations and improves existing that
enables us to pass all tests without relying on external symbols or
tools such as objdump or radare2.

This branch support PLT-like relocations, as well as direct calls with
GLOB_DAT relocations (fixes BinaryAnalysisPlatform#1135). The PLT entries are constant
folded and memory references are then analyzed. We also extended the
analysis that detects stub functions to support various ABI and file
formats. For PowerPC MachO, that stores stubs directly in the text
section, we implemented a signature matching procedure to reliably
detect the stubs. We also significantly improved support of mips,
which was sufferening from missing function starts that correspond to
the stubbed functions as byteweigh is unable to detect these stubs.

In addition, this PR brings a new library called Bap_relation that is
a bidirectional mapping useful for storing addr <-> name mapping and
ensure their bijection. This library is now used explicitly or
implicitly (via the old symbolizer interface) by all our providers of
symbolic information. This change prevents symbolizers from providing
conflicting information, which may later lead to the knowledge base
conflicts.

We also removed so far the name to address translation service that we
recently introduced BinaryAnalysisPlatform#1119. We are not ready for this service yet (our
knowledge base is not having enough rules stored in it) and without
this rule we can disassemble 25% faster.

There are also a couple of minor fixes and quality of life
improvements:
- fixes Insn.dests domain functions
- a better default for the KB.Domain.Powerset inspect parameter
- makes glibc-runtime heuristic more aggressive
ivg added a commit to ivg/bap that referenced this issue Aug 19, 2020
Implements support for various relocations and improves existing that
enables us to pass all tests without relying on external symbols or
tools such as objdump or radare2.

This branch support PLT-like relocations, as well as direct calls with
GLOB_DAT relocations (fixes BinaryAnalysisPlatform#1135). The PLT entries are constant
folded and memory references are then analyzed. We also extended the
analysis that detects stub functions to support various ABI and file
formats. For PowerPC MachO, that stores stubs directly in the text
section, we implemented a signature matching procedure to reliably
detect the stubs. We also significantly improved support of mips,
which was sufferening from missing function starts that correspond to
the stubbed functions as byteweigh is unable to detect these stubs.

In addition, this PR brings a new library called Bap_relation that is
a bidirectional mapping useful for storing addr <-> name mapping and
ensure their bijection. This library is now used explicitly or
implicitly (via the old symbolizer interface) by all our providers of
symbolic information. This change prevents symbolizers from providing
conflicting information, which may later lead to the knowledge base
conflicts.

We also removed so far the name to address translation service that we
recently introduced BinaryAnalysisPlatform#1119. We are not ready for this service yet (our
knowledge base is not having enough rules stored in it) and without
this rule we can disassemble 25% faster.

There are also a couple of minor fixes and quality of life
improvements:
- fixes Insn.dests domain functions
- a better default for the KB.Domain.Powerset inspect parameter
- makes glibc-runtime heuristic more aggressive
@ivg ivg closed this as completed in #1209 Aug 21, 2020
ivg added a commit that referenced this issue Aug 21, 2020
Implements support for various relocations and improves existing that
enables us to pass all tests without relying on external symbols or
tools such as objdump or radare2.

This branch support PLT-like relocations, as well as direct calls with
GLOB_DAT relocations (fixes #1135). The PLT entries are constant
folded and memory references are then analyzed. We also extended the
analysis that detects stub functions to support various ABI and file
formats. For PowerPC MachO, that stores stubs directly in the text
section, we implemented a signature matching procedure to reliably
detect the stubs. We also significantly improved support of mips,
which was sufferening from missing function starts that correspond to
the stubbed functions as byteweigh is unable to detect these stubs.

In addition, this PR brings a new library called Bap_relation that is
a bidirectional mapping useful for storing addr <-> name mapping and
ensure their bijection. This library is now used explicitly or
implicitly (via the old symbolizer interface) by all our providers of
symbolic information. This change prevents symbolizers from providing
conflicting information, which may later lead to the knowledge base
conflicts.

We also removed so far the name to address translation service that we
recently introduced #1119. We are not ready for this service yet (our
knowledge base is not having enough rules stored in it) and without
this rule we can disassemble 25% faster.

There are also a couple of minor fixes and quality of life
improvements:
- fixes Insn.dests domain functions
- a better default for the KB.Domain.Powerset inspect parameter
- makes glibc-runtime heuristic more aggressive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants