-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
renovates the LLVM backend #1187
renovates the LLVM backend #1187
Conversation
Relocatable files are probably PI[C|E] ELFs? |
edb5ca6
to
49f3d08
Compare
Speaking about relocatable files, I didn't mean just files with relocations, but files that contain a code that will be relocated, i.e. code that will be linked at some different address. Such files can contain symbols with no address assigned. For example, in the https://docs.oracle.com/cd/E19683-01/816-1386/6m7qcoblj/index.html#chapter6-35166 Also, in Speaking about |
3036367
to
161c0b4
Compare
161c0b4
to
6e22f64
Compare
The main misconception that your brought into the implementation is that relocations could only happen in the relocatable files, while the only difference is that relocatable files do not have any fixed virtual addresses so we have to give them some base and this base must be more or less in sync with the state-of-the-art tools. As this PR shows, we can easily handle relocatable files by just making the base calculation more robust and amend it for the relocatable files. The issue that I have with the current implementation, is that we are not providing a lot of valuable information, for non-relocatable files, e.g., the rellocations itself, indirect symbols, external symbols, etc. Again, all these features commonly occur in regular binaries. But this is a topic for another PR. This should be considered done. We have fixed macho and cleanup ELF and removed lots of unnecessary code. Everything else will be tracked in #1189. See also #1188 where I also significantly refactored the loader part and introduced proper namespaces for properties. I will rebase #1188 once this PR is merged. |
68caac6
to
cb60bbe
Compare
f70d3d0
to
0f4e47b
Compare
1. For ELF files we compute base as the difference between the address of any loadable code segment and its offset. If there are no loadable code segments, then we find a section with minimal offset value and substract its address from its offset. 2. For MachO, when the file is relocatable, i.e., it doesn't have addresses we compute base as $vaddr - offset$, the same as we do in ELF. This gives us results that match objdump (but do not match radare2, however radare2 is not seeing any symbols, so it doesn't really matter) 3. For COFF nothing is done, and I am not sure that we need to do anything. 4. Removed special computation of the base address (Base.from_sections_offset) from ELF, MachO, and COFF. It is not tested on LLVM versions below 6, but I believe it should work up to 3.4. resolves BinaryAnalysisPlatform#1183 Co-authored-by: gitoleg <forown@yandex.ru>
Hope we will pass it now.
Droping support of old LLVM and legacy backends ----------------------------------------------- We drop a lot of old code (minus 3k lines of code) thus removing the support burden and making it easier to maintain, fix, and upgrade the code. Fixes BinaryAnalysisPlatform#1166 Simplifies the implementation ----------------------------- The remaining code base is significanly simplified. We dropped the separation between relocatable and non-relocatable files, removed any transformations of addresses from the LLVM backend (we now emit absolute virtual addresses). The whole logic of transforming from the llvm view to the bap image view now fits into a hundred lines of code (instead of hundreds lines spread across 16 files as it was before). Fixes BinaryAnalysisPlatform#1183 Fixes BinaryAnalysisPlatform#1189 Produces more information ------------------------------- The relocation information is now emitted for all files (not only for relocatable). Also, removes tons of checks that were preventing our backends from emitting valuable symbolic information. Paves the road to BinaryAnalysisPlatform#1135 and BinaryAnalysisPlatform#1161
4a35ce4
to
7110388
Compare
renovates the LLVM backend
Droping support of old LLVM and legacy backends
We drop a lot of old code (minus 3k lines of code) thus removing the support
burden and making it easier to maintain, fix, and upgrade the code.
Fixes #1166
Simplifies the implementation
The remaining code base is significanly simplified. We dropped the
separation between relocatable and non-relocatable files, removed any
transformations of addresses from the LLVM backend (we now emit
absolute virtual addresses). The whole logic of transforming from the
llvm view to the bap image view now fits into a hundred lines of
code (instead of hundreds lines spread across 16 files as it was
before).
Fixes #1183
Fixes #1189
Produces more information
The relocation information is now emitted for all files (not only for
relocatable). Also, removes tons of checks that were preventing our
backends from emitting valuable symbolic information.
Paves the road to #1135 and #1161