-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for UEFI datatype libraries #501
Conversation
…nstead of using Windows datatype library
@ryanmkurtz I would like to use Ghidra for UEFI reverse engineering, can you please merge this PR/give some feedback? |
Evaluating this PR is on our TODO list, but unfortunately our TODO list is massive so we have to prioritize accordingly. In the mean time, you have the option of building Ghidra on your own, incorporating these changes. |
@wrffrz Thank you for the work you put into this. This PR is very useful for reversing UEFI applications. Is there some way I can contribute to the datatypes without editing the .gdt files (they are binary, and won't diff well)? For example the EFI_CONFIGURATION_TABLE struct is empty (this shouldn't be the case according to https://sites.google.com/site/uefiforth/bios/uefi/uefi/4-efi-system-table/4-6-efi-configuration-table). How did you generate these .gdt files? |
@redfast00 Thanks for the feedback and catching the error with the EFI_CONFIGURATION_TABLE structure. I had originally created the datatype libraries via using gcc -E to output preprocessed headers and manually edited them until I was able to get Ghidra's Parse C Source function to parse them and successfully generate a gdt file without errors. I've since figured out a less painful method which results in repeatable output without needing to manually edit the header files. Basically, you need to check out the tianocore EDK2 repo and setup an EDK2 build environment and then get gcc to generate preprocessed header files like this:
Once you have the header for the particular datatype library you want to generate, use File->Parse C Source in Ghidra to load that individual header with no parse options and select "Parse To File" to generate the .gdt file. I'll update the PR to include these newer datatype libraries. |
The above lines to generate the combined headers to load into Ghidra are accidentally stripping the #pragma pack lines, so these ones should be used instead to get the structure packing correct:
|
I originally had issues getting the Parse C Source dialog box to work without errors, but you can also successfully generate the datatype libraries by setting the parse options to the full list of -I flags for each include directory with absolute paths including the processor-specific one and add the list of headers to include to the list of source files to parse. This has the additional benefit of the structures showing up in the proper include file names in the datatype library browse hierarchy. For example, to generate the X64 version of the datatypes, you can do this to get the list of include directory options to include in the Parse Options field:
Take the contents of the file and use that as the Parse Options and manually browse through the edk2 source tree to add the header files to parse:
You can add other include files in addition to the ones that are pulled in by this set, but I ran into parse errors when I tried adding some others and it'll probably need some experimentation. Once it's successfully building the datatype library, you can save the parse configuration profile as something like uefi_x64.prf and either generate a .gdt to be used for all UEFI binaries or use it later to parse the headers into each individual programs datatype library. |
As Ryan mentioned, we have been a bit swamped as of late. It has been on our list, in particular asking for the .prf files, and suggesting that the parsing be done without C-Preprocessing outside of Ghidra as information is lost. Getting the CParser to successfully parse header files can be painful, almost like getting the compile options correct when porting to another platform. It has other advantages besides having the original header file name in the .gdt. Any #define's that evaluate to a constant will be added as an enum, so you can have error codes or calculated values based on your parse included. Can you post your two .prf file that worked as well? We can regenerate it, but I'd like to include them so the exact parse options are included for re-generation. You can strip off the full paths for the individual header files once you've added all the files that should be parsed. Then use the -I relative as you have done. We put them in as paths from root, but a better option might be to put a place holder, or we should allow variable expansion from a base entry or environment variable, but this was meant to be easy. Also if you want to parse an entire directory from the root, you can just add the directory. When the parse is chosen an attempt is made to find the root header files to include. This may not be a great option if you have alot of header files, or a mixture of many files that won't parse (C++). From your directions I think uefi_x64.prf should look like:
You can add it to the directory with the other .prf files. Do the header files include bitfields? |
@wrffrz is it possible with ghidra to automatically mark the types of the entry function in uefi binaries (EFI_HANDLE and EFI_SYSTEM_TABLE) and to propagate them in other functions? |
Do you mean apply the function signatures to already functions already labeled with their correct name? Once the functions have good applied types, make sure your binary is clean (no red-bookmarks, bad flow, etc). |
@emteere yes, that certainly helps, the only thing left is then to automatically add the types to the entry function (the uefi spec specifies these types). |
@emteere After further experimentation, this prf is working well for me for X64:
There are a lot of other include directories with useful protocol definitions spread throughout the UEFI edk2 source tree, but this pulls in the core Pei/Dxe/Smm definitions and the most common UEFI entry point definition. You can successfully generate Ia32, Arm, and AArch64 datatype libraries by replacing the architecture directory name in the first include definition, so uefi_ia32.prf is identical except for:
uefi_arm.prf has:
and uefi_aarch64.prf has:
Arm UEFI systems are definitely less common, but I have couple that I'm working with currently. The existing patch doesn't include support for differentiating x86 vs arm and selecting the correct datatype library, but that's probably something that should be added also. Not sure what the best way to make architectural-specific decisions from DataTypeArchiveUtility.java would be. |
@redfast00 There's probably a way to automatically mark the entry point variables with specific types, but I haven't figured that out yet. One thing to keep in mind is that there are multiple different types of entry point within UEFI and we'll need to differentiate between DXE and PEI phase binaries, but we can probably do some of that based on the specific subsystem value in the NT Optional Header. If you look in MdePkg/Include/Library, there are different include files for each of the types of entry points that exist like:
Each of these is wrapped in "#ifndef MODULE_ENTRY_POINT_H" and defines _ModuleEntryPoint() and EfiMain() function definitions, but although Uefi Applications and Drivers from the Dxe phase look like this:
... we also have Pei phase modules that look like this:
And there's also the Sec -> Pei handoff and the Pei -> Dxe handoff entry points which are also different. We might want to have a core UEFI datatype with most of the definitions and then separate smaller datatypes for each of the different types of UEFI executables that are available. |
Follow NationalSecurityAgency/ghidra#501 and set a boolean property in the program info options to indicate if the TE binary is a UEFI binary. Also add additional getters to expose various header fields for future use (by the UEFI helper script).
@wrffrz @ryanmkurtz @emteere what are the steps still needed to get (a basic version of this) merged? Parts of this PR have been used in https://github.com/al3xtjames/ghidra-firmware-utils (a coreboot project for GSoC 2019). |
Closing since we cannot accept binary files. |
It's unfortunate that this PR is closed because it contains new .gdt files. While I understand that accepting PRs that add or change binary files poses a risk that might not be acceptable, I think it'd be a shame if that would hinder Ghidra support for UEFI binaries. Maybe a solution would be for the Ghidra team to regenerate the .gdt files that are part of this PR, and merge those, together with the "textual" changes of this PR. |
I think the Ghidra Firmware Utilities extension includes this functionality. |
Ghidra currently assumes that any PE files are for Windows and automatically loads those datatype libraries, but the PE file format is also used for UEFI executables.
UEFI PE files are identified by one of several alternate subsystem values within the NT Optional Header.
This PR adds support for identifying UEFI PE executables and an initial pair of 32-bit and 64-bit UEFI datatype libraries.