Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for UEFI datatype libraries #501

Closed
wants to merge 3 commits into from

Conversation

wrffrz
Copy link

@wrffrz wrffrz commented Apr 24, 2019

Ghidra currently assumes that any PE files are for Windows and automatically loads those datatype libraries, but the PE file format is also used for UEFI executables.

UEFI PE files are identified by one of several alternate subsystem values within the NT Optional Header.

This PR adds support for identifying UEFI PE executables and an initial pair of 32-bit and 64-bit UEFI datatype libraries.

@redfast00
Copy link
Contributor

@ryanmkurtz I would like to use Ghidra for UEFI reverse engineering, can you please merge this PR/give some feedback?

@ryanmkurtz
Copy link
Collaborator

Evaluating this PR is on our TODO list, but unfortunately our TODO list is massive so we have to prioritize accordingly. In the mean time, you have the option of building Ghidra on your own, incorporating these changes.

@redfast00
Copy link
Contributor

@wrffrz Thank you for the work you put into this. This PR is very useful for reversing UEFI applications.

Is there some way I can contribute to the datatypes without editing the .gdt files (they are binary, and won't diff well)? For example the EFI_CONFIGURATION_TABLE struct is empty (this shouldn't be the case according to https://sites.google.com/site/uefiforth/bios/uefi/uefi/4-efi-system-table/4-6-efi-configuration-table). How did you generate these .gdt files?

@wrffrz
Copy link
Author

wrffrz commented May 29, 2019

@redfast00 Thanks for the feedback and catching the error with the EFI_CONFIGURATION_TABLE structure.

I had originally created the datatype libraries via using gcc -E to output preprocessed headers and manually edited them until I was able to get Ghidra's Parse C Source function to parse them and successfully generate a gdt file without errors. I've since figured out a less painful method which results in repeatable output without needing to manually edit the header files.

Basically, you need to check out the tianocore EDK2 repo and setup an EDK2 build environment and then get gcc to generate preprocessed header files like this:

git clone https://github.com/tianocore/edk2

cd edk2
make -C BaseTools
. edksetup.sh

INCDIRS=$(find . -type d -name Include | awk '{print "-I" $1}')

cat > uefitypes.c <<EOF
#include <ProcessorBind.h>
#include <Base.h>
#include <Uefi/UefiBaseType.h>
#include <Uefi/UefiSpec.h>
EOF

gcc $INCDIRS -I./MdePkg/Include/Ia32 -E uefitypes.c | grep -v builtin | grep -v ^# | grep -v ^$ > uefi_ia32.h
gcc $INCDIRS -I./MdePkg/Include/X64 -E uefitypes.c | grep -v builtin | grep -v ^# | grep -v ^$ > uefi_x64.h
gcc $INCDIRS -I./MdePkg/Include/Arm -E uefitypes.c | grep -v builtin | grep -v ^# | grep -v ^$ > uefi_arm.h
gcc $INCDIRS -I./MdePkg/Include/AArch64 -E uefitypes.c | grep -v builtin | grep -v ^# | grep -v ^$ > uefi_aarch64.h
gcc $INCDIRS -I./MdePkg/Include/Ebc -E uefitypes.c | grep -v builtin | grep -v ^# | grep -v ^$ > uefi_ebc.h

Once you have the header for the particular datatype library you want to generate, use File->Parse C Source in Ghidra to load that individual header with no parse options and select "Parse To File" to generate the .gdt file.

I'll update the PR to include these newer datatype libraries.

@wrffrz
Copy link
Author

wrffrz commented May 29, 2019

The above lines to generate the combined headers to load into Ghidra are accidentally stripping the #pragma pack lines, so these ones should be used instead to get the structure packing correct:

for arch in Ia32 X64 Arm AArch64 Ebc
do
gcc $INCDIRS -I./MdePkg/Include/$arch -E uefitypes.c | grep -v builtin | egrep -v '^# [0-9]' | grep -v ^$ >uefi_${arch,,}.h
done

@wrffrz
Copy link
Author

wrffrz commented May 30, 2019

I originally had issues getting the Parse C Source dialog box to work without errors, but you can also successfully generate the datatype libraries by setting the parse options to the full list of -I flags for each include directory with absolute paths including the processor-specific one and add the list of headers to include to the list of source files to parse.

This has the additional benefit of the structures showing up in the proper include file names in the datatype library browse hierarchy.

For example, to generate the X64 version of the datatypes, you can do this to get the list of include directory options to include in the Parse Options field:

git clone https://github.com/tianocore/edk2
cd edk2

echo "-I$(pwd)/MdePkg/Include/X64" > list_of_parse_options.txt
find $(pwd) -name Include | awk '{print "-I" $1}' >> list_of_parse_options.txt

Take the contents of the file and use that as the Parse Options and manually browse through the edk2 source tree to add the header files to parse:

MdePkg/Include/X64/ProcessorBind.h
MdePkg/Include/Base.h
MdePkg/Include/Uefi/UefiBaseType.h
MdePkg/Include/Uefi/UefiSpec.h

You can add other include files in addition to the ones that are pulled in by this set, but I ran into parse errors when I tried adding some others and it'll probably need some experimentation.

Once it's successfully building the datatype library, you can save the parse configuration profile as something like uefi_x64.prf and either generate a .gdt to be used for all UEFI binaries or use it later to parse the headers into each individual programs datatype library.

@emteere
Copy link
Contributor

emteere commented May 31, 2019

As Ryan mentioned, we have been a bit swamped as of late. It has been on our list, in particular asking for the .prf files, and suggesting that the parsing be done without C-Preprocessing outside of Ghidra as information is lost.

Getting the CParser to successfully parse header files can be painful, almost like getting the compile options correct when porting to another platform. It has other advantages besides having the original header file name in the .gdt. Any #define's that evaluate to a constant will be added as an enum, so you can have error codes or calculated values based on your parse included.

Can you post your two .prf file that worked as well? We can regenerate it, but I'd like to include them so the exact parse options are included for re-generation.

You can strip off the full paths for the individual header files once you've added all the files that should be parsed. Then use the -I relative as you have done. We put them in as paths from root, but a better option might be to put a place holder, or we should allow variable expansion from a base entry or environment variable, but this was meant to be easy.

Also if you want to parse an entire directory from the root, you can just add the directory. When the parse is chosen an attempt is made to find the root header files to include. This may not be a great option if you have alot of header files, or a mixture of many files that won't parse (C++).

From your directions I think uefi_x64.prf should look like:

X64/ProcessorBind.h
Base.h
Uefi/UefiBaseType.h
Uefi/UefiSpec.h

-I$(UEFI_BASE_PATH_HERE)/MdePkg/Include

You can add it to the directory with the other .prf files.

Do the header files include bitfields?

@redfast00
Copy link
Contributor

@wrffrz is it possible with ghidra to automatically mark the types of the entry function in uefi binaries (EFI_HANDLE and EFI_SYSTEM_TABLE) and to propagate them in other functions?

@emteere
Copy link
Contributor

emteere commented Jun 1, 2019

Do you mean apply the function signatures to already functions already labeled with their correct name?
You can apply FunctionDataTypes from the popup menu for the open UEFI archive.

Once the functions have good applied types, make sure your binary is clean (no red-bookmarks, bad flow, etc).
Then run the DecompilerParameterID analyzer. It is off by default since it can propagate bad data types if your program is not clean.

@redfast00
Copy link
Contributor

@emteere yes, that certainly helps, the only thing left is then to automatically add the types to the entry function (the uefi spec specifies these types).

@wrffrz
Copy link
Author

wrffrz commented Jun 3, 2019

@emteere After further experimentation, this prf is working well for me for X64:

Uefi/UefiBaseType.h
Uefi/UefiSpec.h
PiDxe.h
PiMm.h
PiPei.h
PiSmm.h
Library/UefiApplicationEntryPoint.h

-I$(UEFI_BASE_PATH_HERE)/MdePkg/Include/X64
-I$(UEFI_BASE_PATH_HERE)/MdePkg/Include

There are a lot of other include directories with useful protocol definitions spread throughout the UEFI edk2 source tree, but this pulls in the core Pei/Dxe/Smm definitions and the most common UEFI entry point definition.

You can successfully generate Ia32, Arm, and AArch64 datatype libraries by replacing the architecture directory name in the first include definition, so uefi_ia32.prf is identical except for:

-I$(UEFI_BASE_PATH_HERE)/MdePkg/Include/Ia32

uefi_arm.prf has:

-I$(UEFI_BASE_PATH_HERE)/MdePkg/Include/Arm

and uefi_aarch64.prf has:

-I$(UEFI_BASE_PATH_HERE)/MdePkg/Include/AArch64

Arm UEFI systems are definitely less common, but I have couple that I'm working with currently.

The existing patch doesn't include support for differentiating x86 vs arm and selecting the correct datatype library, but that's probably something that should be added also. Not sure what the best way to make architectural-specific decisions from DataTypeArchiveUtility.java would be.

@wrffrz
Copy link
Author

wrffrz commented Jun 3, 2019

@redfast00 There's probably a way to automatically mark the entry point variables with specific types, but I haven't figured that out yet.

One thing to keep in mind is that there are multiple different types of entry point within UEFI and we'll need to differentiate between DXE and PEI phase binaries, but we can probably do some of that based on the specific subsystem value in the NT Optional Header.

If you look in MdePkg/Include/Library, there are different include files for each of the types of entry points that exist like:

MdePkg/Include/Library/DxeCoreEntryPoint.h
MdePkg/Include/Library/PeiCoreEntryPoint.h
MdePkg/Include/Library/PeimEntryPoint.h
MdePkg/Include/Library/StandaloneMmDriverEntryPoint.h
MdePkg/Include/Library/UefiApplicationEntryPoint.h
MdePkg/Include/Library/UefiDriverEntryPoint.h

Each of these is wrapped in "#ifndef MODULE_ENTRY_POINT_H" and defines _ModuleEntryPoint() and EfiMain() function definitions, but although Uefi Applications and Drivers from the Dxe phase look like this:

EFI_STATUS
EFIAPI
_ModuleEntryPoint (
IN EFI_HANDLE ImageHandle,
IN EFI_SYSTEM_TABLE *SystemTable
);

... we also have Pei phase modules that look like this:

EFI_STATUS
EFIAPI
_ModuleEntryPoint (
IN EFI_PEI_FILE_HANDLE FileHandle,
IN CONST EFI_PEI_SERVICES **PeiServices
);

And there's also the Sec -> Pei handoff and the Pei -> Dxe handoff entry points which are also different.

We might want to have a core UEFI datatype with most of the definitions and then separate smaller datatypes for each of the different types of UEFI executables that are available.

al3xtjames added a commit to al3xtjames/ghidra-firmware-utils that referenced this pull request Jul 31, 2019
Follow NationalSecurityAgency/ghidra#501 and set a boolean property in
the program info options to indicate if the TE binary is a UEFI binary.
Also add additional getters to expose various header fields for future
use (by the UEFI helper script).
@redfast00
Copy link
Contributor

@wrffrz @ryanmkurtz @emteere what are the steps still needed to get (a basic version of this) merged? Parts of this PR have been used in https://github.com/al3xtjames/ghidra-firmware-utils (a coreboot project for GSoC 2019).

@ryanmkurtz
Copy link
Collaborator

Closing since we cannot accept binary files.

@ryanmkurtz ryanmkurtz closed this Apr 13, 2023
@LukeSerne
Copy link
Contributor

It's unfortunate that this PR is closed because it contains new .gdt files. While I understand that accepting PRs that add or change binary files poses a risk that might not be acceptable, I think it'd be a shame if that would hinder Ghidra support for UEFI binaries.

Maybe a solution would be for the Ghidra team to regenerate the .gdt files that are part of this PR, and merge those, together with the "textual" changes of this PR.

@ryanmkurtz
Copy link
Collaborator

I think the Ghidra Firmware Utilities extension includes this functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants