-
-
Notifications
You must be signed in to change notification settings - Fork 484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to append PT_LOAD
segments to ELFs
#1148
Comments
We have the |
Yes! That's absolutely correct. The difference in this case is the generation of program headers instead of sections. A The program header type has a reserved OS-specific range:
Could types in this range be used to temporarily mark the sections so they could be targeted by post-processing tools? |
PT_NULL is defined to be the type that is ignored, so we should probably use that type. |
Why don't you just mmap the program in runtime and read the data? |
@brenoguim That is my last resort in case all other alternatives fail. My rationale for the program header solution:
|
Fair enough. Even though I disagree with some of arguments, others make up for it. |
No, it's a magic link. Opening it doesn't dereference the filename; it always leads to the relevant file, even if it's been moved or deleted. /proc/self/fd/1 is also a magic link. That said, there are possible problems in opening /proc/self/exe. /proc isn't always mounted (afaik this happens mostly in containers). |
Certainly. Correct me if I'm wrong but it's my understanding that appending program headers would require re-linking the whole executable. Since it's located at the beginning of the file, increasing the size of the program header table would result in all addresses having to be adjusted. I assume that's the reason
I see. I must have misunderstood things when I read the documentation:
I thought it was a normal link that pointed to the file. Reading the manual again, I noticed a couple more edge cases:
So there might be multithreading and permission problems.
You're completely correct. My program will definitely be running in those conditions. My goal is to be able to boot Linux directly into it and bring up the system myself, so I'll have to mount |
Patchelf is able to change things that objcopy doesn't. Specifically, patchelf is designed to operate on linked programs/libraries. |
Technically, you can add a new program header without re-linking the entire executable by copying the existing PHDR to the end of the file with some spare slots. You can then edit the PHDRs as needed and rewrite the ELF header to refer to the new PHDR instead of the old one. That said, improving the linker to add a few spare slots at the end of the PHDRs is quite straightforward, and the feature request is convincing, so I'll try to do that. |
That's awesome, I didn't think of that. It looks like a patchelf feature is certainly possible. I'll try to contribute a patch.
Thank you so much. I will integrate |
The purpose of this tool is to create an easily customizable segment in the ELF executable that the Linux kernel will map into memory when it loads the interpreter. A pointer to it can be easily obtained via the program headers table pointer passed via the auxiliary vector. Through this program, it should be possible to embed lone lisp code into the interpreter itself and create self-contained applications. This tool does the following: - Parses and validates an ELF executable - Finds its program headers table - Moves it to the end of the file - Appends two new entries to the table - Adjusts the ELF header's pointer to it - Adjusts the PHDR segment dimensions - Sets the first one to a PT_LOAD segment covering the table - Sets the second one to a custom lone lisp loadable segment - Outputs the result to a new file This leaves an unused hole in the file where the original table was. There's no getting around that without re-linking the entire executable. Fortunately, mold has just added support for placeholder segments which will efficiently and completely solve this problem without any waste. This tool works though and should be useful if mold is unavailable. Thanks: Rui Ueyama <ruiu@cs.stanford.edu> GitHub-Issue: rui314/mold#1148
There's a better and simpler way to implement this feature: take advantage of the fact that the PT_PHDR program header is optional. Instead of moving the whole program header table to the end of the file and then wrangling pointers, I can just overwrite the PT_PHDR entry. There seems to be no downsides to this. Since the kernel passes programs pointers to their program header tables, as well as the entry count and even the size of each entry, the PT_PHDR segment does not appear to be necessary at all for accessing the program header table. This enables even simpler integration with mold and its new features: look for spare PT_NULL segments to use for the lone segment before using the PT_PHDR segment. This allows keeping the PT_PHDR segment and avoids disturbing the linker's output. The ELF documentation says, emphasis mine: > PT_PHDR > > The array element, _if present_ ... > _If it is present_ ... So it might not be present and is not required to be present. > Moreover, it _may_ occur only if the program header table > is part of the memory image of the program. A lack of the PT_PHDR segment does not imply the program header table is not part of the program's memory image. It could still be covered by a PT_LOAD segment and be in memory anyway. All linkers I've tried generate PT_LOAD segments that cover the ELF header and the program headers and so accessing the table at runtime should work. GitHub-Issue: rui314/mold#1148
I've already integrated this feature into my tools and will be using it as soon as it is released. Thanks again! @brenoguim I've just created a dedicated ELF patching tool for my own project and it's already working nicely. I'll make a pull request to I've also requested that this feature be added to |
This is a new experimental flag to make room at the end of PHDR so that post-processing tools can add more entries as needed. Fixes rui314#1148
It would be useful to have a command line option or plugin for the linker that appends an empty
PT_LOAD
program header table entry to ELF executables. This will greatly facillitate patching executables with new data after linking.The Linux kernel automatically loads those segments onto memory and passes a pointer to the program header table via the auxiliary vector. This would be the perfect mechanism to allow executables to easily and efficiently access data embedded into the executable itself, even data patched in after the the binary has been compiled.
Current solutions are insufficient.
objcopy
can add new sections but they do not get loaded by the kernel without aPT_LOAD
segment and those can only be created at link time since adding new program headers would change all offsets in the file. Linker scripts support aPHDRS
command but using that disables the linker's default behavior and forces users to specify all the segments and map all the sections to them instead of letting the linker do it.A simple
--append-program-header
that just adds an empty program header to the end of the table would be ideal. With that feature in place, custom tools can be written to copy arbitrary data into the ELF and then edit the placeholder's offset and size to match.Links:
Related StackOverflow question
Binutils mailing list discussion
Equivalent LLVM linker issue
The text was updated successfully, but these errors were encountered: