-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebAssembly] llvm-strip can invalidate linking sections #102002
Comments
@llvm/issue-subscribers-backend-webassembly Author: Sam Clegg (sbc100)
The WebAssembly version of llvm-strip currently just blindly copies all sections.
However, when stripping sections we really need to reconstruct the linking section and the reelections sections since symbol information can change. This is what the ELF version of llvm-strip does I believe. |
@llvm/issue-subscribers-tools-llvm-objcopy-strip Author: Sam Clegg (sbc100)
The WebAssembly version of llvm-strip currently just blindly copies all sections.
However, when stripping sections we really need to reconstruct the linking section and the reelections sections since symbol information can change. This is what the ELF version of llvm-strip does I believe. |
hello, I've started working on this a bit and I'm a bit puzzled (about the wasm spec): do all symbols appear in the linking section's symtab? if so, why are exports added to Symbols as well? wouldn't that be a duplicate entry? (see https://github.com/llvm/llvm-project/blob/main/llvm/lib/Object/WasmObjectFile.cpp#L1431) |
@sbc100 I updated the description to make it more clear that this only applies to stripping object files (although that is of course implied by the fact that only object files have linking sections), and also that it primary applies to the "strip-debug" use case since strip-all would of course also remove the linking section. Doing the work as described here would also open up the possibility of supporting some of the symbol-manipulation features of objcopy. @ghostway0 Yes, for the linking use case, all symbols appear in the linking section's symtab. If a linking section is present, its contents override any information from the export section. Getting symbol information from the export or name section isn't used for linking, but it's useful for running tools like |
I also took a look at starting this yesterday. It does look like it will be quite a large amount of work to make it work in the general case. For example, look at our current Wasm objcopy compared to the ELF version:
|
Yeah, I'm not really surprised by this. Our objcopy implementation is extremely simple and minimal. |
understood. didn't parse the clear there, for some reason. I am of course not sure, and I'll work on it more tomorrow, if that's okay with you all note that, as @dschuff mentioned, elf objcopy does so much more than the wasm version does. I think it's not gonna reach that size |
sat on this a bit today/yesterday. this is a very rough and (currently) non-working, but I think this does get us closer :) |
Wow, thanks for working on this @ghostway0. @dschuff and I discussed a stop-gap solution but it looks like you have the real thing almost working. I worry that testing this stuff is going to be fairly involved too. |
worked a bit more today and... I guess you're right. do you have any tools to decompile these segments and show problems that are not (printf) debugging? currently debugging an |
I've create a smaller, stopgap PR which fixes this bug without implementing full symbol table regeneration: #102978. @ghostway0 this is not intended as a replacement for your work, just something that we can land now while you work on the larger change. |
One possible way to try to test this kind of thing at a low level would be to leverage obj2yaml/yaml2obj to create and dump object files on the lowest level. So you could imagine a test written as a yaml file, that runs yaml2obj to "assemble" it, then llvm-strip, and then obj2yaml to check it, and the assertions would be written against the output yaml. This has the advantage that using yaml bypasses the object writer and reader and lets you create files that would not otherwise be valid. But it does involve a fair bit of manual work. Using the assembler and objdump is also an option (also useful for bootstrapping, e.g. you could write an assembly file, assemble it to an object file, and then dump that with ob2yaml to create some test input that you might later check in as yaml). Another good option for debugging would be to use wabt and/or wasm-tools to inspect your binaries at the lowest levels, which can give you different insight into the binary encodings. |
…table files. This change is enough to allow `--strip-debug` to work on object files, without breaking the relocation information or symbol table. A more complete version of this change would instead reconstruct the symbol table and relocation sections, but that is much larger change. Fixes: llvm#102002
…able files. This change is enough to allow `--strip-debug` to work on object files, without breaking the relocation information or symbol table. A more complete version of this change would instead reconstruct the symbol table and relocation sections, but that is much larger change. Bug: llvm#102002
…able files. This change is enough to allow `--strip-debug` to work on object files, without breaking the relocation information or symbol table. A more complete version of this change would instead reconstruct the symbol table and relocation sections, but that is much larger change. Bug: llvm#102002
using obj2yaml is a great idea! |
…able files. This change is enough to allow `--strip-debug` to work on object files, without breaking the relocation information or symbol table. A more complete version of this change would instead reconstruct the symbol table and relocation sections, but that is much larger change. Bug: llvm#102002
…able files. This change is enough to allow `--strip-debug` to work on object files, without breaking the relocation information or symbol table. A more complete version of this change would instead reconstruct the symbol table and relocation sections, but that is much larger change. Bug: llvm#102002
…able files. This change is enough to allow `--strip-debug` to work on object files, without breaking the relocation information or symbol table. A more complete version of this change would instead reconstruct the symbol table and relocation sections, but that is much larger change. Bug: llvm#102002
…able files. (#102978) This change is enough to allow `--strip-debug` to work on object files, without breaking the relocation information or symbol table. A more complete version of this change would instead reconstruct the symbol table and relocation sections, but that is much larger change. Bug: #102002
This issue has now been partially address with #102978, but I'll leave it open in case @ghostway0 is able to finish the more complete solution. |
hi! sorry for the delay this is probably a small unintended 'feature' I overlooked. will continue to debug it when I have time, and when that works, write the no-hack version :) |
The WebAssembly version of llvm-strip currently just blindly copies all sections.
However, when stripping e.g. debug info sections from object files we really need to reconstruct the linking section and the reelections sections since symbol information can change. This is what the ELF version of llvm-strip does I believe.
See emscripten-core/emscripten#22291
The text was updated successfully, but these errors were encountered: