Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce new ELF parser #348

Merged
merged 34 commits into from
Aug 5, 2022
Merged

Introduce new ELF parser #348

merged 34 commits into from
Aug 5, 2022

Conversation

alessandrod
Copy link

@alessandrod alessandrod commented Jun 10, 2022

When config.new_elf_parser=true, parse ELF files using a new dependency free ELF parser instead of the old goblin based one.

@alessandrod alessandrod force-pushed the new-elf branch 11 times, most recently from a9d1081 to db3236e Compare June 17, 2022 08:31
@alessandrod alessandrod changed the title WIP: new elf parser Introduce new ELF parser Jun 17, 2022
@alessandrod alessandrod marked this pull request as ready for review June 17, 2022 08:32
@alessandrod
Copy link
Author

alessandrod commented Jun 17, 2022

I still want to add some structured fuzzing, but this is now ready for review!

As it is now, the new parser is supposed to be a drop in replacement for the old goblin parser. It does not enforce stricter checks yet. I'm happy to add those as part of this PR or do them as a follow up, but my priority for now is to be 100% compatible with the old parser so we can drop goblin sooner rather than later.

@codecov-commenter
Copy link

codecov-commenter commented Jun 21, 2022

Codecov Report

Merging #348 (ff1005f) into main (e8243ec) will decrease coverage by 0.33%.
The diff coverage is 80.53%.

@@            Coverage Diff             @@
##             main     #348      +/-   ##
==========================================
- Coverage   90.21%   89.88%   -0.34%     
==========================================
  Files          21       24       +3     
  Lines        8312     9000     +688     
==========================================
+ Hits         7499     8090     +591     
- Misses        813      910      +97     
Impacted Files Coverage Δ
src/elf_parser/types.rs 57.14% <57.14%> (ø)
src/elf_parser_glue.rs 66.02% <66.02%> (ø)
src/lib.rs 76.92% <75.00%> (-23.08%) ⬇️
src/elf_parser/mod.rs 82.75% <82.75%> (ø)
src/elf.rs 91.96% <92.14%> (+3.16%) ⬆️
src/aligned_memory.rs 87.59% <100.00%> (+12.59%) ⬆️
src/call_frames.rs 97.80% <100.00%> (-0.02%) ⬇️
src/vm.rs 80.06% <100.00%> (+0.06%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us.

src/aligned_memory.rs Outdated Show resolved Hide resolved
src/aligned_memory.rs Outdated Show resolved Hide resolved
src/call_frames.rs Outdated Show resolved Hide resolved
src/elf_parser_glue.rs Outdated Show resolved Hide resolved
let mut offset = 0usize;
for section_header in section_header_table.iter() {
if section_header.sh_type == SHT_NOBITS {
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we can restrict this further?

Copy link

@Lichtso Lichtso Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it is the other way around. The continue statement here excludes sections which have no data in the file (SHT_NOBITS) from the restrictive checks below.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, but why do we enforce section header checks on sections we don't care about. Not saying we shouldn't but I suspect that there might be deployed elfs that don't adhere to these restrictions

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm "+0" on doing those checks - but keep in mind that it's checking that the sections don't overlap with the file header, program headers and the section table. Those are all things not allowed in ELF, and I can't think of anything that would generate such files (barring corruption or intentionally malicious inputs)

src/elf_parser/mod.rs Outdated Show resolved Hide resolved
@Lichtso
Copy link

Lichtso commented Jun 22, 2022

As it is now, the new parser is supposed to be a drop in replacement for the old goblin parser. It does not enforce stricter checks yet. I'm happy to add those as part of this PR or do them as a follow up, but my priority for now is to be 100% compatible with the old parser so we can drop goblin sooner rather than later.

The new parser is a lot stricter (e.g. sections may not overlap, string length limits, etc.) do you plan on putting these restrictions behind a feature flag? Or just have them come with the general switch of the parser implementation feature gate?

Also, additionally to fuzzing, can we have benchmarks for parsing speed and memory consumption?

Check for alignment before casting pointers to references or reference slices.
When config.new_elf_parser=true, use the new dependency free ELF parser instead
of the legacy goblin based one.
The dynamic table can be padded with invalid entries, and processing
must stop at the first DT_NULL.
…ents

At least until rust-bpf-sysroot v0.13, we used to generate invalid
dynamic sections where the address of DT_REL was not contained in any
program segment. When loading such files, fallback to relying on section
headers to find out the offset of the relocations table.

One notable program that exhibits this bug is
https://github.com/solana-labs/solana-program-library/releases/tag/memo-v3.0.0
unwrap() was actually safe since in a path only exercised from the cli, but
just in case the code gets moved around in the future.
From the reference docs:

"If it is not possible to align the pointer, the implementation returns
usize::MAX. It is permissible for the implementation to always return
usize::MAX. Only your algorithm’s performance can depend on getting a usable
offset here, not its correctness."
… SHT_NULL

Other data in ELF can link back to entry #0 to mean "undefined section".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants