Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to parse PE files compiled by Rust #424

Closed
Berrysoft opened this issue Oct 21, 2024 · 13 comments
Closed

Fail to parse PE files compiled by Rust #424

Berrysoft opened this issue Oct 21, 2024 · 13 comments

Comments

@Berrysoft
Copy link

The error is:

Malformed("cannot map tls address_of_index rva (0xb5abc) into offset")

This error only occurs with goblin 0.9.0. Goblin 0.8.2 works fine.

The error seems only applies to exe create by Rust, for example, sudo.exe.

@m4b
Copy link
Owner

m4b commented Oct 22, 2024

@kkent030315 this seems directly related to the new tls parsing in #404 could you take a look? if a patch isn't available soon to fix this regression, I will likely disable the tls parsing in a minor patch and push that until it can be resolved better later on.

@m4b
Copy link
Owner

m4b commented Oct 22, 2024

@Berrysoft would you mind uploading a binary exhibiting the behavior here, so it's easier to repro/triage, thanks

@Berrysoft
Copy link
Author

sudo.zip

It contains the sudo.exe from Windows 11.

@m4b
Copy link
Owner

m4b commented Oct 22, 2024

thank you, and yes i can repro on top of tree:

cargo run --example rdr -- sudo.exe

exhibits the failure.

@m4b
Copy link
Owner

m4b commented Oct 22, 2024

this is odd, doesn't seem to find the rva in the .data section:

DEBUG - Checking .data for 0xb5abc ∈ 0xb5000..0xb6228

this is returning false for whatever reason:

fn is_in_section(rva: usize, section: &section_table::SectionTable, file_alignment: u32) -> bool {
    let section_rva = section.virtual_address as usize;
    is_in_range(
        rva,
        section_rva,
        section_rva + section_read_size(section, file_alignment),
    )
}

something seems very wrong with some of the section end values (end < begin)?

DEBUG - Checking .text for 0xb5abc ∈ 0x1000..0x98190
DEBUG - CHECKING: rva=0xb5abc align=512, section=0x1000 section_end=0x97200
DEBUG - Checking .rdata for 0xb5abc ∈ 0x99000..0xb4534
DEBUG - CHECKING: rva=0xb5abc align=512, section=0x99000 section_end=0x1b600
DEBUG - Checking .data for 0xb5abc ∈ 0xb5000..0xb6228
DEBUG - CHECKING: rva=0xb5abc align=512, section=0xb5000 section_end=0xa00
DEBUG - Checking .pdata for 0xb5abc ∈ 0xb7000..0xba138
DEBUG - CHECKING: rva=0xb5abc align=512, section=0xb7000 section_end=0x3200
DEBUG - Checking _RDATA for 0xb5abc ∈ 0xbb000..0xbb1f4
DEBUG - CHECKING: rva=0xb5abc align=512, section=0xbb000 section_end=0x200
DEBUG - Checking .rsrc for 0xb5abc ∈ 0xbc000..0xe6880
DEBUG - CHECKING: rva=0xb5abc align=512, section=0xbc000 section_end=0x2aa00
DEBUG - Checking .reloc for 0xb5abc ∈ 0xe7000..0xe7cf0
DEBUG - CHECKING: rva=0xb5abc align=512, section=0xe7000 section_end=0xe00

probably there is some subtle bug in section_read_size

not sure i haven't debugged this in a while; anyway that's probably all for me tonight :)

EDIT: nope, I forgot to add the end_size to the start :) that's why don't debug late at night

@m4b
Copy link
Owner

m4b commented Oct 22, 2024

With updated prints, it does indeed seem like that value is not in any sections; it's quite odd

DEBUG - Checking .data for 0xb5abc ∈ 0xb5000..0xb6228
DEBUG - CHECKING: rva=0xb5abc align=512, section=0xb5000 section_end=0xb5a00
DEBUG - Checking .pdata for 0xb5abc ∈ 0xb7000..0xba138
DEBUG - CHECKING: rva=0xb5abc align=512, section=0xb7000 section_end=0xba200

it's still possible there's something wrong with end calculation, but I'm not sure. anyway...

@philipc
Copy link
Collaborator

philipc commented Oct 22, 2024

There's nothing wrong with is_in_section for parsing files. The TLS address of index is in uninitialized data in the .data section, so it's impossible to give a file offset for the data. According to the PE documentation, the TLS address of index is where the loader stores the TLS index, so it also doesn't make sense to be trying to read the TLS index from the file in the first place.

If this code is used to parse a loaded file then it probably makes sense to read the TLS index from memory, but in that case is_in_section is wrong because it shouldn't restrict the size to the raw data (that only applies to the data in the file).

@kkent030315
Copy link
Contributor

@m4b I'm back! and yes. I recognize this issue. Lemme take a look and I will put fixes for this.

@kkent030315
Copy link
Contributor

kkent030315 commented Oct 23, 2024

I can repro this behaviour by cargo run --example rdr -- sudo.exe as well.

The address in question is the IMAGE_TLS_DIRECTORY::AddressOfIndex.
(Static) Image base is 0x140000000 and the VA to RVA translates (simply subtract the VA by the image base) it to the virtual address 0xb5abc in question.
TLS directory (IMAGE_TLS_DIRECTORY) in PE is kinda special. It handles virtual address, instead relative virtual address (mainly because the fields are covered by base relocation entries and compatible with ASLR). So likely something is insuffient in the source I am making assumption, in the first place.
image

The binary sudo.exe is compiled with MSVC compiler and linker. Some components within it compiled by C/C++ though. The Rust std feature in sudo.exe inserts TLS for its runtime.
Remember that Rust team had recently changed the TLS logic within their std and the binaries compiled may behave differ than the sudo.exe in this issue.

Applicable codebase of Rust std (before significant changes applied) is following:

Given that, we can reproduce the same behaviour with sudo.exe in question with following code and the nightly Rust:

rustc 1.83.0-nightly (3ae715c8c 2024-10-07)

Tip

rustup install nightly-2024-10-07

or

# proj-root/rust-toolchain.toml
[toolchain]
channel = "nightly-2024-10-07"
# then rustup do everything needed for you when you triggered builds
#[link_section = ".CRT$XLB"]
#[used]
pub static TLS_CALLBACK: unsafe extern "system" fn(*mut i8, u32, *mut i8) = tls_callback;

#[inline(never)]
pub extern "system" fn tls_callback(_: *mut i8, reason: u32, _: *mut i8) {}

fn main() {
    println!("Hello, world!");
}
cargo run --example rdr -- "path\to\target\release\goblintlstest.exe"

Malformed entity: cannot map tls address_of_index rva (0x21270) into offset

The goblintlstest.exe is compiled with Clang and LLD, the same is reproduced, so it's generic issue across MSVC(link.exe) and Clang(LLD).

The IMAGE_TLS_DIRECTORY::AddressOfIndex of binaries compiled with Rust never be correct it seems. So the current version of goblin raises that "malformed" error for any binaries compiled with Rust.

I'd like to use goblintlstest.exe with correct and complete PDB, that's superior than exploring sudo.exe with no PDBs and complexity.

The VA 0x21270 is indeed exists in the .tls section, when mapped in memory. the section doesn't cover the range when it isn't mapped into the memory.

This sounds a bit weird but is correct and behaves as expected. TLS indexes are literally the unique indexes for each threads when the executable is mapped being and prepared for execution by Windows Loader, so it shouldn't have to be in the raw data but should be exists in virtual address only when mapped into the memory.

1, .text, 0x400, 0x18000, 95 kB, 0x1000, 0x18ad6, 94.71 kB, Code, Executable, Readable (0x60000020), 5ed7a26677f471dcbeec0ecc139b0ee8, 6.39, 1536:xmO7M/bnFzUm8bmBcZ1+3jXau5xZSvBy4A77Wb85xY3Ooez3dlM4:4OkFkbmGZ1+3euvZ4UEeJT, T110932A327691A1BCD94FC0B883658966B972B0CE173178FF16A585303F69EF22B3C654
2, .rdata, 0x18000, 0x1fc00, 31 kB, 0x19000, 0x20ba4, 30.91 kB, Initialized data, Readable (0x40000040), 95e86e493cc408b9dd9377f4f0cbf808, 5.19, 768:RT2X0WlnlFMSezAwVyCO4EzyOj0Qz4MN:knMw7Y5Oj0Q, T12EE2F70B3765FAEAC0115A3A445213E52372ED78CF828347749D733ECD7E25A9E316A4
3, .data, 0x1fc00, 0x1fe00, 512 B, 0x21000, 0x21308, 776 B, Initialized data, Readable, Writeable (0xc0000040), 4c7c2bae4b6704a149a2a4a8f0782926, 1.63, 3:Qj6Vv2telslYlel5RtnOj6Vv2tGpW/LXP6aRQtllltl1l+lYtDKt/t:60S6EYEzt00SGpGjxEqkKt/, T1B7F08C83763090C5E04552323849CF81D339FCE00C513B1B3180732EAD301A3EC02C64
4, .pdata, 0x1fe00, 0x21200, 5 kB, 0x22000, 0x23248, 4.57 kB, Initialized data, Readable (0x40000040), 96490417e5f2c6885fffc6c3ed947135, 4.77, 96:20LhCp4/CIinmtEOvV6chUFl1JgWWR0u2EntskaAXliME7JDUx/bEJ:Hwp4CIzt9vIchWl1JgWW6EtsKlFE7SyJ, T1D9B1E89830AD0DD8D685D53D01D090770D288087A3FD2FEE4765E77B06A316D4EA6CBA
5, .tls, 0x21200, 0x21400, 512 B, 0x24000, 0x24059, 89 B, Initialized data, Readable, Writeable (0xc0000040), 023b3a392975053cc12e06f90d16a2b3, 0.02, 3:1l:, 
6, .reloc, 0x21400, 0x21800, 1 kB, 0x25000, 0x2533c, 828 B, Initialized data, Discardable, Readable (0x42000040), 618721927221131cadd91d9762b741b7, 4.85, 24:0f+FBr/qdE3+nDWKsk3sk8HX+nwi24CHkP:0f+FBr/qdUqW9HHHX+6M, T14911818A4083094AC71DC232C8BB06C494226C72C0F0EB9CF387233930E500063E98F8

So, possible fix is to be:

  • Not raising explicit error but ignores parsing when the VA does not covered by raw data. This is easy and I think is the most ideal solution for this case as the slot is already Option<T>. As explained above, assuming the field does not exists in raw data is totally safe by PE architecture.
    • goblin/src/pe/tls.rs

      Lines 182 to 202 in b2505ec

      // Parse the index if any
      if itd.address_of_index != 0 {
      if (itd.address_of_index as usize) < image_base {
      return Err(error::Error::Malformed(format!(
      "tls address_of_index ({:#x}) is less than image base ({:#x})",
      itd.address_of_index, image_base
      )));
      }
      // VA to RVA
      let rva = itd.address_of_index as usize - image_base;
      let offset =
      utils::find_offset(rva, sections, file_alignment, opts).ok_or_else(|| {
      error::Error::Malformed(format!(
      "cannot map tls address_of_index rva ({:#x}) into offset",
      rva
      ))
      })?;
      slot = Some(bytes.pread_with::<u32>(offset, scroll::LE)?);
      }

@kkent030315
Copy link
Contributor

Somewhat similarly related, I found a human error:

goblin/src/pe/tls.rs

Lines 239 to 241 in b2505ec

if callback == 0 {
break;
}

It should check if callback == 0 right after it reads data from binary. I'll include this fix for PR for this issue as well.

@kkent030315
Copy link
Contributor

@m4b Sorry for verbose ping, I submitted PR for fixing this issue. :)

@m4b
Copy link
Owner

m4b commented Oct 25, 2024

thanks everyone for quick feedback, @kkent030315 fixed this for now in #425 please re-open if see issue in 0.9.1, which is published with this fix

@m4b m4b closed this as completed Oct 25, 2024
@kkent030315
Copy link
Contributor

kkent030315 commented Oct 29, 2024

Note: This issue is technically problem with MSVC. Clang/LLD does not create AddressOfIndex that is only available in virtual address. MSVC linker is smart enough to reduce binary size by emitting it from the raw data. So this issue is correctly: "Fail to parse PE files compiled with MSVC and TLS data".

To replicate this behaviour:

# .cargo/config.toml
[target.x86_64-pc-windows-msvc]
linker    = "rust-lld.exe" # LLD
# linker = "link.exe" for MSVC

Also, non-mangled _tls_index is required to tell linker; this is C/C++ code but something like this:

EXTERN_C unsigned int _tls_index{};

I'll add tests for those cases to mitigate further regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants