Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSI string pool handling doesn't account for large strings #72

Closed
iansltx opened this issue Jan 1, 2025 · 3 comments
Closed

MSI string pool handling doesn't account for large strings #72

iansltx opened this issue Jan 1, 2025 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@iansltx
Copy link

iansltx commented Jan 1, 2025

Description

Parsing properties from OpenVPN Connect v3's MSI doesn't grab the right data.

To Reproduce

emit -> xtmsi on the above installer, then look at the properties output. There's a license string blob that gets partially pulled into one interned string stream, with the string overflowing into the remainder of the string pool, while a bunch of the interned strings are missing (e.g. ProductName, ProductCode, ProductVersion).

Environment

  • macOS 15
  • Python 3.13.0
  • Refinery 0.8.0

Additional Context

I found a clue in the source for the msi-props-reader JS package: for interned strings too long to fit into a single length value, eight bytes are used rather than four for the string pool record, with the first two bytes as zeroes to indicate a longer record, the next two with the top two bytes of the length, the next two with the bottom two bytes of the length, then the last two with the reference count.

What's weird is that in my testing with the installer above (and the x86 version) I needed a 17-bit shift rather than 16, which makes me think I'm doing something wrong here, but for the above 17 bits works and 16 doesn't. My fix implementation is on another package, in another language (fleetdm/fleet#25079), but should be easy enough to port over if the code passes the smell test. Really uneasy about the 17-bit shift though, even though we're talking about an undocumented format that...well, I don't want to make guesses about why I'm seeing what I'm seeing.

@iansltx iansltx added the bug Something isn't working label Jan 1, 2025
@jhhcs
Copy link
Contributor

jhhcs commented Jan 1, 2025

Thanks for the excellent bug report, I'll look into it ASAP. I'll also try to find some more samples with the same phenomenon, maybe additional data can help understand the mystery about the shift length.

@jhhcs
Copy link
Contributor

jhhcs commented Jan 2, 2025

This is the pool data for your provided sample that triggers the problem:

00 00  size
01 00  refcount
bd b7  value1
02 00  value2

Now shifting refcount by 17 left and adding value1 is suspiciously the same as taking the little-endian concatenation of value1 and value2. Furthermore, when using your suggested method, we get a very interesting reference count inconsistency for string 732, which is the one with the large size:

(19:11:16) verbose in xtmsi: string 0000 reference count computed=021 provided=011
(19:11:16) verbose in xtmsi: string 0732 reference count computed=001 provided=002

So when instead we assume that value1 and value2 together are just the little endian encoding of the actual size, this reference count inconsistency vanishes and there are no unexplained magic numbers.

@iansltx
Copy link
Author

iansltx commented Jan 2, 2025

Tahnks for this! That makes a lot more sense. I'll fix the implementation that I built.

iansltx added a commit to fleetdm/fleet that referenced this issue Jan 2, 2025
h/t binref/refinery#72

Also added output for version in the custom package parser tool.
iansltx added a commit to fleetdm/fleet that referenced this issue Jan 2, 2025
h/t binref/refinery#72, for #24720. No changes
file as this is an unreleased bug.

Also added output for version in the custom package parser tool.

# Checklist for submitter

If some of the following don't apply, delete the relevant line.

<!-- Note that API documentation changes are now addressed by the
product design team. -->

- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/Committing-Changes.md#changes-files)
for more information.
- [x] Manual QA for all new/changed functionality
iansltx added a commit to fleetdm/fleet that referenced this issue Jan 2, 2025
h/t binref/refinery#72, for #24720. No changes
file as this is an unreleased bug.

Also added output for version in the custom package parser tool.

# Checklist for submitter

If some of the following don't apply, delete the relevant line.

<!-- Note that API documentation changes are now addressed by the
product design team. -->

- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.
See [Changes
files](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/Committing-Changes.md#changes-files)
for more information.
- [x] Manual QA for all new/changed functionality
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants