Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS Runtime Issues for GitHub Artifacts built on Windows #126

Closed
coltonhurst opened this issue Aug 22, 2024 · 11 comments · Fixed by #127
Closed

TLS Runtime Issues for GitHub Artifacts built on Windows #126

coltonhurst opened this issue Aug 22, 2024 · 11 comments · Fixed by #127
Labels
bug Something isn't working O-Windows Work related to the Windows verifier implementation

Comments

@coltonhurst
Copy link

Hello 😊 We have discovered a weird runtime TLS bug happening for artifacts built on the GitHub Windows runners. We have an example repository showcasing a simplified example here:

https://github.com/Thomas-Avery/test-rustls-platform-verifier

You can find the built artifact here:

https://github.com/Thomas-Avery/test-rustls-platform-verifier/actions/runs/10378079748

The error when running this artifact:

PS C:\Users\Colton\Desktop> .\test-rustls-platform-verifier.exe
[2024-08-22T17:19:28Z ERROR rustls_platform_verifier::verification::windows] failed to verify TLS certificate: invalid peer certificate: InvalidPurpose
Error: reqwest::Error { kind: Request, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("httpbin.org")), port: None, path: "/ip", query: None, fragment: None }, source: hyper_util::client::legacy::Error(Connect, Custom { kind: Other, error: Custom { kind: InvalidData, error: InvalidCertificate(InvalidPurpose) } }) }
error
@ctz
Copy link
Member

ctz commented Aug 22, 2024

So apparently this is CERT_E_WRONG_USAGE. Can you get a packet capture of this going wrong (in case there are any middleboxes between you and httpbin.org)?

@complexspaces
Copy link
Collaborator

I'd like to second what ctz is saying. A packet capture here would be great if possible since I can't reproduce your error on my local Windows system and network, further implying its from a specific network environment config.

coltonhurst added a commit to bitwarden/sdk-sm that referenced this issue Aug 22, 2024
## 🎟️ Tracking

https://bitwarden.atlassian.net/browse/SM-1407

## 📔 Objective

We are having TLS runtime issues for GitHub artifacts built on Windows
when using `rustls-platform-verifier`. There is a repository with a
simplified example showcasing this bug here:

https://github.com/Thomas-Avery/test-rustls-platform-verifier

The goal of this PR is to use a work-around on Windows until the bug is
fixed.

A GitHub issue for this has been created:
rustls/rustls-platform-verifier#126

## ⏰ Reminders before review

- Contributor guidelines followed
- All formatters and local linters executed and passed
- Written new unit and / or integration tests where applicable
- Protected functional changes with optionality (feature flags)
- Used internationalization (i18n) for all UI strings
- CI builds passed
- Communicated to DevOps any deployment requirements
- Updated any necessary documentation (Confluence, contributing docs) or
informed the documentation
  team

## 🦮 Reviewer guidelines

<!-- Suggested interactions but feel free to use (or not) as you desire!
-->

- 👍 (`:+1:`) or similar for great changes
- 📝 (`:memo:`) or ℹ️ (`:information_source:`) for notes or general info
- ❓ (`:question:`) for questions
- 🤔 (`:thinking:`) or 💭 (`:thought_balloon:`) for more open inquiry
that's not quite a confirmed
  issue and could potentially benefit from discussion
- 🎨 (`:art:`) for suggestions / improvements
- ❌ (`:x:`) or ⚠️ (`:warning:`) for more significant problems or
concerns needing attention
- 🌱 (`:seedling:`) or ♻️ (`:recycle:`) for future improvements or
indications of technical debt
- ⛏ (`:pick:`) for minor or nitpick changes
@coltonhurst
Copy link
Author

coltonhurst commented Aug 22, 2024

@ctz @complexspaces I've captured two requests with Wireshark (specifically, the Wireshark build for Windows ARM). One is a successful request with cURL (the cURL on Windows PowerShell), and the other is the failed request sent with the simplified example artifact here.

Within the Wireshark file:

The packets for the successful request are row #'s: ~22 - 44
The packets for the failed request are row #'s: ~66 - 81

Both requests are GET requests to: https://httpbin.org/

In case there is sensitive info within the file, I've uploaded the Wireshark file here: https://send.bitwarden.com/#MyCGYTnaekmYl7HUAQ4OhQ/Ah16xdCywkQdMkVuxW1fQg

This was run on Windows 11 Pro on ARM.

Microsoft Windows
Version 23H2 (OS Build 22631.4037)

@ctz
Copy link
Member

ctz commented Aug 22, 2024

Thanks, that looks pretty normal to me (real certificate from a real CA, in CT).

Looking more closely at the involved code I think we might be missing zero termination on the rgpszUsageIdentifier?

@complexspaces
Copy link
Collaborator

I'm not immediately sure if that's it. The CTL_USAGE structure takes a pointer to an array and the length of the array. It should not be looking for a NUL element to mark the end of the valid EKU array. As for the string itself, the docs don't specify but I would have expected this to crash/fail on any input if this was a case of Windows reading a garbage EKU value due to a missing terminator:

The LPSTR type and its alias PSTR specify a pointer to an array of 8-bit characters, which MAY be terminated by a null character.

I can run your test on an ARM64 Windows 11 VM later today and see if that makes a difference. If it doesn't I'll extract the certificates from the capture you provided and see if I can narrow it down by providing the exact chain 👍

@ctz
Copy link
Member

ctz commented Aug 22, 2024

My parsing of rgpsz is rg array of p pointer to sz nul terminated string, though I'm not sure if the hungarian notation is more or less trustworthy than anything else. If I'm right, and the matching gets done by strcmp, the behaviour will vary from compilation-to-compilation, platform-to-platform, in a highly unpredictable way.

@complexspaces
Copy link
Collaborator

I spent a while looking at this tonight and I manged to reproduce it. This was caused UB from a missing NULL terminator like @ctz suspected, but in a fun non-obvious way. With a fresh look it makes sense: how would Windows know where the string ends?

I was interested in the actual cause though, so I dug further.

The CERT_USAGE_MATCH we pass into the system is determined at compile time, as are all the pointer offsets (we only make pointers into const structures here):

 let usage = CERT_USAGE_MATCH {
            dwType: USAGE_MATCH_TYPE_AND,
            Usage: CTL_USAGE {
                cUsageIdentifier: ALLOWED_EKUS.len() as u32,
                rgpszUsageIdentifier: ALLOWED_EKUS.as_ptr() as *mut LPSTR,
            },
        };

The first problem, but that actually wasn't breaking things, was the way the string pointer was getting passed in. &str is a fat pointer so we were actually passing the equivalent of *const u128 into windows, slice metadata included. This didn't break anything though since the array only had one element and Windows wouldn't read past the actual string's data.

The second problem was more interesting. To finally reproduce the error, I actually had to set $CARGO_HOME="C:/Users/runneradmin/.cargo" because the broken binary would only be compiled that way when the path string of windows.rs was a specific length. My default username is shorter then runneradmin, hence failing to reproduce it at first.

I opened up both the binary from CI and my local build in Binary Ninja to look at the callsite for differences. Right away something jumped out. Here's what a correct initialization of our CTL_USAGE structure looks like:
image
image

Now the binary compiled by GitHub actions:
image
image

You can see that its getting a .reloc offset pointer that is just completely wrong in the GHA test binary. Its reading past the end of the &str const and into other random, merged strings. The reason this previously worked fine is because, AFAICT, the string would always* merge differently and result in those garbage bytes not being there, and instead a NUL terminator from the string's section layout being right after the EKU string data. I even checked our production 1Password builds and it accidentally lines up there too 😅. That is the "fun" of UB though: entirely absurd chances and results.

Running the bad binary with syscall tracing confirms the hypothesis, where you can see Windows is receiving an invalid eku:
image

As a result of that invalid EKU, the valid one in the certificate isn't in the allow-list anymore and it then fails. Small mystery solved 🎉

As an aside: This stopped working unconditionally at codegen time in Rust 1.78, for unknown reasons. Prior to that it worked with older rustc versions despite having the runneradmin path. The incorrect code also seems to work again in the latest nighty but this is just a random datapoint because it doesn't matter, and needs fixed regardless.

@complexspaces
Copy link
Collaborator

Please take a look at #127 which contains all of the fixes.

@complexspaces complexspaces added bug Something isn't working O-Windows Work related to the Windows verifier implementation labels Aug 23, 2024
@tangowithfoxtrot
Copy link

tangowithfoxtrot commented Aug 23, 2024

Thanks so much for the quick turnaround with a fix for this!

Just chiming in to confirm that the fix in #127 works for artifacts built in GitHub runners:

image

@coltonhurst
Copy link
Author

This is excellent, thank you @complexspaces and @ctz for the help, the fix, and the detailed information!

🎉

@ctz ctz closed this as completed in #127 Aug 23, 2024
@complexspaces
Copy link
Collaborator

This is now released on crates.io as versions 0.3.4 and 0.1.2.

Thank you again for the really helpful bug report! This issue would have probably continued to lurk for much longer otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working O-Windows Work related to the Windows verifier implementation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants