-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid union reinterpret #16817
Avoid union reinterpret #16817
Conversation
617b236
to
14bc0f9
Compare
So, somewhat disappointingly, it's not this either. Graphics are still broken in GTA on M1, at least. I actually do think that unions should be safe enough since they're quite widely used for this purpose, although that'll probably only last to the next c++ standard or something.. So I'm a little undecided on merging this.. |
If this doesn't fix Intel either we should probably check if we can get some consistence Windows/Intel Mac behavior with a "self made" sine and cosine function. Taylor should do just fine, but there sure are lot of implementation on the internet. Can't really think of anything else on that PR... |
It would be best to try to match what the PSP actually generates, see #2990. I don't know if sin/cos would specifically solve that, but we already had issues that using the double version of sin/cos helped. It's important that the results line up with the PSP results.
Darn, it would've explained the strange NANs. That said, the screenshots showed it working on softgpu for Intel, which is (if I understand correctly) not the case for the M1. So they're different problems, apparently. Though, these unions are used in both cases anyway, I'd think... -[Unknown] |
Sadly it also does not help on intel
If you're referring to the screenshots I took on software rendering, each screenshot was taken with the windows version running in wine on top, native version under; the native MacOS version has the same rendering issues even on software rendering Clarification: On intel x86_64 hardware, it works in wine with Windows builds but not native MacOS native builds, and bisect points to 0ccc63b where the issue first appeared on MacOS x86_64 builds With MacOS x86_64 builds, the issue is observed with and without software rendering enabled Given how the Windows x86_64 builds are issue-free in Wine even on MacOS, softgpu has the same issue on MacOS x86_64 builds, I made a guess that the issue resides in the different compilers used |
Well, when I get around to it, I'll start by reproducing on my old Intel Mac, and start taking that commit apart.. |
Don't get me wrong, I just wanted to propose a quick test to see if like this Windows and Mac behaved the same, both broken or not doesn't really matter as it would give us some good hint on what's the problem. Different implementation could give zero or different sign leading to NaN on some edge cases. |
Well, if it doesn't help either, I suppose we should close this. I wish unions were definitively correct for these types of conversions. I guess a quick check would be trying to switch back to sinf/cosf/sincosf. That would cause problems, but if it undoes the breakage it would answer questions. -[Unknown] |
Yeah, will try some stuff like that soon. |
Is there a table somewhere of known 'input->output' pairs for PSP FPU functions? |
Addendum (maybe it's offtopic here?). So, as I understand it, I think the current implementation, i.e.
with
|
I'm sure it doesn't. I did create a table of inputs/outputs to do some simulations against. Here's an example:
I only saved the full table as a CSV and it's quite large, but maybe I can convert it to something smaller and attach it somewhere. As I recall, it was clear the the result was also only correct to a certain number of bits in "infinite bits" space, before normalizing the exponent. Specifically, exp 0x68 (-24) always had zeroes in the lowest 18 bits, until 0x7A (-6) which could have all bits set in the result. That's just what I had in my notes that it was looking like. At the time I was thinking maybe it used CORDIC because of the vrot instruction, but knowing that it definitely happens in two steps internally now, I guess a taylor expansion is more likely, possibly just at a fixed precision. -[Unknown] |
@unknownbrackets Just curious, the error column seems a little .. off? Or what does it represent? |
I don't remember, it might've just been bits of mantissa error (i.e. error as an integer adjusted to a certain exp or something.) -[Unknown] |
It matches Yes, seeing these 0's in LSB is interesting. It's clearly far from correctly rounded. |
So, at least in this table:
|
Here's a quick upload of the csv (compressed using zstd): -[Unknown] |
Downloaded, thanks! |
I'm gonna go ahead and close this, as whatever the standard may say, unions are well support and convenient for this. It turned out to be something completely different in #15149 . |
Fair enough, I'd definitely prefer to keep the unions as long as they're valid everywhere and stay valid..... If we figure out the actual model of sin/cos, might be nice to use it (even optionally.) I'm also now very curious if the wrong driving in #2990 is differently wrong on macOS. -[Unknown] |
It seems like the consensus is that this is undefined behavior in C++, though not newer C. Let's just avoid it, whether or not it fixes the M1 thing. It does seem like it would explain it, though. It's unfortunate, this may make debug mode a bit slower for IR/interp...
Might've missed some but I think I got most of them.
-[Unknown]