Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix x86 SSE4.1 ptestnzc #3216

Merged
merged 1 commit into from
Dec 9, 2023
Merged

Fix x86 SSE4.1 ptestnzc #3216

merged 1 commit into from
Dec 9, 2023

Conversation

eduardosm
Copy link
Contributor

Fixes ptestnzc by bringing back the original implementation of #3214.

(op & mask) != 0 && (op & mask) == !ask need to be calculated for the whole vector. It cannot be calculated for each element and then folded.

For example, given

  • op = [0b100, 0b010]
  • mask = [0b100, 0b110]

The correct result would be:

  • op & mask = [0b100, 0b010]
    Comparisons are done on the vector as a whole:
  • all_zero = (op & mask) == [0, 0] = false
  • masked_set = (op & mask) == mask = false
  • !all_zero && !masked_set = true correct result

The previous method:

  • op & mask = [0b100, 0b010]
    Comparisons are done element-wise:
  • all_zero = (op & mask) == [0, 0] = [true, true]
  • masked_set = (op & mask) == mask = [true, false]
  • !all_zero && !masked_set = [true, false]
    After folding with AND, the final result would be false, which is incorrect.

`(op & mask) == 0` and `(op & mask) == mask` need each to be calculated for the whole vector.

For example, given
* `op = [0b100, 0b010]`
* `mask = [0b100, 0b110]`

The correct result would be:
* `op & mask = [0b100, 0b010]`
Comparisons are done on the vector as a whole:
* `all_zero = (op & mask) == [0, 0] = false`
* `masked_set = (op & mask) == mask = false`
* `!all_zero && !masked_set = true`

The previous method:
`op & mask = [0b100, 0b010]`
Comparisons are done element-wise:
* `all_zero = (op & mask) == [0, 0] = [true, true]`
* `masked_set = (op & mask) == mask = [true, false]`
* `!all_zero && !masked_set = [true, false]`
After folding with AND, the final result would be `false`, which is incorrect.
@RalfJung
Copy link
Member

RalfJung commented Dec 9, 2023

Ah... unfortunate. I guess we could still express this as a fold if we wanted (by folding over a pair of booleans), but it doesn't seem worth it.

@bors r+

@bors
Copy link
Collaborator

bors commented Dec 9, 2023

📌 Commit 092eb11 has been approved by RalfJung

It is now in the queue for this repository.

@bors
Copy link
Collaborator

bors commented Dec 9, 2023

⌛ Testing commit 092eb11 with merge 12c868a...

@bors
Copy link
Collaborator

bors commented Dec 9, 2023

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing 12c868a to master...

1 similar comment
@bors
Copy link
Collaborator

bors commented Dec 9, 2023

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing 12c868a to master...

@bors bors merged commit 12c868a into rust-lang:master Dec 9, 2023
8 checks passed
@eduardosm eduardosm deleted the fix-ptestnzc branch December 9, 2023 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants