Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Narrow utf16 to ascii #39509

Closed
wants to merge 1 commit into from
Closed

Conversation

pgovind
Copy link

@pgovind pgovind commented Jul 17, 2020

This is ready for review now.

NOTE: There is a bug here somewhere, but I can't seem to find it, so I'd appreciate another pair of eyes in this code. When I try to test this on WSL2, I get an error saying "Can't find file xunit.console.deps.json" when that file does in fact exist. The real error is that the xunit infra itself calls into the method being modified here and something goes wrong (maybe marshalling the path from string to native also uses this code? I'm not sure). AFAICT the changes are right.

@ghost
Copy link

ghost commented Jul 23, 2020

Tagging subscribers to this area: @tannergooding
See info in area-owners.md if you want to be subscribed.

}
else if (AdvSimd.Arm64.IsSupported)
{
if (ContainsNonAsciiValue(AdvSimd.AddSaturate(utf16VectorFirst.AsUInt16(), asciiMaskForAddSaturate)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the original Sse2 algorithm right - this is testing if any of the characters in utf16VectorFirst is outside of [0, 0x7f] value range, right?

I don't think the algorithm on Arm64 should copy exactly, what Sse2 does.

I believe the following should be enough

AdvSimd.MaxPairwise(utf16VectorFirst.AsUInt16(), utf16VectorFirst.AsUInt16()).AsUInt64().ToScalar() & 0xFF80FF80FF80FF80UL != 0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

}
else if (AdvSimd.Arm64.IsSupported)
{
utf16VectorFirst = AdvSimd.LoadVector128((short*)pUtf16Buffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: add a comment // unaligned load

}
else if (AdvSimd.Arm64.IsSupported)
{
if (ContainsNonAsciiValue(AdvSimd.AddSaturate(utf16VectorFirst.AsUInt16(), asciiMaskForAddSaturate)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

@jeffhandley
Copy link
Member

@pgovind I'm going to close this PR since this work is on hold. When we get back around to this work, we can create a fresh PR.

@ghost ghost locked as resolved and limited conversation to collaborators Feb 22, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants