Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broaden use of SearchValues in TryFindNextPossibleStartingPosition in Regex #89205

Merged
merged 1 commit into from
Jul 20, 2023

Conversation

stephentoub
Copy link
Member

Replaces #89140. That PR changed how we emit custom IndexOf helpers, such that if we could fully enumerate the set and it contained a small enough number of characters, we would have the whole thing vectorized by first doing an ASCII-based search and then falling back to a probabilistic map search. But we then decided to move that approach down into SearchValues itself, via #89155. This means we can simplify TryFindNextPossibleStartingPosition in Regex to not track AsciiSet specially and instead just increase the number of characters we query the set for (from 5 to 128). That way, we'll use SearchValues rather than emitting our own helper up until a (semi-arbitrary) point where we deem it impossible or infeasible to enumerate all the chars that make up the set.

… Regex

SearchValues has been updated to have an ASCII fast-path for inputs that are not only ASCII.  This means we can simplify TryFindNextPossibleStartingPosition in Regex to not track AsciiSet specially and instead just increase the number of characters we query the set for (from 5 to 128).  That way, we'll use SearchValues rather than emitting our own helper up until a (semi-arbitrary) point where we deem it impossible or infeasible to enumerate all the chars that make up the set.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants