Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand small value sets to all case permutations in SearchValues<string> #98902

Merged

Conversation

MihaZupan
Copy link
Member

@MihaZupan MihaZupan commented Feb 25, 2024

Implements #98791 (comment)

If we have a set of values like ["ab", "c!"], we can expand it to ["ab", "Ab" "aB", "AB", "c!", "C!"] and switch to case-sensitive searching.
As long as we're making use of buckets that would otherwise have been empty, this is going to be an improvement as the prefix search loop is a bit simpler due to not needing to deal with casing. In the below benchmark, it means eliminating this step from the loop.

This optimization is a bit niche (unlikely to be applicable often), but it's really cheap to check whether it could apply, and can help cases with many non-letter characters in their prefixes.

public class IgnoreCaseToOrdinal
{
    private static readonly SearchValues<string> s_values = SearchValues.Create(["ab", "c!"], StringComparison.OrdinalIgnoreCase);
    private readonly string _text = new('\n', 1000);

    [Benchmark]
    public int IndexOfAny() => _text.AsSpan().IndexOfAny(s_values);
}
Method Toolchain Mean Error Ratio
IndexOfAny \main\corerun.exe 82.42 ns 0.596 ns 1.00
IndexOfAny \pr\corerun.exe 67.83 ns 0.369 ns 0.82

@MihaZupan MihaZupan added this to the 9.0.0 milestone Feb 25, 2024
@MihaZupan MihaZupan self-assigned this Feb 25, 2024
@ghost
Copy link

ghost commented Feb 25, 2024

Tagging subscribers to this area: @dotnet/area-system-buffers
See info in area-owners.md if you want to be subscribed.

Issue Details

Implements #98791 (comment)

If we have a set of values like ["ab", "c!"], we can expand it to ["ab", "Ab" "aB", "AB", "c!", "C!"] and switch to case-sensitive searching.
As long as we're making use of buckets that would otherwise have been empty, this is going to be an improvement as both the prefix search loop and the verification steps are a bit simpler due to not needing to deal with casing.

This optimization is a bit niche (unlikely to be applicable often), but it's really cheap to check whether it could apply, and can help cases with many non-letter characters in their prefixes.

public class IgnoreCaseToOrdinal
{
    private static readonly SearchValues<string> s_values = SearchValues.Create(["ab", "c!"], StringComparison.OrdinalIgnoreCase);
    private readonly string _text = new('\n', 1000);

    [Benchmark]
    public int IndexOfAny() => _text.AsSpan().IndexOfAny(s_values);
}
Method Toolchain Mean Error Ratio
IndexOfAny \main\corerun.exe 82.42 ns 0.596 ns 1.00
IndexOfAny \pr\corerun.exe 67.83 ns 0.369 ns 0.82
Author: MihaZupan
Assignees: MihaZupan
Labels:

area-System.Buffers

Milestone: 9.0.0

@MihaZupan MihaZupan force-pushed the searchvalues-string-shortCasePermutations branch from 54e8e3d to ac5966c Compare March 2, 2024 03:13
@MihaZupan MihaZupan merged commit 8aff565 into dotnet:main Mar 2, 2024
178 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Apr 2, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants