-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit PNSE by using default comparer in HybridGlobalization
#96541
Conversation
Tagging subscribers to this area: @dotnet/area-system-globalization Issue DetailsImproved version of #96354. Changes:
|
@@ -415,6 +416,17 @@ private uint InitHash(object key, int hashsize, out uint seed, out uint incr) | |||
// | |||
public virtual void Add(object key, object? value) | |||
{ | |||
#if TARGET_BROWSER | |||
if (_keycomparer == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to move this logic to constructor, because there are other methods which are able to modify the collection. like public virtual object? this[object key]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please link the issue which explains motivations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we move it to constructor when we don't know what type will be added to Hashtable
? If we add int
then it will throw because we set comparer to StringComparer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could somebody add "1"
string and then 2
int ? That would fail anyway, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the problem is that for Hashtable
it's perfectly allowed, even when not recommended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we assume that users don't mix types, constructor does not know what types will be put into the non-generic collection.
public void IListedKeysPropertyCanUseCustomEqualityComparer() | ||
{ | ||
var orderedDictionary = new OrderedDictionary(StringComparer.InvariantCultureIgnoreCase); | ||
var orderedDictionary = PlatformDetection.IsHybridGlobalizationOnBrowser ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please link the issue which explains this
#if TARGET_BROWSER | ||
if (GlobalizationMode.Hybrid) | ||
{ | ||
_comparer = comparer ?? (IEqualityComparer<T>)(IEqualityComparer<string>)StringComparer.Ordinal; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please link the issue which explains motivations.
#if TARGET_BROWSER | ||
if (GlobalizationMode.Hybrid) | ||
{ | ||
_comparer = comparer ?? (IEqualityComparer<TKey>)(IEqualityComparer<string>)StringComparer.Ordinal; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please link the issue which explains motivations.
yield return new object[] { 10, null }; | ||
if (PlatformDetection.IsNotHybridGlobalizationOnBrowser) | ||
{ | ||
yield return new object[] { 0, null }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this doesn't work ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because comparer is null, so the default one will be used. We don't have any means of checking if the run is HybridGlobalization or not in System.Collections.Specialized, so all the responsibility lies on the user to use the correct constructor. Here in the test the incorrect one is used.
@@ -72,7 +73,18 @@ public Dictionary(int capacity, IEqualityComparer<TKey>? comparer) | |||
if (typeof(TKey) == typeof(string) && | |||
NonRandomizedStringEqualityComparer.GetStringComparer(_comparer!) is IEqualityComparer<string> stringComparer) | |||
{ | |||
#if TARGET_BROWSER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds pretty broken to be changing behavior of generic collections when hybrid globalization is enabled. Can this be fixed in the string comparers instead, so that anybody using the string comparers gets the right behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea but so far the implementation of it that I have in my head is also messy. So we could go to
public abstract class StringComparer : IComparer, IEqualityComparer, IComparer<string?>, IEqualityComparer<string?> |
and for each comparer option (
InvariantCulture
, InvariantCultureIgnoreCase
, etc) make a preprocessor directive #if TARGET_BROWSER
then use Ordinal
or OrdinalIgnoreCase
instead. However, it is also problematic:
- All the usages of these
StringComparer
types, even in the context of collections that are currently not affected byPNSE
will be changed. - User that calls
StringComparer.CurrentCulture
expects current culture, notOrdinal
(this is public member). This can be documented but does not look good.
Or maybe I missed the point and you had a different idea, @jkotas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding we are playing tradeoffs here. Hybrid is partial between ICU and InvariantCulture.
StringComparer.CurrentCulture
expects locale, but we can't do it because we don't have hybrid hashcode, right ?
So we could keep PNSE or we could do InvariantCulture
, that's our tradeoff.
Or is Ordinal
better ? Why ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it does not use ICU so we don't need native SortKey
implementation to keep it working as expected. The idea was: only Ordinal
and OrdinalIgnoreCase
are able to work in Hybrid Globalization
, so use them always for collections that wanted to use other options and thanks to that, keep these collections supported. If we "keep PNSE or we could do InvariantCulture" (InvariantCulture
throws the same way as any other culture, because it uses CompareInfo
implementation) then we would just stick to the current behavior. And we don't want it, we would like to make collections listed in the description of this PR available to the users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you're mixing InvariantCulture
and InvariantGlobalization
. "Hybrid is partial between ICU and InvariantCulture." -> "Hybrid is partial between ICU and InvariantGlobalization."
When we use InvariantCulture
we still use the same paths in the code as for ICU
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are hundreds of different types of custom collections out there in the ecosystem, and many of them are affected the same way as the few collections that you are changing in this PR. It is not ok to break all these custom collections out there with hybrid globalization. I would be a major violation of our code portability and compatibility promise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas do you mean the other solution that I proposed is also invalid? If so, I am really out of ideas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we go back to the drawing board with the hybrid globalization then? Given the feature gaps in the hybrid globalization, it may make more sense to think about it as invariant globalization with a few extra culture aware APIs working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to limitations of Web API the Hybrid Globalization feature in the Browser in fact turned out to be less promising than e.g. iOS's implementation where we were able to really maintain ICU-like experience with Invariant-like size. However, the unsupported scope is not that huge, we are listing all the limitations in a doc (https://github.com/dotnet/runtime/blob/928ff3015d1936ca9985a0123754e13cbf47b237/docs/design/features/globalization-hybrid-mode.md). I do not believe a re-design would help in the issue we have with SortKey
. We know we cannot support it nativity (and loading it from ICU does not pay off in this point, we would loose all the benefits of collations removal). Now we have a brainstorm how to solve it some other way.
Improved version of #96354. Fixes #96400.
Changes:
comparer=StringComparer.Ordinal
for string-keyed collections that have access toGlobalizationMode
(= are inSystem.Private.CoreLib
):System.Collections.Generic.Dictionary
,System.Collections.Generic.HashSet
,System.Collections.Hashtable
Dictionary
andHashSet
are generic collections and we know their key type in the compile time. We can decide if the collection requires the default comparer (has keys of string type) or not, in collection constructor.Hashtable
is non-generic, types of keys are not known in the compile time. Because of this, logic of comparer initialization was moved toAdd()
function. If the first element is string, we assume all the other will be strings.Hashtable
does not prevent users from adding keys of different types and it's only a good practice to keep them in the same type. InHybridGlobalization
keys of different types are not supported, adding non-string key to collection that hascomparer=StringComparer
will throwArgumentException(Arg_MustBeString)
.System.Private.CoreLib
assembly (that is:System.Collections.Specialized.OrderedDictionary
,System.Collections.Specialized.NameValueCollection
,System.Collections.Specialized.NameValueCollection
,System.Collections.Specialized.NameObjectCollectionBase
) we cannot detect if they are run inHybrid Globalization
mode. It is still possible to use these collections ifStringComparer.Ordinal
orStringComparer.OrdinalIgnoreCase
will be passed to their constructors. We change the tests to use the constructor with parameter for HG.StringComparer.Ordinal
orStringComparer.OrdinalIgnoreCase
parameter.System.Data.DataColumnCollection
,Microsoft.VisualBasic.Collection
,System.Net.Mail.MailAddress
,System.Collections.CaseInsensitiveHashCodeProvider
(deprecated) have no constructor with parameter and are not inSystem.Private.CoreLib
, so none of above apply to them. They will stay unsupported.