-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Add a generic OrderedDictionary class #24826
Comments
📝 Edit: (And removing my related comments that followed) I was not aware of the semantics of the non-generic |
@sharwell |
Related to dotnet/corefx#26638 |
Would the implementation optimize for retrieval (presumably array + hashtable) or for space (presumably a tree/heap) ? Are there scenarios for both? What complexity do you hope for for lookup, Add, Remove, and ContainsKey (and ContainsValue if we have that) [snipped example table in favor of @TylerBrinkley 's below] |
Thanks for putting that chart together. The implementation will be nearly identical to Below is the chart filled out
|
💭 I almost prefer the indexer for this type be based on the "bias" implied by the type name. To me, |
If we only have one I'd agree that the |
FYI, if this gets approved I'd happily implement it. |
I'd also be interested in seeing generic variant of Thank you in advance for considering this. |
Hello, I too would like to see a generic version of this, and have it moved to the and so on.. So it would be nice to have this type implemented in the framework. |
I still think it would be beneficial to have Immutable Parent/Child Collection, but understand it is not specifically related to this thread. |
Just had need for this today. Any news if/when this will be implemented? |
Moved to corefxlab#2456 as part of the specialized collections initiative. |
Reopening in place of #29570. For context this has already been included in corefxlab via dotnet/corefxlab#2525 |
Bump |
Any news? |
We have no plans on adding such a type in our immediate roadmap. We will post an update on this thread as soon as anything changes. |
This has been around in various forms and other issues for years now. Is there really no priority in closing a glaring gap in Microsoft Collections? Especially given that it has been done time and time again, and with a general messiness in its placement - moved from a repo to another, without a stable nuget available. I dislike having to resort to third-party implementations for stuff like this. |
You're right to point out that the primary bottleneck is getting the proposal lined up for API review. This does require some prep-work, including evaluating prototypes and ensuring that the API shape is on par with other designs that have already been shipped. Not all API proposals are created equal from a complexity standpoint however, and championing brand new collection types can be a long-running and expensive process. It's unlikely we could get around to such a proposal unless it registers high in our prioritization. |
Thanks for the reply! I totally understand and agree with most of what you bring up, save for this point:
This is the part I was really getting at. What would make it register high in your prioritization? As I mentioned, it's unclear to me how e.g. frozen collections could have registered higher going by the indicators I posted. (I don't say that to knock them, by the way - I use them myself! - they're just an example to illustrate the point.) If the answer is "we simply decided we really cared about maximizing read performance for collections in that release cycle" then, hey, fair enough. Like I said, I'm just looking for some insight into the process here. I think a lot of folks have the impression that prioritization of API proposals is to a large extent driven by the volume of demand. If that isn't the case, I think it would be good to just clarify what the different factors are, and maybe their relative importance. |
While our planning does take upvotes into consideration, it is not the only driving factor. In the interest of transparency, frozen collections were added because they were a first-party team requirement at the time. There are other factors as well: as you mention an implementation is already available via a NuGet package which plays a role as well. Not everything needs to be part of the BCL, or at least it doesn't urgently need to be part of the BCL. |
It goes without saying that our resources aren't infinite and our backlog is substantial. Oftentimes we might not invest on collections at all in a particular release cycle, simply because the team is pursuing different opportunities. |
To this particular point:
This is true of course, but with some caveats: Microsoft.Experimental.Collections is pre-release, so you will get NU5104 if you use package validation. There's also the fact that, with corefxlab archived, the package is deprecated and unmaintained, and even finding the source code requires a fair bit of digging. To be fair, nothing stops anyone from taking that code and publishing a new package. But I suspect part of the problem here is that, rightly or wrongly, .NET doesn't really have a culture of publishing small utility packages that are narrowly focused on one specific thing, as you'd see in e.g. Node.js and Rust. And on top of that, maintainers of libraries don't seem to like taking dependencies on such small utility packages. So, many who aren't using Microsoft.Experimental.Collections just end up copying an implementation into their project. I think several of the links I provided earlier at least partially substantiate this line of thinking. |
I wonder why this is. But that is off topic .. |
We should just do this. As has been noted, there a plethora of implementations floating around, including very close to home in System.IO.Packaging, EF Core, WCF, MAUI, and WPF, and then also as noted there a multitude of implementations in a myriad of other projects. We can do it once in the core libraries and avoid all that duplication, for something where we already have a non-generic implementation and just need a generic one. We can also start a more minimal surface area and add to it in the future if we're missing anything. Some notes on the original proposal:
I've updated the top proposal and marked it ready for review. |
When |
The ambiguity is there are then two overloads with the exact same arguments but that do two completely different things, e.g. this will successfully augment a histogram: public static void AddToHistogram(OrderedDictionary<string, int> counts, IEnumerable<string> source)
{
foreach (var item in source) counts[item] = counts.TryGetValue(item, out int count) ? count + 1 : 1;
} but this, with the exact same method body, will likely either blow up or produce meaningless results: public static void AddToHistogram(OrderedDictionary<int, int> counts, IEnumerable<int> source)
{
foreach (var item in source) counts[item] = counts.TryGetValue(item, out int count) ? count + 1 : 1;
} |
Thanks, yeah I agree it would likely cause issues for some users and using the |
namespace System.Collections.Generic;
public class OrderedDictionary<TKey, TValue> :
IDictionary<TKey, TValue>, IReadOnlyDictionary<TKey, TValue>, IDictionary,
IList<KeyValuePair<TKey, TValue>>, IReadOnlyList<KeyValuePair<TKey, TValue>>, IList
where TKey : not null
{
public OrderedDictionary();
public OrderedDictionary(int capacity);
public OrderedDictionary(IEqualityComparer<TKey>? comparer);
public OrderedDictionary(int capacity, IEqualityComparer<TKey>? comparer);
public OrderedDictionary(IDictionary<TKey, TValue> dictionary);
public OrderedDictionary(IDictionary<TKey, TValue> dictionary, IEqualityComparer<TKey>? comparer);
public OrderedDictionary(IEnumerable<KeyValuePair<TKey, TValue>> collection);
public OrderedDictionary(IEnumerable<KeyValuePair<TKey, TValue>> collection, IEqualityComparer<TKey>? comparer);
public IEqualityComparer<TKey> Comparer { get; }
public OrderedDictionary<TKey, TValue>.KeyCollection Keys { get; }
public OrderedDictionary<TKey, TValue>.ValueCollection Values { get; }
public int Count { get; }
public TValue this[TKey key] { get; set; }
public void Add(TKey key, TValue value);
public void Clear();
public bool ContainsKey(TKey key);
public bool ContainsValue(TValue value);
public KeyValuePair<TKey, TValue> GetAt(int index);
public OrderedDictionary<TKey, TValue>.Enumerator GetEnumerator();
public int IndexOf(TKey key);
public void Insert(int index, TKey key, TValue value);
public bool Remove(TKey key);
public bool Remove(TKey key, [MaybeNullWhen(false)] out TValue value);
public void RemoveAt(int index);
public void SetAt(int index, TValue value);
public void SetAt(int index, TKey key, TValue value);
public void TrimExcess();
public bool TryGetValue(TKey key, [MaybeNullWhen(false)] out TValue value);
public struct Enumerator : IEnumerator<KeyValuePair<TKey, TValue>>
{
public KeyValuePair<TKey, TValue> Current { get; }
public void Dispose();
public bool MoveNext();
}
public sealed class KeyCollection : IList<TKey>, IReadOnlyList<TKey>, IList
{
public int Count { get; }
public bool Contains(TKey key);
public void CopyTo(TKey[] array, int arrayIndex);
public OrderedDictionary<TKey, TValue>.KeyCollection.Enumerator GetEnumerator();
public struct Enumerator : IEnumerator<TKey>
{
public TKey Current { get; }
public bool MoveNext();
public void Dispose();
}
}
public sealed class ValueCollection : IList<TValue>, IReadOnlyList<TValue>, IList
{
public int Count { get; }
public void CopyTo(TValue[] array, int arrayIndex);
public OrderedDictionary<TKey, TValue>.ValueCollection.Enumerator GetEnumerator();
public struct Enumerator : IEnumerator<TValue>
{
public TValue Current { get; }
public bool MoveNext();
public void Dispose();
}
}
} |
@stephentoub @terrajobst public void Move(int oldIndex, int newIndex);
public void MoveRange(int fromIndex, int toIndex, int count);
public void Stort(IComparer<TValue> comparer); |
Also consider these methods, that can be more efficient if implemented in the type itself. int EnsureCapacity(int capacity)
void TrimExcess(int capacity)
TValue GetOrAdd(TKey key, TValue value)
TValue GetOrAdd(TKey key, Func<TValue> valueFactory)
bool TryAdd(TKey key, TValue value) |
EDITED on 4/10/2024 by @stephentoub to update proposal
Often times I've come across places when needing a
Dictionary
where the insertion order of the elements is important to me. Unfortunately, .NET does not currently have a genericOrderedDictionary
class. We've had a non-genericOrderedDictionary
class since .NET Framework 2.0 which oddly enough was when generics were added but no generic equivalent. This has forced many to roll their own solution, typically by using a combination of aList
andDictionary
field resulting in the worst of both worlds in terms of performance and resulting in larger memory usage, and even worse sometimes users instead rely on implementation details ofDictionary
for ordering which is quite dangerous.Proposed API
Perhaps one of the reasons there was no generic
OrderedDictionary
added initially was due to issues with having both a key and index indexer when the key is anint
. A call to the indexer would be ambiguous. Roslyn prefers the non-generic parameter so in this case the index indexer will be called.API Details
Insert
allowsindex
to be equal toCount
to insert the element at the end.SetAt(int index, TValue value)
requiresindex
to be less thanCount
butSetAt(int index, TKey key, TValue value)
allowsindex
to be equal toCount
similar toInsert
.Dictionary
for all operations exceptRemove
which will necessarily beO(n)
.Insert
andRemoveAt
which aren't members ofDictionary
will also beO(n)
.Open Questions
System.Collections.Generic
when it could easily beSystem.Collections.Specialized
where the non-generic version is located? I just felt this collection is far more useful to be relegated to that namespace.ICollection
,IList
, andIOrderedDictionary
be implemented?Updates
IEnumerable<KeyValuePair<TKey, TValue>>
.ContainsValue
method due to being needed for theValueCollection.Contains
method.The text was updated successfully, but these errors were encountered: