Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move off of ImmutableDictionary #47637

Open
333fred opened this issue Sep 11, 2020 · 9 comments
Open

Move off of ImmutableDictionary #47637

333fred opened this issue Sep 11, 2020 · 9 comments

Comments

@333fred
Copy link
Member

333fred commented Sep 11, 2020

Our use of ImmutableDictionary in roslyn does not match up well with the design goals of ImmutableDictionary, and causes excessive overhead because of it. ImmutableDictionary is designed mostly for the ability to minimize copies when elements are added or removed, whereas the roslyn use case is mostly around adding a bunch of elements to a view and then freezing it so that it can be safely shared between threads.

I know @agocke was investigating this area before moving teams. Andy, can you comment on your investigations in this area?

/cc @CyrusNajmabadi @benaadams

@sharwell
Copy link
Member

I'm working on a SegmentedDictionary<TKey, TValue> based on Dictionary<TKey, TValue> but backed by SegmentedArray<T>. From there, I plan to implement an ImmutableSegmentedDictionary<TKey, TValue> wrapper that we can use instead of ImmutableDictionary<TKey, TValue>.

@mjsabby
Copy link
Contributor

mjsabby commented Sep 11, 2020

@danmosemsft Maybe this should be a theme for .NET 6 ... not have data structures that perform poorly with LOH (which I think is the reason for SegmentedArray).

@danmoseley
Copy link
Member

I'd be interested in @jkotas thoughts on that since we talked about that in the past.

@agocke
Copy link
Member

agocke commented Sep 11, 2020

The results of my investigation where:

Roslyn has very simple requirements. It need a dictionary which is something close to ImmutableArray (create once, modify never again) and not ImmutableList (ability to add items and produce a new list afterwards). In addition, the number of necessary APIs was very small, basically just TryGetValue. The most important metrics were size and speed, size being the overhead of storing elements, and speed being the time to retrieve (or fail to retrieve) an element.

I ended up creating a new data structure to fulfill these requirements, which I ended up calling FrozenDictionary. I'll try to find a copy and create a PR to Roslyn. It manages to be slightly faster than Dictionary, which is much faster than ImmutableDictionary, and much smaller than both of them.

In terms of general guidance, I think the framework could consider a set of "frozen" data structures. I'd also put ImmutableArray in corelib. 😄

@sharwell
Copy link
Member

I submitted a drop-in replacement as #47641

@danmoseley
Copy link
Member

It manages to be slightly faster than Dictionary

That's interesting I"d like to see how it does that.

@jkotas
Copy link
Member

jkotas commented Sep 12, 2020

not have data structures that perform poorly with LOH (which I think is the reason for SegmentedArray).
I'd be interested in @jkotas thoughts on that since we talked about that in the past.

Related discussion: #43464 (comment)

I believe that avoiding LOH allocations everywhere would have very similar effect (higher allocation rate in Gen0) as setting the LOH threshold to a high number. Workloads that have issues with LOH allocations can configure the LOH threshold today, without changing any code.

Instead of chasing LOH allocations, I think it is better to focus on predictable performance. Predictable performance was an epic for .NET 5 and I expect it will continue to be our focus going forward.

Dictionary and other similar hashtable-based collections are optimized for best average performance, but they do not have predictable performance due to rehash stalls. I think it would be interesting to have a set of collections that are optimized for best worst case per-operation performance. https://github.com/tunnelvisionlabs/dotnet-trees has some good ideas.

@svick
Copy link
Contributor

svick commented Jan 9, 2024

FrozenDictionary is now part of System.Collections.Immutable 8.0. Does it make sense to update the package and switch to using that type?

@sharwell
Copy link
Member

sharwell commented Jan 9, 2024

@svick Roslyn doesn't control the timing of when we update System.Collections.Immutable. With that said, I'm not sure it would outperform the ImmutableSegmentedDictionary<TKey, TValue> which we already have, especially if we start adding something like #62193. Anything for which FrozenDictionary<TKey, TValue> is a benefit could be switched to the segmented form today (and for many cases already has). The copy-on-write semantics of these collections means it's easy to switch when the collection is created once and never subsequently modified, but as soon as potential mutations are involved we have to be careful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants