Set the exact initial capacity of a List literal #99073

jbevain · 2024-02-28T19:41:06Z

In .NET 7:

List<string> s = ["bonjour"];
Console.WriteLine(s.Capacity);

Will print 1, as it will compile to a call to the List(int) constructor which sets the exact initial capacity, then to a List.Add.

After .NET 8,

This code prints 4 as CollectionsMarshal.SetCount(List,int) calls List.Grow.

It seems more optimal to create a List with an exact initial capacity. This PR is not be ready as there's a hundred scenarios I didn't consider and no test at the moment :)

Alternatively, this PR could be scrubbed and the C# compiler could also call the List(int) ctor.

ghost · 2024-02-28T19:41:16Z

Tagging subscribers to this area: @dotnet/interop-contrib
See info in area-owners.md if you want to be subscribed.

Issue Details

In .NET 7:

List<string> s = ["bonjour"];
Console.WriteLine(s.Capacity);

Will print 1, as it will compile to a call to the List(int) constructor which sets the exact initial capacity, then to a List.Add.

After .NET 8,

This code prints 4 as CollectionsMarshal.SetCount(List,int) calls List.Grow.

This PR restores the previous behavior for this scenario.

Author:	jbevain
Assignees:	-
Labels:	`area-System.Runtime.InteropServices`
Milestone:	-

AaronRobinsonMSFT · 2024-02-28T21:01:08Z

Thanks @jbevain. This seems like a reasonable discussion point. Let me loop in @stephentoub who likely will have a more informed opinion about the intents of the langauge designers and is always looking for excess memory usage.

jbevain · 2024-02-28T21:51:31Z

Related: dotnet/roslyn#72318

stephentoub · 2024-02-28T22:22:11Z

For reference, here's what I wrote in an email earlier today when asked in a related conversation why SetCount behaves the way it does:

Because count != capacity, and for the same reason that new List() { item } yields a count of 1 and a capacity of 4. No one has specified a minimum capacity the list must be grow to initially, and in the absence of that, it follows a normal doubling scheme, starting at 4. It has no idea if you'll be adding more after this.

We discussed whether collection literals should set the capacity, and it comes down to how likely it will be that someone will add more after construction: if no, set capacity, if yes, don't. But the compiler can't always predict that (though it might be able to "see" it in some cases).

AaronRobinsonMSFT · 2024-02-29T01:17:06Z

here's what I wrote in an email earlier today

@stephentoub Based on that snippet, I interepret your opinion to mean the compiler should emit the lower allocation pattern if it can be detected rather than changing the API. Is that fair?

stephentoub · 2024-02-29T14:49:24Z

here's what I wrote in an email earlier today

@stephentoub Based on that snippet, I interepret your opinion to mean the compiler should emit the lower allocation pattern if it can be detected rather than changing the API. Is that fair?

It's more that the behavior for both SetCount and collection literals is as was previously discussed and designed. We could of course choose to revisit.

I still think the behavior for SetCount is correct. The knobs exist on the API surface to separately control the count and capacity, so I don't love the idea of SetCount employing a different heuristic for growth than the rest of the API surface area. If you want the capacity to exactly match the count, you can do either:

var list = new List<T>(count);
CollectionsMarshal.SetCount(list, count);

or:

list.Capacity = count;
CollectionsMarshal.SetCount(list, count);

If you don't choose a specific capacity, then I think having SetCount employ the same growth scheme as the rest of the List<T> surface area is right from the perspective of consistency.

That's separate from how the language/compiler choose to lower collection literals. Because the compiler wouldn't always be able to predict whether a developer was going to add more to the list after it was initialized, we'd decided to just do the simple thing and use list's default growth scheme by not having the compiler set the capacity. If we now believe the 99% case is that someone using a collection literal to construct a List<T> won't ever append to the list after it's initialized, then it would likely make sense to change the lowering to set a capacity equal to the count. If not, what's currently there may make the most sense. And if the latter, and extra care wanted to be paid to analysis to determine whether the list was subsequently updated (e.g. the compiler can see it's a local being initialized, that local never escapes, and no methods are ever called on that local that could increase its size), then the compiler could also choose to special-case that. In any case, that's a question for Roslyn and dotnet/roslyn#72318.

Just for completeness why this matters, if you do:

var list = new List<T>(1) { item };
list.Add(item);
list.Add(item);

that's going to allocate a T[] of length 1, a T[] of length 2, and a T[] of length 4. If you instead do:

var list = new List<T>() { item };
list.Add(item);
list.Add(item);

that's only going to allocate a T[] of length 4. So from a microoptimization perspective, whether additional adds will be performed is relevant to whether setting the capacity is an optimization or deoptimization.

AaronRobinsonMSFT · 2024-02-29T16:57:39Z

@jbevain Given Steve's commentary above and the discussion in dotnet/roslyn#72318, I'm going to convert this PR to Draft for now.

Set the exact initial capacity of a List literal

bcb7a22

dotnet-issue-labeler bot added the area-System.Runtime.InteropServices label Feb 28, 2024

AaronRobinsonMSFT modified the milestones: 9.0.0, Future Feb 28, 2024

This was referenced Feb 28, 2024

CI error: System.Net.Quic.QuicException: The connection timed out from inactivity #91757

Closed

CI test failure: Connection handshake was canceled due to the configured timeout #99074

Closed

AaronRobinsonMSFT marked this pull request as draft February 29, 2024 16:57

jbevain closed this Mar 7, 2024

jbevain deleted the list-literal-initial-capacity branch March 7, 2024 23:55

github-actions bot locked and limited conversation to collaborators Apr 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set the exact initial capacity of a List literal #99073

Set the exact initial capacity of a List literal #99073

jbevain commented Feb 28, 2024 •

edited

Loading

ghost commented Feb 28, 2024

AaronRobinsonMSFT commented Feb 28, 2024

jbevain commented Feb 28, 2024

stephentoub commented Feb 28, 2024 •

edited

Loading

AaronRobinsonMSFT commented Feb 29, 2024

stephentoub commented Feb 29, 2024 •

edited

Loading

AaronRobinsonMSFT commented Feb 29, 2024

Set the exact initial capacity of a List literal #99073

Set the exact initial capacity of a List literal #99073

Conversation

jbevain commented Feb 28, 2024 • edited Loading

ghost commented Feb 28, 2024

AaronRobinsonMSFT commented Feb 28, 2024

jbevain commented Feb 28, 2024

stephentoub commented Feb 28, 2024 • edited Loading

AaronRobinsonMSFT commented Feb 29, 2024

stephentoub commented Feb 29, 2024 • edited Loading

AaronRobinsonMSFT commented Feb 29, 2024

jbevain commented Feb 28, 2024 •

edited

Loading

stephentoub commented Feb 28, 2024 •

edited

Loading

stephentoub commented Feb 29, 2024 •

edited

Loading