Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collection literals: type inference #68703

Merged
merged 11 commits into from
Jun 23, 2023

Conversation

cston
Copy link
Member

@cston cston commented Jun 20, 2023

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead labels Jun 20, 2023
@cston cston force-pushed the collections-inference branch 2 times, most recently from a375464 to bf36b25 Compare June 21, 2023 19:16
@cston cston marked this pull request as ready for review June 21, 2023 20:22
@cston cston requested a review from a team as a code owner June 21, 2023 20:22
Copy link
Member

@333fred 333fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done review pass (commit 1)

{
var builder = new ForEachEnumeratorInfo.Builder();
BoundExpression collectionExpr = new BoundValuePlaceholder(syntax, collectionType);
return binder.GetEnumeratorInfoAndInferCollectionElementType(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me thing that we should do this up front, when we do the initial unconverted collection literal info calculation, and not here. If we're in some generic overload scenario where we need to perform a number of type inferences, this is going to become extremely expensive extremely quickly. Needing pass a binder into the MethodTypeInferrer is, to me, a signal that the bound node is missing a critical piece of information.

Copy link
Member Author

@cston cston Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collection type in this case is the parameter type rather than the collection literal argument type, so we don't have a bound node in the tree that is directly associated with the parameter where we could reasonably cache the iteration type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like caching GetEnumeratorInfo in the bound tree for the call site doesn't work because we're doing a type inference for each overload just once and not really revisiting them (not getting cache hits).

We might be able to keep a GetEnumeratorInfo cache on the parameter symbol or the parameter type symbol. But it would only be for non-extension cases. We'd still have to do some call-site-specific work in the extension case because it's dependent on the calling context. Maybe it would still be a good "fast path"?

I feel like it would be good to include an end-to-end test which stresses this code path a bit, to make sure it doesn't fall over at a moderate scale. If we can confirm that, I'd personally be OK with perf work on this being pushed to after feature merge or perhaps even to the "debt payoff" milestone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we solve this for tuple literals? Presumably they have a similar problem when you have a signature that takes a (T, T), but they were able to solve it without a binder.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the collection case, where we're inferring from a collection literal passed as an argument to a parameter that is a collection type, we need to determine the iteration type, and that requires a binder in particular for the foreach-able pattern cases. By doing that here, in MethodTypeInferrer, we essentially calculating the iteration type lazily, when we know we need it.

@@ -7228,9 +7228,11 @@ private static NullableAnnotation GetNullableAnnotation(BoundExpression expr)
case BoundKind.UnboundLambda:
case BoundKind.UnconvertedObjectCreationExpression:
case BoundKind.ConvertedTupleLiteral:
case BoundKind.UnconvertedCollectionLiteralExpression:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this will crash IOperation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't clear to me why this is OK for UnconvertedObjectCreationExpression, but not for UnconvertedCollectionLiteralExpression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not. That codepath is either never hit, or only hit during initial nullable analysis of something like an unbound lambda (and if so, we really should have left a comment). If CSharpOperationFactory sees an UnconvertedObjectCreationExpression, it will crash.

@cston cston requested a review from RikkiGibson June 22, 2023 02:45
}
""";
var comp = CreateCompilation(source);
// PROTOTYPE: Should inference succeed?
Copy link
Member Author

@cston cston Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like MethodTypeInferrer.MakeOutputTypeInferences() should walk into collection literals, similar to the behavior for tuples.

See dotnet/csharplang#7293.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is precisely what I was hoping to discover when asking for these tests, so I agree 😄.

@RikkiGibson RikkiGibson self-assigned this Jun 22, 2023
{
var builder = new ForEachEnumeratorInfo.Builder();
BoundExpression collectionExpr = new BoundValuePlaceholder(syntax, collectionType);
return binder.GetEnumeratorInfoAndInferCollectionElementType(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like caching GetEnumeratorInfo in the bound tree for the call site doesn't work because we're doing a type inference for each overload just once and not really revisiting them (not getting cache hits).

We might be able to keep a GetEnumeratorInfo cache on the parameter symbol or the parameter type symbol. But it would only be for non-extension cases. We'd still have to do some call-site-specific work in the extension case because it's dependent on the calling context. Maybe it would still be a good "fast path"?

I feel like it would be good to include an end-to-end test which stresses this code path a bit, to make sure it doesn't fall over at a moderate scale. If we can confirm that, I'd personally be OK with perf work on this being pushed to after feature merge or perhaps even to the "debt payoff" milestone.

@@ -7228,9 +7228,11 @@ private static NullableAnnotation GetNullableAnnotation(BoundExpression expr)
case BoundKind.UnboundLambda:
case BoundKind.UnconvertedObjectCreationExpression:
case BoundKind.ConvertedTupleLiteral:
case BoundKind.UnconvertedCollectionLiteralExpression:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't clear to me why this is OK for UnconvertedObjectCreationExpression, but not for UnconvertedCollectionLiteralExpression.

}
}
""";
CompileAndVerify(new[] { source, s_collectionExtensions }, expectedOutput: "System.Int32[][][[], [1, 2, 3]], ");
CompileAndVerify(new[] { source, s_collectionExtensions }, expectedOutput: "System.Int32[][][[], [1, 2, 3]], System.Int32[][][][[[]], [[1, 2, 3]]], ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure it would be a pain to adjust all the baselines but man it would be nice in these cases if there were a space, or parens, or something between the type and the expression. Maybe parens on the type to make it look like a cast.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added parentheses around the type name and a space following.

{
return t;
var x = new[] { [ulong.MaxValue], [1, 2, 3] };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to demonstrate the new[] {...} and [...] behaviors side-by-side to show that collection literals actually can do this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the test is useful after all, since neither element in the implicitly-type array has a type. Removed.

{
var x = args.Length > 0 ? new int[0] : [1, 2, 3];
x.Report(includeType: true);
var y = args.Length == 0 ? [[4, 5]] : new[] { new byte[0] };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this scenario work before this PR?

Copy link
Member Author

@cston cston Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this did work previously. The behavior of the BestCommonType_* tests didn't change in this PR. Originally, BestTypeInferrer was modified, and the tests seemed useful, so I left them in after reverting BestTypeInferrer.

// 0.cs(9,17): error CS0411: The type arguments for method 'Program.AsArray<T>(params T[])' cannot be inferred from the usage. Try specifying the type arguments explicitly.
// var a = AsArray([1, 2, 3]);
Diagnostic(ErrorCode.ERR_CantInferMethTypeArgs, "AsArray").WithArguments("Program.AsArray<T>(params T[])").WithLocation(9, 17));
CompileAndVerify(new[] { source, s_collectionExtensions }, expectedOutput: "[1, 2, 3], ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is params expected to test something different about the scenario here? It feels like it wouldn't have an effect outside of expanded form, and in expanded form we're using existing rules, not the new collection literal inference rules?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, params is not testing anything different here, and this test is not particularly interesting. I've merged this case with _03 above.

string source = """
class Program
{
static T[] F1<T>(T[] x, T[] y) => y;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to include a method T F0(T[] x, T y) => y; and call F0(new byte[0], 1), to emphasize the analogy with existing scenarios.

static List<T[]> AsListOfArray<T>(List<T[]> arg) => arg;
static void Main()
{
var x = AsListOfArray([[4, 5], []]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really nice that we can just figure this one out.

Comment on lines +1716 to +1717
// F([Main()]);
Diagnostic(ErrorCode.ERR_CantInferMethTypeArgs, "F").WithArguments("Program.F<T>(T[])").WithLocation(8, 9));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like an additional error for Main() outside an expression statement would be nice, since it's always going to be broken in this context, but I wouldn't spend much time on it.

@@ -609,8 +611,12 @@ private void MakeExplicitParameterTypeInferences(BoundExpression argument, TypeW
ExplicitParameterTypeInference(argument, target, ref useSiteInfo);
ExplicitReturnTypeInference(argument, target, ref useSiteInfo);
}
else if (argument.Kind == BoundKind.UnconvertedCollectionLiteralExpression)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have natural typing, this (and the corresponding elements in output type inference) are going to need to change, as we are not making any type inferences from the entire collection type to T at the moment. Future work though.

@cston cston merged commit 4feb306 into dotnet:features/CollectionLiterals Jun 23, 2023
@cston cston deleted the collections-inference branch June 23, 2023 19:14
@333fred
Copy link
Member

333fred commented Jun 23, 2023

A followup will also need to include lambda tests like () => [1, 2] when target-typed to Func<T[]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants