Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add improved interpolated strings spec #4486

Merged
merged 13 commits into from
Mar 25, 2021
39 changes: 36 additions & 3 deletions proposals/improved-interpolated-strings.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ A function member is said to be an ***applicable function member*** with respect
For a function member that includes a parameter array, if the function member is applicable by the above rules, it is said to be applicable in its ***normal form***. If a function member that includes a parameter array is not applicable in its normal form, the function member may instead be applicable in its ***expanded form***:
* The expanded form is constructed by replacing the parameter array in the function member declaration with zero or more value parameters of the element type of the parameter array such that the number of arguments in the argument list `A` matches the total number of parameters. If `A` has fewer arguments than the number of fixed parameters in the function member declaration, the expanded form of the function member cannot be constructed and is thus not applicable.
* Otherwise, the expanded form is applicable if for each argument in `A` the parameter passing mode of the argument is identical to the parameter passing mode of the corresponding parameter, and
* **for an interpolated string argument to a fixed value parameter or a value parameter created by the expansion when `A` is an instance method or static extension method invoked in reduced form, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_ `Ai`, and overload resolution on `Ai` with the identifier `GetInterpolatedStringBuilder` and a parameter list of 2 int parameters, the receiver type of `A`, and an out parameter of type `Ai` succeeds with 1 invocable member. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or,**
* **for an interpolated string argument to a fixed value parameter or a value parameter created by the expansion when `A` is an instance method or static extension method invoked in reduced form, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_ `Ai`, and overload resolution on `Ai` with the identifier `GetInterpolatedStringBuilder` and a parameter list of 2 int parameters, the receiver type of `A`, and an out parameter of type `Ai` succeeds with 1 invocable member. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an _implicit\_string\_builder\_conversion_. Or,**
* for a fixed value parameter or a value parameter created by the expansion, an implicit conversion ([Implicit conversions](conversions.md#implicit-conversions)) exists from the type of the argument to the type of the corresponding parameter, or
* for a `ref` or `out` parameter, the type of the argument is identical to the type of the corresponding parameter.

Expand All @@ -171,7 +171,7 @@ We change the [better conversion from expression](https://github.com/dotnet/csha
following:

Given an implicit conversion `C1` that converts from an expression `E` to a type `T1`, and an implicit conversion `C2` that converts from an expression `E` to a type `T2`, `C1` is a ***better conversion*** than `C2` if:
1. `E` is a non-constant _interpolated\_string\_expression_, `C1` is an _interpolated\_string\_builder\_conversion_, `T1` is an _applicable\_interpolated\_string\_builder\_type_, and `C2` is not an _interpolated\_string\_builder\_conversion_, or
1. `E` is a non-constant _interpolated\_string\_expression_, `C1` is an _implicit\_string\_builder\_conversion_, `T1` is an _applicable\_interpolated\_string\_builder\_type_, and `C2` is not an _implicit\_string\_builder\_conversion_, or
2. `E` does not exactly match `T2` and at least one of the following holds:
* `E` exactly matches `T1` ([Exactly matching Expression](expressions.md#exactly-matching-expression))
* `T1` is a better conversion target than `T2` ([Better conversion target](expressions.md#better-conversion-target))
Expand Down Expand Up @@ -242,7 +242,7 @@ If the type of an interpolated string is `System.IFormattable` or `System.Format
### Lowering

Both the general pattern and the specific changes for interpolated strings directly converted to `string`s follow the same lowering pattern. The `GetInterpolatedStringBuilder` method is
invoked on the receiver (whether that's the temporary method receiver for an _interpolated\_string\_builder\_conversion_ derived from the applicable function member algorithm, or a
invoked on the receiver (whether that's the temporary method receiver for an _implicit\_string\_builder\_conversion_ derived from the applicable function member algorithm, or a
standard conversion derived from the target type), and stored into a temp local. `TryFormat` is then repeatedly invoked on that temp, with each part of the interpolated string, in order,
stopping subsequent calls if a `TryFormat` call returns `false`. The temp is then evaluated as the result of the expression.

Expand Down Expand Up @@ -291,6 +291,39 @@ This was done to support partial formatting scenarios where the user wants to st
introduce a bunch of unnecessary branches in standard interpolated string usage. We could consider an addendum where we use just `Format` methods if no `TryFormat` method is present, but
it does present questions about what we do if there's a mix of both TryFormat and Format calls.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about it, the more I think we need to allow this. I don't have a strong preference for whether it's two different TryFormat/Format names, or whether we just allow TryFormat to be void-returning in addition to bool-returning, but I suspect just saying TryFormat can return void will be a little simpler as it avoids the ambiguity of what to do if there's both Format and TryFormat methods with the same arguments (since in C# you can't overload on return type alone).

The most common use of the builder will be InterpolatedStringBuilder, and if we don't do this, we'll be building an unnecessary inefficiency into the pattern from the get-go, with every call site being larger and requiring an unnecessary (though easily predictable) jump instruction.


### Passing previous arguments to the builder

There is unfortunate lack of symmetry in the proposal at it currently exists: invoking an extension method in reduced form produces different semantics than invoking the extension method in
normal form. This is different from most other locations in the language, where reduced form is just a sugar. We have a couple of potential options for resolving this:

* Special case extension methods called in normal form. This feels pretty icky: why are extensions special here?
* Allow other previous parameters to be passed to the builder. This gets complicated quickly: how do we determine what to pass to the builder? What if the builder has a `GetInterpolatedString`
method that accepts the first parameter, but not the receiver, of an instance method?
* Pass parameters to the builder marked with a specific attribute, a la `EnumeratorCancellation` support. This would need rules about whether we pass the receiver (maybe if the method is marked
we pass the receiver, and we don't in the general case?), and what we do if parameters _after_ the string parameter are annotated, but it seems like a potential option.

Some compromise is likely needed here, but either direction has complications. Some scenarios that would be affected by this is the `Utf8Formatter` below, or existing api patterns that have
an `IFormatProvider` as the first argument.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding this section in response to my pestering you about it. ;-)

I would like us to be able to add:

String.Format([InterpolatedStringArgument] IFormatProvider? provider, InterpolatedStringBuilder builder);

and have the provider passed to the GetInterpolatedStringBuilder method.

I would like us to be able to add:

Utf8Formatter.TryWrite([InterpolatedStringArgument] Span<byte> destination, Utf8InterpolatedStringBuilder);

and have the span passed to the builder as the destination.

This also provides a solution for using stack scratch space. We can add an overload:

InterpolatedStringBuilder.GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Span<char> scratchSpace, out InterpolatedStringBuilder);

Then either we could expose a public API, or even if not internally use (and tell devs that want the interim solution to copy/paste it):

public static string Format([InterpolatedStringArgument] Span<char> scratchSpace, InterpolatedStringBuilder builder);

which we can then use like:

string result = Format(stackalloc char[256], $"{a} = {b}");


### `await` usage in interpolation holes

Because `$"{await A()}"` is a valid expression today, we need to rationalize how interpolation holes with await. We could solve this with a few rules:

1. If an interpolated string used as a `string`, `IFormattable`, or `FormattableString` has an `await` in an interpolation hole, fall back to old-style formatter.
2. If an interpolated string is subject to an _implicit\_string\_builder\_conversion_ and _applicable\_interpolated\_string\_builder\_type_ is a `ref struct`, `await` is not allowed to be used
in the format holes.

Fundamentally, this desugaring could use a ref struct in an async method as long as we guarantee that the `ref struct` will not need to be saved to the heap, which should be possible if we forbid
`await`s in the interpolation holes.

Alternatively, we could simply make all builder types non-ref structs, including the framework builder for interpolated strings. This would, however, preclude us from someday recognizing a `Span`
version that does not need to allocate any scratch space at all.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should say:

  • InterpolatedStringBuilder is a ref struct.
  • If you're targeting strings, the compiler choose between its various ways of implementing string interpolation: InterpolatedStringBuilder, string.Concat, and string.Format. If there's something that prevents it from using InterpolatedStringBuilder, well then it uses one of its other mechanisms.
  • If you're targeting a custom builder, then you're subject to the constraints of that builder. If the builder is a struct or class, great, you can use await in holes; have at it. If the builder is a ref struct, then you can't use await in holes, just as you can't use anything the builder doesn't have a TryFormat overload for.


### Builders as ref parameters

Some builders might want to be passed as ref parameters (either `in` or `ref`). Should we allow either? And if so, what will a `ref` builder look like? `ref $""` is confusing, as you're not actually
passing the string by ref, you're passing the builder that is created from the ref by ref, and has similar potential issues with async methods.

## Other use cases

### `TryFormat` on `Span` receivers
Expand Down