Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient params and string formatting #2293

Merged
merged 15 commits into from
Mar 5, 2019
Merged

Conversation

jaredpar
Copy link
Member

@jaredpar jaredpar commented Mar 4, 2019

This proposal is being discussed at LDM on 3/4/2019.

Changes
- Remove `VariantCollection`
- Move the `IEnumerable<T>` variant into the main section
- Simplify the language around `Span<T>` and `ReadOnlySpan<T>`
proposals/format.md Outdated Show resolved Hide resolved
JaredPar can't spell.

Co-Authored-By: jaredpar <jaredpparsons@gmail.com>
are available then it will give us this type of flexibility.

### Why not varargs?
!!! Will flesh this out in a bit. Have to track down my notes!!!
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: i did not forget about this. I'm actually trying to track down my notes. Verdict though is it's not a good solution for C#.

Copy link
Contributor

@bdukes bdukes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal looks wonderful, just a couple of typos, etc.

proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
Copy link
Member Author

@jaredpar jaredpar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JaredPar can't spell part 2

This order is the most to the least effecient for the general case.

### Variant
The CoreFX is introducing a new managed type `Variant`. This type is meant to be used in APIs which expect hetrogeneous
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The CoreFX reads weird. Maybe just use CoreFX

storage but avoids the boxing allocation for the most commonly used types. Using this type in APIs like `string.Format`
can eliminate the boxing overhead in the majority of cases.

This type itself is not necessarily special to the language. It is being introduced in this document separately though
Copy link
Member

@tannergooding tannergooding Mar 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you going to be putting the Variant proposal up on CoreFX as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeremyKuhne owns that. Pretty sure it's one corefxlab already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't stop laughing about the typos they just keep coming

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry they won't stop.

proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved

### Extending params
The language will allow for `params` in a method signature to have the types `Span<T>`, `ReadOnlySpan<T>` and
`IEnumerable<T>`. The same rules for invocation will apply to these new types that apply to `params T[]`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're going to go to this, wouldit be oke to also allow going to ICollection/IList/IReadOnlyList at teh same time (effectively, all the types that T[] already implements)? It feels like it would be a prime time to just roll this in since we're working in this area.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you need to support the others, since they all implement IEnumerable<T>? Generally speaking, types that implement things like ICollection or IList can also efficiently implement their iterators (and do in the case of most CoreFX types).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the currently proposed types cover most of the scenarios. That is, Span<T>/ROSpan<T> make sense since they provide allocation-less calls (and declare the immutability of the input data) and IEnumerable<T> allows easier interop with LINQ (it avoids the need to realize a query).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you need to support the others, since they all implement IEnumerable?

Because i want to expose this api to users:

public static void Foo(params IReadOnlyList<T> whatever) { }

Allowing people to just pass their existing list to me. Or, they can pass raw values, and it will be an array, an that doesn't need to be converted into an IReadOnlyList since it already is one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add it to the open issues. We didn't cover it in LDM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much! I would understand if this didn't happen. However, it's been one of those language niggles for ages. If it's cheap, this would be nice to have :) It would def clean up a few of my apis.

Note: @tannergooding and i discussed this on Gitter. It is definitely a strange thing that the language has baked in understanding that an array has an implicit reference conversion to IList and its ilk. And, yet, for 'params' i must specify only a T[] type. It seems like it would be pretty simple to just relax that to "anything an array has reference implicit conversion to".

```

This can siginficantly reduce the number of arrays allocated in an application. Allocations can be even further
reduced if the runtime provides utilities for smarter stack allocation of arrays.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you expand on this further?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The later sections go into a few of the possible scenarios here around stack allocation.

The CoreFX is introducing a new managed type `Variant`. This type is meant to be used in APIs which expect hetrogeneous
values but don't want the overhead brought on by using `object` as the parmeter. The `Variant` type provides universal
storage but avoids the boxing allocation for the most commonly used types. Using this type in APIs like `string.Format`
can eliminate the boxing overhead in the majority of cases.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there more info on Variant and how it works? A link, perhaps, to the CoreFx side of tings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a link to the PR that has the initial Variant code spit and that has a bit of conversation. I don't think there is an official design doc though. Will let @JeremyKuhne comment on that.

dotnet/corefxlab#2595

readonly struct ValueFormattableString {
public static ValueFormattableString Create(Variant v) { ... }
public static ValueFormattableString Create(string s) { ... }
public static ValueFormattableString Create(string s, params ReadOnlySpan<Variant> collection) { ... }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you need a ReadOnlySpan of hte string portions of hte interpolated string, along with the ReadOnlySPan of interpolation values? Or am i misunderstanding how this would be used? For example, if you have:

$"foo{bar}baz"

how is the ValueFormattableString created? I would prefer a form that gave me "foo" and "bar" in a ReadOnlySpan, and something that gave me the Variant for bar in a ReadOnlySpan.

ConsoleEx.Write(ValueFormattableString.Create((Variant)42));
ConsoleEx.Write(ValueFormattableString.Create(
"hello {0}",
VariantCollection.Create((Variant)DateTime.UtcNow));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this approach seems unfortunate... because won't i now have to figure out how to break "hello {0}" up, even thouh the language already did that for me? I'm coming from this from a JS/TS perspective where this is the natural way this data is packaged up and passed along to template processors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the approach that interpolated strings use today. The goal of this proposal is to make the existing process more efficient vs. opening up the door for text templating. That was brought up though in LDM and we're going to look into it a bit though. If you have some pointers to the JS / TS similar features I'd like to read up on them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the approach that interpolated strings use today

That's fair. Though it would be nice if both forms could be supported. Having to have clients manually parse strings is a PITA. JS/TS info on template strings is here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals

Importantly:

function template(strings, ...keys) {
  return (function(...values) {
    var dict = values[values.length - 1] || {};
    var result = [strings[0]];
    keys.forEach(function(key, i) {
      var value = Number.isInteger(key) ? values[key] : dict[key];
      result.push(value, strings[i + 1]);
    });
    return result.join('');
  });
}

This is how one could write a function that accepts an interpolated string. From TypeScript's perspective, the signature is:

function template(literals: TemplateStringsArray, ...placeholders: string[]) {

TmplateStringsArray is also super interesting. It is typed as:

interface TemplateStringsArray extends ReadonlyArray<string> {
    readonly raw: ReadonlyArray<string>;
}

Meaning, it actually represents two arrays of strings. For example if i had:

whatever `this is a \t { template } \t string literal`;

Then the first arg passed to 'whatever' would look like this:

{
     [ "this is a \t ", " \t string literal"]
     raw: [ "this is a \\t ", " \\t string literal" ]
}

So JS gives both the original raw strings and the interpreted language strings. I'm not asking for that myself, but at least having the separate strings makes it so much easier to do things.


The possibility of this seems reasonably low. The type would need the full name `System.ValueFormattableString` and it
would need to have `static` methods named `Create`. Given that developers are strongly discouraged from defining any
type in the `System` namespace this break seems like a reasonable compromise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if the Language/Compiler codified this (similer to how some languages carve out __ names as being reserved for them). If the language drew a line and said "we may change behavior for new types introduced in the System... space, and you should not be adding anything there yourself" then this gives both a reasonable path to add a lot of stuff in the future, while also being able to point users at some struct guidelines in case they do bad stuff.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving System. namespaces available for general use can allow third party libraries to implement features on older versions of the framework that aren't officially supported for that particular feature. But having a compile warning may be appropriate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've essentially drawn the line on types defined under System.. We're still pretty careful about introducing breaking changes here but if we feel we can improve things and it's a type under System. we'll accetp a level of breaking changes to do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. I wasn't certain if htat was something official, or more informal. I was basically saying that it would be nice if really just stated it officially.

## Considerations

### Variant2 and Variant3
The CoreFX team also has a non-allocating set of storage types for up to three `Variant` arguments. These are a single
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the '3' brought about by an sort of analysis of existing codebases? i'd be really curious what the 'tail' looks like her.e i.e. if 3 gets us 95% of the way, but 4 gets us to 99.9%, that would be interesting to know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeremyKuhne can comment on that.

Copy link
Contributor

@bdukes bdukes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple more 😉

proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
proposals/format.md Outdated Show resolved Hide resolved
bdukes and others added 2 commits March 5, 2019 09:09
so many typos to fix. Thanks for all the help.

Co-Authored-By: jaredpar <jaredpparsons@gmail.com>
@yaakov-h
Copy link
Member

yaakov-h commented Mar 3, 2020

Sorry to bump this but is there a discussion thread that corresponds to this proposal? I can't find it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants