-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Conversation
9a09e47
to
019cdaf
Compare
destination.Write(buffer, 0, read); | ||
} | ||
} | ||
finally | ||
{ | ||
Array.Clear(buffer, 0, highwaterMark); // clear only the most we used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why you removed this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted in the PR description:
"(While I was modifying this, I also removed some unnecessary array clearing that we'd added before later deciding it wasn't needed in general.)"
We'd initially gone through and adding clearing everywhere we were using array pool. Then we subsequently decided we didn't need to, that we would only explicitly clear in very explicitly high-security code (like when using an array to hold a key in a crypto lib), and we removed some of these and stopped adding more, but we missed some.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah
@@ -1127,7 +1127,7 @@ public override ValueTask<int> ReadAsync(Memory<char> buffer, CancellationToken | |||
{ | |||
Debug.Assert(_bytePos == 0, "_bytePos can be non zero only when we are trying to _checkPreamble. Are two threads using this StreamReader at the same time?"); | |||
|
|||
_byteLen = await tmpStream.ReadAsync(tmpByteBuffer, 0, tmpByteBuffer.Length, cancellationToken).ConfigureAwait(false); | |||
_byteLen = await tmpStream.ReadAsync(new Memory<byte>(tmpByteBuffer), cancellationToken).ConfigureAwait(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: tmpByteBuffer.AsMemory() is cleaner (generic-type inference)
here and elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmpByteBuffer.AsMemory() is cleaner (generic-type inference)
I prefer the explicitness.
@@ -963,14 +963,14 @@ private bool HaveWrittenPreamble_Prop | |||
byte[] preamble = encoding.GetPreamble(); | |||
if (preamble.Length > 0) | |||
{ | |||
await stream.WriteAsync(preamble, 0, preamble.Length, cancellationToken).ConfigureAwait(false); | |||
await stream.WriteAsync(new ReadOnlyMemory<byte>(preamble), cancellationToken).ConfigureAwait(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to explicitly create a ROM here? Would implicit cast work? I don't see an array based stream.WriteAsync overload that has 2 parameters (second one being the cancellation token).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to explicitly create a ROM here?
Being explicit helps me when reading the code to know that I'm using the desired overload. Otherwise I need to start paying attention to how many arguments there are and whether I might be slipping into the wrong one.
/// <summary>Schedules the state machine to proceed to the next action when the specified awaiter completes.</summary> | ||
/// <typeparam name="TAwaiter">The type of the awaiter.</typeparam> | ||
/// <typeparam name="TStateMachine">The type of the state machine.</typeparam> | ||
/// <param name="awaiter">the awaiter</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inconsistent capitalization/use of period in the xml comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -5,88 +5,212 @@ | |||
using System.Diagnostics; | |||
using System.Runtime.InteropServices; | |||
using System.Threading.Tasks; | |||
#if !netstandard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add new line between using System.*
directives and Internal.*
ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
/// <summary>Gets the result of the ValueTask.</summary> | ||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
[StackTraceHidden] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Why is GetResult marked with the StackTraceHidden attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exception is thrown inside it so looks like gunk in the stack trace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Open issues to address or decide against before merging:
|
// The .NET Foundation licenses this file to you under the MIT license. | ||
// See the LICENSE file in the project root for more information. | ||
|
||
namespace System.Threading.Tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we agreed to put the source APIs in a subnamespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #16618 (comment).
We discussed it. There was no conclusion. I'm happy for you to make a call on if and what it should be. My default is to leave them where they are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't feel super strongly, so maybe @terrajobst has some input?
public interface IValueTaskSource | ||
{ | ||
/// <summary>Gets the status of the current operation.</summary> | ||
ValueTaskSourceStatus Status { get; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we not doing the cookies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #16618 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason when I click on the link it does not take me anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason when I click on the link it does not take me anywhere.
Hmm, ok, well it's a top-level comment earlier in this PR thread that outlines a few remaining action items to follow-up one and determine whether to address or ignore, one of which was cookies, another of which was namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't feel strongly about whether they are added; but if it were something like CorrelationToken
would be better than Cookie
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't feel strongly about whether they are added; but if it were something like CorrelationToken would be better than Cookie
You mean just in terms of name? If we were to do something (I'm still investigating any perf ramifications), I was thinking it would just be something like short token
.
I spent some time today investigating this possibility... The core idea here is to modify the public ValueTask(IValueTaskSource<T> source, short token); That ValueTaskSourceStatus GetStatus(short token);
void OnCompleted(Action<object> continuation, object state, short token, ValueTaskSourceOnCompletedFlags flags);
TResult GetResult(short token); There were two reasons raised for considering something like this:
(1) doesn't seem to me to be a real issue. An implementation can already be used to represent any number of distinct operations, simply by itself maintaining the state that tells it what kind of operation the current one is; it doesn't need to feed that to the (2) does have some value. The question is what it costs us and whether it’s worth the price:
Given that there is diagnostic value here (we won’t catch every misuse, but many of them) and there doesn’t seem to be a meaningful perf cost in a real scenario, I think we should go ahead with this, with the Any concerns? I'm going to proceed with it and assume not. |
No concerns about the short token from me. |
019cdaf
to
6676f53
Compare
I personally don't think the cookie is important and I'm hoping implementations can just ignore it. |
It's up to the implementation what to do with it. I do think it adds value, in that it can help to catch misuse, e.g. now if you do this: ValueTask<int> r = networkStream.ReadAsync(...);
await r;
await r; you'll get an exception, whereas previously you would have just silently gotten bad behavior if another operation had already been started on the stream. But an implementation can choose to ignore it. After all, it's the implementation that provides it in the first place.... it's ValueTask that ignores it, and just passes whatever value the API implementation gave it back to the API. |
This commit adds support for extending `ValueTask<T>` with arbitrary backing sources. Prior to this change, `ValueTask<T>` could wrap a `T` or a `Task<T>`; now it can also wrap an `IValueTaskSource<T>`, which can be implemented by arbitrary objects to be represented by `ValueTask<T>`. These objects can then be pooled and reused to minimize allocation. The commit also adds a non-generic `ValueTask` that can represent void-returning operations, including a `default` synchronous success, `Task`, and `IValueTaskSource`. For the non-generic `ValueTask`, the commit also includes awaiters and async method builders, so it can be both awaited and used as the return type of an async method. The rest of the changes fall into a few buckets all related to enabling this support: - Modifying `AsyncTaskMethodBuilder<TResult>.AwaitUnsafeOnCompleted` to specially recognize any `ValueTask` and utilize either the `Task` or `IValueTaskSource` that backs it to avoid allocating an Action MoveNext method. If every object awaited in an async method is either a `Task`/`Task<T>` or `ValueTask`/`ValueTask<T>`, regardless of what the `ValueTask`/`ValueTask<T>` wraps, we'll be able to avoid allocating the delegate and only allocate the single state machine object that also serves as the returned object. - Changing `Stream.WriteAsync` to return `ValueTask` instead of `Task`. This enables interested overriding stream types to use a reusable/pooled object to avoid `WriteAsync` allocations. - Modifying Stream.CopyToAsync to use the new `Memory`-based overloads of `ReadAsync` and `WriteAsync`. This enables the default `CopyToAsync` implementation to take advantage of any pooling done by derived streams, but even without pooling to take advantage of synchronously completing `ReadAsync`s returning `ValueTask<int>`s that contained an `int` rather than an allocated object. (While I was modifying this, I also removed some unnecessary array clearing that we'd added before later deciding it wasn't needed in general.) - Modifying StreamReader/Writer to use the new `ReadAsync`/`WriteAsync` overloads.
Group these more advanced types into a subnamespace.
6676f53
to
f9a312b
Compare
I believe I've addressed all of the remaining feedback, including adding in the cookie/token and moving the types to a subnamespace. I plan to merge this once CI is green. Thanks. |
@dotnet-bot test Ubuntu x64 Checked Innerloop Build and Test Previous failed CI legs: |
Firstly, I will be one of the consumers of this API. Secondly, I love the concept. Thirdly, I'm not sure I love the specific implementation discussed above... Interfaces are nice, but due to the higher dispatch cost, I'm thinking this could be a good case where the capability expressed by IValueTaskSource is better expressed as an abstract class BaseValueTaskSource (viz., not every point of extensibility needs to be exposed as an interface -- especially the performance sensitive cases.) I understand that .NET Core is probably going to get smarter inlining before the big brother .NET framework, and I know that interface inlining is one of the things on the drawing board... However, this is a case where even a comprehensive effort on the inling effort isn't likely to succeed in getting these dispatches inlined... Moreover, for cases where this interface were implemented on a struct, we'll likely get the boxing interface penalties as well. Since the TPL, async, etc., affect literally almost everything in .NET these days, I highly recommend against doing anything that adds processing overhead. For these advanced cases, I would not mind at all creating a type that extends BaseValueTaskSource and does what I need it to do. Moreover, since I'm making a class anyway, I absolutely do not mind adding whatever tracking I want to have in my BaseValueTaskSource implementation. (If I want a state counter, I'll add it.) In fact, you could ship a BaseTokenValueTaskSource implementation that simply exposes the short Token read/write property, and an abstract Execute( ref short token ). In the implementation, have the Execute() implemented as a sealed overload and grab the current Token, call Execute( ref token ), and then store the result back in Token. Etc., etc.: all of the relevant APIs on BaseValueTaskSource could be overriden to pass the current token along to the BaseTokenValueTaskSource abstract implementation. (Bonus: your current code would be pretty trivial to update on this route.) I would, again, however, stress that as much as possible be implemented as non-virtual. Again, this flexibility is awesome, but the added costs of the inevitable failed branch prediction and the dispatching will add up when multiplied by large scale apps where almost everything is async nowadays. However, I am less concerned about exposing virtual members on BaseValueTaskSource as appropriate -- as those costs will only apply to cases where BaseValueTaskSource is actually used. |
I would think the refactoring effort to move from the current IValueTaskSource approach to the suggested BaseValueTaskSource + BaseTokenValueTaskSource approach would be relatively minor. You'd get the functionality you want, and reduce much of the implied runtime costs. |
After reviewing the actual interfaces, here's the code for what I had in mind:
This would allow removing the token from the Feel free to use that code verbatim if it helps 😀 |
NOTE: even if there is good reason to keep the interfaces (I would actually not mind having a dual approach and supporting both internally), using a specific struct If the performance were bad with specific cases using the suggested |
There is. We want to be able to use any object rather than one with a specific base type. For example, in Sockets we want to use an object derived from SocketAsyncEventArgs. And in pipelines, we want to use the same object to support multiple operations, all with different return types. An abstract base class would make such things impossible without more allocations, and avoiding allocations is a key aspect of this feature.
If there's a proven need, it can be considered for the future. |
The base class approach can definitely be added later, and I'm positive it would help in some cases. In the framework I support, I've got a custom task class that I'm already using for something similar to this (thus I've already got the allocation cost), and I had to go through great lengths to avoid the double allocate of passing an I like the idea of modifying my code (down the road) to use this new mechanism instead for those cases. It's cleaner. My bigger concern is adding fields to the core task related structures like ValueTask. I see your point about the SocketAsyncEventArgs quandary. I've got a lot of rather nasty code (callback hell) that hits the SocketAsyncEventArgs APIs... it would be wonderful to get async support down at that level of the .NET APIs as efficiently as possible. And I see what you're trying to do there: pass some state around via the token in ValueTask without any additional allocation costs. (We all want the sockets APIs to be as fast as possible.) I could see the token helping with a few edge cases, but I cannot think of much use for it offhand in the APIs that I support. I would have to do a deep dive through my code to see if the token would add any value or assist in eliminating any allocations... Did you happen to test something similar to using a token "box" like I posted above which uses a struct to store the token and the struct implements the In my personal experience, I have repeatedly seen that the cost of allocating temporary purpose built helper classes to avoid interface dispatch costs is often much lower than my very "clever" interface-based code. In fact, that's one of my goto, tried and true, optimization strategies. I've seen code that was as optimized as possible, using interfaces, get roughly a 30% speed bump by refactoring to a base class. (The payoff is going to be higher in cases where the code is hitting multiple interface members repeatedly in a tight loop.) |
…ensibility Implement ValueTask extensibility Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
The primary purpose is to enable an IValueTaskSource implementation to catch misuse. If you're pooling / reusing these across multiple operations, which is the primary use case, then it's possible that a previous call that has a ValueTask could still be holding on to an IValueTaskSource instance that's already being used for another operation. If someone then tried to access the original ValueTask, they'd be accessing the wrong operation. The token allows for the IValueTaskSource to fail operations through the original ValueTask rather than silently getting corrupted or returning incorrect results. I'm sure inventive folks will come up with other uses for the token, but the above is the primary motivating use case.
I don't understand the suggestion. The whole point here is avoiding allocation; if you implement IValueTaskSource on a struct and pass that struct to the ValueTask, that's going to allocate the box. At that point there's no value to using a struct, or to even having the token: just allocate a class that implements the interface and pass that in, there's no potential for reuse as you're not pooling the object. |
…ensibility Implement ValueTask extensibility Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
Then you can't have a |
Sorry about my off-base comments above, I took the time to read all of the code and look at the implementation for ValueTask. The way it's implemented, with the flags enum stored as a byte, the space for short token pretty much comes for free due to the packing rules. So, yeah, it definitely doesn't hurt to do something useful with that unused dataspace since its allocated and moving anyway. Moreover, that should actually leave an extra byte to work with... if you can think of something useful to do with it. Would the token be any more useful it was a quasi-24 bit int instead of a 16 bit short? In that case, the flags could be stored in the upper byte and stripped off to get the int token. You'd get a range of 0 - 16,777,215 on 24 bits going that route instead only 0 - 65,536 with the current 16 bits (or rather -32,768 - 32,767). I would personally find that additional range very useful -- that's a pretty big number for runtime operation numbers, version numbers, etc.) There's the question of what to do with negative numbers (uint?)... although there's a bit twiddle for that (http://graphics.stanford.edu/~seander/bithacks.html#FixedSignExtend). Likely it is just fine the way it is, I just thought I would mention the possibility of doing something useful with that extra byte while it's still hot on the press... (1) Is there any chance that this will make it into 2.1? (I'm excited to play with this new feature in production code.) (2) Any thoughts at all on when this feature might bubble up to full .NET? My main project is cross platform... so the platform deltas can get a bit painful to straddle. Regardless, I'll take wins where I can. (I'm guessing a long time...) (3) Do you any place you could point me towards to review the scenarios using the token on the ValueTask that you described? Like for SocketAsyncEventArgs, etc.? I didn't see any cases in this commit where the token is actually used, and I'd like to see some examples of what you described above. |
It depends on the size of T. If T is a bool, for example, which will actually be very common, then on 32-bit we'd have a 64-bit type with 32-bits for the object and 32-bits for the bool+byte+short, with basically no space left over.
Yes
Actually built in, I'm not sure. But the plan is all of this ships as part of the System.Threading.Tasks.Extensions NuGet package as well, so you can just reference that from your .NET Framework application. There are some optimizations in the coreclr implementation that aren't in that out-of-band implementation, but the core pieces are there.
See the corresponding PR in corefx: |
Sidebar...While we're talking about reducing allocations for Async and Task implementations, I thought I'd share a concrete example of how to avoid allocating a delegate instance for the continuations and/or the initial tasks. Have a look at my sample changes to the FileStream.WriteAsyncInternal() method to remove the allocation of the delegate entirely. This strategy is cumbersome, but works beautifully when the task's state is sufficient to feed the continuation (and as long as the setting up the continuation state doesn't cause an additional allocation -- like boxing; otherwise it ends up being the same difference as creating a capturing delegate). I've got two commits on that branch: the first shows the strategy implemented using pure static method (good) and the second shows the static method deferring to the instance method (better).
|
That math makes sense for My guess is that much more often that not, we'd see This is my last comment on this topic. I just wanted to raise the possibility and let you think on it. |
Let's avoid continuing an unrelated conversation on this PR. Thanks.
There's no per-operation delegate allocation there. I assume you're referring to this code: coreclr/src/mscorlib/shared/System/IO/FileStream.Unix.cs Lines 685 to 705 in 12f978b
The lambda explicitly avoids closing over any state, and as a result the C# compiler will cache the delegate for it and reuse that same delegate object over and over and over. All of the state is passed into the object as a state argument. |
Well, if the compiler has gotten smart enough to do that... Hurray! (It does make perfect sense to do that automatically though.) I really want to believe you on that 😁 I'm distrusting by nature though so I'll have to check some IL to confirm. The delegation to the instance method can have a bit of perf kick though due to the field locality. That might be the source of the bulk of the gains that I've measured by implementing this strategy. So perhaps not a total waste of time.
Not too unrelated as it pertained to specific code in the commit, but I do think we've hammered this nail enough. Thanks for the info about the C# compiler & delegate caching. I definitely need to read up on that. |
…ensibility Implement ValueTask extensibility Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
…ensibility Implement ValueTask extensibility Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
…ensibility Implement ValueTask extensibility Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
@stephentoub, I hate to beat a horse to death here, but I wanted to clarify that while you're right about the C# compiler generating a field for us at compile time to store stateless delegates, it isn't exactly the same as doing it by hand (e.g. the strategy I outlined in my "sidebar" above). Consider these two C# statements. The first using my static readonly callback strategy:
And the second uses the stateless delegate strategy:
The IL for the first is very simple:
The IL for the stateless delegate method is more complex:
In the second method, where the compiler creates a delegate class, and a backing field, it is checking to see if the field is non-null, and if it is null, it creates the delegate and stores it, and then otherwise executes the stored delegate. That branching does have a cost. The branch predict should essentially always get it right after the field is populated. So on heavy use code-paths, this method is probably okay. But there is an implied, even if minor, cost to letting the compiler "do it for us" here. (I suppose the counterargument is that a future version of the compiler might "do it better", but it is what it is for now). I will tell you one benefit of this investigation for me: my Resharper configuration was complaining that
produces this horrid IL:
I know in the past I've always let Resharper make that substitution for the method group for me, and now it is littered all through my codebase. I had assumed (incorrectly) that it was allocating always anyway, and the method group substitution reads better. I would argue that the compiler should handle the method group case using the same strategy as the stateless delegate. I think I'll try to find the place to add an issue for the C# compiler team. If they make this change, it would apply to old frameworks too! (ps. I've pinged Jetbrains on Twitter to see if they can help us poor saps who've made this blunder... https://twitter.com/marksmeltzer/status/969762966655582208) |
That's dotnet/roslyn#5835, which has a PR, dotnet/roslyn#6642, that has been open for several years, with last activity several months ago. I'm not sure what's currently blocking it. |
@benaadams, this change significantly impacted the await code paths for value tasks, which you'd previously looked at improving. Might be worth another look if you have the time 😄 |
…ensibility Implement ValueTask extensibility Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
This commit adds support for extending
ValueTask<T>
with arbitrary backing sources. Prior to this change,ValueTask<T>
could wrap aT
or aTask<T>
; now it can also wrap anIValueTaskSource<T>
, which can be implemented by arbitrary objects to be represented byValueTask<T>
. These objects can then be pooled and reused to minimize allocation. The commit also adds a non-genericValueTask
that can represent void-returning operations, including adefault
synchronous success,Task
, andIValueTaskSource
. For the non-genericValueTask
, the commit also includes awaiters and async method builders, so it can be both awaited and used as the return type of an async method.The rest of the changes fall into a few buckets all related to enabling this support:
AsyncTaskMethodBuilder<TResult>.AwaitUnsafeOnCompleted
to specially recognize anyValueTask
and utilize either theTask
orIValueTaskSource
that backs it to avoid allocating an Action MoveNext method. If every object awaited in an async method is either aTask
/Task<T>
orValueTask
/ValueTask<T>
, regardless of what theValueTask
/ValueTask<T>
wraps, we'll be able to avoid allocating the delegate and only allocate the single state machine object that also serves as the returned object.Stream.WriteAsync
to returnValueTask
instead ofTask
. This enables interested overriding stream types to use a reusable/pooled object to avoidWriteAsync
allocations.Memory
-based overloads ofReadAsync
andWriteAsync
. This enables the defaultCopyToAsync
implementation to take advantage of any pooling done by derived streams, but even without pooling to take advantage of synchronously completingReadAsync
s returningValueTask<int>
s that contained anint
rather than an allocated object. (While I was modifying this, I also removed some unnecessary array clearing that we'd added before later deciding it wasn't needed in general.)ReadAsync
/WriteAsync
overloads.Contributes to https://github.com/dotnet/corefx/issues/27445