Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async-streams: Consider optimizing return logic of MoveNextAsync() #31246

Closed
jcouv opened this issue Nov 19, 2018 · 4 comments
Closed

Async-streams: Consider optimizing return logic of MoveNextAsync() #31246

jcouv opened this issue Nov 19, 2018 · 4 comments
Assignees
Labels
Area-Compilers New Feature - Async Streams Async Streams Resolution-Fixed The bug has been fixed and/or the requested behavior has been implemented
Milestone

Comments

@jcouv
Copy link
Member

jcouv commented Nov 19, 2018

Reported by @stephentoub

Right now I see this for MoveNextAsync:

[DebuggerHidden]
ValueTask<bool> IAsyncEnumerator<int>.MoveNextAsync()
{
    if (this.<>1__state == -2)
    {
        return new ValueTask<bool>();
    }

    this.<>v__promiseOfValueOrEnd.Reset();
    Program.<DoStuffAsync>d__1 stateMachine = this;
    this.<>t__builder.Start<Program.<DoStuffAsync>d__1>(ref stateMachine);
    return new ValueTask<bool>(this, this.<>v__promiseOfValueOrEnd.Version);
}

That last line will force the resulting usage of the ValueTask<bool> to go through the underlying interface, even if there is already a yielded value available (which we expect to be common). That means that in such a case we’ll end up making two unnecessary interface calls: IsCompleted and GetResult. If we were to instead check the ManualResetValueTaskSourceCore to see if there’s already a value available, and if there is, return a new ValueTask<bool>(true), it would make that path faster. We should of course benchmark it, but I expect the savings will more than make up for the extra branch.

@jcouv
Copy link
Member Author

jcouv commented Dec 9, 2018

FYI @stephentoub I tried a simple benchmark. Let me know what you think

// * Summary *

BenchmarkDotNet=v0.11.3, OS=Windows 10.0.17134.407 (1803/April2018Update/Redstone4)
Intel Core i7-6770HQ CPU 2.60GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
Frequency=2531253 Hz, Resolution=395.0613 ns, Timer=TSC
.NET Core SDK=3.0.100-preview-009812
  [Host] : .NET Core 3.0.0-preview-27122-01 (CoreCLR 4.6.27121.03, CoreFX 4.7.18.57103), 64bit RyuJIT
  Core   : .NET Core 3.0.0-preview-27122-01 (CoreCLR 4.6.27121.03, CoreFX 4.7.18.57103), 64bit RyuJIT

Job=Core  Runtime=Core
Method Mean Error StdDev
Interface 24.76 ns 0.3026 ns 0.2830 ns
ManualCheck 19.79 ns 0.3713 ns 0.3291 ns
using System;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Sources;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

namespace MyBenchmarks
{
    [CoreJob]
    public class ValueTaskBenchmarkCore
    {
        private MyValueTaskSource source = new MyValueTaskSource();

        [Benchmark]
        public void Interface()
        {
            var task = new ValueTask<bool>(source, source.token);
            task.GetAwaiter().GetResult();
        }

        [Benchmark]
        public void ManualCheck()
        {
            short token = source.token;
            var promise = source.promise;

            if (promise.GetStatus(token) == ValueTaskSourceStatus.Succeeded)
            {
                var task = new ValueTask<bool>(promise.GetResult(token));
                task.GetAwaiter().GetResult();
                return;
            }

            Thread.Sleep(10000);
            throw null;
        }
    }

    public class Program
    {
        public static void Main(string[] args)
        {
            var summary = BenchmarkRunner.Run<ValueTaskBenchmarkCore>();
        }
    }

    public class MyValueTaskSource : IValueTaskSource<bool>
    {
        public ManualResetValueTaskSourceCore<bool> promise;
        public short token;

        public MyValueTaskSource()
        {
            promise = new ManualResetValueTaskSourceCore<bool>();
            promise.Reset();
            token = promise.Version;
            promise.SetResult(true);
        }

        public bool GetResult(short token) => promise.GetResult(token);

        public ValueTaskSourceStatus GetStatus(short token) => promise.GetStatus(token);

        public void OnCompleted(Action<object> continuation, object state, short token, ValueTaskSourceOnCompletedFlags flags)
        {
        }
    }
}

@jcouv
Copy link
Member Author

jcouv commented Dec 9, 2018

I have a PR for the compiler to generate:

		ValueTask<bool> IAsyncEnumerator<string>.MoveNextAsync()
		{
			if (<>1__state == -2)
			{
				return default(ValueTask<bool>);
			}
			<>v__promiseOfValueOrEnd.Reset();
			<Iter>d__1 stateMachine = this;
			<>t__builder.MoveNext(ref stateMachine);
			short version = <>v__promiseOfValueOrEnd.Version;
			if (<>v__promiseOfValueOrEnd.GetStatus(version) == ValueTaskSourceStatus.Succeeded)
			{
				return new ValueTask<bool>(<>v__promiseOfValueOrEnd.GetResult(version));
			}
			return new ValueTask<bool>(this, version);
		}

@stephentoub
Copy link
Member

Great. Thanks, @jcouv. The benchmark looks good, but I expect real-world results will actually be even better for the new version, for two reasons:

  1. The benchmark isn't checking task.IsCompleted like the generated await code would, so actual consumption will result in two calls to ValueTask, in one case via direct calls and in one case involving interface dispatch for each.
  2. The benchmark only ever uses one implementation of the interface in ValueTask, so the JIT is going to end up using a monomorphic dispatch stub, which is the fastest way it can invoke an interface method (short of being able to devirtualize). In actual workoads, since ValueTask will end up wrapping many, many different IValueTaskSource implementations, I expect the tiered JIT-compiled code for ValueTask will end up having to use a slower polymorphic dispatch stub.

@jcouv
Copy link
Member Author

jcouv commented Dec 10, 2018

Yup, the benefits of the optimization are even more stark when also calling IsCompleted.

Method Mean Error StdDev
Interface 29.37 ns 0.4140 ns 0.3232 ns
ManualCheck 20.72 ns 0.2668 ns 0.2496 ns
using System;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Sources;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

// dotnet run -c Release
namespace MyBenchmarks
{
    [CoreJob]
    public class ValueTaskBenchmarkCore
    {
        private MyValueTaskSource source = new MyValueTaskSource();

        [Benchmark]
        public void Interface()
        {
            var task = new ValueTask<bool>(source, source.token);
            var awaiter = task.GetAwaiter();
            _ = awaiter.IsCompleted;
            awaiter.GetResult();
        }

        [Benchmark]
        public void ManualCheck()
        {
            short token = source.token;
            var promise = source.promise;

            if (promise.GetStatus(token) == ValueTaskSourceStatus.Succeeded)
            {
                var task = new ValueTask<bool>(promise.GetResult(token));
                var awaiter = task.GetAwaiter();
                _ = awaiter.IsCompleted;
                awaiter.GetResult();
                return;
            }

            Thread.Sleep(10000);
            throw null;
        }
    }

    public class Program
    {
        public static void Main(string[] args)
        {
            var summary = BenchmarkRunner.Run<ValueTaskBenchmarkCore>();
        }
    }

    public class MyValueTaskSource : IValueTaskSource<bool>
    {
        public ManualResetValueTaskSourceCore<bool> promise;
        public short token;

        public MyValueTaskSource()
        {
            promise = new ManualResetValueTaskSourceCore<bool>();
            promise.Reset();
            token = promise.Version;
            promise.SetResult(true);
        }

        public bool GetResult(short token) => promise.GetResult(token);

        public ValueTaskSourceStatus GetStatus(short token) => promise.GetStatus(token);

        public void OnCompleted(Action<object> continuation, object state, short token, ValueTaskSourceOnCompletedFlags flags)
        {
        }
    }
}

@jcouv jcouv added the Resolution-Fixed The bug has been fixed and/or the requested behavior has been implemented label Dec 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers New Feature - Async Streams Async Streams Resolution-Fixed The bug has been fixed and/or the requested behavior has been implemented
Projects
None yet
Development

No branches or pull requests

2 participants