Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Parallel.ForEachAsync #46943

Merged
merged 2 commits into from
Jan 14, 2021
Merged

Add Parallel.ForEachAsync #46943

merged 2 commits into from
Jan 14, 2021

Conversation

stephentoub
Copy link
Member

Fixes #1946
cc: @kouvel, @tarekgh, @tomesendam

@ghost
Copy link

ghost commented Jan 13, 2021

Tagging subscribers to this area: @tarekgh
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #1946
cc: @kouvel, @tarekgh, @tomesendam

Author: stephentoub
Assignees: -
Labels:

area-System.Threading.Tasks

Milestone: 6.0.0

@dotnet-issue-labeler
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

Also clean up a few things in the src, adding a few more asserts and moving nullable suppressions closer to the point where they're relevant.
@stephentoub stephentoub merged commit 6936e44 into dotnet:master Jan 14, 2021
@stephentoub stephentoub deleted the foreachasync branch January 14, 2021 18:47
@theodorzoulias
Copy link
Contributor

theodorzoulias commented Jan 24, 2021

Hi all. I played a bit with the Parallel.ForEachAsync method, and I think I may have found a bug in the current implementation. It seems that in the presence of an ambient SynchronizationContext, it is possible for the suppled body to not run always on the specified ParallelOptions.TaskScheduler. I observed this behavior with a ForEachAsync configured with a CurrentThreadTaskScheduler (from the ParallelExtensionsExtras), running under a Nito.AsyncEx.AsyncContext, in a .NET 5 Concole app. Only the first elements are invoked on the specified scheduler, and the rest are invoked on the ThreadPoolTaskScheduler.

Something strange also happens with this specific scheduler (CurrentThreadTaskScheduler), irrespective of the presence of a SynchronizationContext. The first few items of the enumerable are processed in reverse order (#4, #3, #2, #1, #5...). AFAICS this doesn't happen when the existing synchronous Parallel.ForEach method is configured with the same scheduler.

Thanks for the good work on this very much needed API!

@stephentoub
Copy link
Member Author

Can you please open an issue with a repro? Thanks.

@stephentoub
Copy link
Member Author

stephentoub commented Jan 24, 2021

Oh, CurrentThreadTaskScheduler. If there's a sync ctx on the current thread, the awaits in the implementation will see it. That's expected, and there's little that can or should be done about that. That sample CurrentThreadTaskScheduler does not make sense to use with anything on Parallel, as "run synchronously on the current thread" and "parallel" are opposites ;-)

@theodorzoulias
Copy link
Contributor

What I find compelling with the idea of combining ForEachAsync and CurrentThreadTaskScheduler is that I don't have to worry about thread synchronization inside the asynchronous delegate (the body). This doesn't really need to be parallelized, because it just returns a Task. The time-consuming work happens while the task is awaited, which is why parallelism is needed. I guess that instead of using a CurrentThreadTaskScheduler I could use an ConcurrentExclusiveSchedulerPair.ExclusiveScheduler, right? As I see now, this one seems to not be affected by the presence of the AsyncContext.

@theodorzoulias
Copy link
Contributor

No, the ConcurrentExclusiveSchedulerPair.ExclusiveScheduler doesn't cut it because it doesn't preserve the SynchronizationContext. What I probably need is to configure the ForEachAsync with the TaskScheduler.FromCurrentSynchronizationContext(). As I see now this preserves the context, although curiously the TaskScheduler.Current property still shows ThreadPoolTaskScheduler for all but the first four elements.

About posting a repro, I am not familiar with how GitHub works, but I could post something on dotnetfiddle.net if you want.

@stephentoub
Copy link
Member Author

stephentoub commented Jan 24, 2021

What are you trying to achieve? I'm not understanding why you're using Parallel.ForEachAsync. It sounds like you just want a regular await foreach loop

(And fwiw, CurrentThreadTaskScheduler is definitely not what you want, regardless of sync ctx. All it does is run a queued work item synchronously instead of queueing it. It doesn't run a continuation somehow back on the original thread.)

@theodorzoulias
Copy link
Contributor

theodorzoulias commented Jan 24, 2021

This a hypothetical usage example of the Parallel.ForEachAsync method in a WinForms application:

private async void Button_Click(object sender, EventArgs e)
{
    List<string> urls = GetUrls();
    await Parallel.ForEachAsync(urls, new ParallelOptions()
    {
        TaskScheduler = TaskScheduler.FromCurrentSynchronizationContext()
    }, async (url, token) =>
    {
        Debug.WriteLine(TaskScheduler.Current);
        var document = await _httpClient.GetStringAsync(url, token);
        ListBox1.Items.Add($"{url}: {document.Length} chars");
    });
}

It seems to work correctly with both a CurrentThreadTaskScheduler and a TaskScheduler.FromCurrentSynchronizationContext(). The ListBox1 is accessed from the UI thread, as it should. It's just a bit confuzing that the TaskScheduler.Current property inside the asynchronous lambda returns an object of type ThreadPoolTaskScheduler for all but the first few urls.

@stephentoub
Copy link
Member Author

Ah, you're trying to overlap the awaits but run all work on the UI thread.

TaskScheduler and SychronizationContext are both effectively schedulers, so when there's work to be queued, really only one can be used. If what you want is for all scheduling to be back to the original SynchronizationContext, try using TaskScheduler.FromCurrentSynchronizationContext.

@theodorzoulias
Copy link
Contributor

Yes, that's the idea. From my experience in StackOverflow the last couple of years, most folks who are searching for an async-friendly version of Parallel.ForEach are trying to do things like that. From what I've seen most of them are not going to bother with configuring the TaskScheduler. They're just going to Invoke their way to the UI thread, using the Dispatcher/Control.InvokeRequired from inside the asynchronous lambda. 😃

Thanks for the insight about the "conflict" between the TaskScheduler and the SychronizationContext. It has not caused me any problems in the past, so I guess it's not something serious to worry about.

@ghost ghost locked as resolved and limited conversation to collaborators Feb 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Async parallel foreach
4 participants