Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concurrency safe iterator wrapper ChannelLike #121

Merged
merged 1 commit into from
Sep 27, 2024

Conversation

fredrikekre
Copy link
Contributor

Just to show that the idea from JuliaFolds2/ChunkSplitters.jl#49 can be used to improve the (chunked) GreedyScheduler.
The advantages are that the iterator doesn't have to be copied into a Channel, and that the AtomicChunk-thingy is more efficient (but not as general, of course) than a Channel. See some benchmarks in the test.jl file.

@MasonProtter
Copy link
Member

Very nice idea, I had wanted something like this before.

@fredrikekre fredrikekre changed the title WIP: Use AtomicChunk Channel-like iterator for GreedyScheduler Add concurrency safe iterator wrapper ChannelLike Sep 26, 2024
@fredrikekre
Copy link
Contributor Author

fredrikekre commented Sep 26, 2024

Ok, cleaned it up and added some docs. If you prefer to keep it internal we can do that, but it isn't really much code to maintain. On the other hand, perhaps this isn't a package of "generally useful things". Let me know what you think.

Posting the benchmarks here for posterity since I removed the file with them from the commit:

Benchmark:

using OhMyThreads: tmapreduce, GreedyScheduler
using BenchmarkTools: @btime

function mc_parallel(N; ntasks, chunksize)
    scheduler = GreedyScheduler(; ntasks, chunksize, split = :consecutive)
    M = tmapreduce(+, 1:N; scheduler) do i
        rand()^2 + rand()^2 < 1.0
    end
    pi = 4 * M / N
    return pi
end

const N = 100_000_000

for chunksize in (10, 1_000, 100_000), ntasks in (1, 2, 4)
    @btime mc_parallel(N; ntasks = $ntasks, chunksize = $chunksize)
end

PR:

950.227 ms (15 allocations: 944 bytes)
656.752 ms (21 allocations: 1.48 KiB)
562.387 ms (33 allocations: 2.61 KiB)

506.860 ms (15 allocations: 944 bytes)
252.174 ms (21 allocations: 1.48 KiB)
129.447 ms (33 allocations: 2.61 KiB)

499.351 ms (15 allocations: 944 bytes)
249.004 ms (21 allocations: 1.48 KiB)
127.825 ms (33 allocations: 2.61 KiB)

master:

2.575 s (53 allocations: 439.07 MiB)
3.593 s (59 allocations: 219.72 MiB)
3.346 s (70 allocations: 110.09 MiB)

518.885 ms (46 allocations: 4.06 MiB)
263.877 ms (52 allocations: 4.06 MiB)
141.339 ms (60 allocations: 69.31 KiB)

500.587 ms (41 allocations: 67.19 KiB)
252.641 ms (47 allocations: 67.75 KiB)
127.858 ms (59 allocations: 26.23 KiB)

@fredrikekre fredrikekre marked this pull request as ready for review September 26, 2024 23:21
Copy link
Member

@carstenbauer carstenbauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and I'm fine with documenting this as API. However, I think we need to modify the GreedyScheduler docstring which currently mentions Channel explicitly. Probably we shouldn't mention the specific implementation of the "storage" at all.

@codecov-commenter
Copy link

codecov-commenter commented Sep 27, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 80.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 85.81%. Comparing base (cba9127) to head (1776729).
Report is 127 commits behind head on master.

Files with missing lines Patch % Lines
src/types.jl 77.77% 2 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #121      +/-   ##
==========================================
- Coverage   90.24%   85.81%   -4.44%     
==========================================
  Files           3        7       +4     
  Lines          82      578     +496     
==========================================
+ Hits           74      496     +422     
- Misses          8       82      +74     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@carstenbauer
Copy link
Member

BTW, could perhapse make sense to mention this in the docs under https://juliafolds2.github.io/OhMyThreads.jl/stable/literate/tls/tls/#Another-safe-way-based-on-Channel.

@fredrikekre
Copy link
Contributor Author

Added to changelog and updated the GreedyScheduler docstring.

@carstenbauer
Copy link
Member

Unfortunately, we mention "channel" more times in the docstring in the keyword arguments section. Would be great if you could "fix" this.

@fredrikekre
Copy link
Contributor Author

BTW, could perhapse make sense to mention this in the docs under

Done.

Unfortunately, we mention "channel" more times in the docstring

Fixed, replaced with "shared workqueue".

`ChannelLike` wraps an indexable object such that it can be iterated by
concurrent tasks in a safe manner similar to a `Channel`. This is
instead of `Channel` in the chunked `GreedyScheduler`.
@carstenbauer carstenbauer merged commit b6fa541 into JuliaFolds2:master Sep 27, 2024
9 checks passed
@fredrikekre fredrikekre deleted the fe/atomicchunk branch September 27, 2024 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants