Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark parallel algorithms as experimental #10

Closed
ViralBShah opened this issue Oct 18, 2021 · 12 comments
Closed

Mark parallel algorithms as experimental #10

ViralBShah opened this issue Oct 18, 2021 · 12 comments
Labels
bug Something isn't working

Comments

@ViralBShah
Copy link
Contributor

ViralBShah commented Oct 18, 2021

A parallel BFS non-deterministic issue in #9
https://github.com/JuliaGraphs/Graphs.jl/pull/9/checks?check_run_id=3932016414#step:6:227

Trace says this is where the test fails: https://github.com/JuliaGraphs/Graphs.jl/blob/master/src/Parallel/traversals/gdistances.jl#L67

The test failed in the parallel gdistances test.

@ViralBShah ViralBShah added the bug Something isn't working label Oct 18, 2021
@ViralBShah
Copy link
Contributor Author

ViralBShah commented Oct 18, 2021

Could it be because writes in cases where data races may occur are not atomic?

local_front = cur_front_t[t] # Data race, but first read always succeeds

# Data race, but first read on visited[i] always succeeds

@ViralBShah
Copy link
Contributor Author

I'm unable to recreate this test failure outside of CI.

@ViralBShah ViralBShah changed the title Non-deterministic test in BFS Non-deterministic test failure in BFS Oct 19, 2021
@Keno
Copy link
Contributor

Keno commented Nov 8, 2021

This entire function is making assumptions about Julia's threaded memory model that do not hold true. I'm not sure as to the exact cause of the CI issue, but this function is entirely unsound as is.

@ViralBShah
Copy link
Contributor Author

I was wondering if the parallel graph algorithms should be split out of Graphs.jl into a separate more experimental package.

@ViralBShah ViralBShah changed the title Non-deterministic test failure in BFS Split parallel algorithms into a separate package Nov 8, 2021
@ViralBShah
Copy link
Contributor Author

Discussion around threaded gdistance: https://discourse.julialang.org/t/optimistic-concurrency-control/12402

As @Keno pointed out on slack, this could be correctly implemented with 1.7 atomics. But for now, it would be best to split these into a separate package to make Graphs.jl more reliable and a nice side effect would be that it will have fewer dependencies.

@jpfairbanks
Copy link
Member

Yes I think this is appropriate unless anyone wants to take on the mantle of getting this updated to correctly use the 1.7 atomics in a hurry. We might find out if anyone is using the parallel stuff when we remove it.

@tkf
Copy link

tkf commented Nov 13, 2021

FYI there's https://github.com/JuliaConcurrent/UnsafeAtomics.jl (registered) and a wrapper https://github.com/JuliaConcurrent/AtomicArrays.jl (not registered). You'd need something like this for concurrently accessed arrays like visited.

Also, Julia 1.7 does not help much here since we still don't have atomically accessible arrays JuliaLang/julia#32455. However, if we only need to access small isbits types, we don't need direct support in the Julia core. I think I can provide pre-1.7-compatible API from AtomicArrays.jl. if people need it. (Or you can just directly write llvm IR)

@simonschoelly
Copy link
Member

The parallel functions are already in a Parallel submodule, that could be considered experimental. I haven't looked at the issues here yet, but I would rather keep these function in this package for now - maybe disable the tests and add some warnings to the documentation.

@ViralBShah ViralBShah changed the title Split parallel algorithms into a separate package Mark parallel algorithms as experimental Nov 22, 2021
@ViralBShah
Copy link
Contributor Author

If we mark the parallel algorithms experimental, then we can disable the tests for the ones that are failing, and make it easier to get new functionality in.

@jpfairbanks
Copy link
Member

I guess that just takes someone to update the readme saying that the Parallel module is no longer supported on Julia 1.7 and tag a minor release with that code disabled. I think we should leave it in the repo, but delete the include calls, so that it is easier for someone to say "why doesn't this parallel code work" and try to fix it. Deleting the code makes it less likely that someone would pick ip from where we left off.

@ViralBShah
Copy link
Contributor Author

Agree - that's exactly what I was thinking. I have seen only gdistance and bfs failing, and it seems easy to just disable those two

https://github.com/JuliaGraphs/Graphs.jl/blob/master/test/parallel/runtests.jl

But do the other algorithms use these? Or maybe we just keep disabling things until we reach a happy point?

@simonschoelly
Copy link
Member

Done in #94

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants