Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix warnings #193

Merged
merged 20 commits into from
Oct 16, 2024
Merged

Fix warnings #193

merged 20 commits into from
Oct 16, 2024

Conversation

N-Maas
Copy link
Collaborator

@N-Maas N-Maas commented Oct 2, 2024

Fixes a bunch of warnings from newer compiler, cmake and dependency versions. As well as a spuriously failing assertion in the coarsening code, which was due to a redundant write (discovered in #188)

mt-kahypar/partition/coarsening/multilevel_coarsener.h Outdated Show resolved Hide resolved
success = joinCluster<has_fixed_vertices>(hypergraph,
u, rep, cluster_ids, contracted_nodes, fixed_vertices);
} else if ( _matching_state[v].compare_exchange_strong(unmatched, match_in_progress) ) {
} else if ( _matching_state[v].compare_exchange_strong(expect_unmatched_v, match_in_progress) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify a memory ordering here. Acquire should be right here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added memory orderings for the whole file. For this specific line even relaxed suffices, if I'm not mistaken

@@ -33,12 +33,23 @@
#include <tbb/enumerable_thread_specific.h>

#ifdef __linux__
#ifdef __clang__
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wunused-parameter"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we fix the unused parameter warning?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if anyone wants to open a PR at growt, go ahead 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah :) No thanks. I'm actually in favor of removing growt and just using tbb_unordered_map.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah removing might make sense, but perhaps in another PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tbb::unordered_map was 2 - 3 order of magnitudes slower for caching the Steiner trees calculations -> do not remove :D

tests/CMakeLists.txt Show resolved Hide resolved
uint8_t expect_unmatched_u = STATE(MatchingState::UNMATCHED);
if ( _matching_state[u].compare_exchange_strong(expect_unmatched_u, match_in_progress) ) {
_matching_partner[u] = v;
if ( _matching_state[u].compare_exchange_strong(expect_unmatched_u, match_in_progress, std::memory_order_relaxed) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a compare-exchange, the memory ordering should be acquire

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, for what reason? This is absolutely not generally true. You can't just map operations to orderings, that is not how memory ordering semantics work

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't some of the loads and stores in the locked state be allowed to moved before the CAS? Is that why the load on line 389 has acquire ordering? Does that prevent this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the accesses we need to be careful about are loads and stores of cluster_ids. New cluster IDs are written by the "owning" thread of the according node in the joinCluster operation. Cluster IDs are read either during conflict resolution (line 440) or when the state of v is already MATCHED (line 393). So in these two cases, we need to establish a happens-before relationship between the reading thread (lets say t_u) and the writing thread (t_v).

Line 393: Since u is owned by t_u, no other thread than t_u has performed a write on matching_state[u]. Thus, using acquire semantics for the compare_exchange wouldn't do anything here, since there is nothing to establish a happens-before relationship with. However, t_v has updated matching_state[v] to MATCHED after updating the cluster ID. What we need is therefore a happens-before relationship with regards to matching_state[v]. This is achieved by the release semantics in line 448 (as well as 403/430, though I think the release semantics aren't even strictly necessary there, since the cluster ID does not change) and the load on line 389 with acquire semantics.

Line 440: Same principle, in this case the acquire semantics on matching_state[v] are provided by the load in line 432.

Now, of course this does not provide any ordering guarantees for the accesses to _matching_partner (e.g. line 384). However, as discussed before, acquire semantics on the compare_exchange wouldn't help with this either, since there is nothing to synchronize against (I guess if we want that ordering we could use release semantics for the write in line 384). Luckily, the conflict resolution code does not need any ordering guarantees, since it just loops until a consistent state is detected. Only when the conflict is resolved and writes happen we need the acquire/release semantics.

So the summarized argument for the correctness is that (a) all accesses to cluster_ids are synchronized via acquire/release semantics on the read/write when cluster_ids[v] changes state to MATCHED and (b) we don't need ordering guarantees for accesses to _matching_partner.

Of course there is also _cluster_weight, but here we just accept that a cluster might get too heavy if a race condition happens. To fix this we would need a compare_exchange loop in joinCluster, which we probably don't want.

@N-Maas N-Maas merged commit 0ef674a into master Oct 16, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants