-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix warnings #193
Fix warnings #193
Conversation
These write were redundant. Also, the state of u would be updated to MATCHED before v. However, the code expects that v is updated before u (see assertion in line 388).
Note: removing explicit constructor since it interferes with implicit constructor/implicit assignment operator
triggers unused warnings by clang otherwise
success = joinCluster<has_fixed_vertices>(hypergraph, | ||
u, rep, cluster_ids, contracted_nodes, fixed_vertices); | ||
} else if ( _matching_state[v].compare_exchange_strong(unmatched, match_in_progress) ) { | ||
} else if ( _matching_state[v].compare_exchange_strong(expect_unmatched_v, match_in_progress) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify a memory ordering here. Acquire should be right here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added memory orderings for the whole file. For this specific line even relaxed suffices, if I'm not mistaken
@@ -33,12 +33,23 @@ | |||
#include <tbb/enumerable_thread_specific.h> | |||
|
|||
#ifdef __linux__ | |||
#ifdef __clang__ | |||
#pragma clang diagnostic push | |||
#pragma clang diagnostic ignored "-Wunused-parameter" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we fix the unused parameter warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if anyone wants to open a PR at growt, go ahead 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah :) No thanks. I'm actually in favor of removing growt and just using tbb_unordered_map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah removing might make sense, but perhaps in another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tbb::unordered_map
was 2 - 3 order of magnitudes slower for caching the Steiner trees calculations -> do not remove :D
uint8_t expect_unmatched_u = STATE(MatchingState::UNMATCHED); | ||
if ( _matching_state[u].compare_exchange_strong(expect_unmatched_u, match_in_progress) ) { | ||
_matching_partner[u] = v; | ||
if ( _matching_state[u].compare_exchange_strong(expect_unmatched_u, match_in_progress, std::memory_order_relaxed) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a compare-exchange, the memory ordering should be acquire
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm, for what reason? This is absolutely not generally true. You can't just map operations to orderings, that is not how memory ordering semantics work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't some of the loads and stores in the locked state be allowed to moved before the CAS? Is that why the load on line 389 has acquire ordering? Does that prevent this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the accesses we need to be careful about are loads and stores of cluster_ids
. New cluster IDs are written by the "owning" thread of the according node in the joinCluster
operation. Cluster IDs are read either during conflict resolution (line 440) or when the state of v
is already MATCHED (line 393). So in these two cases, we need to establish a happens-before relationship between the reading thread (lets say t_u) and the writing thread (t_v).
Line 393: Since u is owned by t_u, no other thread than t_u has performed a write on matching_state[u]
. Thus, using acquire semantics for the compare_exchange
wouldn't do anything here, since there is nothing to establish a happens-before relationship with. However, t_v has updated matching_state[v]
to MATCHED after updating the cluster ID. What we need is therefore a happens-before relationship with regards to matching_state[v]
. This is achieved by the release semantics in line 448 (as well as 403/430, though I think the release semantics aren't even strictly necessary there, since the cluster ID does not change) and the load on line 389 with acquire semantics.
Line 440: Same principle, in this case the acquire semantics on matching_state[v]
are provided by the load in line 432.
Now, of course this does not provide any ordering guarantees for the accesses to _matching_partner
(e.g. line 384). However, as discussed before, acquire semantics on the compare_exchange
wouldn't help with this either, since there is nothing to synchronize against (I guess if we want that ordering we could use release semantics for the write in line 384). Luckily, the conflict resolution code does not need any ordering guarantees, since it just loops until a consistent state is detected. Only when the conflict is resolved and writes happen we need the acquire/release semantics.
So the summarized argument for the correctness is that (a) all accesses to cluster_ids
are synchronized via acquire/release semantics on the read/write when cluster_ids[v]
changes state to MATCHED and (b) we don't need ordering guarantees for accesses to _matching_partner
.
Of course there is also _cluster_weight
, but here we just accept that a cluster might get too heavy if a race condition happens. To fix this we would need a compare_exchange
loop in joinCluster
, which we probably don't want.
Fixes a bunch of warnings from newer compiler, cmake and dependency versions. As well as a spuriously failing assertion in the coarsening code, which was due to a redundant write (discovered in #188)