-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CI failure due to AZP multiple valid MST #406
Conversation
Codecov Report
@@ Coverage Diff @@
## main #406 +/- ##
=======================================
+ Coverage 77.6% 77.7% +0.1%
=======================================
Files 27 27
Lines 2634 2636 +2
=======================================
+ Hits 2045 2048 +3
+ Misses 589 588 -1
|
@gegen07 This is great work in finding the exact cause of the discrepency here. While the solution here is probably not the most efficient, it certainly does get the job done and makes CI green again (which is a huge accomplishment). Here's a couple of points to start off the discussion:
@knaaptime @martinfleis What are yalls thoughts on this? |
I'm also wondering if functions like |
New failures due to pysal/libpysal#605, pysal/esda#271. Fix is on the way. --> pysal/esda#272 |
spopt/region/util.py
Outdated
@@ -747,6 +743,14 @@ def _randomly_divide_connected_graph(adj, n_regions): | |||
f"equal to the number of nodes which is {n_areas}." | |||
) | |||
mst = csg.minimum_spanning_tree(adj) | |||
mst_copy = mst.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use a classic trick here:
symmetric_mst = (mst + mst.T) > 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ljwolf How does this look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so smooth!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gegen07 I know, right? Always some tricks to learn that I have never seen!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ready to go in my opinion, but we'll wait for @ljwolf' s approval to see if there are any more easy improvements.
Summary
Read the issue comment here. TLDR; the MST of
scipy.sparse
graphs generates some fluctuations. Then the proposed fix is presented inutil.py
. Working with an undirected graph give the same results over the python versions.MST of an unweighted graph is all trees (a path with a single component). Then it would give more than one valid MST using the
scipy
algorithm. Thus, I computed the distance between centroid polygons based on Queen Weight contiguity to generate a unique valid MST based on Mexico shapefile intest_azp.py
.I changed the expected labels in the test to consider all work done in the PR.