-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented watershed algorithm for finding local minima in 1D data based on topological persistence. #3515
Conversation
b25c8ea
to
2688531
Compare
@asmirnov239 Please review! |
daf3b49
to
fe8ba43
Compare
Codecov Report
@@ Coverage Diff @@
## master #3515 +/- ##
===============================================
+ Coverage 79.905% 79.917% +0.012%
- Complexity 17918 17945 +27
===============================================
Files 1199 1200 +1
Lines 65102 65195 +93
Branches 10142 10160 +18
===============================================
+ Hits 52020 52102 +82
- Misses 9042 9049 +7
- Partials 4040 4044 +4
|
fe8ba43
to
a77e859
Compare
.mapToDouble(i -> (double) i / numPoints - 0.5) | ||
.map(x -> (x - 0.75) * (x - 0.5) * (x - 0.25) * (x - 0.1) * (x + 0.25) * (x + 0.5) + 0.0001 * rng.nextGaussian()) | ||
.toArray(); | ||
System.out.println(Arrays.stream(dataPolynomial).boxed().map(Object::toString).collect(Collectors.joining(", "))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! Reminder to self to remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with the review. Looks great, just a few minor comments!
/** | ||
* Identifies the local minima of {@code data} based on topological persistence upon construction. | ||
*/ | ||
public PersistenceOptimizer(final double[] data) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you leave a description here of the what the data parameter represents, i.e. that it's ordered values of a 1D function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
//find extrema pairs via a watershed algorithm | ||
private static List<ExtremaPair> findExtremaPairs(final double[] data, | ||
final List<Integer> sortedIndices) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that sortedIndices
has to be stably sorted (which it was in the constructor) for the algorithm to work correctly (since if we have a local constant interval in the provided function, we will could get a lot of spurious components). Could you still mention this requirement here for the future devs, and also add an extra test for this test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expanded the comment. I have a test case where the entire function is constant, but I'll add one like you describe.
extremaPairs.add(new ExtremaPair(data, components.get(rightColor).minIndex, index)); | ||
} | ||
mergeComponents(components, colors, leftColor, rightColor); | ||
colors[index] = colors[index - 1]; //local maxmimum takes color of component to left by convention |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maxmimum -> maximum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
extremaPairs.add(new ExtremaPair(data, components.get(rightColor).minIndex, index)); | ||
} | ||
mergeComponents(components, colors, leftColor, rightColor); | ||
colors[index] = colors[index - 1]; //local maxmimum takes color of component to left by convention |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused here. Isn't the local maximum a part of a larger component at this point in the code, color of which is determined by which of the merged components has the lowest minimum? Also can't we just skip recoloring of the local maximum as we skip recoloring of non-boundary component values in the mergeComponents
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the maximum has been assigned a color yet, since we are traversing the points in ascending order, and both createComponent
and extendComponent
only assign colors to the point under consideration.
You are right in that assigning a color is not necessary just to find and order the local minima, but it's nice to have all points assigned a color at the end (and that's probably what would reasonably be expected from anyone looking to reuse this code in the future).
But perhaps it would be more consistent to also assign the color within mergeComponents
; i.e., I'll change the signature of mergeComponents
so that we call mergeComponents(components, colors, leftColor, rightColor, index)
and will move the assignment of the color into the method accordingly.
colors[components.get(indexToMerge).leftIndex] = indexToKeep; | ||
colors[components.get(indexToMerge).rightIndex] = indexToKeep; | ||
if (components.get(indexToKeep).minIndex > components.get(indexToMerge).minIndex) { | ||
components.get(indexToKeep).leftIndex = components.get(indexToMerge).leftIndex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not removing the components.get(indexToMerge)
component may introduce some space overhead, which could be linear in most degenerate examples, but I don't see a quick work-around..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically true, but space considerations are probably not an issue in most cases (and I think actually removing "merged" components also complicates the bookkeeping a bit).
.toArray(); | ||
final List<Integer> minimaIndicesExpectedIdentityNegative = Collections.singletonList(numPoints - 1); | ||
|
||
final double[] dataFlat = new double[numPoints]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add an extra test case, which has a flat part, but also has a single maxima and minima?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion---the result of that test is actually a little non-intuitive! Added some extra cases to boot.
2c5d6b5
to
5b039fd
Compare
Addressed review comments, expanded documentation, added some test cases, and did some minor renaming/cleanup. Thanks, back to you @asmirnov239! |
…ased on topological persistence.
5b039fd
to
990d05c
Compare
The changes look good @samuelklee! |
This bit of code will be used for the new segmentation method, but could be more generally useful as well. I put it in a utils package within the copynumber package for now, but we can put it somewhere else if more appropriate.