-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use ReadClipper in BaseQualityClipReadTransformer #3388
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3388 +/- ##
===============================================
+ Coverage 80.467% 80.549% +0.081%
- Complexity 17507 17639 +132
===============================================
Files 1173 1175 +2
Lines 63376 63749 +373
Branches 9878 9963 +85
===============================================
+ Hits 50997 51349 +352
- Misses 8431 8443 +12
- Partials 3948 3957 +9
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments on how to reduce code duplication. Merge at your will. 👍
public GATKRead apply(GATKRead read) { | ||
GATKRead readClippedRightEnd = clipReadRightEnd(read); | ||
public GATKRead apply(final GATKRead read) { | ||
final GATKRead readClippedRightEnd = clipReadRightEnd(read); | ||
return clipReadLeftEnd(readClippedRightEnd); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this exists before this PR, but while you are at it, could you add a documentation to getRightClipPoint(byte[])
and getLeftClipPoint(byte[])
explicitly saying that a negative return value means no clipping necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
public GATKRead apply(GATKRead read) { | ||
GATKRead readClippedRightEnd = clipReadRightEnd(read); | ||
public GATKRead apply(final GATKRead read) { | ||
final GATKRead readClippedRightEnd = clipReadRightEnd(read); | ||
return clipReadLeftEnd(readClippedRightEnd); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, while looking at these two functions, they are very similar.
Considering that even though it is public but seemingly only this class itself is using it, I suggest the following change, but it is totally up to you:
@override
public GATKRead apply(final GATKRead read) {
return clipRead(clipRead(read, false), true);
}
private int getClipPoint( final byte[] quals, final boolean fromLeft ) {
final int readLength = quals.length;
final int start, end, step;
if (fromLeft) {
start = 0; end = readLength; step = 1;
} else {
start = readLength - 1; end = -1; step = -1;
}
int clipSum = 0, lastMax = -1, clipPoint = -1; // -1 means no clip
for (int i = start; i != end; i+= step) {
clipSum += (qTrimmingThreshold - quals[i]);
if (clipSum >= 0 && (clipSum >= lastMax)) {
lastMax = clipSum;
clipPoint = i;
}
}
return clipPoint;
}
private GATKRead clipRead(final GATKRead read, final boolean fromLeft) {
final byte[] quals = read.getBaseQualities();
final int clipPoint = getClipPoint(quals, fromLeft);
if (clipPoint != -1) {
final ReadClipper readClipper = new ReadClipper(read);
readClipper.addOp(fromLeft ? new ClippingOp(0, clipPoint) : new ClippingOp(clipPoint, read.getLength()));
return readClipper.clipRead(ClippingRepresentation.HARDCLIP_BASES);
} else {
return read;
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my original approach, but I think it's easier to have two simple functions instead of one complicated function in this case.
import org.broadinstitute.hellbender.utils.read.GATKRead; | ||
|
||
import java.util.Arrays; | ||
|
||
/** | ||
* Clips reads on both ends using base quality scores | ||
*/ | ||
public class BaseQualityClipReadTransformer implements ReadTransformer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention that you probably intended this class to be final
too, and its serialVersionUID
to be private
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - done.
The earlier version of BaseQualityClipReadTransformer clipped bases by just removing them from the bases and base qualities arrays, but did not adjust the cigar, coordinate, etc. (thanks to @SHuang-Broad for catching this in #3354).
This commit fixes this behavior by invoking the ReadClipper class and adds cigar checking to the unit test. It also makes some minor style changes to the transformer and test code.