Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for events not in minimal representation #8567

Merged
merged 2 commits into from
Nov 3, 2023
Merged

Conversation

meganshand
Copy link
Contributor

A non-minimally represented allele in dbsnp was causing GenotypeGVCFs to throw an error. This PR fixes this and removes a TODO.

@meganshand
Copy link
Contributor Author

meganshand commented Oct 27, 2023

@davidbenjamin Do you have time for a review here? Or can you please suggest someone? Thanks!

Copy link
Contributor

@davidbenjamin davidbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor changes @meganshand.

* @return pair of alleles in minimal representation
*/
private static Pair<Allele, Allele> makeMinimalRepresentation(Allele ref, Allele alt) {
Utils.validateArg(!Arrays.equals(ref.getBases(), alt.getBases()), "ref and alt alleles are identical");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Allele::baseMatch

refAllele = ref;
altAllele = alt;
} else {
Pair<Allele, Allele> minimalAlleles = makeMinimalRepresentation(ref, alt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final

byte[] newRefBases = ref.getBases();
byte[] newAltBases = alt.getBases();
while (newRefBases.length != 1 && newAltBases.length != 1 && !differentLastBase(newRefBases, newAltBases)) {
newRefBases = Arrays.copyOf(ref.getBases(), ref.length() - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The array operations shouldn't be inside the loop. First calculate how many bases are identical at the end, then trim the base arrays.

* @param alt allele not in minimal representation
* @return pair of alleles in minimal representation
*/
private static Pair<Allele, Allele> makeMinimalRepresentation(Allele ref, Allele alt) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's much more common that the allele representation is already minimal, it's worth checking for this and just returning the original ref and alt if possible without needing to call Allele.create.

@meganshand
Copy link
Contributor Author

Thanks @davidbenjamin, back to you!

Copy link
Contributor

@davidbenjamin davidbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@meganshand meganshand merged commit 2900e01 into master Nov 3, 2023
20 checks passed
@meganshand meganshand deleted the ms_minimal_event branch November 3, 2023 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants