Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In this PR, I add the Garuds G statistics (G1, G12, G123, G2/G1) to scikit-allel.
We use the same hashing function as the Garuds H functions, as opposed to the optimised (DJB33XA) hash function used in malariagen_data. Currently I cant actually install scikit-allel, not sure whats going on, have asked @alimanfoo for help.
We also add a function diplotype_frequencies() which computes distinct_frequencies of diplotypes. This is not quite analogous to the garuds_h function, as the garuds_h function takes an allel.HaplotypeArray which has the method distinct_frequencies(). garuds_g expects an array of alternate counts (biallelic, from GenotypeArray.to_n_alt() ), which is not a scikit-allel class (just a np array), therefore we cannot add its own methods to it.
As far as I can see, H12 does not have tests, so I haven't added any yet, but will do.