Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add 2022 JGI Petabyte Scale Sequence Search workshop slideshow into docs #2252

Open
ctb opened this issue Sep 2, 2022 · 0 comments
Open
Labels
doc documentation content or issues

Comments

@ctb
Copy link
Contributor

ctb commented Sep 2, 2022

https://hackmd.io/_h-BqlNWS4uqb9LkiOeALg

Fractional sketches

or, some details of how sourmash and FracMinHash work!


Consider overlaps between k-mers extracted from three genomes - two that share sequence, and one that does not.

From these we can calculate k-mer based similarity measures (Jaccard similarity and containment).


FracMinHash sketching compresses k-mer collections while retaining set relationships

This is implemented in the software sourmash.


These Jaccard (k-mer) measures can be translated to ANI

Credits: Dr. Tessa Pierce-Ward et al; more info.


References:

FracMinHash and sourmash - Lightweight compositional analysis of metagenomes with FracMinHash and minimum metagenome covers, Irber et al. 2022

ANI calculations - Debiasing FracMinHash and deriving confidence intervals for mutation rates across a wide range of evolutionary distances, Hera et al., Hera et al., 2022

@ctb ctb added the doc documentation content or issues label Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc documentation content or issues
Projects
None yet
Development

No branches or pull requests

1 participant