You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
More complicated use case that would be really neat to enable: run prefetch against, e.g. genus-level representative database. Then run gather and use the prefetch output csv as a picklist, but select all signatures in the same genera (or family, etc) as any match.
Actually, even if you were to keep tax grep as just a picklist utility, being able to scale up from matches to all members of the taxonomic group could be pretty neat.
I responded:
Right... is this kind of an inverse operation?
You would want to take a list of lineages (perhaps from a prefetch or gather file - note that sourmash tax only deals with gather files for now) and then build a taxonomy? or a picklist? that expands those matches to another level.
For example, you might:
* run gather
* annotate gather results with taxonomy using `sourmash tax annotate` => strain level
* 🪄 somehow 🪄 go from the lineages in the annotated gather file to a more general set of lineages at (say) the genus level
This strikes me as a pretty useful taxonomic utility, and points at functionality that is lacking -
* we don't really have anything that parses the annotated gather file, other than the metacoder example in #2041
we don't have any way to manipulate a "bulk" taxonomy file in bulk ways, e.g. "give me all of the lineages from taxfile1 that match at the genus level to the genomes/lineages in taxfile2.
... elided ...
maybe in addition to tax grep which works on a single match, we want a bulk matching function that takes in some format that links identifiers and taxonomies (annotated gather file? and/or taxonomy file?) as well as a taxonomy database, and then outputs picklists. "Promote these matches from strain to genus level" is one specific example here.
I think tax extract would be very useful and get us to the second use case!!!
1. to select all members of specific family: `tax grep family_name` --> picklist
2. to promote prefetch matches to genus level: `tax annotate` --> `tax extract` --> picklist
Note -- If we're providing the taxonomy file to tax extract, we could even just do the tax annotate step internally to avoid needing to run an extra step.
Additional use case: use these picklists with exclude allows us to easily exclude entire taxonomic groups from search, e.g. for testing taxonomic classification.
The text was updated successfully, but these errors were encountered:
From #2178,
@bluegenes:
I responded:
@bluegenes
The text was updated successfully, but these errors were encountered: