Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add tax summarization dataclasses for safety and flexibility #2439

Merged
merged 24 commits into from
Jan 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
b1ed900
init new LineagePair and LineageInfo classes
bluegenes Jan 5, 2023
5aa5714
test new LineagePair,BaseLineageInfo,RankLineageInfo
bluegenes Jan 6, 2023
ea7a163
fix
bluegenes Jan 6, 2023
23e2480
add v4.4 columns that will be required
bluegenes Jan 6, 2023
331336b
Merge branch 'latest' into upd-lineage-utils
bluegenes Jan 6, 2023
f458dc1
make the newly added query_bp info in test1.gather.csv work with exis…
bluegenes Jan 6, 2023
592993d
rename test1.gather_ani.csv to test1.gather_old.csv to reflect its ne…
bluegenes Jan 6, 2023
06fb848
moar lineage tests
bluegenes Jan 6, 2023
ed055b6
test remaining codecov misses
bluegenes Jan 6, 2023
c15ff78
whoops, one last codecov miss
bluegenes Jan 6, 2023
766a8b9
add tax summarization classes; rename old NamedTuples to avoid breakage
bluegenes Jan 6, 2023
808e6be
finish renaming
bluegenes Jan 6, 2023
7c9fdb4
add tests for new classes
bluegenes Jan 6, 2023
8c2c546
upd
bluegenes Jan 7, 2023
00f9a64
more tests; move status checking into ClassificationResult
bluegenes Jan 9, 2023
cbb1e65
human summary dict tests
bluegenes Jan 9, 2023
d4c43fd
add value checks and tests for SGR,CR classes
bluegenes Jan 9, 2023
724bf74
test make_full_summary
bluegenes Jan 10, 2023
9dd9adf
use f_unique, not f_weighted to preserve current functionality
bluegenes Jan 10, 2023
c33c137
test no ranks
bluegenes Jan 10, 2023
15031ef
test make_kreport_results
bluegenes Jan 10, 2023
f03c2c3
Merge branch 'latest' into upd-tax-summarization
bluegenes Jan 10, 2023
826bc6b
fix lineagepair
bluegenes Jan 11, 2023
e449d01
Merge branch 'latest' into upd-tax-summarization
ctb Jan 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions src/sourmash/tax/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from sourmash.lca.lca_utils import display_lineage, zip_lineage

from . import tax_utils
from .tax_utils import ClassificationResult, MultiLineageDB
from .tax_utils import ClassInf, MultiLineageDB

usage='''
sourmash taxonomy <command> [<args>] - manipulate/work with taxonomy information.
Expand Down Expand Up @@ -222,7 +222,7 @@ def genome(args):
notify(f"WARNING: classifying query {sg.query_name} at desired rank {args.rank} does not meet containment threshold {args.containment_threshold}")
else:
status="match"
classif = ClassificationResult(sg.query_name, status, sg.rank, sg.fraction, sg.lineage, sg.query_md5, sg.query_filename, sg.f_weighted_at_rank, sg.bp_match_at_rank, sg.query_ani_at_rank)
classif = ClassInf(sg.query_name, status, sg.rank, sg.fraction, sg.lineage, sg.query_md5, sg.query_filename, sg.f_weighted_at_rank, sg.bp_match_at_rank, sg.query_ani_at_rank)
classifications[args.rank].append(classif)
matched_queries.add(sg.query_name)
if "krona" in args.output_format:
Expand Down Expand Up @@ -251,7 +251,7 @@ def genome(args):
elif sg.fraction >= args.containment_threshold:
status = "match"
if status == "match":
classif = ClassificationResult(query_name=sg.query_name, status=status, rank=sg.rank,
classif = ClassInf(query_name=sg.query_name, status=status, rank=sg.rank,
fraction=sg.fraction, lineage=sg.lineage,
query_md5=sg.query_md5, query_filename=sg.query_filename,
f_weighted_at_rank=sg.f_weighted_at_rank, bp_match_at_rank=sg.bp_match_at_rank,
Expand All @@ -261,7 +261,7 @@ def genome(args):
continue
elif rank == "superkingdom" and status == "nomatch":
status="below_threshold"
classif = ClassificationResult(query_name=sg.query_name, status=status,
classif = ClassInf(query_name=sg.query_name, status=status,
rank="", fraction=0, lineage="",
query_md5=sg.query_md5, query_filename=sg.query_filename,
f_weighted_at_rank=sg.f_weighted_at_rank, bp_match_at_rank=sg.bp_match_at_rank,
Expand Down
Loading