Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] add sum_hashes to sourmash sig describe output. #1882

Merged
merged 8 commits into from
Mar 13, 2022

Conversation

ctb
Copy link
Contributor

@ctb ctb commented Mar 12, 2022

This PR -

  • adds sum hashes: to human readable output, and sum_hashes to CSV output, for sig dsecribe
  • adds --include-db-pattern and --exclude-db-pattern to sig describe
  • fixes a bug in the CSV output of sig describe where signature_file was always empty;
  • removes a confusing and useless test file, and updates tests to use a less confusing file;

Fixes #1833 by making sum_hashes visible (the last remaining part of that issue).

@codecov
Copy link

codecov bot commented Mar 12, 2022

Codecov Report

Merging #1882 (fd5ef0a) into latest (30ac877) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           latest    #1882   +/-   ##
=======================================
  Coverage   82.61%   82.62%           
=======================================
  Files         121      121           
  Lines       13105    13109    +4     
  Branches     1756     1756           
=======================================
+ Hits        10827    10831    +4     
  Misses       2015     2015           
  Partials      263      263           
Flag Coverage Δ
python 90.55% <100.00%> (+<0.01%) ⬆️
rust 65.80% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/sourmash/cli/sig/describe.py 100.00% <100.00%> (ø)
src/sourmash/sig/__main__.py 91.84% <100.00%> (+0.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30ac877...fd5ef0a. Read the comment docs.

@ctb ctb changed the title [WIP] add sum_hashes to sourmash sig describe output. [MRG] add sum_hashes to sourmash sig describe output. Mar 12, 2022
@ctb
Copy link
Contributor Author

ctb commented Mar 12, 2022

ready for review & merge @sourmash-bio/devs

Copy link
Contributor

@bluegenes bluegenes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm!

Just so I'm clear - sum hashes is the sum of the abundances for all unique hashes?

@ctb
Copy link
Contributor Author

ctb commented Mar 12, 2022

Lgtm!

Just so I'm clear - sum hashes is the sum of the abundances for all unique hashes?

Yep! Added some doc text in 065b05b

signature license: CC0
```

Here, the `size` is the number of distinct hashes in the sketch, and
`sum_hashes` is the total number of hashes in the sketch, with abundances.
When `track_abundance` is 0, `size` is always the same as `sum_hashes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this missing a closing "`"?

@ctb ctb merged commit dbda4ef into latest Mar 13, 2022
@ctb ctb deleted the add/sum_hashes_describe branch March 13, 2022 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lca summarize abundance with outfile
2 participants