Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix filtering of COGs in annotation pipeline. #108

Merged
merged 3 commits into from
Jul 4, 2024
Merged

Conversation

njohner
Copy link
Contributor

@njohner njohner commented Jul 4, 2024

Filtering of COGs in the annotation pipeline was actually picking the hits with lowest target sequence id, i.e. random ones, instead of the best ones. We now instead pick the hits with lowest E-value and highest sequence identity.

We also fix a small bug that was introduced with #99 and we fix some conda environment issues.

Checklist

  • Changelog entry
  • Check that tests still pass
  • Add tests for new features and regression tests for bugfixes whenever possible.

Filtering of COGs in the annotation pipeline was actually picking
the hits with lowest target sequence id, i.e. random ones,
instead of the best ones. We now instead pick the hits with lowest
E-value and highest sequence identity.
Function was raising an error when to_return was not defined. That
bug was introduced recently when ensuring that all columns were
present in the returned DataFrame.
@njohner njohner merged commit e2e459a into master Jul 4, 2024
@njohner njohner deleted the nj/fix_cogs branch July 4, 2024 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant