-
Notifications
You must be signed in to change notification settings - Fork 7
What's new
Jose Manuel Martí edited this page Sep 25, 2024
·
1 revision
-
Improve generic parser:
- New automatic full directory option to process all the present files in the directory
- Compatible with .gz and .bz2 compressed files
- Improve resilience of statistics module upon extreme cases such as empty samples
- Solve problems with GH Actions so that CI/CD is working fine again.
- Improve rextract to deal with some simulated reads
-
Addition of new tool refafilt for processing databases contained in —huge— fasta files (such as NCBI BLAST nt), so:
- provide filters by minimum and/or maximum sequence length to separate those sequences,
- correct issue when header of fasta sequence is multiplexed meaning such sequence is redundant in the database
- Correct bug and improve statistics and messages
- Correct bug in scoring introduced in v1.13.0
- Update after NCBI Taxonomy change and upgrade to 2024
- Important reductions in the size of the generated HTML, now < 2/3 of old size
- LOGLENGTH scoring: switch from geometric to arithmetic averaging
- retest: use pandas.testing.assert_frame_equal for df comparisons
- Relocation of a few krona-related constants
If you use Recentrifuge in your research, please consider citing the paper. Thanks!
Martí JM (2019) Recentrifuge: Robust comparative analysis and contamination removal for metagenomics. PLOS Computational Biology 15(4): e1006967. https://doi.org/10.1371/journal.pcbi.1006967