Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAT_DB_GENERATE issues #462

Closed
prototaxites opened this issue Jun 23, 2023 · 4 comments
Closed

CAT_DB_GENERATE issues #462

prototaxites opened this issue Jun 23, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@prototaxites
Copy link
Contributor

Description of the bug

CAT_DB_GENERATE fails on my system due to what seem to be network connection issues, despite nodes having internet access:

WARNING: Skipping mount /var/apptainer/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
# CAT v4.6.

CAT prepare is running, constructing a fresh database.
Rawr!

WARNING: preparing the database files may take a couple of hours.

Supplied command: /usr/local/bin/CAT prepare --fresh

Taxonomy folder: 2023-06-23_taxonomy/
Database folder: 2023-06-23_CAT_database/
Log file: 2023-06-23.CAT_prepare.fresh.log

-----------------

[2023-06-23 10:05:36.644592] DIAMOND found: diamond version 2.0.6.
[2023-06-23 10:05:36.705498] 2023-06-23_taxonomy is created.
[2023-06-23 10:05:36.753575] 2023-06-23_CAT_database is created.
[2023-06-23 10:05:36.760679] Downloading and extracting taxonomy files from ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz to 2023-06-23_taxonomy.
[2023-06-23 10:05:36.800651] ERROR: donwload of taxonomy files failed.

Digging into the error a bit, this appears to be a recurring problem with some older biocontainers: bioconda/bioconda-recipes#11583 - the CAT mulled container was built on 15/03/2021, before this issue was fixed. CAT image might want to be updated to use a newer base image, which hopefully avoids this problem.

In debugging this, I also noticed that CAT_DB_GENERATE only gets 1 cpu by default, but there's a --nproc option which specifies the number of cores available when building the Diamond database. Might be good to bump that up to speed up datebase building.

Command used and terminal output

nextflow run nf-core/mag -profile test,singularity --cat_db_generate --outdir results

Relevant files

No response

System information

HPC, Slurm, Singularity, Linux, dev

@jfy133
Copy link
Member

jfy133 commented Nov 3, 2023

@prototaxites would you be able to investigate this as a bug fix? I guess it could be quite small fix; or slightly more (but more robust) if it was converted to a proper nf-core module.

@prototaxites
Copy link
Contributor Author

Can take a look but might take me some time to get to it, as my plate is quite full at the moment! I did build an updated version of the container back when I first made this issue, but I am not convinced I remember it fixing the issue.

@jfy133
Copy link
Member

jfy133 commented Feb 9, 2024

Ultimately I think the best thing is to just replace the custom module with official nf-core ones... so hopefully it gets solved then (as will be using latest verison etc)

@jfy133
Copy link
Member

jfy133 commented May 10, 2024

There are a bunch of issues which I think means this would be fixed by teh whole-sale update, closing in favour of the most recent issue as it's all related #611

@jfy133 jfy133 closed this as completed May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants