Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

genbank_get_genomes_by_taxon.py seems to ignore "--format" argument #89

Closed
jvollme opened this issue Jul 27, 2017 · 3 comments
Closed
Assignees
Labels
bug something isn't working how it should

Comments

@jvollme
Copy link

jvollme commented Jul 27, 2017

Summary & Description:

I am currently using pyani v.2.3 as installed via bioconda. genbank_get_genomes_by_taxon.py seems work generally for downloading contig-fastas.
However, when I try to download genbanks instead (which I need for some additional analyses), by setting the --format option to "gbk" (as stated in the help function), this argument seems to be ignored and i still just get the contig-fastas.

Reproducible Steps:

genbank_get_genomes_by_taxon.py -t 1107 -l chloroflexuslog --email myemail@wherever.com -o chloroflexus --format gbk -v

Current Output:

.fna & .fna.gz for each genome (BTW: I guess only the "*.fna.gz" files were actually meant to remain as final output here?)

Expected Output:

.gbk[.gz] for each genome

pyani Version:

v2.3

Python Version:

v3.5.2

Operating System:

Linux Ubuntu 14.04

Summary & Description:

I am currently using pyani v.2.3 as installed via bioconda. genbank_get_genomes_by_taxon.py seems work generally for downloading contig-fastas.
However, when I try to download genbanks instead, by setting the --format option to "gbk" (as stated in the help function), this argument seems to be ignored and I still just get the contig-fastas.

Reproducible Steps:

genbank_get_genomes_by_taxon.py -t 1107 -l chloroflexuslog --email myemail@wherever.com -o chloroflexus --format gbk -v

Current Output:

.fna & .fna.gz for each genome

Expected Output:

.gbk[.gz] for each genome

pyani Version:

v2.3 (installed via bioconda)

Python Version:

v3.5.2

Operating System:

Linux Ubuntu 14.04

@widdowquinn widdowquinn self-assigned this Jul 28, 2017
@widdowquinn widdowquinn added the bug something isn't working how it should label Jul 28, 2017
@widdowquinn
Copy link
Owner

Thanks John - I'll get on it.

widdowquinn added a commit that referenced this issue Jul 28, 2017
The issue reported was that the `--format` argument did not work for
downloading GenBank files. I had overlooked reimplementation of this
when adapting code for the new NCBI layouts. The lesson here is that
tests need to cover scripts (and this was already in process under
the `classify` branch).
widdowquinn added a commit that referenced this issue Jul 28, 2017
@widdowquinn
Copy link
Owner

Hi John,

Thanks for the bug report - the root cause of the problem was that when I changed the internals to cope with the NCBI restructure, I forgot to reimplement the choice of format. I didn't notice this in part because the test suite doesn't currently cover the scripts (this will be fixed in the next iteration of pyani).

I've reimplemented and tested the download format choice in the latest commit: 43ba088

The following command-line now works for me, as I would expect:

genbank_get_genomes_by_taxon.py -t 203804 -v -l C_blochmannia_dl.log  -o issue_89_dl --email my.email@my.domain --format=gbk -f

Re: the .fna vs .fna.gz files. At the FTP source site all files are gzipped. They're downloaded as .fna.gz and then uncompressed on the local disk.

Please let me know if this fix works for you (if so, I'll close the issue), or if you have any further comments/questions.

Cheers,

L.

widdowquinn added a commit that referenced this issue Jul 28, 2017
@jvollme
Copy link
Author

jvollme commented Aug 7, 2017

Hi, sorry for the late reply. It seems to work nicely now. Thanks!

@jvollme jvollme closed this as completed Aug 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something isn't working how it should
Projects
None yet
Development

No branches or pull requests

2 participants