Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue Running PVAmpliconFinder.sh #3

Open
cwarden45 opened this issue Jul 27, 2020 · 12 comments
Open

Issue Running PVAmpliconFinder.sh #3

cwarden45 opened this issue Jul 27, 2020 · 12 comments

Comments

@cwarden45
Copy link

cwarden45 commented Jul 27, 2020

Hi,

I apologize for submitting 2 tickets close to the same time, but I wanted to list some details before I forgot about them.

Essentially, I think I may have successfully installed PVAmpliconFinder, except that I manually configured conda/bioconda and I used the regular channels (without the /label/cf201901 part).

I also re-named the public FASTQ files as follows (adding the "R" as described by an error that I got with the program otherwise) in a folder called test_files:

SRR9702898_R1.fastq.gz
SRR9702898_R2.fastq.gz
SRR9702899_R1.fastq.gz
SRR9702899_R2.fastq.gz
SRR9702900_R1.fastq.gz
SRR9702900_R2.fastq.gz
SRR9702901_R1.fastq.gz
SRR9702901_R2.fastq.gz
SRR9702902_R1.fastq.gz
SRR9702902_R2.fastq.gz
SRR9702903_R1.fastq.gz
SRR9702903_R2.fastq.gz
SRR9702904_R1.fastq.gz
SRR9702904_R2.fastq.gz
SRR9702905_R1.fastq.gz
SRR9702905_R2.fastq.gz

I am trying to test the program with minimal provided information, without the information file described in #2.

I am using the following commands (within the downloaded GitHub folder) to try and run PVAmpliconFinder:

#!/bin/bash

FASTQ=../test_files
OUT=../test_out

sh PVAmpliconFinder.sh -t 4 -i 98 -s .fastq.gz -d $FASTQ -o $OUT

I added -i 98 because I was previously getting an error message saying PVAmpliconFinder.sh: 92: PVAmpliconFinder.sh: [[: not found (making me think this might have been a required rather than optional parameter?).

However, I don't think that solved the problem because these are the error messages that I currently am seeing:

PVAmpliconFinder.sh: 92: PVAmpliconFinder.sh: [[: not found
-e PVAmpliconFinder.sh [-h] [-t threads] [-b "nt" database] [-f info_file] [-i identity thershold] -s fastq_files_suffix -d input_dir -o output_dir -- program to process amplicon-based NGS data
Version 1.0
The fastq filename to process must with the same suffix (option "-s").
The Read 1 filename must contain "R1" and the Read 2 filename must contain "R2" (the pair must otherwise have the same name).
See README for more information about PVAmpliconFinder.sh usage.

where:
    -h  show this help text
    -s	suffix of fastq filename (ex : "pool" or "sample")
    -d  PATH to input fastq directory (.fastq | .fq | .zip | .tar.gz | .gz)
    -o	PATH to output directory
    -f	file containing pool information
    -b	"nt" blast database name (default "nt")
    -i	threshold of percentage of identity for centroid clustering (default 98) - INT only
    -t	number of threads (default 2)
    
-e You must provide an integer as identity thershold for de-novo clustering
PVAmpliconFinder.sh: 104: PVAmpliconFinder.sh: [[: not found
PVAmpliconFinder.sh: 104: PVAmpliconFinder.sh: [[: not found
PVAmpliconFinder.sh: 108: PVAmpliconFinder.sh: [[: not found
PVAmpliconFinder.sh: 108: PVAmpliconFinder.sh: [[: not found
-e ##########################################
##	FastQC control of the raw reads	##
##########################################
-e FastQC control of raw fastq files already done
-e ##########################################
##	Remove adapter sequence		##
##########################################
Pair : bash bash
xargs: bash: exited with status 255; aborting
PVAmpliconFinder.sh: 203: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
ln: failed to create symbolic link '../test_out/multiQC_report_on_filtered.html': No such file or directory
PVAmpliconFinder.sh: 207: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
mv: cannot stat 'bash': No such file or directory
mv: cannot stat 'bash': No such file or directory
PVAmpliconFinder.sh: 217: cd: can't cd to ../test_out/fastq_filtered
mv: cannot stat 'bash': No such file or directory
mv: cannot stat 'bash': No such file or directory
Done
PVAmpliconFinder.sh: 237: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
-e ##########################################
##	Clustering step : VSEARCH	##
##########################################
PVAmpliconFinder.sh: 240: cd: can't cd to ../test_out
PVAmpliconFinder.sh: 247: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
PVAmpliconFinder.sh: 251: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
PVAmpliconFinder.sh: 253: cd: can't cd to ../test_out/fastq_filtered
-e ~~	MergePair	~~
Pair : bash bash - Label :
bash: ../test_out/logfile.txt: No such file or directory
PVAmpliconFinder.sh: 259: cd: can't cd to ../test_out/tmp
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch



Fatal error: Unable to open file for reading (bash)
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch



Fatal error: Unable to open file for reading (bash)
cat: bash.fasta: No such file or directory
cat: bash: No such file or directory
cat: bash: No such file or directory
-e ~~	Dereplicate	~~
bash: ../test_out/logfile.txt: No such file or directory
-e ~~	ChimericSeqRemoval	~~
bash: ../test_out/logfile.txt: No such file or directory
-e ~~	Clustering	~~
bash: ../test_out/logfile.txt: No such file or directory
PVAmpliconFinder.sh: 284: cd: can't cd to ../test_out
mv: cannot stat '../test_out/tmp': No such file or directory
Done
PVAmpliconFinder.sh: 298: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
-e ##########################################
##	Sequence identification : BLAST	##
##########################################
PVAmpliconFinder.sh: 306: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
PVAmpliconFinder.sh: 308: cd: can't cd to ../test_out/vsearch
bash
Command line argument error: Argument "query". File is not accessible:  `bash'
PVAmpliconFinder.sh: 313: cd: can't cd to ../test_out/blast_result
sed: can't read *.blast: No such file or directory
Done
PVAmpliconFinder.sh: 328: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
-e ##########################################
##	Advanced analysis		##
##########################################
-e Advanced analysis already done

Can you please help me troubleshoot this messages?

Thank you very much.

Sincerely,
Charles

@SixEl27
Copy link
Collaborator

SixEl27 commented Jul 28, 2020

Dear Charles,

Indeed the "_R1" and "_R2" must be present in the fastq file name in order for the script to recognize the paired sequencing reads, as precised in the help.

I apologize I made a mistake in the help text : the "-s" option should not be the suffix but the prefix of the fastq file name to analyze. In your case, something like "SRR" or even "SRR9702" should work. I've already upload an updated version of the help.

If the program can find the input fastq files, this should solve the issue you have with the identity threshold option. If not, please let me know I'll have a closer look to this issue.

Also, I did not thought about the renaming of the fastq files by the SRR database. For matching the current info_file.txt that I just upload in the github repo #2 , you may rename the fastq files, or modify the info_file accordingly. The correspondence between the info_file.txt files naming and SRR database renaming can be found here for pool8 as example, in the "Data access" tab.

I hope this will help solving your issue,

Sincerely,
Alexis

@cwarden45
Copy link
Author

Hi Alexis,

1) In terms of troubleshooting the installation:

The first problem was that conda was not successfully added to the path (which is why I installed anaconda separately).

The second problem was that vsearch was not successfully added to the path (which is why I used the regular bioconda instructions, rather than the commands in the installation file).

I am testing PVAmpliconFinder in an Ubuntu Docker image (version 16.04).

2) In terms of modifying the code to run PVAmpliconFinder, this is the mapping that I found for the SRA samples:

SRR9702898: pool5-skin-pathogen_S5_L001
SRR9702899: pool6-oral-pathogen_S6_L001
SRR9702900: pool7-oral-pathogen_S7_L001
SRR9702901: pool8-oral-pathogen_S8_L001
SRR9702902: pool1-skin-pathogen_S1_L001
SRR9702903: pool2-skin-pathogen_S2_L001
SRR9702904: pool3-skin-pathogen_S3_L001
SRR9702905: pool4-skin-pathogen_S4_L001

So, this is what the current FASTQ files look like:

pool1-skin-pathogen_S1_L001_R1.fastq.gz
pool1-skin-pathogen_S1_L001_R2.fastq.gz
pool2-skin-pathogen_S2_L001_R1.fastq.gz
pool2-skin-pathogen_S2_L001_R2.fastq.gz
pool3-skin-pathogen_S3_L001_R1.fastq.gz
pool3-skin-pathogen_S3_L001_R2.fastq.gz
pool4-skin-pathogen_S4_L001_R1.fastq.gz
pool4-skin-pathogen_S4_L001_R2.fastq.gz
pool5-skin-pathogen_S5_L001_R1.fastq.gz
pool5-skin-pathogen_S5_L001_R2.fastq.gz
pool6-oral-pathogen_S6_L001_R1.fastq.gz
pool6-oral-pathogen_S6_L001_R2.fastq.gz
pool7-oral-pathogen_S7_L001_R1.fastq.gz
pool7-oral-pathogen_S7_L001_R2.fastq.gz
pool8-oral-pathogen_S8_L001_R1.fastq.gz
pool8-oral-pathogen_S8_L001_R2.fastq.gz

This is the revised command that I am using:

#!/bin/bash

FASTQ=../test_files
OUT=../test_out
INFO="test_demo_file.txt"

sh PVAmpliconFinder.sh -t 4 -i 98 -s pool1-skin-pathogen -d $FASTQ -o $OUT -f $INFO

And this is the output that I see:

PVAmpliconFinder.sh: 92: PVAmpliconFinder.sh: [[: not found
-e PVAmpliconFinder.sh [-h] [-t threads] [-b "nt" database] [-f info_file] [-i identity thershold] -s fastq_files_suffix -d input_dir -o output_dir -- program to process amplicon-based NGS data
Version 1.0
The fastq filename to process must with the same suffix (option "-s").
The Read 1 filename must contain "R1" and the Read 2 filename must contain "R2" (the pair must otherwise have the same name).
See README for more information about PVAmpliconFinder.sh usage.

where:
    -h  show this help text
    -s	suffix of fastq filename (ex : "pool" or "sample")
    -d  PATH to input fastq directory (.fastq | .fq | .zip | .tar.gz | .gz)
    -o	PATH to output directory
    -f	file containing pool information
    -b	"nt" blast database name (default "nt")
    -i	threshold of percentage of identity for centroid clustering (default 98) - INT only
    -t	number of threads (default 2)
    
-e You must provide an integer as identity thershold for de-novo clustering
PVAmpliconFinder.sh: 104: PVAmpliconFinder.sh: [[: not found
PVAmpliconFinder.sh: 104: PVAmpliconFinder.sh: [[: not found
PVAmpliconFinder.sh: 108: PVAmpliconFinder.sh: [[: not found
PVAmpliconFinder.sh: 108: PVAmpliconFinder.sh: [[: not found
PVAmpliconFinder.sh: 113: PVAmpliconFinder.sh: [[: not found
PVAmpliconFinder.sh: 113: PVAmpliconFinder.sh: [[: not found
-e ##########################################
##	FastQC control of the raw reads	##
##########################################
PVAmpliconFinder.sh: 144: PVAmpliconFinder.sh: [[: not found
PVAmpliconFinder.sh: 148: PVAmpliconFinder.sh: [[: not found
PVAmpliconFinder.sh: 152: PVAmpliconFinder.sh: [[: not found
PVAmpliconFinder.sh: 167: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
ln: failed to create symbolic link '../test_out/multiQC_report_on_raw.html': No such file or directory
Done
PVAmpliconFinder.sh: 185: cd: can't cd to ../test_out
PVAmpliconFinder.sh: 187: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
-e ##########################################
##	Remove adapter sequence		##
##########################################
PVAmpliconFinder.sh: 193: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
PVAmpliconFinder.sh: 195: cd: can't cd to ../test_files
Pair : bash bash
bash: ../test_out/logfile.txt: No such file or directory
PVAmpliconFinder.sh: 200: cd: can't cd to ../test_out/trim_galore
PVAmpliconFinder.sh: 203: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
ln: failed to create symbolic link '../test_out/multiQC_report_on_filtered.html': No such file or directory
PVAmpliconFinder.sh: 207: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
mv: cannot stat 'bash': No such file or directory
mv: cannot stat 'bash': No such file or directory
PVAmpliconFinder.sh: 217: cd: can't cd to ../test_out/fastq_filtered
mv: cannot stat 'bash': No such file or directory
mv: cannot stat 'bash': No such file or directory
Done
PVAmpliconFinder.sh: 237: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
-e ##########################################
##	Clustering step : VSEARCH	##
##########################################
PVAmpliconFinder.sh: 240: cd: can't cd to ../test_out
PVAmpliconFinder.sh: 247: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
PVAmpliconFinder.sh: 251: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
PVAmpliconFinder.sh: 253: cd: can't cd to ../test_out/fastq_filtered
-e ~~	MergePair	~~
Pair : bash bash - Label :
bash: ../test_out/logfile.txt: No such file or directory
PVAmpliconFinder.sh: 259: cd: can't cd to ../test_out/tmp
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch



Fatal error: Unable to open file for reading (bash)
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch



Fatal error: Unable to open file for reading (bash)
cat: bash.fasta: No such file or directory
cat: bash: No such file or directory
cat: bash: No such file or directory
-e ~~	Dereplicate	~~
bash: ../test_out/logfile.txt: No such file or directory
-e ~~	ChimericSeqRemoval	~~
bash: ../test_out/logfile.txt: No such file or directory
-e ~~	Clustering	~~
bash: ../test_out/logfile.txt: No such file or directory
PVAmpliconFinder.sh: 284: cd: can't cd to ../test_out
mv: cannot stat '../test_out/tmp': No such file or directory
Done
PVAmpliconFinder.sh: 298: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
-e ##########################################
##	Sequence identification : BLAST	##
##########################################
PVAmpliconFinder.sh: 306: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
PVAmpliconFinder.sh: 308: cd: can't cd to ../test_out/vsearch
bash
Command line argument error: Argument "query". File is not accessible:  `bash'
PVAmpliconFinder.sh: 313: cd: can't cd to ../test_out/blast_result
sed: can't read *.blast: No such file or directory
Done
PVAmpliconFinder.sh: 328: PVAmpliconFinder.sh: cannot create ../test_out/logfile.txt: Directory nonexistent
-e ##########################################
##	Advanced analysis		##
##########################################
Possible precedence issue with control flow operator at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805.
ls: cannot access '/mnt/user_data/jogembo/Seq/170118/Code/Round19/PVAmpliconFinder/PVAmpliconFinder/../test_out/blast_result/pool1-skin-pathogen*.blast': No such file or directory
ls: cannot access '/mnt/user_data/jogembo/Seq/170118/Code/Round19/PVAmpliconFinder/PVAmpliconFinder/../test_out/vsearch/pool1-skin-pathogen*.fasta': No such file or directory
Error : Info file badly formatted
 Did you forget the header (ID	primer	pool)

So, I am sorry, but I still don't think I quite have this working for the demo files.

Can you please continue to help me troubleshoot?

Thank You,
Charles

@SixEl27
Copy link
Collaborator

SixEl27 commented Jul 29, 2020

Dear Charles,

  1. I've change the "automatic installation" text in the README. I had several issue when running the installation script on different OS, so I do no more recommend to use the script "PVAmplicon_install.sh".

Because all the information for conda installation can be found here, I now only provide the conda channel and conda install packages commands needed for PVAmpliconFinder to run. I does not make the installation more complicated, and users become also free to create their own conda environment.

I don't know why vsearch was not successfully added to the path in your case, as I use the regular bioconda instructions to install vsearch
conda install vsearch

  1. I re-download the fastq file from the SRA database here, as the file kept the given submission name during the download (I just had to change the extension from ".fastq.1" to ".fastq").
>ls -1 fastq
pool1-skin-pathogen_S1_L001_R1_001.fastq
pool1-skin-pathogen_S1_L001_R2_001.fastq
pool2-skin-pathogen_S2_L001_R1_001.fastq
pool2-skin-pathogen_S2_L001_R2_001.fastq
pool3-skin-pathogen_S3_L001_R1_001.fastq
pool3-skin-pathogen_S3_L001_R2_001.fastq
pool4-skin-pathogen_S4_L001_R1_001.fastq
pool4-skin-pathogen_S4_L001_R2_001.fastq
pool5-skin-pathogen_S5_L001_R1_001.fastq
pool5-skin-pathogen_S5_L001_R2_001.fastq
pool6-oral-pathogen_S6_L001_R1_001.fastq
pool6-oral-pathogen_S6_L001_R2_001.fastq
pool7-oral-pathogen_S7_L001_R1_001.fastq
pool7-oral-pathogen_S7_L001_R2_001.fastq
pool8-oral-pathogen_S8_L001_R1_001.fastq
pool8-oral-pathogen_S8_L001_R2_001.fastq

To take all the fastq files as input, the files need to share a common prefix, so I would advise using something like "pool".

I created a bash script containing the revised command that you are using (I tried with and without the "-i 98" option and It run in both cases) :

#!/bin/bash

FASTQ=../PVAmpliconFinder_test/fastq
OUT=../PVAmpliconFinder_test
INFO="info_file.txt"

sh PVAmpliconFinder.sh -t 8 -s pool -d $FASTQ -o $OUT -f $INFO

And everything is running correctly on the server I'm using (Centos 7.0). I don't know what can be the causes of the issue you're facing, but the error look like that the "if" condition of the bash script are creating errors.

Maybe should you try running the script like this:
bash PVAmpliconFinder.sh -t 8 -s pool -d $FASTQ -o $OUT -f $INFO
or like this:
./PVAmpliconFinder.sh -t 8 -s pool -d $FASTQ -o $OUT -f $INFO

I hope this will solve your problem.
Best,

Alexis

@cwarden45
Copy link
Author

cwarden45 commented Jul 29, 2020

Hi Alexis,

1) I think you have changed the part that caused problems for me in the installation file for vsearch.

I am guessing that this won't be an issue for other users. However, if somebody had an earlier download for some reason, then the change can be seen here.

2) I have made the changes that you described in the file names as well as testing running the script in different ways.

I am running some concurrent analysis on the computer where I have the Docker image installed. However, I did not get the same error messages within a few seconds of running the command (and it looked like it was successfully starting to decompress the FASTQ files).

So, I think it might be best if I waited a little bit before I continued testing the program, but I think this helped.

If I can confirm that the full set of analysis works on the demo data, then I will close this ticket.

Thank you very much!

Sincerely,
Charles

@cwarden45
Copy link
Author

cwarden45 commented Jul 31, 2020

As mentioned in thread #4, I think there is some issue with BLAST, since the .blast files are empty and there are error messages at that step:

##########################################
##	FastQC control of the raw reads	##
##########################################
ln: failed to create symbolic link '../test_out/multiQC_report_on_raw.html': Read-only file system
Done
##########################################
##	Remove adapter sequence		##
##########################################
Pair : ./pool3-skin-pathogen_S3_L001_R1_001.fastq ./pool3-skin-pathogen_S3_L001_R2_001.fastq
Pair : ./pool7-oral-pathogen_S7_L001_R1_001.fastq ./pool7-oral-pathogen_S7_L001_R2_001.fastq
Pair : ./pool2-skin-pathogen_S2_L001_R1_001.fastq ./pool2-skin-pathogen_S2_L001_R2_001.fastq
Pair : ./pool6-oral-pathogen_S6_L001_R1_001.fastq ./pool6-oral-pathogen_S6_L001_R2_001.fastq
Pair : ./pool1-skin-pathogen_S1_L001_R1_001.fastq ./pool1-skin-pathogen_S1_L001_R2_001.fastq
Pair : ./pool4-skin-pathogen_S4_L001_R1_001.fastq ./pool4-skin-pathogen_S4_L001_R2_001.fastq
Pair : ./pool5-skin-pathogen_S5_L001_R1_001.fastq ./pool5-skin-pathogen_S5_L001_R2_001.fastq
Pair : ./pool8-oral-pathogen_S8_L001_R1_001.fastq ./pool8-oral-pathogen_S8_L001_R2_001.fastq
ln: failed to create symbolic link '../test_out/multiQC_report_on_filtered.html': Read-only file system
Done
##########################################
##	Clustering step : VSEARCH	##
##########################################
~~	MergePair	~~
Pair : ./pool5-skin-pathogen_S5_L001_R1_001.fq ./pool5-skin-pathogen_S5_L001_R2_001.fq - Label : pool5-skin-pathogen_S5_L001
Pair : ./pool1-skin-pathogen_S1_L001_R1_001.fq ./pool1-skin-pathogen_S1_L001_R2_001.fq - Label : pool1-skin-pathogen_S1_L001
Pair : ./pool7-oral-pathogen_S7_L001_R1_001.fq ./pool7-oral-pathogen_S7_L001_R2_001.fq - Label : pool7-oral-pathogen_S7_L001
Pair : ./pool6-oral-pathogen_S6_L001_R1_001.fq ./pool6-oral-pathogen_S6_L001_R2_001.fq - Label : pool6-oral-pathogen_S6_L001
Pair : ./pool2-skin-pathogen_S2_L001_R1_001.fq ./pool2-skin-pathogen_S2_L001_R2_001.fq - Label : pool2-skin-pathogen_S2_L001
Pair : ./pool8-oral-pathogen_S8_L001_R1_001.fq ./pool8-oral-pathogen_S8_L001_R2_001.fq - Label : pool8-oral-pathogen_S8_L001
Pair : ./pool4-skin-pathogen_S4_L001_R1_001.fq ./pool4-skin-pathogen_S4_L001_R2_001.fq - Label : pool4-skin-pathogen_S4_L001
Pair : ./pool3-skin-pathogen_S3_L001_R1_001.fq ./pool3-skin-pathogen_S3_L001_R2_001.fq - Label : pool3-skin-pathogen_S3_L001
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
3246 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
16417 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
218718 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
8283 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
8310 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
41716 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
11960 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
2809 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
8310 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
16417 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
3246 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
11960 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
41716 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
218718 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
8283 sequences kept (of which 0 truncated), 0 sequences discarded.
vsearch v2.15.0_linux_x86_64, 23.0GB RAM, 4 cores
https://github.com/torognes/vsearch

Reading input file 100%
2809 sequences kept (of which 0 truncated), 0 sequences discarded.
~~	Dereplicate	~~
~~	ChimericSeqRemoval	~~
~~	Clustering	~~
Done
##########################################
##	Sequence identification : BLAST	##
##########################################
pool8-oral-pathogen_S8_L001
Warning: [blastn] Examining 5 or more matches is recommended
Indexed BLAST database error: NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 793: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_New::CIndexedDb_New() - no database volume has an index

NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 1006: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_Old::CIndexedDb_Old() - no index file specified or index 'nt*' not found.

pool6-oral-pathogen_S6_L001
Warning: [blastn] Examining 5 or more matches is recommended
Indexed BLAST database error: NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 793: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_New::CIndexedDb_New() - no database volume has an index

NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 1006: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_Old::CIndexedDb_Old() - no index file specified or index 'nt*' not found.

pool1-skin-pathogen_S1_L001
Warning: [blastn] Examining 5 or more matches is recommended
Indexed BLAST database error: NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 793: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_New::CIndexedDb_New() - no database volume has an index

NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 1006: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_Old::CIndexedDb_Old() - no index file specified or index 'nt*' not found.

pool4-skin-pathogen_S4_L001
Warning: [blastn] Examining 5 or more matches is recommended
Indexed BLAST database error: NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 793: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_New::CIndexedDb_New() - no database volume has an index

NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 1006: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_Old::CIndexedDb_Old() - no index file specified or index 'nt*' not found.

pool5-skin-pathogen_S5_L001
Warning: [blastn] Examining 5 or more matches is recommended
Indexed BLAST database error: NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 793: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_New::CIndexedDb_New() - no database volume has an index

NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 1006: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_Old::CIndexedDb_Old() - no index file specified or index 'nt*' not found.

pool7-oral-pathogen_S7_L001
Warning: [blastn] Examining 5 or more matches is recommended
Indexed BLAST database error: NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 793: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_New::CIndexedDb_New() - no database volume has an index

NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 1006: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_Old::CIndexedDb_Old() - no index file specified or index 'nt*' not found.

pool2-skin-pathogen_S2_L001
Warning: [blastn] Examining 5 or more matches is recommended
Indexed BLAST database error: NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 793: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_New::CIndexedDb_New() - no database volume has an index

NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 1006: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_Old::CIndexedDb_Old() - no index file specified or index 'nt*' not found.

pool3-skin-pathogen_S3_L001
Warning: [blastn] Examining 5 or more matches is recommended
Indexed BLAST database error: NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 793: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_New::CIndexedDb_New() - no database volume has an index

NCBI C++ Exception:
    T0 "/opt/conda/conda-bld/blast_1595737360567/work/blast/c++/src/algo/blast/api/blast_dbindex.cpp", line 1006: Error: (CDbIndex_Exception::bad index creation option) BLAST::ncbi::blast::CIndexedDb_Old::CIndexedDb_Old() - no index file specified or index 'nt*' not found.

Done
##########################################
##	Advanced analysis		##
##########################################
Possible precedence issue with control flow operator at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805.
pool1-skin-pathogen_S1_L001 is empty and thus will not be proceed by the program
pool2-skin-pathogen_S2_L001 is empty and thus will not be proceed by the program
pool3-skin-pathogen_S3_L001 is empty and thus will not be proceed by the program
pool4-skin-pathogen_S4_L001 is empty and thus will not be proceed by the program
pool5-skin-pathogen_S5_L001 is empty and thus will not be proceed by the program
pool6-oral-pathogen_S6_L001 is empty and thus will not be proceed by the program
pool7-oral-pathogen_S7_L001 is empty and thus will not be proceed by the program
pool8-oral-pathogen_S8_L001 is empty and thus will not be proceed by the program
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Use of uninitialized value in concatenation (.) or string at ./PVAmpliconFinder_step2.pl line 699.
Can't use an undefined value as an ARRAY reference at ./PVAmpliconFinder_step2.pl line 702.

There are the commands that I am using to run the program:

FASTQ=../test_files
OUT=../test_out
INFO="test_demo_file.txt"

#I created the ~/.ncbirc file, but I am also seeing if this might help
export BLASTDB=/path/to/PVAmpliconFinder/databases

./PVAmpliconFinder.sh -t 1 -i 98 -s pool -d $FASTQ -o $OUT -f $INFO

Can you please continue to help me troubleshoot?

Thank you in advance!

@SixEl27
Copy link
Collaborator

SixEl27 commented Aug 3, 2020

Dear Charles,

Sorry for the late reply.

I'm currently re-downloading the nt database to match your current installation procedure. This may take some time.

In the mean time, you may confirm that all the indexes are indeed present in the folder where is stored the database (it should be the case as the script "update_blastdb.pl" download pre-formatted database). I've updated the README with links to the NCBI website that are describing the correct configuration to set up, as I think your issue come from a path that is not correctly set.

I'll not be available in the coming weeks to answers other questions you may have. I'll keep you update later. Thanks for your patience.

Alexis

@cwarden45
Copy link
Author

Hi Alexis,

As far as I can tell, I think all of the files are there (I can tell that there were 27 tar.gz files downloaded successfully).

I am working on downloading the original FASTA files and indexing them from scratch.

I realize that I could have tried to run blastdbcmd -entry all -db nt -out nr.fsa, but I am trying to be able to clearly tell which files are new and which files are old. However, I also think I misunderstood that I would have to run that for each of the 27 files (and combine the .fsa files to try and create a new index).

The estimated download time is long, which matches what you are describing.

Thank You,
Charles

@cwarden45
Copy link
Author

I think I have the BLAST step fixed. I needed to use another computer to run that step, taking advantage of the fact that PVAmpliconFinder can pick up analysis in the middle the process (depending upon which subfolders are in the output folder).

However, I think I am having a problem at the "Advanced Analysis" step:

##########################################
##	FastQC control of the raw reads	##
##########################################
FastQC control of raw fastq files already done
##########################################
##	Remove adapter sequence		##
##########################################
Trim_galore already done
##########################################
##	Clustering step : VSEARCH	##
##########################################
Vsearch already done
##########################################
##	Sequence identification : BLAST	##
##########################################
Sequence identification already done
##########################################
##	Advanced analysis		##
##########################################
Possible precedence issue with control flow operator at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805.
pool1-skin-pathogen_S1_L001
Error in the format of the sequence ID : pool1-skin-pathogen_S1_L0011;clusterid=0;size=170654

There are pool1-skin-pathogen_S1_L001.csv files in the analysis_known and analysis_new output subfolders with a header but no data.

I have attached a .zip file with the BLAST results for all 8 samples (including the first sample where the error message is occuring).
blast_result.zip

I am not sure if you are available yet, but can you please help me troubleshoot?

I hope that I am close to having this working :)

@SixEl27
Copy link
Collaborator

SixEl27 commented Aug 27, 2020

Dear Charles,

Thanks you for your patience.

It seems that vsearch output has slightly change in its latest version, I've update the script accordingly.

It seems that the last version of the Perl module Bio::Tools::Run::StandAloneBlastPlus also produce an error. I'm working on solving this issue.

Alexis

@cwarden45
Copy link
Author

Hi Alexis - great, thank you very much for your help!

@cwarden45
Copy link
Author

Hi Alexis,

I remember that you said that you would be away for a while.

Are you back and able to continue to help me troubleshoot what I think is close to the last step?

Thank you very much!

Sincerely,
Charles

@cwarden45
Copy link
Author

cwarden45 commented Dec 29, 2020

Hi Alexis,

I have downloaded a newer version of PVAmpliconFinder and I tested re-running the scripts on the demo dataset.

The program gets past the previous point where it stopped, but I am still getting warning and the program crashes at one point.
I am wondering if it has essentially got far enough to provide the important results (or if the remaining error message causing the program to crash can be fixed).

There are now analysis_known, analysis_new, and KRONA_MegaBlast subfolders within the output folder.

There are also DiversityByTissu_MegaBlast.csv, diversityByTissu_oral_MegaBlast.csv, diversityByTissu_skin_MegaBlast.csv, and table_summary_MegaBlast_results output files.

The contents of the table_summary_MegaBlast_results file are down below:

MegaBlast results
Table 1 - Sequencing statistics per samples
Pool	Primer	Tissue	Allreads	VIRUS_reads	pVIRUS_reads	Other_reads	pOther_reads	VIRUSnew_reads	pVIRUSnew_onTot	pVIRUSnew_onVIRUSknown	Number of new VIRUS	Number of known VIRUS
pool1-skin-pathogen_S1_L001	Beta3_1	skin	337756	335246	99.257	2510	0.743	0	0	0	0	5
pool2-skin-pathogen_S2_L001	Beta3_2	skin	10543	9642	91.454	901	8.546	0	0	0	0	7
pool3-skin-pathogen_S3_L001	FAP	skin	84061	83384	99.195	677	0.805	2	0.002	0.002	1	20
pool4-skin-pathogen_S4_L001	FAPM1	skin	21648	21317	98.471	331	1.529	0	0	0	0	29
pool5-skin-pathogen_S5_L001	CUT	skin	66898	63811	95.386	3087	4.614	3	0.004	0.005	1	43
pool6-oral-pathogen_S6_L001	FAPM1	oral	155734	210	0.135	155524	99.865	0	0	0	0	18
pool7-oral-pathogen_S7_L001	FAPM2	oral	484031	46443	9.595	437588	90.405	0	0	0	0	17
pool8-oral-pathogen_S8_L001	CUT	oral	242304	167	0.069	242137	99.931	0	0	0	0	8


Table 2 - Sequencing statistics per primers
Primer	Allreads	VIRUS_reads	pVIRUS_reads	Other_reads	pOther_reads	VIRUSnew_reads	pVIRUSnew_onTot	pVIRUSnew_onVIRUSknown
Beta3_1	337756	335246	99.257	2510	0.743	0	0	0
Beta3_2	10543	9642	91.454	901	8.546	0	0	0
CUT	154602	31990	20.692	122612	79.308	2	0.001	0.006
FAP	84061	83384	99.195	677	0.805	2	0.002	0.002
FAPM1	88692	10764	12.136	77928	87.864	0	0	0
FAPM2	484031	46443	9.595	437588	90.405	0	0	0


Table 3 - Sequencing statistics per primers & tissue
Primer	Tissue	Other	VIRUSknown	VIRUSnew
Beta3_1	skin	17	5	0
Beta3_2	skin	60	7	0
FAP	skin	15	20	1
FAPM1	skin	15	29	0
CUT	skin	37	43	1
FAPM1	oral	265	18	0
FAPM2	oral	200	17	0
CUT	oral	137	8	0


Table 4 - Species identification per primers
Primer	Other	VIRUSknown	VIRUSnew
Beta3_1	17	5	0
Beta3_2	60	7	0
CUT	174	51	1
FAP	15	20	1
FAPM1	280	47	0
FAPM2	200	17	0
Total species	746	147	2
Total unique species	550	107	2


Table 5 - KNOWN virus family level
Pool	Papillomaviridae
pool1-skin-pathogen_S1_L001	335246
pool2-skin-pathogen_S2_L001	9642
pool3-skin-pathogen_S3_L001	83382
pool4-skin-pathogen_S4_L001	21317
pool5-skin-pathogen_S5_L001	63808
pool6-oral-pathogen_S6_L001	210
pool7-oral-pathogen_S7_L001	46443
pool8-oral-pathogen_S8_L001	167

Table 6 - KNOWN virus genus level
Family	Papillomaviridae					
Pool\Genus	Alphapapillomavirus	Betapapillomavirus	Gammapapillomavirus	Lambdapapillomavirus	Unclassified	
pool1-skin-pathogen_S1_L001	0	230838	0	16	104392
pool2-skin-pathogen_S2_L001	0	662	0	155	8825
pool3-skin-pathogen_S3_L001	0	2801	69390	0	11191
pool4-skin-pathogen_S4_L001	0	335	4676	0	16306
pool5-skin-pathogen_S5_L001	10205	13383	20001	0	20219
pool6-oral-pathogen_S6_L001	65	51	58	0	36
pool7-oral-pathogen_S7_L001	19	1869	42320	0	2235
pool8-oral-pathogen_S8_L001	153	4	0	0	10

Table 7 - NEW virus family level
Pool	Papillomaviridae
pool1-skin-pathogen_S1_L001	0
pool2-skin-pathogen_S2_L001	0
pool3-skin-pathogen_S3_L001	2
pool4-skin-pathogen_S4_L001	0
pool5-skin-pathogen_S5_L001	3
pool6-oral-pathogen_S6_L001	0
pool7-oral-pathogen_S7_L001	0
pool8-oral-pathogen_S8_L001	0

Table 8 - NEW virus genus level
Family	Papillomaviridae		
Pool\Genus	Gammapapillomavirus	Unclassified	
pool1-skin-pathogen_S1_L001	0	0
pool2-skin-pathogen_S2_L001	0	0
pool3-skin-pathogen_S3_L001	0	2
pool4-skin-pathogen_S4_L001	0	0
pool5-skin-pathogen_S5_L001	3	0
pool6-oral-pathogen_S6_L001	0	0
pool7-oral-pathogen_S7_L001	0	0
pool8-oral-pathogen_S8_L001	0	0

However, this is the output that I am seeing as those files are produced:

##########################################
##	FastQC control of the raw reads	##
##########################################
FastQC control of raw fastq files already done
##########################################
##	Remove adapter sequence		##
##########################################
Trim_galore already done
##########################################
##	Clustering step : VSEARCH	##
##########################################
Vsearch already done
##########################################
##	Sequence identification : BLAST	##
##########################################
Sequence identification already done
##########################################
##	Advanced analysis		##
##########################################
Possible precedence issue with control flow operator at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805.
pool1-skin-pathogen_S1_L001
pool2-skin-pathogen_S2_L001
pool3-skin-pathogen_S3_L001
pool4-skin-pathogen_S4_L001
pool5-skin-pathogen_S5_L001
pool6-oral-pathogen_S6_L001
pool7-oral-pathogen_S7_L001
pool8-oral-pathogen_S8_L001
Concat NEW VIRUS sequence
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 1.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 2.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 4.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 5.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 7.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 8.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 9.
pool1-skin-pathogen_S1_L001
pool2-skin-pathogen_S2_L001
pool3-skin-pathogen_S3_L001
pool4-skin-pathogen_S4_L001
pool5-skin-pathogen_S5_L001
pool6-oral-pathogen_S6_L001
pool7-oral-pathogen_S7_L001
pool8-oral-pathogen_S8_L001
Concat KNOWN VIRUS sequence
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 10.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 48.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 125.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 156.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 207.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 287.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 316.
Odd number of elements in hash assignment at /root/miniconda3/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 326, <F1> line 371.
pool1-skin-pathogen_S1_L001
pool2-skin-pathogen_S2_L001
pool3-skin-pathogen_S3_L001
pool4-skin-pathogen_S4_L001
pool5-skin-pathogen_S5_L001
pool6-oral-pathogen_S6_L001
pool7-oral-pathogen_S7_L001
pool8-oral-pathogen_S8_L001

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: /opt/ncbi-blast-2.10.0+/bin/makeblastdb call crashed: There was a problem running /opt/ncbi-blast-2.10.0+/bin/makeblastdb : Error: mdb_env_open: Invalid argument

STACK: Error::throw
STACK: Bio::Root::Root::throw /root/miniconda3/lib/site_perl/5.26.2/Bio/Root/Root.pm:447
STACK: Bio::Tools::Run::WrapperBase::_run /root/miniconda3/lib/site_perl/5.26.2/Bio/Tools/Run/WrapperBase/CommandExts.pm:1032
STACK: Bio::Tools::Run::StandAloneBlastPlus::make_db /root/miniconda3/lib/site_perl/5.26.2/Bio/Tools/Run/StandAloneBlastPlus.pm:778
STACK: ./PVAmpliconFinder_step2.pl:1498
-----------------------------------------------------------

Is there anything that can be done to avoid that last error message?

Is there anything else that PVAmpliconFinder is supposed to be providing? For example, do I have everything for known sequences and this would only be a problem for novel sequences?

The files within analysis_new mostly only contain headers without any data rows, with 2 exceptions:

pool3-skin-pathogen_S3_L001.csv:

QueryID	SubjectID	evalue	bitscore	length query	perc id	frames	taxid	kingdom	scientifique name	common name	blast name	title	seq query	startq	stopq
pool3-skin-pathogen_S3_L00133;clusterid=32;size=2	gi|1483237272|gb|MH972565.1|	1.06e-09	75.0	78	84.615	1/-1	10566	Viruses	Human papillomavirus	Human papillomavirus	viruses	Human papillomavirus isolate HPV-mSK_213, complete genome	TCTTTCTGGATTATATATAGATGGG-TCTATTAAAGCAAATTTGTTAGGATCTGGGAGTCTTAATCTGAAA-ACTCTA	3	78

pool5-skin-pathogen_S5_L001.csv:

QueryID	SubjectID	evalue	bitscore	length query	perc id	frames	taxid	kingdom	scientifique name	common name	blast name	title	seq query	startq	stopq
pool5-skin-pathogen_S5_L00169;clusterid=68;size=3	gi|1273499348|gb|MF588722.1|	1.13e-72	285	352	81.250	1/-1	1513258	Viruses	Gammapapillomavirus 13	Gammapapillomavirus 13	viruses	Gammapapillomavirus 13 isolate Gamma13_HIVGc158, complete genome	CCTGTATTTTCGTATGATTTAATGCTATAGGTGGACATTCGCCTGCATTCTGTTTATTACAAGGTGTTGTAACGTCCCAATGTTGTCCTATAGGAGGTGCACAACCAACAATAAGAATTTGAGTTTGTTTTGGTTCAAACGATAAATTAACTCTATTATCATCTTGTTCTGGAATGGGTCTTCCCAAAGGATTTTCTGTATCTCCATATTTATTTAATAAAGGATGACCTACAGCTCCAATACCTAGAGGACCACCTCTGTCAACCTCTAAGCCTTTAACTCTCCAAACTAATCTCTCCGTTTCTGGATTATAAATGTCCTGATCTATCAATGCAAACTTATTCGGATCCGG	1	352

Also, I am guessing it isn't a big deal, but you can see above that these are tab-delimited (tsv) rather than comma-separated (csv) files.

Thank you very much.

Sincerely,
Charles

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants