-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue Running PVAmpliconFinder.sh #3
Comments
Dear Charles, Indeed the "_R1" and "_R2" must be present in the fastq file name in order for the script to recognize the paired sequencing reads, as precised in the help. I apologize I made a mistake in the help text : the "-s" option should not be the suffix but the prefix of the fastq file name to analyze. In your case, something like "SRR" or even "SRR9702" should work. I've already upload an updated version of the help. If the program can find the input fastq files, this should solve the issue you have with the identity threshold option. If not, please let me know I'll have a closer look to this issue. Also, I did not thought about the renaming of the fastq files by the SRR database. For matching the current info_file.txt that I just upload in the github repo #2 , you may rename the fastq files, or modify the info_file accordingly. The correspondence between the info_file.txt files naming and SRR database renaming can be found here for pool8 as example, in the "Data access" tab. I hope this will help solving your issue, Sincerely, |
Hi Alexis, 1) In terms of troubleshooting the installation: The first problem was that The second problem was that I am testing PVAmpliconFinder in an Ubuntu Docker image (version 16.04). 2) In terms of modifying the code to run PVAmpliconFinder, this is the mapping that I found for the SRA samples: SRR9702898: pool5-skin-pathogen_S5_L001 So, this is what the current FASTQ files look like:
This is the revised command that I am using:
And this is the output that I see:
So, I am sorry, but I still don't think I quite have this working for the demo files. Can you please continue to help me troubleshoot? Thank You, |
Dear Charles,
Because all the information for conda installation can be found here, I now only provide the conda channel and conda install packages commands needed for PVAmpliconFinder to run. I does not make the installation more complicated, and users become also free to create their own conda environment. I don't know why vsearch was not successfully added to the path in your case, as I use the regular bioconda instructions to install vsearch
To take all the fastq files as input, the files need to share a common prefix, so I would advise using something like "pool". I created a bash script containing the revised command that you are using (I tried with and without the "-i 98" option and It run in both cases) :
And everything is running correctly on the server I'm using (Centos 7.0). I don't know what can be the causes of the issue you're facing, but the error look like that the "if" condition of the bash script are creating errors. Maybe should you try running the script like this: I hope this will solve your problem. Alexis |
Hi Alexis, 1) I think you have changed the part that caused problems for me in the installation file for vsearch. I am guessing that this won't be an issue for other users. However, if somebody had an earlier download for some reason, then the change can be seen here. 2) I have made the changes that you described in the file names as well as testing running the script in different ways. I am running some concurrent analysis on the computer where I have the Docker image installed. However, I did not get the same error messages within a few seconds of running the command (and it looked like it was successfully starting to decompress the FASTQ files). So, I think it might be best if I waited a little bit before I continued testing the program, but I think this helped. If I can confirm that the full set of analysis works on the demo data, then I will close this ticket. Thank you very much! Sincerely, |
As mentioned in thread #4, I think there is some issue with BLAST, since the .blast files are empty and there are error messages at that step:
There are the commands that I am using to run the program:
Can you please continue to help me troubleshoot? Thank you in advance! |
Dear Charles, Sorry for the late reply. I'm currently re-downloading the nt database to match your current installation procedure. This may take some time. In the mean time, you may confirm that all the indexes are indeed present in the folder where is stored the database (it should be the case as the script "update_blastdb.pl" download pre-formatted database). I've updated the README with links to the NCBI website that are describing the correct configuration to set up, as I think your issue come from a path that is not correctly set. I'll not be available in the coming weeks to answers other questions you may have. I'll keep you update later. Thanks for your patience. Alexis |
Hi Alexis, As far as I can tell, I think all of the files are there (I can tell that there were 27 tar.gz files downloaded successfully). I am working on downloading the original FASTA files and indexing them from scratch. I realize that I could have tried to run The estimated download time is long, which matches what you are describing. Thank You, |
I think I have the BLAST step fixed. I needed to use another computer to run that step, taking advantage of the fact that PVAmpliconFinder can pick up analysis in the middle the process (depending upon which subfolders are in the output folder). However, I think I am having a problem at the "Advanced Analysis" step:
There are I have attached a .zip file with the BLAST results for all 8 samples (including the first sample where the error message is occuring). I am not sure if you are available yet, but can you please help me troubleshoot? I hope that I am close to having this working :) |
Dear Charles, Thanks you for your patience. It seems that vsearch output has slightly change in its latest version, I've update the script accordingly. It seems that the last version of the Perl module Bio::Tools::Run::StandAloneBlastPlus also produce an error. I'm working on solving this issue. Alexis |
Hi Alexis - great, thank you very much for your help! |
Hi Alexis, I remember that you said that you would be away for a while. Are you back and able to continue to help me troubleshoot what I think is close to the last step? Thank you very much! Sincerely, |
Hi Alexis, I have downloaded a newer version of PVAmpliconFinder and I tested re-running the scripts on the demo dataset. The program gets past the previous point where it stopped, but I am still getting warning and the program crashes at one point. There are now There are also DiversityByTissu_MegaBlast.csv, diversityByTissu_oral_MegaBlast.csv, diversityByTissu_skin_MegaBlast.csv, and table_summary_MegaBlast_results output files. The contents of the table_summary_MegaBlast_results file are down below:
However, this is the output that I am seeing as those files are produced:
Is there anything that can be done to avoid that last error message? Is there anything else that PVAmpliconFinder is supposed to be providing? For example, do I have everything for known sequences and this would only be a problem for novel sequences? The files within pool3-skin-pathogen_S3_L001.csv:
pool5-skin-pathogen_S5_L001.csv:
Also, I am guessing it isn't a big deal, but you can see above that these are tab-delimited (tsv) rather than comma-separated (csv) files. Thank you very much. Sincerely, |
Hi,
I apologize for submitting 2 tickets close to the same time, but I wanted to list some details before I forgot about them.
Essentially, I think I may have successfully installed PVAmpliconFinder, except that I manually configured conda/bioconda and I used the regular channels (without the /label/cf201901 part).
I also re-named the public FASTQ files as follows (adding the "R" as described by an error that I got with the program otherwise) in a folder called
test_files
:I am trying to test the program with minimal provided information, without the information file described in #2.
I am using the following commands (within the downloaded GitHub folder) to try and run PVAmpliconFinder:
I added
-i 98
because I was previously getting an error message saying PVAmpliconFinder.sh: 92: PVAmpliconFinder.sh: [[: not found (making me think this might have been a required rather than optional parameter?).However, I don't think that solved the problem because these are the error messages that I currently am seeing:
Can you please help me troubleshoot this messages?
Thank you very much.
Sincerely,
Charles
The text was updated successfully, but these errors were encountered: