Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The typingtools (NoV, EV, HAV) rules give an error (proxy) and crash. Explanation. #29

Closed
DennisSchmitz opened this issue May 3, 2019 · 3 comments
Assignees
Labels
faq Frequently asked question

Comments

@DennisSchmitz
Copy link
Owner

DennisSchmitz commented May 3, 2019

Please see the updated post below, in newer versions of the pipeline you should no longer get the error as described in the first post

There is a bug in the Viral_typing part of the pipeline that causes Jovian to stop prematurely and not finish it's analysis. This results in the Jovian Report to not show some tables and rendering improperly.

The error message:

RuleException:

CalledProcessError in line 576 of /PATH/Snakefile:

Command 'source /mnt/miniconda/bin/activate '/PATH/.snakemake/conda/ccb14a27'; set -euo pipefail;  awk -F "\t" '$6 == "Norwalk virus" {print ">" $2 "\n" $24}' < data/tables/NAME_taxClassified.tsv 2> logs/Viral_typing_NAME.log 1> data/virus_typing_tables/NAME_NoV.fa

if [ -s "data/virus_typing_tables/NAME.fa" ]

then

    curl -s --data-urlencode fasta-sequence@data/virus_typing_tables/NAME_NoV.fa https://www.rivm.nl/mpf/typingservice/norovirus 2>> logs/Viral_typing_NAME.log 1> data/virus_typing_tables/NAME_NoV.xml

    python bin/typingtool_NoV_XML_to_csv_parser.py NAME data/virus_typing_tables/NAME_NoV.xml data/virus_typing_tables/NAME_NoV.csv 2>> logs/Viral_typing_NAME.log

else

    echo -e "No scaffolds with species == Norwalk Virus in sample:      NAME." >> logs/Viral_typing_Undetermined_S0.log

    touch data/virus_typing_tables/NAME.xml

    touch data/virus_typing_tables/NAME.csv

fi



awk -F "\t" '$8 == "Picornaviridae" {print ">" $2 "\n" $24}' < data/tables/NAME.tsv 2>> logs/Viral_typing_NAME.log 1> data/virus_typing_tables/NAME_EV.fa

if [ -s "data/virus_typing_tables/NAME_EV.fa" ]

then

    curl -s --data-urlencode fasta-sequence@data/virus_typing_tables/NAME_EV.fa https://www.rivm.nl/mpf/typingservice/enterovirus 2>> logs/Viral_typing_NAME.log 1> data/virus_typing_tables/NAME_EV.xml

    python bin/typingtool_EV_XML_to_csv_parser.py NAME data/virus_typing_tables/NAME_EV.xml data/virus_typing_tables/NAME_EV.csv 2>> logs/Viral_typing_NAME.log

else

    echo -e "No scaffolds with family == Picornaviridae in sample:       NAME." >> logs/Viral_typing_NAME.log

    touch data/virus_typing_tables/NAME_EV.xml

    touch data/virus_typing_tables/NAME_EV.csv

fi' returned non-zero exit status 1.

  File "/PATH/Snakefile", line 576, in __rule_Viral_typing

  File "/PATH/.conda/envs/Jovian_master/lib/python3.6/concurrent/futures/thread.py", line 56, in run

Removing output files of failed job Viral_typing since they might be corrupted:

data/virus_typing_tables/NAME_NoV.fa, data/virus_typing_tables/NAME_EV.fa, data/virus_typing_tables/NAME_NoV.xml, data/virus_typing_tables/NAME_EV.xml, data/virus_typing_tables/NAME_NoV.csv

The reason:

We are currently using the publicly available Norovirus, Hepatititis A and Enterovirus web-based typingtools of Kroneman et al. 2011 hosted by the RIVM. These typingtools were originally intended for Sanger sequences.

We've found they cannot keep up with the amount of queries being sent by the pipeline, especially not now more people are using Jovian. A consequence of this increased popularity is that the typingtool web-server becomes overloaded and crashes, which in turn results in the Jovian error shown above.

We are aware of this problem, however, it is not trivial to solve and will take some time. In the meantime we are working on a short-term work-around (described below). For now, should you encounter this problem, please do the following troubleshooting and let us know the results:

Troubleshooting:

Check if these the typing tools are available by clicking here. Either this website is available, or you`ll get a time-out or 404 error.

If you get a time-out or 404 error this means the web-service has crashed. Please contact the developers either via a GitHub issue or via mail and we will reboot the server ASAP.

If this link does work it means the web-services are available and this is most likely caused by a sporadic connection problem. Please try to run the pipeline again in ~5 minutes. If again it doesn't work, it is not the same issue. So please make a separate GitHub issue and describe what you did and what error message you received.

Short term solution/work-around:

The Viral_typing rule will be removed from Jovian in v0.9.2 and be included as a separate, on-demand, script that can be started from within the report. Rationale being that not everyone wants the NoV, HAV and EV typing results anyway and thereby reducing the load on the web-services. Those that do want the results can activate it on-demand via the Jovian report. There the same problem can also occur, but at least the rest of the pipeline is then unaffected, will finish correctly and the Jovian Report is rendered properly.

Long term solution:

  1. We are working on the efficiency of the typing tool services, also, we're improving the queuing capabilities and improving the hardware. This should stop the web-services from crashing.
  2. Even farther in the future we hope to be able to package the typing software within the pipeline, but this is currently not possible.
@DennisSchmitz DennisSchmitz added bug Something isn't working faq Frequently asked question labels May 3, 2019
@DennisSchmitz DennisSchmitz added this to the v0.9.2 milestone May 3, 2019
@DennisSchmitz DennisSchmitz self-assigned this May 3, 2019
@DennisSchmitz
Copy link
Owner Author

DennisSchmitz commented May 9, 2019

Workaround is implemented in version V0.9.2 (1ac0e94).

The typingtool processes are now removed from the main Jovian analysis and now can be performed by doing:
bash jovian --virus-typing [NoV|EV|HAV|EV]
N.B. this only works if a Jovian analysis has been performed already, otherwise you`ll get an error.

We kindly ask you to use these services sparingly lest the servers become overloaded and crash. These services are intended for clinical and public health applications that require sub-species level taxonomic classification, e.g. for outbreak tracing with accurate metadata.

We are working on long-term fixes that would allow automated virus typing, however, we have no ETA for it yet.

@DennisSchmitz DennisSchmitz removed the bug Something isn't working label May 9, 2019
@DennisSchmitz DennisSchmitz removed this from the v0.9.2 milestone May 17, 2019
@DennisSchmitz
Copy link
Owner Author

Linked to #51, small update.

Apparently someone used a bot to continuously send queries... Hence all the proxy errors and overloading. That should be fixed now.

In the meanwhile, an improved version of the typingtools is online and being tested. Once testing is successful, it will replace the old version and then it should be much faster and more stable.

@DennisSchmitz
Copy link
Owner Author

Just an update: the new typing tool web-services are being tested and hopefully will come online after the holiday season.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
faq Frequently asked question
Projects
None yet
Development

No branches or pull requests

1 participant