-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected calls to www.ebi.ac.uk when using cram files #180
Comments
This is an unfortunate side effect of CRAM. The good news is this lookup can be avoided through a "reference cache", see the section on REF_PATH and REF_CACHE here: https://www.htslib.org/workflow/cram.html. |
|
@bartcharbon Based on our local testing, it seems that setting the Additionally, we noticed that the errors in connecting to EBI can increase the processing time but do not affect the final result. Therefore, you can directly use the completed jobs' results without re-running them to save time. |
Indeed, it's just fetching the reference sequence from EBI using the checksum in the CRAM SQ lines. |
Fixed in v1.0.6. |
We use clair3 for out variant calling and provide it with our own reference fasta file.
We are seeing some unexpected calls to https://www.ebi.ac.uk in the cases we are using cram files as input.
From the log:
Calling variants ...
Our hypothesis is that the samtools call used to work with the cram files is not provided with the reference file (-T argument), and therefor falls back to www.ebi.ac.uk.
In addition to the fact that we would like our pipeline to be selfcontained and not dependent on external servers, this can also lead to quite severe performance losses if the connection to www.ebi.ac.uk is slow.
Currently we work around the issue by converting our crams to bams and using those for clair3.
The text was updated successfully, but these errors were encountered: