Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash during vcf:annotate step on VEP plugins ncER and ReMM #658

Open
SergeWielhouwer opened this issue Dec 6, 2024 · 6 comments
Open

Comments

@SergeWielhouwer
Copy link

Describe the bug
Hi, thanks for developing VIP. I am currently experiencing issues running the VCF workflow on a HG002 test sample. During the annotation step, VEP crashes with errors for various invalid variant lines (some examples below).
log_dump.txt

WARNING: 3870278 : WARNING: Plugin 'ncER' went wrong: ERROR: Expecting no more than one score for a position.
WARNING: Plugin 'ncER' went wrong: ERROR: Expecting no more than one score for a position.
WARNING: Plugin 'ReMM' went wrong: ERROR: Expecting no more than one score for a position.
Using sed, I could not really pinpoint an invalid VCF line (I wasn't quite sure what the input file was, I assume GIAB_HG002_chunk_0_normalized.vcf.gz in this case)

Any idea what could potentially be causing this? Duplicated variant lines? Pre-existing INFO fields?

To Reproduce
Run VIP v8.0.0 as ./vip -w vcf -i samplesheet2.tsv -o output -p slurm on a VCF generated with clair3 v1.0.10 (on GCA_000001405.15_GRCh38_no_alt_analysis_set).

input VCF lines beneath header
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
chr1 10177 . A C 0.57 LowQual F GT:GQ:DP:AD:AF 1/1:0:60:1,29:0.4833
chr1 10247 . TA T 0.44 LowQual F GT:GQ:DP:AD:AF 0/1:0:60:36,23:0.3833
etc...

SampleSheet
project_id individual_id vcf sequencing_method assembly
GIAB_HG002 SAMPLE /mnt/flashblade01/scratch/s.wielhouwer/vip_testing/PAO89685_90Gbp.snp.vcf.gz WGS GRCh38

We use Debian GNU/Linux 11 (bullseye)

Should you need more info, please let me know. Also, can users create custom hg19/hg38 configs? Such as with alt contigs and decoys included? VIP seems to only accept GCA_000001405.15_GRCh38_no_alt_analysis_set.

@dennishendriksen
Copy link
Contributor

Hello @SergeWielhouwer,

We are aware of the annotation warnings in your bug report, they are innocent. Your issue is likely related to this log message:

 -- Check '/mnt/flashblade01/scratch/s.wielhouwer/vip_testing/output/.nxf.log' file for details
ERROR ~ Unexpected error [InvocationTargetException]

 -- Check script '/mnt/shared/tools/vip-8.0.0/./modules/vcf/./utils.nf' at line: 91 or see '/mnt/flashblade01/scratch/s.wielhouwer/vip_testing/output/.nxf.log' file for more details
ERROR ~ Unexpected error [InvocationTargetException]

Can you share .nxf.log for further investigation?

Also, can users create custom hg19/hg38 configs? Such as with alt contigs and decoys included? VIP seems to only accept GCA_000001405.15_GRCh38_no_alt_analysis_set.

Yes, you can provide additional configuration to VIP including parameters to specify your reference genome, see here. Note that for GRCh37 and T2T a liftover to GRCh38 is performed. Note that some annotations might not be included for other GRCh38 flavors.

Best regards,
@dennishendriksen

@SergeWielhouwer
Copy link
Author

SergeWielhouwer commented Dec 9, 2024

Hi Dennis,

Thanks for looking into this.

Please find the log file attached below:
nxf.log

I will try out creating a custom hg38 reference later, thanks for providing me with the link.

Best regards,

Serge

@dennishendriksen
Copy link
Contributor

From your log file: Caused by: java.lang.RuntimeException: error: expected group size '15' differs from actual group size '9'. this might indicate a bug in the software. To pinpoint the issue could you provide PAO89685_90Gbp.snp.vcf.gz as well? If you cannot share the data publicly then please provide (a link to) the data via d.hendriksen@umcg.nl.

Did the error happen to occur during a pipeline resume or on a initial run?

@SergeWielhouwer
Copy link
Author

SergeWielhouwer commented Dec 9, 2024

Hi Dennis,

The error occurred both during an initial run as well as using -resume.
I have sent you the link containing the samplesheet and vcf.gz/vcf.gzi.tbi by mail.

Serge

@dennishendriksen
Copy link
Contributor

Hi Serge,

I wasn't able to reproduce your issue, the pipeline run finished successfully. I'll share the results by mail.
We've implemented the safeguard in the pipeline that resulted in your error message to ensure data is properly processed. I suspect you are hitting a resume/caching issue in the underlying Nextflow workflow manager that was triggered in the resume.

Do you happen to still have the .nxf.log of your initial run? My hypothesis is that your initial Slurm run failed due to a e.g. walltime exceeded error.

Greetings,
@dennishendriksen

@SergeWielhouwer
Copy link
Author

Hi Dennis,

I unfortunately do not have the initial .nxf.log file anymore. I will try to grant the SLURM job additional wall time and memory and run again from scratch. I'll also try to run it locally on of our execution nodes to see whether that changes some things. I should mention that we're still using Singularity instead of Apptainer, so I modified the configurations (e.g., environment variables and exec commands) in the installation directory. This might have caused some unexpected behaviour, even though the initial rules ran successfully. We're working on the migration but need a bit more time to complete the transition.

Kind regards,

Serge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants