Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Program always run #855

Closed
liukeweiaway opened this issue Jul 29, 2024 · 3 comments
Closed

Program always run #855

liukeweiaway opened this issue Jul 29, 2024 · 3 comments

Comments

@liukeweiaway
Copy link

liukeweiaway commented Jul 29, 2024

Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.6.1/docs/FAQ.md: YES

Describe the issue:

  1. When the input sequence (fastq) matches the reference sequence, the program will keep running.
  2. Sequence obtain from data generation tools. (dwgsim)

Setup

  • Operating system: Red Hat Enterprise Linux release 8.6 (Ootpa)
  • DeepVariant version: deepvariant1.6.0.sif
  • Installation method (Docker, built from source, etc.): singularity
  • Type of data: (sequencing instrument, reference genome, anything special that is unlike the case studies?)
    GRCh38| (Sequence obtain from data generation tools. (dwgsim) ,length,8-9kB, 150bp, PE, sequence may the same as reference sequence)

Steps to reproduce:

  • Command:
    time singularity run ~/singularity/deepvariant.simg
    /opt/deepvariant/bin/run_deepvariant
    --model_type WES
    --ref ${ref}
    --reads ${bamSavePath}/${name}.sorted.bam
    --output_vcf ${vcf}
    --output_gvcf ${outputPath}/vcf/${name}/${name}.g.vcf.gz
    --num_shards $(nproc)
    --regions ${BED}
    --sample_name ${name}
    --make_examples_extra_args="min_mapping_quality=1,keep_legacy_allele_counter_behavior=true,normalize_reads=true"

  • Error trace: (if applicable)
    I0729 14:44:37.339473 140223721211712 make_examples_core.py:301] Task 0/4: Preparing inputs
    I0729 14:44:37.339473 140478861559616 make_examples_core.py:301] Task 3/4: Preparing inputs
    I0729 14:44:37.350302 140710547908416 make_examples_core.py:301] Task 1/4: Preparing inputs
    I0729 14:44:37.339477 139779121772352 make_examples_core.py:301] Task 2/4: Preparing inputs
    I0729 14:44:37.476220 140223721211712 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.485832 140223721211712 make_examples_core.py:301] Task 0/4: Common contigs are ['chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM']
    I0729 14:44:37.533100 140223721211712 genomics_reader.py:222] Reading /lustre/home/acct-medfzx/medfzx-lkw/project/CAH/data/BED/cah_noname.bed with NativeBedReader
    I0729 14:44:37.541654 140223721211712 make_examples_core.py:301] Task 0/4: Starting from v0.9.0, --use_ref_for_cram is default to true. If you are using CRAM input, note that we will decode CRAM using the reference you passed in with --ref
    I0729 14:44:37.543606 140223721211712 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.620779 140223721211712 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.620978 140223721211712 make_examples_core.py:301] Task 0/4: Writing gvcf records to /tmp/tmpkcjcf0p_/gvcf.tfrecord-00000-of-00004.gz
    I0729 14:44:37.621363 140223721211712 make_examples_core.py:301] Task 0/4: Writing examples to /tmp/tmpkcjcf0p_/make_examples.tfrecord-00000-of-00004.gz
    I0729 14:44:37.621420 140223721211712 make_examples_core.py:301] Task 0/4: Overhead for preparing inputs: 0 seconds
    I0729 14:44:37.652796 140223721211712 make_examples_core.py:301] Task 0/4: 0 candidates (0 examples) [0.03s elapsed]
    I0729 14:44:37.476214 140478861559616 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.485823 140478861559616 make_examples_core.py:301] Task 3/4: Common contigs are ['chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM']
    I0729 14:44:37.533152 140478861559616 genomics_reader.py:222] Reading /lustre/home/acct-medfzx/medfzx-lkw/project/CAH/data/BED/cah_noname.bed with NativeBedReader
    I0729 14:44:37.541796 140478861559616 make_examples_core.py:301] Task 3/4: Starting from v0.9.0, --use_ref_for_cram is default to true. If you are using CRAM input, note that we will decode CRAM using the reference you passed in with --ref
    I0729 14:44:37.543756 140478861559616 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.619259 140478861559616 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.619429 140478861559616 make_examples_core.py:301] Task 3/4: Writing gvcf records to /tmp/tmpkcjcf0p_/gvcf.tfrecord-00003-of-00004.gz
    I0729 14:44:37.619812 140478861559616 make_examples_core.py:301] Task 3/4: Writing examples to /tmp/tmpkcjcf0p_/make_examples.tfrecord-00003-of-00004.gz
    I0729 14:44:37.619867 140478861559616 make_examples_core.py:301] Task 3/4: Overhead for preparing inputs: 0 seconds
    I0729 14:44:37.653389 140478861559616 make_examples_core.py:301] Task 3/4: 0 candidates (0 examples) [0.03s elapsed]
    I0729 14:44:37.476402 140710547908416 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.486019 140710547908416 make_examples_core.py:301] Task 1/4: Common contigs are ['chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM']
    I0729 14:44:37.533221 140710547908416 genomics_reader.py:222] Reading /lustre/home/acct-medfzx/medfzx-lkw/project/CAH/data/BED/cah_noname.bed with NativeBedReader
    I0729 14:44:37.541869 140710547908416 make_examples_core.py:301] Task 1/4: Starting from v0.9.0, --use_ref_for_cram is default to true. If you are using CRAM input, note that we will decode CRAM using the reference you passed in with --ref
    I0729 14:44:37.543881 140710547908416 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.619640 140710547908416 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.619808 140710547908416 make_examples_core.py:301] Task 1/4: Writing gvcf records to /tmp/tmpkcjcf0p_/gvcf.tfrecord-00001-of-00004.gz
    I0729 14:44:37.620180 140710547908416 make_examples_core.py:301] Task 1/4: Writing examples to /tmp/tmpkcjcf0p_/make_examples.tfrecord-00001-of-00004.gz
    I0729 14:44:37.620236 140710547908416 make_examples_core.py:301] Task 1/4: Overhead for preparing inputs: 0 seconds
    I0729 14:44:37.652904 140710547908416 make_examples_core.py:301] Task 1/4: 0 candidates (0 examples) [0.03s elapsed]
    I0729 14:44:37.476840 139779121772352 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.486504 139779121772352 make_examples_core.py:301] Task 2/4: Common contigs are ['chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM']
    I0729 14:44:37.533215 139779121772352 genomics_reader.py:222] Reading /lustre/home/acct-medfzx/medfzx-lkw/project/CAH/data/BED/cah_noname.bed with NativeBedReader
    I0729 14:44:37.541668 139779121772352 make_examples_core.py:301] Task 2/4: Starting from v0.9.0, --use_ref_for_cram is default to true. If you are using CRAM input, note that we will decode CRAM using the reference you passed in with --ref
    I0729 14:44:37.543754 139779121772352 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.624524 139779121772352 genomics_reader.py:222] Reading result/simulate_A/bam/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585/chr6_CYP21A2_CYP21A1P__A_Q_8845_R_8585.sorted.bam with NativeSamReader
    I0729 14:44:37.624701 139779121772352 make_examples_core.py:301] Task 2/4: Writing gvcf records to /tmp/tmpkcjcf0p_/gvcf.tfrecord-00002-of-00004.gz
    I0729 14:44:37.625109 139779121772352 make_examples_core.py:301] Task 2/4: Writing examples to /tmp/tmpkcjcf0p_/make_examples.tfrecord-00002-of-00004.gz
    I0729 14:44:37.625170 139779121772352 make_examples_core.py:301] Task 2/4: Overhead for preparing inputs: 0 seconds
    I0729 14:44:37.653617 139779121772352 make_examples_core.py:301] Task 2/4: 0 candidates (0 examples) [0.03s elapsed]
    I0729 14:44:37.898170 140223721211712 make_examples_core.py:301] Task 0/4: Writing example info to /tmp/tmpkcjcf0p_/make_examples.tfrecord-00000-of-00004.gz.example_info.json
    I0729 14:44:37.898274 140223721211712 make_examples_core.py:2958] example_shape = None
    I0729 14:44:37.898649 140223721211712 make_examples_core.py:2959] example_channels = [1, 2, 3, 4, 5, 6, 19]
    I0729 14:44:37.899018 140223721211712 make_examples_core.py:301] Task 0/4: Found 0 candidate variants
    I0729 14:44:37.899082 140223721211712 make_examples_core.py:301] Task 0/4: Created 0 examples
    I0729 14:44:37.898826 140478861559616 make_examples_core.py:301] Task 3/4: Writing example info to /tmp/tmpkcjcf0p_/make_examples.tfrecord-00003-of-00004.gz.example_info.json
    I0729 14:44:37.898957 140478861559616 make_examples_core.py:2958] example_shape = None
    I0729 14:44:37.899326 140478861559616 make_examples_core.py:2959] example_channels = [1, 2, 3, 4, 5, 6, 19]
    I0729 14:44:37.899663 140478861559616 make_examples_core.py:301] Task 3/4: Found 0 candidate variants
    I0729 14:44:37.899726 140478861559616 make_examples_core.py:301] Task 3/4: Created 0 examples
    I0729 14:44:37.898853 140710547908416 make_examples_core.py:301] Task 1/4: Writing example info to /tmp/tmpkcjcf0p_/make_examples.tfrecord-00001-of-00004.gz.example_info.json
    I0729 14:44:37.898977 140710547908416 make_examples_core.py:2958] example_shape = None
    I0729 14:44:37.899351 140710547908416 make_examples_core.py:2959] example_channels = [1, 2, 3, 4, 5, 6, 19]
    I0729 14:44:37.899687 140710547908416 make_examples_core.py:301] Task 1/4: Found 0 candidate variants
    I0729 14:44:37.899752 140710547908416 make_examples_core.py:301] Task 1/4: Created 0 examples
    I0729 14:44:37.893192 139779121772352 make_examples_core.py:301] Task 2/4: Writing example info to /tmp/tmpkcjcf0p_/make_examples.tfrecord-00002-of-00004.gz.example_info.json
    I0729 14:44:37.893293 139779121772352 make_examples_core.py:2958] example_shape = None
    I0729 14:44:37.893665 139779121772352 make_examples_core.py:2959] example_channels = [1, 2, 3, 4, 5, 6, 19]
    I0729 14:44:37.894033 139779121772352 make_examples_core.py:301] Task 2/4: Found 0 candidate variants
    I0729 14:44:37.894105 139779121772352 make_examples_core.py:301] Task 2/4: Created 0 examples

real 0m4.791s
user 0m11.503s
sys 0m2.085s

***** Running the command:*****
time /opt/deepvariant/bin/call_variants --outfile "/tmp/tmpkcjcf0p_/call_variants_output.tfrecord.gz" --examples "/tmp/tmpkcjcf0p_/make_examples.tfrecord@4.gz" --checkpoint "/opt/models/wes"

/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning:

TensorFlow Addons (TFA) has ended development and introduction of new features.
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP).

For more information see: tensorflow/addons#2807

warnings.warn(
I0729 14:44:41.088234 139722246891328 call_variants.py:471] Total 1 writing processes started.
W0729 14:44:41.090612 139722246891328 call_variants.py:482] Unable to read any records from /tmp/tmpkcjcf0p_/make_examples.tfrecord@4.gz. Output will contain zero records.
I0729 14:44:41.091079 139722246891328 call_variants.py:623] Complete: call_variants.

Does the quick start test work on your system?
yes

Any additional context:
Some samples work fine, some very similar samples keep running

@kishwarshafin
Copy link
Collaborator

@liukeweiaway are these human samples? Looks like the program is running fine, it's just not finding any variants. Can you please explain a bit more to what exactly is your data?

@liukeweiaway
Copy link
Author

liukeweiaway commented Jul 30, 2024

@liukeweiaway are these human samples? Looks like the program is running fine, it's just not finding any variants. Can you please explain a bit more to what exactly is your data?

It is a human sample, and the generated data is the same as the reference genome. When no mutation is detected, the program will not stop and will continue to run. You need to stop the program manually.
chr6_CYP21A2.bwa.read1.fastq.gz
chr6_CYP21A2.bwa.read2.fastq.gz

@kishwarshafin
Copy link
Collaborator

@liukeweiaway ,

I see, can you please update to 1.6.1? I think you are getting stuck in the bug of 1.6.0 that we fixed in 1.6.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants