Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error due to undesirable use of an alternative contig of the reference genome #14

Closed
Tintest opened this issue Apr 8, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@Tintest
Copy link

Tintest commented Apr 8, 2024

Hello,

I encountered an error related to the presence of an alternative contig on which Spectre wanted to make the CNV call, whereas I only wanted to make the call on the canonical contigs.

Even though I had specified the contigs I wanted, with the --only-chr argument and the *.regions.bed.gz files were generated specifically on the canonical contigs, Spectre persists in wanting to make the call on the contig HSCHR16_RANDOM_CTG1 present in the reference genome that I am using.

singularity exec --bind /srv:/srv spectre.0.2.0.sif spectre CNVCaller \
--coverage mosdepth/Sample1.regions.bed.gz \
--sample-id Sample1 \
--output-dir results \
--blacklist grch38_blacklist_spectre.bed \
--only-chr chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY \
-0 \
--reference /srv/nfs/ngs-stockage/NGS_Bioinfo/Nanopore/HCL_SNP_ONT/REF/GRCh38.p13.primary_assembly.decoy.fa

testardqu@chu-lyon.fr@ge156378-vm1:~/testardqu/tickets/28452/Spectre$ singularity exec --bind /srv:/srv spectre.0.2.0.sif spectre CNVCaller \
> --coverage mosdepth/Sample1.regions.bed.gz \
> --sample-id Sample1 \
> --output-dir results \
> --blacklist grch38_blacklist_spectre.bed \
> --only-chr chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY \
> -0 \
> --reference /srv/nfs/ngs-stockage/NGS_Bioinfo/Nanopore/HCL_SNP_ONT/REF/GRCh38.p13.primary_assembly.decoy.fa
spectre::2024-04-04 17:30:22,922::INFO::spectre.spectre>  Spectre input: CNVCaller --coverage mosdepth/Sample1.regions.bed.gz --sample-id Sample1 --output-dir results --blacklist grch38_blacklist_spectre.bed --only-chr chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY -0 --reference /srv/nfs/ngs-stockage/NGS_Bioinfo/Nanopore/HCL_SNP_ONT/REF/GRCh38.p13.primary_assembly.decoy.fa
spectre::2024-04-04 17:30:22,922::DEBUG::spectre.spectre>  Debug output is enabled
spectre::2024-04-04 17:30:22,922::INFO::spectre.spectre>  Spectre version: 0.2.0
spectre::2024-04-04 17:30:22,922::INFO::spectre.spectre>  Starting spectre
spectre::2024-04-04 17:30:22,922::INFO::spectre.spectre>  Evaluate if a new .mdr file needs to be created
spectre::2024-04-04 17:30:22,923::INFO::spectre.spectre>  Looking for default metadata.mdr in output directory
spectre::2024-04-04 17:30:22,923::INFO::spectre.spectre>  Using existing metadata file
spectre::2024-04-04 17:30:22,929::DEBUG::spectre.spectre>  genome: ['chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'HSCHR16_RANDOM_CTG1']
spectre::2024-04-04 17:30:22,929::INFO::spectre.spectreCNV>  Spectre calculating for: mosdepth/Sample1.regions.bed.gz
spectre::2024-04-04 17:30:22,929::INFO::spectre.spectreCNV>  Data normalization and outlier removal (right tail)
spectre::2024-04-04 17:30:22,930::DEBUG::spectre.analysis.analysis>  coverage file: mosdepth/Sample1.regions.bed.gz
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
spectre::2024-04-04 17:30:23,191::INFO::spectre.analysis.analysis>  Determined bin size from Mosdepth coverage: 1000
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
[W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option?
Traceback (most recent call last):
  File "/usr/local/bin/spectre", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spectre/spectre.py", line 614, in main
    spectre_run.spectre_exe()
  File "/usr/local/lib/python3.11/site-packages/spectre/spectre.py", line 349, in spectre_exe
    spectre_main.cnv_call()
  File "/usr/local/lib/python3.11/site-packages/spectre/spectreCNV.py", line 87, in cnv_call
    self.cnv_analysis.data_normalization()
  File "/usr/local/lib/python3.11/site-packages/spectre/analysis/analysis.py", line 102, in data_normalization
    for tbx_line in coverage_file_tabix.fetch(reference_chromosome):
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pysam/libctabix.pyx", line 507, in pysam.libctabix.TabixFile.fetch
ValueError: could not create iterator for region 'HSCHR16_RANDOM_CTG1'

I solved the problem by using the new reference genome produced by the following command :

samtools faidx /srv/nfs/ngs-stockage/NGS_Bioinfo/Nanopore/HCL_SNP_ONT/REF/GRCh38.p13.primary_assembly.decoy.fa chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22 chrX chrY | bgzip -c > GRCh38.p13.canonical.fa.gz
samtools faidx GRCh38.p13.canonical.fa.gz

I don't know if it's a bug or my fault, but I couldn't find any instructions on how to handle alternative contigs in the Spectre README.

Regards.

@philippesanio philippesanio added the bug Something isn't working label Apr 8, 2024
@philippesanio
Copy link
Collaborator

Hi @Tintest,

Thank you for raising this issue.
Spectre, was designed to only account for contigs which are present in both the reference sequence and the Mosdepth data.
Reason being, that you can use the included metadata file (.mdr) out of the box, which includes the same contigs you have specified.

However, I also see your point why it should have worked in the first place. I will look into that and come back to you.

philippesanio pushed a commit that referenced this issue Apr 12, 2024
# Fixed issues:
- #14 All contigs that are in the reference but now in coverage data are skipped, and a warning is issued. Furthermore, if no matching contigs are found, Spectre will terminate.
- #15 Due to obsolete code references, Spectre would crash if provided with the `--population` flag in the CNVCaller mode. This has been updated accordingly.
- #16  The flag `--ploidy-chr` required instructions on how to transform the chromosome coverage according to the global ploidy status. This has been changed only to provide the current ploidy state of the chromosome. Spectre will do the transformation calculation automatically now.
- #17 When providing multiple samples without the `--population` flag, results have only been properly outputted to the first sample. Providing multiple samples has only been attended to be used with the `--population` flag. Thus, only CNVs will only be calculated for the first provided sample if the `--population` flag is missing.

# Further updates:
- Fixed: SPC file generation removed all values that were default values according to its datatype.
- Fixed: Re-initialization of variant candidates after loading the SPC file.
- Fixed: Updated functions that used paths only used in the debug mode.
- Added: Two new flags for the population mode `--reciprocal-overlap` and `--disable-quality-filter`:
  - `--reciprocal-overlap` will allow you to change the reciprocal overlap value when looking for the supporting variants in other samples.
  - `--disable-quality-filter` will allow you to look for supporting variants in the variants that did not pass the final filter of Spectre in the CNVCaller mode. (Non-passing reasons of the variants could have been that the variant was too small to pass the minimum CNV length, etc.)
- Added: Pre-calculation checks, which ensure that the user provides proper values.
- Added: Instructions on how to install the latest changes of Spectre locally with pip.
- Updated: Search for supporting variants is primarily done on variants that passed the final filter criteria of the CNVCaller mode.
- Updated: The minimum required package versions.
- Updated: Help message.
- Updated: README.md
- Updated: Minor code changes

# Known issues:
- Population mode: If multiple samples contain the same variant, Spectre will try to find for all of them a supporting variant. This can lead to duplicated variant calls in the final output of the population mode. This will be subject to change in a future release.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants