Skip to content
This repository has been archived by the owner on Aug 23, 2024. It is now read-only.

ds-cgpwgs.pl -sp not carried over to cgpFlagCaVEMan.pl -s #50

Open
ym3 opened this issue Jan 21, 2020 · 2 comments
Open

ds-cgpwgs.pl -sp not carried over to cgpFlagCaVEMan.pl -s #50

ym3 opened this issue Jan 21, 2020 · 2 comments

Comments

@ym3
Copy link

ym3 commented Jan 21, 2020

dockstore-cgpwgs-2.1.0 falls over on mouse bam files mapped using dockstore-cgpmap-3.0.4 giving these errors:

No flagList found in flag.vcf.config.WGS.ini for section MUS MUSCULUS_WGS FLAGLIST. No flagging will be done. at /opt/wtsi-cgp/bin/cgpFlagCaVEMan.pl line 822.

No config found in flag.vcf.config.WGS.ini for section MUS MUSCULUS_WGS FLAGLIST at /opt/wtsi-cgp/bin/cgpFlagCaVEMan.pl line 829.

I am using mm10 reference bundle both for mapping and variant calling.

I am running like
ds-cgpwgs.pl -sp mouse -as mm10
but as the bam files have "SP:Mus musculus" in their @sq header lines, this is carried over to
cgpFlagCaVEMan.pl -s 'Mus musculus'
which fails as the reference file flag.vcf.config.WGS.ini does not have the flagList or prams for 'Mus musculus'.

The behaviour should be that -sp from ds-cgpwgs.pl should override SP: in the bam headers so the flagging would run like this:
cgpFlagCaVEMan.pl -s mouse

@keiranmraine
Copy link
Contributor

Have you tried:

ds-cgpwgs.pl -sp 'Mus musculus' -as mm10...

I think this works, however I can't say for certain. The wrapper code rewrites the flag.vcf.config.WGS.ini based on what you provide on the wrapper script command line.

sub add_species_flag_ini {
my ($species, $ini_in) = @_;
$species =~ s/ /_/g;
$species = uc $species;
my $ini_out = $opts{'o'}.'/flag.vcf.config.WGS.ini';
open my $IN, '<', $ini_in;
open my $OUT,'>',$ini_out;
while(my $line = <$IN>) {
$line =~ s/^\[HUMAN_/[${species}_/;
print $OUT $line;
}
close $OUT;
close $IN;
return $ini_out;
}

The error showing MUS MUSCULUS_WGS FLAGLIST rather than MUS_MUSCULUS_WGS FLAGLIST suggests this may still fail (as the above code shows the space being substituted).

Theoretically updating this function to always defer to the provided value (when specified) should solve this:

sub resolve_sp_as {
my $options = shift;
## read species/assembly from bam headers
my ($mt_species, $mt_assembly) = species_assembly_from_xam($options->{'t'});
my ($wt_species, $wt_assembly) = species_assembly_from_xam($options->{'n'});
if($mt_species ne $wt_species) {
warn "WARN: Species mismatch between T/N [CR|B]AM headers\n";
if(!defined $options->{'sp'} || $options->{'sp'} eq q{}) {
die "ERROR: Please define species to handle this mismatch\n";
}
}
elsif($mt_species ne q{}) {
$options->{'sp'} = $mt_species;
}
if(!defined $options->{'sp'} || $options->{'sp'} eq q{}) {
die "ERROR: Please define species, not found in [CR|B]AM headers.\n";
}
if($mt_assembly ne $wt_assembly) {
warn "WARN: Assembly mismatch between T/N [CR|B]AM headers\n";
if(!defined $options->{'as'} || $options->{'as'} eq q{}) {
die "ERROR: Please define assembly to handle this mismatch\n";
}
}
elsif($mt_assembly ne q{}) {
$options->{'as'} = $mt_assembly;
}
if(!defined $options->{'as'} || $options->{'as'} eq q{}) {
die "ERROR: Please define assembly, not found in [CR|B]AM headers.\n";
}
}

Unfortunately I don't know when we would be able to perform a full round of testing, feel free to raise a PR.

@ym3
Copy link
Author

ym3 commented Feb 4, 2020

Thanks for your input.
This was easily solved by simply adding the corresponding parameters and flags
[MUS MUSCULUS_WGS PARAMS]
[MUS MUSCULUS_WGS FLAGLIST]
[MUS MUSCULUS_WGS BEDFILES]
to flag.vcf.config.WGS.ini file which makes part of SNV_INDEL_mm10-fragment.tar.gz that is supplied to ds-cgpwgs.pl using -si option.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants