Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in function annotateCn #3

Open
nalcala opened this issue Mar 11, 2024 · 0 comments
Open

Error in function annotateCn #3

nalcala opened this issue Mar 11, 2024 · 0 comments

Comments

@nalcala
Copy link

nalcala commented Mar 11, 2024

Hi,

Thanks for the great tool! I am having an error for a very segmented tumor (more CN segments than variants) that is caused by this line:

vcfGR$cn[S4Vectors::to(overlaps)] <- cnaGR$cn[S4Vectors::from(overlaps)]

It seems that it is due to an error in the code, which should be
vcfGR$cn[S4Vectors::from(overlaps)] <- cnaGR$cn[S4Vectors::to(overlaps)]
instead of
vcfGR$cn[S4Vectors::to(overlaps)] <- cnaGR$cn[S4Vectors::from(overlaps)]

Indeed, overlaps are computed as overlaps <- GenomicRanges::findOverlaps(vcfGR, cnaGR) so "from" refers to vcfGR (which has 1827 lines in my case) and "to" refers to cnaGR (which has 2188 lines in my case). Nevertheless in the code it does the opposite and extract "from" values from cnaGR to assign them to "to" values in vcfGR. This would not throw an error unless there are more lines in cnaGR, in which case to(overlaps) will likely have values out of the range of vcfGR, hence the crash, but in any case this glitch will lead to a wrong annotation of vcf.

For example, in my case, the 1000th entry in overlaps leads to:

cnaGR[S4Vectors::from(overlaps)[1000],]
GRanges object with 1 range and 1 metadata column:
      seqnames            ranges strand |        cn
         <Rle>         <IRanges>  <Rle> | <numeric>
  [1]     chr8 94391001-95165000      |    4.9207
vcfGR[S4Vectors::to(overlaps)[1000],]
GRanges object with 1 range and 6 metadata columns:
                      seqnames    ranges strand | paramRangeID            REF                ALT      QUAL      FILTER        cn
                         <Rle> <IRanges>  <Rle> |     <factor> <DNAStringSet> <DNAStringSetList> <numeric> <character> <numeric>
  chr10:122588467_G/A    chr10 122588467       |           NA              G                  A        NA        PASS    7.9278

which are not even in the same chromosome, while the correct answer would be:

vcfGR[S4Vectors::from(overlaps)[1000],] 
GRanges object with 1 range and 6 metadata columns:
                     seqnames    ranges strand | paramRangeID            REF                ALT      QUAL      FILTER        cn
                        <Rle> <IRanges>  <Rle> |     <factor> <DNAStringSet> <DNAStringSetList> <numeric> <character> <numeric>
  chr9:129660438_G/C     chr9 129660438       |           NA              G                  C        NA        PASS    5.8201
cnaGR[S4Vectors::to(overlaps)[1000],]
GRanges object with 1 range and 1 metadata column:
      seqnames              ranges strand |        cn
         <Rle>           <IRanges>  <Rle> | <numeric>
  [1]     chr9 129060001-129755000       |    5.8201

Thanks!

Nicolas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant