Coordinate system query #2

cbergman · 2015-06-04T14:49:41Z

In the RetroSeq VCF file the position for TE insertions relative to the reference are given on 1-based coordinates in the POS column. In addition, there are a set of two consecutive coordinates in the INFO field, the first of which corresponds to the POS column, and the second corresponds to the next base in the genome. Does this imply that the predicted insertion would intergate between the first and second positions in the INFO field? In other words, to convert RetroSeq predictions to 0-based coordinates, do we (i) use the two coordinates in the INFO field, or (ii) subtract 1 from the POS column to make a new start position on 0-based coordinates?

tk2 · 2015-06-09T20:01:13Z

Yes, that is correct. But to be honest, I never consider the breakpoints to be accurate to the exact bp. Some mini local assembly and realignment could get them to bp accuracy, I just never got around to implementing that.

cbergman · 2015-07-02T09:45:41Z

Thanks and sorry for the slow reply.

We are assuming that "that is correct" refers to "Does this imply that the predicted insertion would integrate between the first and second positions in the INFO field?".

This means that RetroSeq is using the INFO field to represent the TE insertion location (which is in reality inter-base) on 1-based coordinates by annotating a consecutive span of 2 nucleotides, with the insertion site being between the first and second nucleotide. This 2-nucleotide span cannot be represented directly in the POS column of the VCF file, which only allows a 1-based single nucleotide feature to be annotated.

To convert RetroSeq output to 0-based BED format in https://github.com/bergmanlab/mcclintock, we will maintain the 2-nucleotide framework, and thus annotate POS-1 for the start and POS+1 for the end of the 2-nucleotide interval.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coordinate system query #2

Coordinate system query #2

cbergman commented Jun 4, 2015

tk2 commented Jun 9, 2015

cbergman commented Jul 2, 2015

Coordinate system query #2

Coordinate system query #2

Comments

cbergman commented Jun 4, 2015

tk2 commented Jun 9, 2015

cbergman commented Jul 2, 2015