Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SD--tsv #9

Open
duhuipeng opened this issue Oct 28, 2020 · 4 comments
Open

SD--tsv #9

duhuipeng opened this issue Oct 28, 2020 · 4 comments

Comments

@duhuipeng
Copy link

Dear author
I'd like to ask you,I run through your code ,generates 3 tsv suffix files,as follow:
image
I'd like to ask which file I should mainly look at. What I' m looking at now is final_decomposition.tsv, This column of the document ,Because I want to predict SV now, I want to know if I can do it
Looking forward to your reply

@duhuipeng
Copy link
Author

Dear author
Can you explain this sentence,I can not understand.why (i+1,j)represent insertion, (j+1,i)represent deletions, and so on
image

@seryrzu
Copy link
Collaborator

seryrzu commented Oct 29, 2020

Hi,

Thank you for your interest in String Decomposer! The final output of the tool is at final_decomposition.tsv.

Wrt your second question --- this is just how the graph is defined. It is pretty much analogous to the matrix alignment of two sequences (see for example, https://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm).

Thanks,
Andrey

@duhuipeng
Copy link
Author

Dear author
image
image
What I want to ask is that in this final_decomposition.tsv,I'm mainly looking at which column to see it structural variation?
Is it the third column with letters?
Looking forward to your reply
Best

@TanyaDvorkina
Copy link
Collaborator

TanyaDvorkina commented Nov 11, 2020

Hi!

Thank you again for your interest in StringDecomposer.
File final_decomposition.tsv has the following columns (from left to right):

  1. Sequence name (usually read or assembly)
  2. Best aligned monomer name (it has ' at the end if the alignment is reverse complement)
  3. Alignment start position on sequence
  4. Alignment end position on sequence
  5. Alignment identity score
  6. Second best aligned monomer name
  7. Second best aligned monomer identity score
  8. Best aligned monomer name, if homopolymers collapsed (like GGGG -> G) in both sequences.
  9. Alignment score for the best monomer with collapsed homopolymers.
  10. Second best aligned monomer with collapsed homopolymers.
  11. And its score.

Sorry for late response!

Thank you,
Tanya

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants