-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/1 and /2 suffixes in paired-end reads #325
Comments
Hmm, yeah, there is an expectation that the read names will be exactly the same. We could easily add an option to either skip this check or strip the suffix before checking. Thoughts @IanSudbery? |
We can set to ignore in extract, or set a specific allowance for /1 and /2. |
OK. I'll set this up on a branch for @gcorre to test out |
Hi @gcorre - Could you try installing This branch also introduces |
Hi,
but (there is always one !) when mapping with STAR, the suffix is trimmed including the UMI-CellBC on its right so the information is lost for the next steps: Would it be possible to add the UMI-cell-barcode before the /1 /2 suffix like: best |
hi @gcorre. Thanks for testing this out. The STAR manual does indeed state that these suffixes are removed (copied below)
I'll update the branch today to add the UMI before the suffix. |
Or UMI-tools could just remove the suffixes. I don't think they are needed for anything, and I'm pretty sure all aligners remove them. |
Thanks, best |
Yeah, this would be my concern about removing the suffixes too. I don't know of any tool which actually uses the suffixes but I'm loath to remove them! @gcorre - The latest version of the branch should insert the UMIs inbetween the read name and suffix. |
hi, after a test on 4 single cells libraries (140 million reads each), everythink looks OK in terms of read name and processing time is not significantly longer (around 2h each). |
Great. Let me know when you've run the alignment and deduplication and confirmed this all works OK. Then I'll merge this into the master and close the issue. |
I wonder if instead of having two options we should just have one, that
takes the delimiter and only uses it if one is provided. In fact I reckon
that we already have a default, which is space.
…On Fri, 15 Mar 2019, 4:16 pm Tom Smith, ***@***.***> wrote:
Great. Let me know when you've run the alignment and deduplication and
confirmed this all works OK. Then I'll merge this into the master and close
the issue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#325 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFJFjo81ivtg5ebDxY-BGKhS2q1nLZ36ks5vW8dOgaJpZM4btnh9>
.
|
Yeah I figured you might prefer this route. Hadn't considered that we could just use space by default but this does seem like adding an unnecessary cell to Optimally, the check for stripping should occur once in the call to |
Sorry, my point wasn't that we could use |
Ah, right, got it. |
Indeed, a choice of the delimiter may be the solution (space by default and user defined). here is a log of the complete pipeline on a subset of my libraries (2.5M reads) with leading rows of outputs: best |
This is now resolved in the master branch with option |
Hi,
Can umi-tools analyze paired-end reads with /1 and /2 suffixes instead of 1:N and 2:N ?
trying with v0.5.5 and get the error message below with umi-tools extract:
Thanks,
The text was updated successfully, but these errors were encountered: