-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
methylation bias with tagmentation based WGBS library #564
Comments
Sorry the graph labels did not show up in the post: |
Hi @docatherine Thanks for sharing these details; I am not sure I was explicitly aware of biases arising from tagmentation experiments, but I'm not very surprised to learn that the do exist. We have seen such biases, both on the sequence composition and methylation-bias level, for a variety of applications, e.g. PBAT and single-cell applications: https://sequencing.qcfail.com/applications/pbat/. In our cases, it proved much better to get rid of the biased positions altogether by hard-clipping the affected residues before mapping (rather than just ignoring the methylation calls), as the alignment rates were often much worse due to additional errors and InDels in the biased positions. A command lilke:
should do the job, maybe you could compare mapping efficiencies? Regarding the methylation values themselves, did you use the Furthermore, I agree that it is puzzling to see the regions with high methylation levels disappear when aggregating all data. A Sequence preference for T of the Tn5 might indeed explain this phenomenon. To understand whether Tn5 might preferentially target such unmethylated cytosines it would be important to know whether Tn5 is used prior to the bisulfite conversion process, i.e. is there a chance that unmethylated regions are converted to Ts, which then get cleaved? If the conversion takes place afterwards (which tends to occur on a single-stranded fragment), it might not explain the preference as straight forwardly. |
Hi Felix, Yes, in the Zymo WGBS library kit the bisulfite conversion occurs first so unmethylated Cs should be converted to Ts before cleavage. We will sequence this NOME-seq DNA using Nanopore which will, I hope, tell us whether the issue was related to the library prep. Catherine |
Thanks very much for these nice comments, they are very much appreciated! I shall go ahead and close this issue for the time being, you can always re-open it when you got additional information available? Best wishes, Felix |
Hi Felix,
![image](https://user-images.githubusercontent.com/39420501/216281727-a6a60758-494c-476b-9887-36614167d095.png)
Have you ever seen methylation bias due to tagmentation based WGBS libraries?
We performed some WGBS using the EZ DNA Methylation Kit which used Tn5 and observed a strange M-bias profile on both reads for the % methylation but also for the total CHG and CHH call:
(Of note, we did GpC methyltransferase treatment which methylated in C at GpC sites of open chromatin region (NOME-seq approach, which explains the higher level of non CpG methylation but I don t think explain this bias especially at position 5).
The fastqc showed a bias within the 10 first bp which seems to be consistent with the known bias due to Tn5 preferred cutsites. So at first I did not worry and ignored the methylation call for those bases, although I am not sure why the Gs are also lower. I did not include read 2 but it looks exactly the same as read 1 with both lower G and C level which is also strange)
![image](https://user-images.githubusercontent.com/39420501/216283100-dd6c95ab-5ac0-42c5-a7ad-100567358cb6.png)
![image](https://user-images.githubusercontent.com/39420501/216285626-2e2cec4d-0ada-4831-8f11-3970ef97a458.png)
However, when comparing the methylation level of the CpG (after excluding the GCG which can have ambiguous methylation calling due to the GpC methylation and excluding the methylation call within the first 15 bp) with RRBS data generated in the same sample, I noticed that the high methylated peak "disappeared"
I really don't think that this is due to the GpC methylation which in case of non specific methylation from the GpC methyltransferase would tend to artificially increase the methylation at CpG not the opposite. Low BS conversion would also look like hypermethylation not hypo.
I was thinking that maybe the preference of the Tn5 for a T at position 5 (even on genomic DNA) could explain a bias toward unmethylated CpG at this position (and overall on the CpGs covered by the same read)? Do you think it is possible. I looked in the literature and nobody mentioned a potential methylation bias for the tagmentation based methylseq libraries. I saw a paper showing that the methylation does not affect the Tn5 cutsites but they don t tell if the Tn5 cutsite bias could affect methylation call.
Thank you for your input and expertise.
The text was updated successfully, but these errors were encountered: