low alignment rate with MeRanGh #4

Vishvak2000 · 2023-12-18T06:30:29Z

Hello, I have been trying to use MeRanTK to align my paired end BS-RNA-Seq reads but I am getting extremely low alignment % (somewhere around ~0.05%). Here is my call:

~/software/meRanTK/meRanGh.pl align \                                                                                                                                              ─╯
            -o test/meRanGhResult \
            -f RB_WT1_S1_L001_R1_001.fastq \
            -r RB_WT1_S1_L001_R2_001.fastq \
            -t 10 \
            -S test_RNA_BSseq.sam \
            -GTF gencode.v44.annotation.gtf \
            -id transcriptome_Gh_hisat_built \
            -dt \

When I use hisat3n to align my reads, I get much better alignment ~60%. If I try to use the bam retrieved from hisat3n on meRanCall, I seem to be stuck at the processing stage:

processing 706 sequences: [KI270757.1 - 100.00%] [overall - 100.00%] done ...

The text was updated successfully, but these errors were encountered:

riederd · 2023-12-18T06:56:13Z

Is your library reverse stranded?
You might try to reverse complement your fastq files first using for e.g. fastx_reverse_complement

Vishvak2000 · 2023-12-18T17:59:48Z

Reverse complementing works - getting much higher alignment rate. However, when I run meRanCall, I seem to be stuck at this step:

Working on: test_RNA_BSseq_sorted.bam
No region BED file specified: calling m5Cs on entire alignment file: test_RNA_BSseq_sorted.bam ...
Starting to process: 549 targets on 20 CPUs ...
processing 549 sequences: [KI270755.1 - 100.00%] [overall - 100.00%] done ...

Here is my function call:
-fasta ../../GRCh38.p14.genome.fa \ -bam test_RNA_BSseq_sorted.bam \ -o meRanCall_result.txt \ -genomeDBref \ -p 20

Using top, I can see that the processes are still running, however, it is taking a suspiciously long time - wondering if this is expected behavior.

Vishvak2000 · 2023-12-19T19:50:07Z

Hello! Just following up on this - it is still stuck on the above step.

riederd · 2023-12-19T22:10:37Z

You did not pass a GTF file to meRanCall so it might take quite long.
Was the bam test_RNA_BSseq_sorted.bam generated with meRanGs or meRanGh ? as you are passing -genomeDBref

Vishvak2000 · 2023-12-19T22:47:50Z

It was generated with meRanGh, should I be passing -tref?

riederd · 2023-12-19T23:10:44Z

No, but a GTF should speed up things

riederd · 2023-12-20T07:59:40Z

processing 549 sequences: [KI270755.1 - 100.00%] [overall - 100.00%] done ...

however, means that it is finished. Did you get any result files?

Vishvak2000 · 2023-12-20T18:35:34Z

I only get the header of the text file in my output. I also don’t get the regular summary output of # of Cs analyzed, etc

riederd · 2023-12-21T12:24:37Z

can you try to run with -debug and see what you get?

Vishvak2000 · 2024-01-09T20:11:51Z

thanks for the suggestion:

chr21:9997359:+ QB:G Cs: 10 : 37

chr21:9997360:+ QB:G Cs: 10 : 37

chr21:9997361:+ QB:T Cs: 10 : 37
processing 549 sequences: [KI270755.1 - 100.00%] [overall - 100.00%] done ...
Done...

Analyzed: test/meRanGhResult/test_RNA_BSseq_sorted.bam

Summary:
Analysis of Cs with minimum coverage of 20

Total duplicate reads filtered: 0

Total analyzed Cs on reference: 0
Total analyzed methylated Cs (<= 80% C->T conversion) on reference: 0
Total analyzed unconverted Cs from queries: 0
Total analyzed unconverted Cs from mutation: 0
Total analyzed C to T conversion rate: 0

Summary over all Cs:

Total Cs on reference covered by seq data: 578988
Total Cs from queries unconverted: 1012141
Total C to T conversion rate estimated: 0.9299

Result file: meRanCall_result.txt

The file returns empty

riederd · 2024-01-09T21:00:58Z

Would it be possible for to share the fastq file? Or let's say the first 1 M reads:

cat RB_WT1_S1_L001_R1_001.fastq | head -4000000 | gzip -c > test_R1_sub.fastq.gz
cat RB_WT1_S1_L001_R2_001.fastq | head -4000000 | gzip -c > test_R2_sub.fastq.gz

Vishvak2000 · 2024-01-09T23:06:52Z

Here are the test files. These are already reverse complemented:

https://app.box.com/folder/243307876541?s=wz09z7wcbjcm5wmidajqq79uw1sxtabv

Vishvak2000 · 2024-01-18T21:19:24Z

Hello! Just following up to see if there are any issues with the fastqs themselves. Appreciate the help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

low alignment rate with MeRanGh #4

low alignment rate with MeRanGh #4

Vishvak2000 commented Dec 18, 2023

riederd commented Dec 18, 2023

Vishvak2000 commented Dec 18, 2023

Vishvak2000 commented Dec 19, 2023

riederd commented Dec 19, 2023

Vishvak2000 commented Dec 19, 2023

riederd commented Dec 19, 2023

riederd commented Dec 20, 2023

Vishvak2000 commented Dec 20, 2023 •

edited

Loading

riederd commented Dec 21, 2023

Vishvak2000 commented Jan 9, 2024

riederd commented Jan 9, 2024

Vishvak2000 commented Jan 9, 2024

Vishvak2000 commented Jan 18, 2024

low alignment rate with MeRanGh #4

low alignment rate with MeRanGh #4

Comments

Vishvak2000 commented Dec 18, 2023

riederd commented Dec 18, 2023

Vishvak2000 commented Dec 18, 2023

Vishvak2000 commented Dec 19, 2023

riederd commented Dec 19, 2023

Vishvak2000 commented Dec 19, 2023

riederd commented Dec 19, 2023

riederd commented Dec 20, 2023

Vishvak2000 commented Dec 20, 2023 • edited Loading

riederd commented Dec 21, 2023

Vishvak2000 commented Jan 9, 2024

riederd commented Jan 9, 2024

Vishvak2000 commented Jan 9, 2024

Vishvak2000 commented Jan 18, 2024

Vishvak2000 commented Dec 20, 2023 •

edited

Loading