Combine the results #72

JunmingH · 2019-11-18T21:13:58Z

Hi I was trying to combine the results together. since there have have issue when I integrate those bam files together. Right now I have CircCoordinates CircRNACount CircSkipJunctions LinearCount Four files for each subjects. I was wondering how could I cimbine each subjects together. using which column to match them?
Thanks!

tjakobi · 2019-11-24T19:21:49Z

Hi @JunmingH,

you would have to combing the different output files, into one set of files with multiple columns for each of the samples. However, the rows will be different, too, since not all circRNAs will be detect in each sample. I would recommend to try to get all sample processed by DCC in one run, possibly with -T 2 or 2 to not create too much CPU and memory load.

Cheers,
Tobias

JunmingH · 2019-11-24T20:05:15Z

Hi @tjakobi Tobias,

I was trying to using server to processing the data but still gave me the error,
Traceback (most recent call last):
File "/restricted/projectnb/casa/jmh/RNA-seq/circu_RNA/DCC-0.4.7/DCC/main.py", line 818, in
main()
File "DCC-0.4.7/DCC/main.py", line 254, in main
minL=options.min, strand=False, pairdendindependent=False, same=same), Input)
File "/share/pkg.7/python2/2.7.16/install/lib/python2.7/multiprocessing/pool.py", line 253, in map
return self.map_async(func, iterable, chunksize).get()
File "/share/pkg.7/python2/2.7.16/install/lib/python2.7/multiprocessing/pool.py", line 572, in get
raise self._value
IndexError: list index out of range

tjakobi · 2019-11-25T10:35:06Z

Could you please attach the log file of that DCC run?

JunmingH · 2019-11-25T15:52:21Z

samplesheet.txt
bam_files.txt
DDC2_e.txt
DDC2_o.txt
Attached is

JunmingH · 2019-11-25T17:53:18Z

@tjakobi Can you give me some idea for this?

tjakobi · 2019-11-26T16:57:33Z

Hi @JunmingH,

I am relatively sure that your command line is not correct, see the following error:

DDC2_o.txt:     => locating circRNAs (unstranded mode) [/restricted/projectnb/casa/jmh/RNA-seq/circu_RNA/script/samplesheet]
DDC2_o.txt:WARNING: File /restricted/projectnb/casa/jmh/RNA-seq/circu_RNA/script/samplesheet, line 2 does not contain all features.
DDC2_o.txt:WARNING: /restricted/projectnb/casa/jmh/RNA-seq/circu_RNA/script/samplesheet is probably corrupt.

Here the Junctions files should be scanned, not the samplesheet.

Can you please provide your complete command line?

Cheers,
Tobias

JunmingH · 2019-11-26T17:09:34Z

@tjakobi Hi Tobias,

Attached is~
python2 ${app_dir}/main.py @samplesheet
-D -N -R ${gtf_dir}/GRCh38_Repeats_simpleRepeats_RepeatMasker.gtf
-an ref/GRCh38/annotation/Homo_sapiens.GRCh38.95.gtf
-F -M -Nr 1 1 -fg -G -A ref/Homo_sapiens.GRCh38.dna.primary_assembly.fa
-T 2 -O /dcc_all_results/
-B @bam_files

tjakobi · 2019-11-26T17:20:02Z

The samplesheet and the command line look okay - however DCC seems to think there is only one input file called samplesheet.

Your command line is not the command line that DCC itself prints out, do you have the complete DCC log, i.e. DCC-2019***.log? That log file contains the actuall command line DCC sees.

Cheers,
Tobias

JunmingH · 2019-11-26T18:36:56Z

@tjakobi Sure,
Attached is
2019-11-24 14:54:23,207 DCC 0.4.7 started
2019-11-24 14:54:23,207 DCC command line: /jmh/RNA-seq/circu_RNA/DCC-0.4.7/DCC/main.py /jmh/RNA-seq/circu_RNA/script/samplesheet -D -N -R jmh/RNA-seq/circu_RNA/script/ref/GRCh38_Repeats_simpleRepeats_RepeatMasker.gtf -an /jmh/ref/GRCh38/annotation/Homo_sapiens.GRCh38.95.gtf -F -M -Nr 1 1 -fg -G -A /bu_brain_rnaseq/hjm_test/step_by_step/ref_RSEM/ref/Homo_sapiens.GRCh38.dna.primary_assembly.fa -T 2 -O /jmh/RNA-seq/circu_RNA/dcc_all_results/ -B /jmh/RNA-seq/circu_RNA/script/bam_files
2019-11-24 14:54:23,422 Starting to detect circRNAs
2019-11-24 14:54:23,422 Non-stranded data, the strand of circRNAs guessed from the strand of host genes
2019-11-24 14:54:23,423 started circRNA detection from file /jmh/RNA-seq/circu_RNA/script/samplesheet

tjakobi · 2019-11-26T20:48:04Z

Hi @JunmingH,

from the log file you can see that the actual command line is

jmh/RNA-seq/circu_RNA/DCC-0.4.7/DCC/main.py /jmh/RNA-seq/circu_RNA/script/samplesheet

While it should be

jmh/RNA-seq/circu_RNA/DCC-0.4.7/DCC/main.py @/jmh/RNA-seq/circu_RNA/script/samplesheet

The @ for the input is missing.

Cheers,
Tobias

JunmingH · 2019-11-26T21:28:55Z

Hi @tjakobi Tobias

Thanks for your help! It's working right now!

tjakobi mentioned this issue Nov 24, 2019

Combining individual circRNA read counts - error #68

Open

tjakobi added the question label Nov 24, 2019

tjakobi closed this as completed Nov 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combine the results #72

Combine the results #72

JunmingH commented Nov 18, 2019

tjakobi commented Nov 24, 2019

JunmingH commented Nov 24, 2019

tjakobi commented Nov 25, 2019

JunmingH commented Nov 25, 2019

JunmingH commented Nov 25, 2019

tjakobi commented Nov 26, 2019 •

edited

Loading

JunmingH commented Nov 26, 2019

tjakobi commented Nov 26, 2019

JunmingH commented Nov 26, 2019

tjakobi commented Nov 26, 2019

JunmingH commented Nov 26, 2019

Combine the results #72

Combine the results #72

Comments

JunmingH commented Nov 18, 2019

tjakobi commented Nov 24, 2019

JunmingH commented Nov 24, 2019

tjakobi commented Nov 25, 2019

JunmingH commented Nov 25, 2019

JunmingH commented Nov 25, 2019

tjakobi commented Nov 26, 2019 • edited Loading

JunmingH commented Nov 26, 2019

tjakobi commented Nov 26, 2019

JunmingH commented Nov 26, 2019

tjakobi commented Nov 26, 2019

JunmingH commented Nov 26, 2019

tjakobi commented Nov 26, 2019 •

edited

Loading