Skip to content

4 Output files and directory structure

Nuno Fonseca edited this page Mar 13, 2018 · 4 revisions

Output files and directory structure

All output files produced by IRAP will be placed in sub-folders under the folder with the name of the experiment defined in the configuration file. The directory tree created has the following structure (where [X] denotes the value of the parameter X):

[name]/

  • data
  • report
  • irap_qc/
    • [mapper]/
      • [quant_method]/
        • [de_method]/
        • [fusion_method]/

Contents of the different folders:

  • data/: auxiliary files create by iRAP

  • report/: HTML report (mostly html files and plots).

  • irap_qc/: contains mainly the FASTQ files (filtered if the qc option is enabled) with the reads given as input to the mapper .

  • [mapper]/: contains all the output files generated by the mapper . It also includes the BAM files sorted by name, by chromosomal position and respective indexes.

  • [quant_method]/: contains the output files generated by the quantification method [quant_method] together with TSV files containing a matrix with the number of reads per gene (genes.raw.<quant_method>.tsv), transcript (transcripts.raw.<quant_method>tsv) and exon (exons.raw.<quant_method>.tsv) for each BAM file. When transcript quantification is enabled, two extra files are generated. One file contains the relative transcript isoform usage (transcripts.riu.<quant_method>.irap.tsv) and the other the dominant transcripts for each gene (transcripts.dt.<quant_method>.irap.tsv).

  • [de_method]/: contains the output files generated by the differential expression method [de_method] and a summary tsv file (.genes_de.tsv), for each contrast (defined in the configuration file), with fold change, p-value, adjusted p-value, gene name, GO, and other information for each gene. Currently, IRAP only supports differential expression analysis at gene level.

If gene set enrichment analysis was enabled then the files TSV files (.gse.<gse_tool>.<gse_method>.go.tsv and .gse.<gse_tool>.<gse_method>.kegg.tsv) with the gene sets and p-values will be kept in this folder.

  • [fusion_method]/: contains the output files generated by the fusion method selected for each library. It will also include summary files (.fusion.tsv and .fusion.sum.tsv) per library and a matrix with the reads per library supporting each fusion (<fusion_method>_readcounts.tsv).