Skip to content

Snakemake pipeline for analysis of Hi-C data

License

Notifications You must be signed in to change notification settings

abcsFrederick/fruitsc

 
 

Repository files navigation

FruitsC : A Snakemake Pipeline for HiC data analysis

This snakemake pipeline was developed for genome wide HiC data and uses a popular HiC tool called Juicer. It has been developed by the CCBR team for use in the NIH's Biowulf HPC Cluster and is currently in development mode

More details about Juicer

The Juicer pipeline was installed and optimized for the NIH's HPC Biowulf cluster by its HPC team. On top of that, the CCBR team has optimized some of the memory settings for the pipeline to run smoothly for very large samples. The largest fastq.gz file executed is about 135-140GB in size (each of the forward and reverse fastq.gz files were close to 70GB)

About this FruitsC Snakemake Pipeline

This pipeline includes the following steps described below.

  • Quality check of raw fastq reads
  • Trimming low quality reads
  • Check quality of the trimmed reads
  • Make a QC HTML report
  • Call Juicer HiC tool which involves the following steps
    • Generate Hi-C contact maps
    • Normalization of hic files
    • Call Arrowhead tool to identify TADs
    • Call Hiccups tool to identify loops

How to set up snakemake pipeline

Declarations

This work has been developed and tested solely on NIH HPC Biowulf.

Author contributions

The following members contributed to the development of the this pipeline including source code and logic:

About

Snakemake pipeline for analysis of Hi-C data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 74.2%
  • Python 25.8%