-
Notifications
You must be signed in to change notification settings - Fork 12
Tutorial
This tutorial gives you a step-by-step description on how to use wg-blimp
.
We provide a small toy dataset based on a subsampled dataset containing blood and sperm methylomes. The following download contains read data in .fastq format and the reference sequence in .fasta format: https://uni-muenster.sciebo.de/s/7vpqRSEATYcvlnP. Please note that the results created by this test run are not meant for downstream analysis, and should only be seen as a feature demonstration.
After extraction you should see a folder fastq
containing read data and the file chr22.fasta
containing the reference. There are 4 test samples in total: blood1, blood2, sperm1 and sperm2.
The easiest way to install wg-blimp
is to use Bioconda. If you do not already have a conda
installation set up, please follow the instructions at https://docs.conda.io/projects/conda/en/latest/user-guide/install/.
Once conda
is available you need to include the Bioconda channel using the commands:
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
For more details or troubleshooting installing Bioconda, see https://bioconda.github.io/.
Once Bioconda is available, you may install wg-blimp
using the following command:
conda create -n wg-blimp wg-blimp r-base==4.0.3
This will create a new conda
environment containing the wg-blimp
installation. The installation process will require ~3GB of space and might take a while because all tools used by wg-blimp
need to be downloaded and installed. Creating a fresh environment is highly advised to prevent incompatibilities causing errors. Please note that pinning r-base
to a specific version here will drastically speed up conda dependency solving.
Before you can use wg-blimp
you need to activate the conda
environment that was created in the previously described installation step:
conda activate wg-blimp
After that command you can use the command:
wg-blimp --help
If everything was set up correctly, you should see a help message similar to:
Usage: wg-blimp [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
create-config Create a config YAML file for running the...
delete-all-output Remove all files generated by the pipeline.
run-shiny Start shiny GUI using configuration files for...
run-snakemake Run the Snakemake pipeline from command line.
run-snakemake-from-config Run the snakemake pipeline using a config file.
Whenever you don't know what command or parameters to use, you may use wg-blimp --help
to gain information about the correct syntax to use. As displayed by the help message, there are multiple ways to run wg-blimp
: You can either directly invoke the full workflow with a single command, or first create a configuration file and run the workflow using the created configuration file.
You can use the command
wg-blimp run-snakemake --help
to get a detailed information about the syntax for running the whole workflow. Make sure your current working directory contains the downloaded fastq/
dir and chr22.fasta
reference.
Before actually invoking a computationally heavy workflow, it is usually recommended to perform a dry run to see if everything is set up correctly:
wg-blimp run-snakemake --cores=8 fastq/ chr22.fasta blood1,blood2 sperm1,sperm2 results --dry-run
If you are satisfied with the steps executed, you may use
wg-blimp run-snakemake --cores=8 fastq/ chr22.fasta blood1,blood2 sperm1,sperm2 results
to actually invoke runing the whole pipeline. If everything runs without errors, a folder results
containing all analysis data will show up. This folder contains the annotated DMR lists as well as methylation reports and QC data. Please note that a configuration file results/config.yaml
is automatically generated to see which parameters have been used.
When dealing with analysis pipelines it is often useful to inspect and manually change analysis parameters if necessary. wg-blimp
provides commands to first create a configuration file, and then run the analysis workflow from the configuration file. To create a configuration file, you may use the command:
wg-blimp create-config --cores-per-job=4 fastq/ chr22.fasta blood1,blood2 sperm1,sperm2 results-from-config wg-blimp-config.yaml
This syntax is very similar to the wg-blimp run-snakemake
command, but instead of running the whole workflow, only a file wg-blimp-config.yaml
will be created. In this file, you may change parameters as you wish, and run the workflow later on (see README for details on available parameters). Once you are satisfied with your configuration file, you can use
wg-blimp run-snakemake-from-config --cores=8 wg-blimp-config.yaml
to invoke the actual analysis.
Once Snakemake finishes execution, you may use wg-blimp
's user interface to inspect the analysis results. To start a Shiny web server, you can use the command:
wg-blimp run-shiny results/config.yaml
Once the server is running, you can access the interface by opening http://localhost:9898 in any web browser that Shiny supports. Please note that the port Shiny listens on can be configured, for details you can use wg-blimp run-shiny --help
.
Once you have finished this tutorial, you may have a deeper look at the repository's README, it contains some more in-depth explanations of pipeline parameters. If you encounter any errors or have any wishes for features, feel free to write a mail to mar.w@wwu.de or open an issue here on GitHub!