Skip to content

pwwang/immunopipe-example

Repository files navigation

Example for immunopipe

In this tutorial we will show you how to run the immunopipe pipeline on a small dataset of 6 patients from 3 groups: colitis (n=2), non-colitis(n=2) and control(n=2). The dataset is part of the data used in the publication below:

We are using a small subset of the data to make the tutorial run faster. The full dataset can be downloaded from Gene Expression Omnibus (GEO) GSE144469.

Download and prepare the data

The data can be downloaded and prepared by running the following commands:

# Clone the example repository
git clone https://github.com/pwwang/immunopipe-example.git

# Enter the example directory
cd immunopipe-example

# Download and prepare the data
bash prepare-data.sh
# The data from GSE144469 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE144469)
# will be downloaded and extracted into:
#
#   ./prepared-data/C1
#   ./prepared-data/C2
#   ...
#

You may also check other files in the data/ directory, especially the samples.txt file, which contains the sample information for the dataset we prepared above.

Prepare the configuration file

To run the pipeline, we need to prepare a configuration file (recommended) or pass the arguments directly via command line. Here we will use the configuration file. See also Configurations for more details.

As explained in the Configurations page, we can provide a configuration file with a minimal set of configuration items to get the pipeline running. The only required configuration item is the input file for the SampleInfo process. However, here we want to give the pipeline a different name and output directory to distinguish it from other runs with a different set of configurations.

The configuration file shall be in the TOML format. We can create a file named ImmunopipeMinimal.config.toml with the following content:

name = "ImmunopipeMinimal"
outdir = "minimal"

[SampleInfo.in]
infile = [ "data/samples.txt" ]

Run the pipeline

The easiest way to run the pipeline is to run it with docker. We can use the following command to run the pipeline with the configuration file we just created:

docker run \
    --rm -w /workdir -v .:/workdir \
    justold/immunopipe:master \
    @ImmunopipeMinimal.config.toml

or with singularity:

singularity run \
    --pwd /workdir -B .:/workdir,/tmp -c -e --writable-tmpfs \
    docker://justold/immunopipe:master \
    @ImmunopipeMinimal.config.toml

or with apptainer:

apptainer run \
    --pwd /workdir -B .:/workdir,/tmp -c -e --unsquash --writable-tmpfs \
    docker://justold/immunopipe:master \
    @ImmunopipeMinimal.config.toml

Tip

docker, singularity and apptainer commands map the current directory (.) to the /workdir directory in the container. To get the detailed directory structure in the container, please refer to the The directory structure in the container.

Tip

If you want to install and run the pipeline without docker, please refer to the Installation and Running the pipeline pages for more details.

Note

You need at least 16G memory to run the example with minimal configuration.

Check the results

With that "minimal" configuration file, only a subset of the processes will be run. See also Enabling/Disabling processes. The results will be saved in the minimal directory. You can also check the reports at minimal/REPORTS/index.html with a web browser.

You can also visit the following link to see the reports of the pipeline we just ran:

http://imp.pwwang.com/minimal/REPORTS/index.html

Next steps

You may read through the immunopipe documentation to learn more about the pipeline and how to configure it. There is also a configuration file, named Immunopipe.config.toml in the example repository, with more processes enabled. Check out the following link for the reports that you run the pipeline with the dataset prepared above. You can also add more samples to the data/samples.txt file and modify the configuration file to run the pipeline with more comphrehensive analyses.

http://imp.pwwang.com/output/REPORTS/index.html

Note

The results provided by this example configuration files are for demonstration purpose only. They are not intended to be used for any scientific analysis.

You may also want to try other routes of the pipeline with the prepared data. These routes are defined in:

  • ImmunopipeMinimalNoTCR.config.toml: The configuration for minimal analyses without scTCR-seq data.
  • ImmunopipeMinimalSupervised.config.toml: The configuration for minimal analyses with supervised clustering of T cells.
  • ImmunopipeNoTCR.config.toml: The configuration for full analyses without scTCR-seq data.
  • ImmunopipeSupervised.config.toml: The configuration for full analyses with supervised clustering of T cells.
  • ImmunopipeWSNoTCR.config.toml: The configuration for full analyses without scTCR-seq data, but with selection of T cells.

Also check out the gallery for more real-world examples.