Skip to content

Contributing

Luke Zappia edited this page Sep 13, 2022 · 2 revisions

To contribute to the project please follow these steps:

  1. Set up the project as described on the Setup page (including any extra steps for developers)
  2. Create an issue for whatever you would like to contribute using the appropriate issue template
  3. Create a new branch for your contribution
  4. Make the changes required for your contribution to the new branch
  5. Submit a pull request and follow the steps on the PR template. Each PR should add one feature or fix one bug.
  6. Work with any reviewers to refine your PR
  7. Your PR is merged, congratulations on your contribution 🎉!

The following sections provide some more detail about the kinds of files you may include in your contribution. Not all of these will be required depending on what your contribution is.

Scripts

The bin/ directory contains script files that perform the main functionality of the pipeline. There should be one script for each dataset, method or metric named with the appropriate prefix. Scripts can be written using either Python or R (or potentially other languages). When adding a new script please begin by copying the appropriate template (_template.py or _template.R) or a closely related existing script. There are files containing some common functions (_functions.py and _functions.R) that should be reused whenever possible. Additional functions can be added to these if they are required by multiple scripts.

Script files can be styled by running ./style_bin.sh. However, this requires additional dependencies that you may not have installed and is not required.

Environments

Each script is executed in a conda environment which can be found in the envs/ directory. Each environment contains the dependencies for a particular tool and may be reused by multiple scripts (for example the scanpy.yml environment is used by several steps). If you require a tool that does not yet have an environment create a new environment by copying the _template.yml file. Additional dependencies can be added or unneeded dependencies removed but the versions in the template should not be changed to maintain consistency across environments (with some exceptions).

Workflows

Any new steps will need to be added to the Nextflow pipeline. This requires some more knowledge of how the pipeline works and can be done as part of the review process for the PR. If you are comfortable working with Nextflow you can edit the appropriate file in workflows/

Configuration

Any new datasets, methods or metrics will not be run unless they are included in conf/full-analysis.yml. This is particularly important for datasets as this is where the names of the columns containing batch and label information are recorded. If you are familiar with YAML this file should be easy to edit but if not this can also be done as part of the PR review.

Clone this wiki locally