Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
pp_wf.png		pp_wf.png

README.md

usegalaxy.org	usegalaxy.eu

Preprocessing of raw SARS-CoV-2 reads

The raw reads available so far are generated from bronchoalveolar lavage fluid (BALF) and are metagenomic in nature: they contain human reads, reads from potential bacterial co-infections as well as true COVID-19 reads.

What's the point?

Assess quality of reads, remove adapters and remove reads mapping to human genome.

The outline

Illumina and Oxford nanopore reads are pulled from the NCBI SRA (links to SRA accessions are available here). They are then processed separately as described in the workflow section.

Inputs

Only SRA accessions are required for this analysis. The described analysis was performed with all SRA SARS-CoV accessions available as of Feb 20, 2020:

Illumina reads
```
SRR10903401
SRR10903402
SRR10971381
```
Oxford Nanopore reads
```
SRR10948550
SRR10948474
SRR10902284
```

Outputs

This workflow produces three outputs that are used in tow subsequent analyses:

#	Output	Used in
1.	A combined set of adapter-free Illumina reads without human contamination	Assembly
2.	A combined set of Oxford Nanopore reads without human contamination	Assembly
3.	A collection of adapter-free Illumina reads from which human reads have not been removed	Variation detection

The history and the workflow

A Galaxy workspace (history) containing the most current analysis can be imported from here.

The publicly accessible workflow can be downloaded and installed on any Galaxy instance. It contains version information for all tools used in this analysis.

The workflow performs the following steps: