From d7b19e3b6678821e8c1cd6f4aef00b9640daa31d Mon Sep 17 00:00:00 2001 From: Taylor Reiter Date: Tue, 23 Aug 2022 07:26:13 -0400 Subject: [PATCH] column header typo Co-authored-by: Adair Borges <68403591+borgesadair1@users.noreply.github.com> --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 4b7837e..baac250 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ # Identifying sequences that are in a metatranscriptome but not in a metagenome This repository curates a set of paired metagenomes and metatranscriptomes and provides a pipeline to rapidly identify the fraction of sequences in a metatranscriptome that are not in a metagenome. -The pipeline is shaped around a metadata file, 'inputs/metadata-paired-mgx-mtx.tsv', that contains a sample name (`sample_name`), metagenome SRA run accession (`mgx_run_accession`; `SRR*`, `ERR*`, `DRR*`), metatranscriptome SRA run accession (`mtxx_run_accession`), and a sample type (`sample_type`). +The pipeline is shaped around a metadata file, 'inputs/metadata-paired-mgx-mtx.tsv', that contains a sample name (`sample_name`), metagenome SRA run accession (`mgx_run_accession`; `SRR*`, `ERR*`, `DRR*`), metatranscriptome SRA run accession (`mtx_run_accession`), and a sample type (`sample_type`). Using the run accessions, it downloads the sequencing data from the SRA and generates a [FracMinHash sketch](https://www.biorxiv.org/content/10.1101/2022.01.11.475838v2.abstract) of each run. Then, it uses the paired information encoded in the metadata table to subtract the metagenome sketch from the metatranscriptome sketch. This produces an estimate of the fraction metatranscriptome sequences not found in the paired metagenome.