Skip to content

Commit

Permalink
Merge branch 'make_metadata' of github.com:Arcadia-Science/2022-mtx-n…
Browse files Browse the repository at this point in the history
…ot-in-mgx-pairs into make_metadata
  • Loading branch information
taylorreiter committed Aug 23, 2022
2 parents a7004d8 + d7b19e3 commit c0519af
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Identifying sequences that are in a metatranscriptome but not in a metagenome

This repository curates a set of paired metagenomes and metatranscriptomes and provides a pipeline to rapidly identify the fraction of sequences in a metatranscriptome that are not in a metagenome.
The pipeline is shaped around a metadata file, 'inputs/metadata-paired-mgx-mtx.tsv', that contains a sample name (`sample_name`), metagenome SRA run accession (`mgx_run_accession`; `SRR*`, `ERR*`, `DRR*`), metatranscriptome SRA run accession (`mtxx_run_accession`), and a sample type (`sample_type`).
The pipeline is shaped around a metadata file, 'inputs/metadata-paired-mgx-mtx.tsv', that contains a sample name (`sample_name`), metagenome SRA run accession (`mgx_run_accession`; `SRR*`, `ERR*`, `DRR*`), metatranscriptome SRA run accession (`mtx_run_accession`), and a sample type (`sample_type`).
Using the run accessions, it downloads the sequencing data from the SRA and generates a [FracMinHash sketch](https://www.biorxiv.org/content/10.1101/2022.01.11.475838v2.abstract) of each run.
Then, it uses the paired information encoded in the metadata table to subtract the metagenome sketch from the metatranscriptome sketch.
This produces an estimate of the fraction metatranscriptome sequences not found in the paired metagenome.
Expand Down

0 comments on commit c0519af

Please sign in to comment.