From 32f7690a661aaf744575c40d69bd9eda8c8fbc14 Mon Sep 17 00:00:00 2001 From: Nick Popp Date: Thu, 23 May 2024 12:03:22 -0700 Subject: [PATCH] update readme, add unzip script --- Readme.md | 12 ++++++++++++ unzip_input_files.sh | 14 ++++++++++++++ 2 files changed, 26 insertions(+) create mode 100644 unzip_input_files.sh diff --git a/Readme.md b/Readme.md index 5952123..8946019 100644 --- a/Readme.md +++ b/Readme.md @@ -5,3 +5,15 @@ R 4.0.0 or greater ## Purpose This repository houses all of the analysis and figures for the Multiplexed Assay of Variant Effect (MAVE) in this paper called Compartmentalized Self-Replication Deep Mutational Scanning (CSR-DMS). It includes an R script that takes processed sequencing data from both short- and long-read sequencing and calculates functional scores for nearly all missense, nonsense, synonymous, and single amino acid deletions in our designed TFO polymerase. It also contains scripts that were used remotely to process our raw Illumina and PacBio sequencing data. Figure panels and analysis products are available in the outputs folder. + +## Instructions for use + +1. Clone or fork this Github repository + +2. Navigate to the downloaded folder. All input and output data should be present already. To run the script on your own, first, you need to run this shell script to unpack the input data. + +`sh unzip_input_files.sh` + +3. Open R/RStudio and open the 221224_TFO_pacbio_subassembly.Rmd document. This will generate any figures and tables from the PacBio long-read sequencing data that was used to isolate and sequence DNA variants and their respective degenerate barcodes. + +4. Next, open the 230227_CSR_scoring.Rmd document in R/RStudio. This will generate the functional scores from the MAVE selection assay (CSR-DMS) and plot heatmaps and other analysis from these data. diff --git a/unzip_input_files.sh b/unzip_input_files.sh new file mode 100644 index 0000000..8d20ce8 --- /dev/null +++ b/unzip_input_files.sh @@ -0,0 +1,14 @@ +#!/bin/bash + +## this script takes input files that were compressed for storage and distributionand unzips them + +## requirements: +## none + +## standard run command: sh unzip_input_files.sh + +## ensure errors stop the process instead of powering through +set -e + +## unzip all files +find . -name "*.gz" -exec gunzip -fv {} \;