update treetime docs with jupyter notebook

ktmeaton · Jul 20, 2020 · 9a63c9b · 9a63c9b
1 parent b6996a1
commit 9a63c9b
Showing 1 changed file with 17 additions and 58 deletions.
diff --git a/docs/exhibit/exhibit_main.rst b/docs/exhibit/exhibit_main.rst
@@ -9,15 +9,15 @@ Code Installation
 Clone Repository
 ^^^^^^^^^^^^^^^^
 
-::
+**Shell**::
 
   git clone https://github.com/ktmeaton/plague-phylogeography.git
   cd plague-phylogeography
   conda activate plague-phylogeography-0.1.4dev
 
 Install some accessory tools that are being tested.
 
-::
+**Shell**::
 
   conda install geopy
   conda install cutadapt
@@ -29,7 +29,7 @@ Database
 Create
 ^^^^^^
 
-::
+**Shell**::
 
   nextflow run ktmeaton/plague-phylogeography \
     --ncbimeta_create config/ncbimeta.yaml \
@@ -77,7 +77,7 @@ Curate metadata with a DB Browser (SQLite). Examples of modifying the BioSampleC
 Update, Annotate, Join
 ^^^^^^^^^^^^^^^^^^^^^^
 
-::
+**Shell**::
 
   nextflow run ktmeaton/plague-phylogeography \
    --ncbimeta_update config/ncbimeta.yaml \
@@ -95,7 +95,7 @@ Verify Samples
 
 Select records from the database that are marked as "KEEP: Assembly".
 
-::
+**Shell**::
 
   nextflow run ktmeaton/plague-phylogeography \
    --sqlite_select_command_asm "\"SELECT AssemblyFTPGenbank FROM Master WHERE (BioSampleComment LIKE '%KEEP%Assembly%')\"" \
@@ -109,15 +109,15 @@ Select records from the database that are marked as "KEEP: Assembly".
 
 Check that there are 475 assemblies to be downloaded.
 
-::
+**Shell**::
 
      wc -l results/sqlite_import/assembly_for_download.txt
 
 
 Run Pipeline (With Outgroup)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-::
+**Shell**::
 
   nextflow run ktmeaton/plague-phylogeography \
     --outdir Assembly_Modern_Outgroup \
@@ -130,7 +130,7 @@ Run Pipeline (With Outgroup)
 Run Pipeline (Without Outgroup)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-::
+**Shell**::
 
   nextflow run ktmeaton/plague-phylogeography \
     --outdir Assembly_Modern \
@@ -145,11 +145,11 @@ Run Pipeline (Without Outgroup)
    (latest resume id: 9112a035-a628-4f9d-8955-faa7732a1b73)
 
 Ancient Raw Data Analysis
-^^^^^^^^^^^^^^^^^^^^^^^^^
+-------------------------
 
 Prep tsv input from ktmeaton/plague-phylogeography, select only EAGER Ancient samples
 
-::
+**Shell**::
 
   nextflow run ktmeaton/plague-phylogeography \
     --outdir EAGER_Ancient \
@@ -162,7 +162,7 @@ Prep tsv input from ktmeaton/plague-phylogeography, select only EAGER Ancient sa
 
 Download all samples, run through EAGER
 
-::
+**Shell**::
 
   nextflow run ktmeaton/plague-phylogeography \
     --outdir EAGER_Ancient \
@@ -174,7 +174,7 @@ Download all samples, run through EAGER
 
 SAMN00715800: Split after base 75 into two separate files to maintain proper paired-end format.
 
-::
+**Shell**::
 
   mv EAGER_Ancient/sra_download/fastq/single/${runAcc}_1.fastq.gz \
     EAGER_Ancient/sra_download/fastq/single/${runAcc}_unsplit.fastq.gz;
@@ -195,55 +195,14 @@ SAMN00715800: Split after base 75 into two separate files to maintain proper pai
 
 Remove original unsplit file
 
-::
+**Shell**::
 
    rm EAGER_Ancient/sra_download/fastq/single/SRR341961_unsplit.fastq.gz
 
-| Fix the metadata in the EAGER tsv input file to now be paired end, (optional: mark full UDG!
+| Fix the metadata in the EAGER tsv input file to now be paired end, (optional: mark full UDG!)
 | Rerun EAGER pipeline
 
-Nextstrain
-----------
+Treetime
+------------
 
-Run the nextstrain and treetime section of the pipeline.
-
-::
-
-  nextflow run ktmeaton/plague-phylogeography \
-    --outdir Assembly_Modern \
-    --sqlite_select_command_asm "\"SELECT AssemblyFTPGenbank FROM Master WHERE (BioSampleComment LIKE '%KEEP%Assembly%')\"" \
-    --max_datasets_assembly 500 \
-    --skip_sra_download \
-    --skip_outgroup_download \
-    --iqtree_branch_support \
-    --iqtree_outgroup GCA_000323485.1_ASM32348v1_genomic,GCA_000323845.1_ASM32384v1_genomic \
-    --treetime \
-    -resume
-
-   (latest resume id: 9112a035-a628-4f9d-8955-faa7732a1b73)
-
-Regression Plot
-^^^^^^^^^^^^^^^
-
-**Python**::
-
-  from Bio import Phylo
-  outdir = "Assembly_Modern/nextstrain/treetime_clock/"
-  PY_88 = "GCA_000269405.1_ASM26940v1_genomic"
-  MG05_1020 = "GCA_000169635.1_ASM16963v1_genomic"
-  India195 = "GCA_000182505.1_ASM18250v1_genomic"
-
-  tree = Phylo.read(outdir + divergence_tree.nexus", "nexus")
-  ori_subtree = tree.common_ancestor(PY_88, MG05_1020, India195)
-  Phylo.write(ori_subtree, open(outdir + "ori_subtree.nwk", "w"), "newick")
-
-**Shell Script**::
-
-  treetime clock \
-    --tree $project/nextstrain/treetime_clock/ori_subtree.nwk \
-    --dates $project/nextstrain/metadata_nextstrain_geocode_state.tsv \
-    --date-column BioSampleCollectionDate \
-    --aln $project/snippy_multi/snippy-core.full_CHROM.filter0.fasta \
-    --clock-filter 3 \
-    --keep-root \
-    --outdir $project/nextstrain/treetime_clock/ori_subtree/
+Treetime scripts are in development as Jupyter Notebooks.