Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V0.9.2 dev #36

Merged
merged 48 commits into from
May 9, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
78b9d04
Add initial reworked heatmap script
samnooij Apr 30, 2019
7642f96
Add dataframe creation and colour argument
samnooij Apr 30, 2019
da84ec0
Add functions to create useful information chunks
samnooij Apr 30, 2019
4a2646c
Add statistics output
samnooij May 1, 2019
31daf2e
active dev branch for v0.9.2
florianzwagemaker May 1, 2019
53645b6
flexible port from yaml file pipeline_parameters
florianzwagemaker May 1, 2019
7d8cb7d
First addition of working with unique run IDs
florianzwagemaker May 3, 2019
bd16b20
succesfully binding unique-identifier to Jupyter
florianzwagemaker May 6, 2019
548068e
Minor fix in wrapper and fix for issue #32
florianzwagemaker May 6, 2019
cbb7b92
executable files in bin/ (issue #31)
florianzwagemaker May 6, 2019
13d3392
added explanation comment for r-base in conda env
florianzwagemaker May 6, 2019
eabde9b
removed creation of symlink from start_nginx.sh
florianzwagemaker May 6, 2019
b24fa71
correct links in individual html files to igvjs
florianzwagemaker May 6, 2019
cf614a5
variables.yaml to bin/, files in bina are exec
florianzwagemaker May 7, 2019
d4299de
set version numbers with git tag
florianzwagemaker May 7, 2019
bbfa573
Add heatmap drawing function
samnooij May 7, 2019
53c8370
First step in typingtool work-around. Script made, still need to upda…
DennisSchmitz May 7, 2019
8a2f0e9
Added automatic IGVjs installation on user consent
florianzwagemaker May 7, 2019
d80547b
IGVjs installation on user confirm
florianzwagemaker May 7, 2019
4b4a7a8
Complete bugfix with missing taxa, #12
samnooij May 7, 2019
61d734b
Replace original script
samnooij May 7, 2019
28334ad
Adept Snakefile to new draw_heatmaps
samnooij May 7, 2019
f8ccf1a
Merge commit '28334ad202f7f537d62f7860a72e14457a2e6751' into v0.9.2-dev
samnooij May 7, 2019
1f7792f
Parsing of HAV fa complete
DennisSchmitz May 8, 2019
82dd712
HAV typingtool integrated, tested and approved
DennisSchmitz May 8, 2019
5e2ae50
solved merge conflicts
florianzwagemaker May 8, 2019
0c7aff2
Merge pull request #34 from DennisSchmitz/Nginx-igvjs
florianzwagemaker May 8, 2019
49ed611
solved minor changes because of merge conflict
florianzwagemaker May 8, 2019
197dbff
cleanup of wrapper
florianzwagemaker May 8, 2019
489d162
**untested** compatibility for draw_heatmaps
florianzwagemaker May 8, 2019
766f871
typingtool script complete, also included HEV
DennisSchmitz May 8, 2019
d340380
fix for snakemake and draw_heatmaps
florianzwagemaker May 8, 2019
21df976
Remove TT rule from snakefile, removed redundant config file parameters
DennisSchmitz May 8, 2019
cbe12b8
TT stabily integrated and tested
DennisSchmitz May 8, 2019
ed0a6d5
Compatibility of conda installation for Ubuntu
florianzwagemaker May 9, 2019
dc98233
Improved explanation of onsucces action
florianzwagemaker May 9, 2019
07cc9ac
cleaned up a bit
DennisSchmitz May 9, 2019
348cd29
clarified comments
DennisSchmitz May 9, 2019
75f5c67
added typing-tool to wrapper
DennisSchmitz May 9, 2019
21e92a5
removed redundant comment line
DennisSchmitz May 9, 2019
7e212e5
added checker for TT arguments
DennisSchmitz May 9, 2019
5a1ddfd
moving variables from bin to data folder
florianzwagemaker May 9, 2019
8828aea
Merge branch 'v0.9.2-dev' into TT-fix_ORF-info
DennisSchmitz May 9, 2019
c1b8691
Merge pull request #35 from DennisSchmitz/TT-fix_ORF-info
DennisSchmitz May 9, 2019
d43686f
stable, tested, ready for merge with master
DennisSchmitz May 9, 2019
d13c13f
:)
florianzwagemaker May 9, 2019
ed5e277
Merge branch 'v0.9.2-dev' of https://github.com/DennisSchmitz/Jovian …
florianzwagemaker May 9, 2019
da3874b
re-added executables in bin
florianzwagemaker May 9, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,5 @@ DELME*
LSF_Snakemake.*
private_*
.github/
.vscode/
.vscode/
data/variables.yaml
89 changes: 76 additions & 13 deletions Notebook_report.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -273,10 +273,19 @@
"metadata": {},
"source": [
"___\n",
"## Virus typing tool output\n",
"## Virus typing results\n",
"___"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The virus typing outputs are currently not automatically generated via Jovian due to overloading and crashing the web-service (further information can be found [here](https://github.com/DennisSchmitz/Jovian/issues/29)).\n",
"\n",
"A long-term solution is being worked on, as a work-around we've included scripts to generate these data separately. However, we kindly ask you to <u>**use this sparingly**</u> as to not overload and break the web-service. Instructions on how to generate these data can be found [here](https://github.com/DennisSchmitz/Jovian/issues/29)."
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -295,15 +304,13 @@
},
"outputs": [],
"source": [
"if os.path.exists(\"results/all_NoV-TT.tsv\") and os.path.getsize(\"results/all_NoV-TT.tsv\") > 0:\n",
" NoV_TT_df = pd.read_csv(\"results/all_NoV-TT.tsv\" , sep = \",\")\n",
"elif os.path.exists(\"results/all_NoV-TT.tsv\") and os.path.getsize(\"results/all_NoV-TT.tsv\") == 0:\n",
"if os.path.exists(\"results/all_NoV-TT.csv\") and os.path.getsize(\"results/all_NoV-TT.csv\") > 0:\n",
" NoV_TT_df = pd.read_csv(\"results/all_NoV-TT.csv\" , sep = \",\")\n",
"elif os.path.exists(\"results/all_NoV-TT.csv\") and os.path.getsize(\"results/all_NoV-TT.csv\") == 0:\n",
" print(\"No viral scaffolds with species equal to \\\"Norwalk virus\\\" were found in this dataset.\")\n",
" NoV_TT_df = pd.DataFrame({'NA' : [\"No\", \"Norwalk virus\", \"species\", \"scaffolds\", \"found\"]})\n",
"else:\n",
" print(\"The file \\\"results/all_NoV-TT.tsv\\\" does not exist. Something went wrong. Please see the logfiles below:\")\n",
" print(\"\\t\\\"logs/Viral_typing_[sample_name].log\\\"\")\n",
" print(\"\\t\\\"logs/Concat_TT_files.log\\\"\")\n",
" print(\"The file \\\"results/all_NoV-TT.csv\\\" does not exist. If you want this information, please see https://github.com/DennisSchmitz/Jovian/issues/29 for instructions.\")\n",
" NoV_TT_df = pd.DataFrame({'Error' : [\"Please\", \"see\", \"error\", \"message\", \"above\"]})\n",
"\n",
"qgrid.show_grid(NoV_TT_df, show_toolbar=False, grid_options=grid_options)"
Expand All @@ -326,20 +333,76 @@
},
"outputs": [],
"source": [
"if os.path.exists(\"results/all_EV-TT.tsv\") and os.path.getsize(\"results/all_EV-TT.tsv\") > 0:\n",
" EV_TT_df = pd.read_csv(\"results/all_EV-TT.tsv\" , sep = \",\")\n",
"elif os.path.exists(\"results/all_EV-TT.tsv\") and os.path.getsize(\"results/all_EV-TT.tsv\") == 0:\n",
"if os.path.exists(\"results/all_EV-TT.csv\") and os.path.getsize(\"results/all_EV-TT.csv\") > 0:\n",
" EV_TT_df = pd.read_csv(\"results/all_EV-TT.csv\" , sep = \",\")\n",
"elif os.path.exists(\"results/all_EV-TT.csv\") and os.path.getsize(\"results/all_EV-TT.csv\") == 0:\n",
" print(\"No viral scaffolds with family equal to \\\"Picornaviridae\\\" were found in this dataset.\")\n",
" EV_TT_df = pd.DataFrame({'NA' : [\"No\", \"Picornaviridae\", \"family\", \"scaffolds\", \"found\"]})\n",
"else:\n",
" print(\"The file \\\"results/all_EV-TT.tsv\\\" does not exist. Something went wrong. Please see the logfiles below:\")\n",
" print(\"\\t\\\"logs/Viral_typing_[sample_name].log\\\"\")\n",
" print(\"\\t\\\"logs/Concat_TT_files.log\\\"\")\n",
" print(\"The file \\\"results/all_EV-TT.csv\\\" does not exist. If you want this information, please see https://github.com/DennisSchmitz/Jovian/issues/29 for instructions.\")\n",
" EV_TT_df = pd.DataFrame({'Error' : [\"Please\", \"see\", \"error\", \"message\", \"above\"]})\n",
"\n",
"qgrid.show_grid(EV_TT_df, show_toolbar=False, grid_options=grid_options)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Hepatitis A typing tool output: \n",
"[Link to the hepatatis A typing tool](https://www.rivm.nl/mpf/typingtool/hav/) "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false
},
"outputs": [],
"source": [
"if os.path.exists(\"results/all_HAV-TT.csv\") and os.path.getsize(\"results/all_HAV-TT.csv\") > 0:\n",
" HAV_TT_df = pd.read_csv(\"results/all_HAV-TT.csv\" , sep = \",\")\n",
"elif os.path.exists(\"results/all_HAV-TT.csv\") and os.path.getsize(\"results/all_HAV-TT.csv\") == 0:\n",
" print(\"No viral scaffolds with genus equal to \\\"Hepatovirus\\\" were found in this dataset.\")\n",
" HAV_TT_df = pd.DataFrame({'NA' : [\"No\", \"Hepatovirus\", \"genus\", \"scaffolds\", \"found\"]})\n",
"else:\n",
" print(\"The file \\\"results/all_HAV-TT.csv\\\" does not exist. If you want this information, please see https://github.com/DennisSchmitz/Jovian/issues/29 for instructions.\")\n",
" HAV_TT_df = pd.DataFrame({'Error' : [\"Please\", \"see\", \"error\", \"message\", \"above\"]})\n",
"\n",
"qgrid.show_grid(HAV_TT_df, show_toolbar=False, grid_options=grid_options)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Hepatitis E typing tool output: \n",
"[Link to the hepatatis E typing tool](https://www.rivm.nl/mpf/typingtool/hev/) "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false
},
"outputs": [],
"source": [
"if os.path.exists(\"results/all_HEV-TT.csv\") and os.path.getsize(\"results/all_HEV-TT.csv\") > 0:\n",
" HEV_TT_df = pd.read_csv(\"results/all_HEV-TT.csv\" , sep = \",\")\n",
"elif os.path.exists(\"results/all_HEV-TT.csv\") and os.path.getsize(\"results/all_HEV-TT.csv\") == 0:\n",
" print(\"No viral scaffolds with genus equal to \\\"Orthohepevirus\\\" were found in this dataset.\")\n",
" HEV_TT_df = pd.DataFrame({'NA' : [\"No\", \"Orthohepevirus\", \"genus\", \"scaffolds\", \"found\"]})\n",
"else:\n",
" print(\"The file \\\"results/all_HEV-TT.csv\\\" does not exist. If you want this information, please see https://github.com/DennisSchmitz/Jovian/issues/29 for instructions.\")\n",
" HEV_TT_df = pd.DataFrame({'Error' : [\"Please\", \"see\", \"error\", \"message\", \"above\"]})\n",
"\n",
"qgrid.show_grid(HEV_TT_df, show_toolbar=False, grid_options=grid_options)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
119 changes: 30 additions & 89 deletions Snakefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
Authors: Dennis Schmitz, Sam Nooij, Robert Verhagen, Thierry Janssens, Jeroen Cremer, Mark Kroon
Authors: Dennis Schmitz, Sam Nooij, Robert Verhagen, Thierry Janssens, Jeroen Cremer, Florian Zwagemaker, Mark Kroon, Erwin van Wieringen, Annelies Kroneman, Harry Vennema, Marion Koopmans
Organisation: Rijksinstituut voor Volksgezondheid en Milieu (RIVM)
Department: Virology - Emerging and Endemic Viruses (EEV)
Date: 23-08-2018
Expand All @@ -14,6 +14,7 @@ Changelog, examples, installation guide and explanation on:
shell.executable("/bin/bash")

configfile: "profile/pipeline_parameters.yaml"
configfile: "data/variables.yaml"

import pprint
import os
Expand All @@ -32,7 +33,6 @@ with open(config["sample_sheet"]) as sample_sheet_file:
##### The `onstart` checker codeblock #####
#################################################################################

# Need to add checker for ~/.ncbirc --> , config["databases"]["BLAST_ncbirc"] --> doesn't work
onstart:
try:
print("Checking if all specified files are accessible...")
Expand All @@ -59,9 +59,7 @@ localrules:
Generate_index_html,
Generate_IGVjs_html_file,
quantify_output,
Viral_typing,
Concat_files,
Concat_TT_files,
Concat_filtered_SNPs,

rule all:
Expand All @@ -76,12 +74,11 @@ rule all:
expand("data/scaffolds_filtered/{sample}_IGVjs.html", sample = SAMPLES), # IGVjs html's
expand("data/taxonomic_classification/{sample}.blastn", sample = SAMPLES), # MegablastN output
expand("data/tables/{sample}_{extension}", sample = SAMPLES, extension = [ 'taxClassified.tsv', 'taxUnclassified.tsv', 'virusHost.tsv' ]), # Tab seperated tables with merged data
expand("data/virus_typing_tables/{sample}_{virus}.{extension}", sample = SAMPLES, virus = [ 'NoV', 'EV' ], extension = [ 'fa', 'xml', 'csv' ]), # Virus typingtool output tables
expand("results/{file}", file = [ 'all_taxClassified.tsv', 'all_taxUnclassified.tsv', 'all_virusHost.tsv', 'all_NoV-TT.tsv', 'all_EV-TT.tsv', 'all_filtered_SNPs.tsv' ]), # Concatenated classification, virus host and typing tool tables
expand("results/{file}", file = [ 'all_taxClassified.tsv', 'all_taxUnclassified.tsv', 'all_virusHost.tsv', 'all_filtered_SNPs.tsv' ]), # Concatenated classification, virus host and typing tool tables
expand("results/{file}", file = [ 'heatmaps/Superkingdoms_heatmap.html', 'Sample_composition_graph.html', 'Taxonomic_rank_statistics.tsv', 'Virus_rank_statistics.tsv', 'Phage_rank_statistics.tsv', 'Bacteria_rank_statistics.tsv' ]), # Taxonomic profile and heatmap output
expand("results/heatmaps/Virus_{rank}_heatmap.html", rank=[ "order", "family", "genus", "species" ]), # Virus (excl. phages) order|family|genus|species level heatmap for the entire run
expand("results/heatmaps/Phage_{rank}_heatmap.html", rank=[ "order", "family", "genus", "species" ]), # Phage order|family|genus|species heatmaps for the entire run (based on a selection of phage families)
expand("results/heatmaps/Bacteria_{rank}_heatmap.html", rank=[ "phylum", "class", "order", "family", "genus", "species" ]), # Bacteria phylum|class|order|family|genus|species level heatmap for the entire run
expand("results/heatmaps/Virus_heatmap-{rank}.html", rank=[ "order", "family", "genus", "species" ]), # Virus (excl. phages) order|family|genus|species level heatmap for the entire run
expand("results/heatmaps/Phage_heatmap-{rank}.html", rank=[ "order", "family", "genus", "species" ]), # Phage order|family|genus|species heatmaps for the entire run (based on a selection of phage families)
expand("results/heatmaps/Bacteria_heatmap-{rank}.html", rank=[ "phylum", "class", "order", "family", "genus", "species" ]), # Bacteria phylum|class|order|family|genus|species level heatmap for the entire run
expand("results/{file}.html", file = [ 'multiqc', 'krona', 'Heatmap_index', 'IGVjs_index' ]), # Reports and heatmap and IGVjs index.html

#################################################################################
Expand All @@ -92,8 +89,6 @@ rule all:
##### Data quality control and cleaning #####
#############################################################################



rule QC_raw_data:
input:
lambda wildcards: SAMPLES[wildcards.sample][wildcards.read]
Expand Down Expand Up @@ -550,52 +545,9 @@ blastn -task megablast \
##### Send scaffolds to their respective typingtools #####
#############################################################################

rule Viral_typing:
input:
"data/tables/{sample}_taxClassified.tsv",
output:
query_scaffolds_for_NoV_TT="data/virus_typing_tables/{sample}_NoV.fa",
query_scaffolds_for_EV_TT="data/virus_typing_tables/{sample}_EV.fa",
XML_result_NoV_TT="data/virus_typing_tables/{sample}_NoV.xml",
XML_result_EV_TT="data/virus_typing_tables/{sample}_EV.xml",
parsed_CSV_result_NoV_TT="data/virus_typing_tables/{sample}_NoV.csv",
parsed_CSV_result_EV_TT="data/virus_typing_tables/{sample}_EV.csv"
resources:
TT_connections=1
conda:
"envs/data_wrangling.yaml"
benchmark:
"logs/benchmark/Viral_typing_{sample}.txt"
threads: 1
params:
NoV_TT_http=config["typingtool_http"]["NoV_TT_http"],
EV_TT_http=config["typingtool_http"]["EV_TT_http"],
log:
"logs/Viral_typing_{sample}.log"
shell:
"""
awk -F "\\t" '$6 == "Norwalk virus" {{print ">" $2 "\\n" $24}}' < {input} 2> {log} 1> {output.query_scaffolds_for_NoV_TT}
if [ -s "{output.query_scaffolds_for_NoV_TT}" ]
then
curl -s --data-urlencode fasta-sequence@{output.query_scaffolds_for_NoV_TT} {params.NoV_TT_http} 2>> {log} 1> {output.XML_result_NoV_TT}
python bin/typingtool_NoV_XML_to_csv_parser.py {wildcards.sample} {output.XML_result_NoV_TT} {output.parsed_CSV_result_NoV_TT} 2>> {log}
else
echo -e "No scaffolds with species == Norwalk Virus in sample:\t{wildcards.sample}." >> {log}
touch {output.XML_result_NoV_TT}
touch {output.parsed_CSV_result_NoV_TT}
fi

awk -F "\\t" '$8 == "Picornaviridae" {{print ">" $2 "\\n" $24}}' < {input} 2>> {log} 1> {output.query_scaffolds_for_EV_TT}
if [ -s "{output.query_scaffolds_for_EV_TT}" ]
then
curl -s --data-urlencode fasta-sequence@{output.query_scaffolds_for_EV_TT} {params.EV_TT_http} 2>> {log} 1> {output.XML_result_EV_TT}
python bin/typingtool_EV_XML_to_csv_parser.py {wildcards.sample} {output.XML_result_EV_TT} {output.parsed_CSV_result_EV_TT} 2>> {log}
else
echo -e "No scaffolds with family == Picornaviridae in sample:\t {wildcards.sample}." >> {log}
touch {output.XML_result_EV_TT}
touch {output.parsed_CSV_result_EV_TT}
fi
"""
# See issue #29 (https://github.com/DennisSchmitz/Jovian/issues/29)
# Will be added to the pipeline again at a later date.
# For now, these analysis can be done via the scripts attached along with Jovian when needed. For instructions, see the link above.

#############################################################################
##### Generate interactive taxonomy plot. LCA, magnitudes and plot #####
Expand Down Expand Up @@ -689,10 +641,11 @@ rule draw_heatmaps:
classified = "results/all_taxClassified.tsv",
numbers = "results/multiqc_data/multiqc_trimmomatic.txt"
output:
super_quantities="results/Superkingdoms_quantities_per_sample.csv",
super="results/heatmaps/Superkingdoms_heatmap.html",
virus=expand("results/heatmaps/Virus_{rank}_heatmap.html", rank=[ "order", "family", "genus", "species" ]),
phage=expand("results/heatmaps/Phage_{rank}_heatmap.html", rank=[ "order", "family", "genus", "species" ]),
bact=expand("results/heatmaps/Bacteria_{rank}_heatmap.html", rank=[ "phylum", "class", "order", "family", "genus", "species" ]),
virus=expand("results/heatmaps/Virus_heatmap-{rank}.html", rank = [ "order", "family", "genus", "species" ]),
phage=expand("results/heatmaps/Phage_heatmap-{rank}.html", rank = [ "order", "family", "genus", "species" ]),
bact=expand("results/heatmaps/Bacteria_heatmap-{rank}.html", rank = [ "phylum", "class", "order", "family", "genus", "species" ]),
stats="results/Taxonomic_rank_statistics.tsv",
vir_stats="results/Virus_rank_statistics.tsv",
phage_stats="results/Phage_rank_statistics.tsv",
Expand All @@ -702,10 +655,16 @@ rule draw_heatmaps:
benchmark:
"logs/benchmark/draw_heatmaps.txt"
threads: 1
params:
virus_basename="results/heatmaps/Virus_heatmap.html",
phage_basename="results/heatmaps/Phage_heatmap.html",
bact_basename="results/heatmaps/Bacteria_heatmap.html"
log:
"logs/draw_heatmaps.log"
script:
"bin/draw_heatmaps.py"
shell:
"""
python bin/draw_heatmaps.py -c {input.classified} -n {input.numbers} -sq {output.super_quantities} -st {output.stats} -vs {output.vir_stats} -ps {output.phage_stats} -bs {output.bact_stats} -s {output.super} -v {params.virus_basename} -p {params.phage_basename} -b {params.bact_basename} > {log} 2>&1
"""

#############################################################################
##### Data wrangling #####
Expand Down Expand Up @@ -771,7 +730,9 @@ find {params.search_folder} -type f -name "{params.virusHost_glob}" -exec awk 'N
rule Generate_index_html:
input:
"results/heatmaps/Superkingdoms_heatmap.html",
expand("results/heatmaps/Virus_{rank}_heatmap.html", rank=[ "order", "family", "genus", "species" ]),
expand("results/heatmaps/Virus_heatmap-{rank}.html", rank = [ "order", "family", "genus", "species" ]),
expand("results/heatmaps/Phage_heatmap-{rank}.html", rank = [ "order", "family", "genus", "species" ]),
expand("results/heatmaps/Bacteria_heatmap-{rank}.html", rank = [ "phylum", "class", "order", "family", "genus", "species" ]),
expand("data/scaffolds_filtered/{sample}_IGVjs.html", sample = SAMPLES),
output:
heatmap_index="results/Heatmap_index.html",
Expand All @@ -784,33 +745,12 @@ rule Generate_index_html:
params:
heatmap_title=config["HTML_index_titles"]["heatmap_title"],
igvjs_title=config["HTML_index_titles"]["IGVjs_title"],
http_adress=config["server_info"]["http_adress"],
http_adress=config["Server_host"]["hostname"],
port=config["server_info"]["port"],
shell:
"""
tree -H "heatmaps" -L 1 -T "{params.heatmap_title}" --noreport --charset utf-8 -P "*.html" -o {output.heatmap_index} results/heatmaps/ > {log} 2>&1
tree -H "{params.http_adress}:{params.port}/igv/Jovian/data/scaffolds_filtered" -L 1 -T "{params.igvjs_title}" --noreport --charset utf-8 -P "*.html" -o {output.IGVjs_index} data/scaffolds_filtered/ >> {log} 2>&1
"""

rule Concat_TT_files:
input:
expand("data/virus_typing_tables/{sample}_{virus}.csv", sample = SAMPLES, virus = ['NoV','EV'])
output:
NoV="results/all_NoV-TT.tsv",
EV="results/all_EV-TT.tsv",
benchmark:
"logs/benchmark/Concat_TT_files.txt"
threads: 1
log:
"logs/Concat_TT_files.log"
params:
search_folder="data/virus_typing_tables/",
NoV_glob="*_NoV.csv",
EV_glob="*_EV.csv",
shell:
"""
find {params.search_folder} -type f -name "{params.NoV_glob}" -exec awk 'NR==1 || FNR!=1' {{}} + 2> {log} 1> {output.NoV}
find {params.search_folder} -type f -name "{params.EV_glob}" -exec awk 'NR==1 || FNR!=1' {{}} + 2>> {log} 1> {output.EV}
bin/create_igv_index.sh
"""

rule Concat_filtered_SNPs:
Expand Down Expand Up @@ -862,16 +802,17 @@ else
echo -e "\t\tYou chose to not remove temp files: the human genome alignment files are not removed."
fi

echo -e "\tRemoving empty typing tool files..."
find data/virus_typing_tables/ -type f -empty -delete

echo -e "\tGenerating methodological hash (fingerprint)..."
echo -e "This is the link to the code used for this analysis:\thttps://github.com/DennisSchmitz/Jovian/tree/$(git log -n 1 --pretty=format:"%H")" > results/git_log.txt
echo -e "This code with unique fingerprint $(git log -n1 --pretty=format:"%H") was committed by $(git log -n1 --pretty=format:"%an <%ae>") at $(git log -n1 --pretty=format:"%ad")" >> results/git_log.txt

echo -e "\tGenerating Snakemake report..."
snakemake --unlock --config sample_sheet=sample_sheet.yaml
snakemake --report results/snakemake_report.html --config sample_sheet=sample_sheet.yaml

echo -e "\tCreating symlinks for the interactive genome viewer..."
bin/set_symlink.sh
echo -e "Finished"
""")

# perORFcoverage output file van de bbtools scaffold metrics nog importeren in data wrangling part!
Empty file modified bin/adapter_detection_script.sh
100644 → 100755
Empty file.
Empty file modified bin/concat_filtered_vcf.py
100644 → 100755
Empty file.
Loading