Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added Salsa2 wrapper #532

Merged
merged 3 commits into from
Aug 16, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions bio/salsa2/environment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
channels:
- conda-forge
- bioconda
- nodefaults
dependencies:
- salsa2 =2.3
15 changes: 15 additions & 0 deletions bio/salsa2/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
name: Salsa2
description: |
A tool to scaffold long read assemblies with Hi-C data
url: https://github.com/marbl/SALSA
authors:
- Filipe G. Vieira
input:
- BED file
- FASTA file
- FASTA index file
output:
- polished assembly (FASTA format)
- polished assembly (AGP format)
notes: |
* The `extra` param allows for additional program arguments.
17 changes: 17 additions & 0 deletions bio/salsa2/test/Snakefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
rule salsa2:
input:
fas="{sample}.fasta",
fai="{sample}.fasta.fai",
bed="{sample}.bed",
output:
agp="out/{sample}.agp",
fas="out/{sample}.fas",
log:
"logs/salsa2/{sample}.log",
params:
enzyme="CTTAAG", # optional
extra="--clean yes", # optional
resources:
mem_mb=1024,
wrapper:
"master/bio/salsa2"
2 changes: 2 additions & 0 deletions bio/salsa2/test/a.bed
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
lambda_NEB3011_3 89 571 m130725_000747_00121_c100518582550000001823079209281362_s1_p0/15956/701_2749 60 - 89 571 255,0,0 1 482 0
lambda_NEB3011_3 639 987 m130725_000747_00121_c100518582550000001823079209281362_s1_p0/82740/1942_3951 60 - 639 987 255,0,0 1 348 0
14 changes: 14 additions & 0 deletions bio/salsa2/test/a.fasta
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
>lambda_NEB3011_3
AACGGTGTATTACCGGTTTGCTACCAGGGAAGAACGGGAAGGAAAGATGAGCACGAACCTGGTTTTTAAGGAGTGTCGCC
AGAGTGCCGCGATGAAACGGGTATTGGCGGTATATGGAGTTAAAAGATGACCATCTACATTACTGAGCTAATAACAGGCC
TGCTGGTAATCGCAGGCCTTTTTATTTGGGGGAGAGGGAAGTCATGAAAAAACTAACCTTTGAAATTCGATCTCCAGCAC
ATCAGCAAAACGCTATTCACGCAGTACAGCAAATCCTTCCAGACCCAACCAAACCAATCGTAGTAACCATTCAGGAACGC
AACCGCAGCTTAGACCAAAACAGGAAGCTATGGGCCTGCTTAGGTGACGTCTCTCGTCAGGTTGAATGGCATGGTCGCTG
GCTGGATGCAGAAAGCTGGAAGTGTGTGTTTACCGCAGCATTAAAGCAGCAGGATGTTGTTCCTAACCTTGCCGGGAATG
GCTTTGTGGTAATAGGCCAGTCAACCAGCAGGATGCGTGTAGGCGAATTTGCGGAGCTATTAGAGCTTATACAGGCATTC
GGTACAGAGCGTGGCGTTAAGTGGTCAGACGAAGCGAGACTGGCTCTGGAGTGGAAAGCGAGATGGGGAGACAGGGCTGC
ATGATAAATGTCGTTAGTTTCTCCGGTGGCAGGACGTCAGCATATTTGCTCTGGCTAATGGAGCAAAAGCGACGGGCAGG
TAAAGACGTGCATTACGTTTTCATGGATACAGGTTGTGAACATCCAATGACATATCGGTTTGTCAGGGAAGTTGTGAAGT
TCTGGGATATACCGCTCACCGTATTGCAGGTTGATATCAACCCGGAGCTTGGACAGCCAAATGGTTATACGGTATGGGAA
CCAAAGGATATTCAGACGCGAATGCCTGTTCTGAAGCCATTTATCGATATGGTAAAGAAATATGGCACTCCATACGTCGG
CGGCGCGTTCTGCACTGACAGATTAAAACTCGTTCCCTTC
1 change: 1 addition & 0 deletions bio/salsa2/test/a.fasta.fai
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
lambda_NEB3011_3 1000 18 80 81
39 changes: 39 additions & 0 deletions bio/salsa2/wrapper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
__author__ = "Filipe G. Vieira"
__copyright__ = "Copyright 2022, Filipe G. Vieira"
__license__ = "MIT"


import tempfile
from snakemake.shell import shell


extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)


enzyme = snakemake.params.get("enzyme", "")
if enzyme:
enzyme = f"--enzyme {enzyme}"

gfa = snakemake.input.get("gfa", "")
if gfa:
gfa = f"--gfa {gfa}"


with tempfile.TemporaryDirectory() as tmpdir:
shell(
"run_pipeline.py"
" --assembly {snakemake.input.fas}"
" --length {snakemake.input.fai}"
" --bed {snakemake.input.bed}"
" {enzyme}"
" {gfa}"
" {extra}"
" --output {tmpdir}"
" {log}"
)

if snakemake.output.get("agp"):
shell("cat {tmpdir}/scaffolds_FINAL.agp > {snakemake.output.agp}")
if snakemake.output.get("fas"):
shell("cat {tmpdir}/scaffolds_FINAL.fasta > {snakemake.output.fas}")
8 changes: 8 additions & 0 deletions test.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,14 @@ def run(wrapper, cmd, check_log=None):
os.chdir(origdir)


@skip_if_not_modified
def test_salsa2():
run(
"bio/salsa2",
["snakemake", "--cores", "1", "out/a.agp", "--use-conda", "-F"],
)


@skip_if_not_modified
def test_mashmap():
run(
Expand Down