Skip to content

Commit

Permalink
feat: Mashmap (#485)
Browse files Browse the repository at this point in the history
* [fix] (template): Missing code in wrappers' doc. Error #187

* mashmap initial commit

* refList parameters included in tests

* fix fasta missing extension

* passing tests

* cleaning

* comment about relist

* formatting

* miss-renaming files

* miss-renaming files

* cleaning

* meta corrections : #485 (comment)

* Gzipped fasta support + query list as text file

* Gzipped support

* Gzipped support

* Documentation update #485 (comment)

Co-authored-by: tdayris <tdayris@gustaveroussy.fr>
  • Loading branch information
tdayris and tdayris authored May 13, 2022
1 parent f52e1d4 commit c05006d
Show file tree
Hide file tree
Showing 9 changed files with 110 additions and 0 deletions.
8 changes: 8 additions & 0 deletions bio/mashmap/environment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
channels:
- bioconda
- conda-forge
- defaults
dependencies:
- mashmap =2.0
- gsl =2.7
- gzip =1.11
15 changes: 15 additions & 0 deletions bio/mashmap/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
name: MashMap
description: Compute local alignment boundaries between long DNA sequences with MashMap
url: https://github.com/marbl/MashMap
authors:
- Thibault Dayris
input:
- ref: Path to reference file
- query: Path to query file (fasta, fastq)
output:
- Path to the alignment file
params:
- extra: Optional parameters for MashMap
notes: |
* `input.ref` may be either a path to a fasta file or a text file containing a list of paths to several fasta files.
* `input.query` may be either a path to a fastq file or a text file containing a list of paths to several fastq files.
13 changes: 13 additions & 0 deletions bio/mashmap/test/Snakefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
rule test_mashmap:
input:
ref="reference.fasta.gz", # This can be a txt file with a path to a fasta-file per line
query="read.fasta.gz",
output:
"mashmap.out",
threads: 2
params:
extra="-s 1000 --pi 99",
log:
"logs/mashmap.log",
wrapper:
"master/bio/mashmap"
12 changes: 12 additions & 0 deletions bio/mashmap/test/Snakefile_reflist.smk
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
rule test_mashmap_reflist:
input:
ref="reference.txt",
query="read.fasta.gz",
output:
"mashmap.out",
params:
extra="-s 500 --pi 99",
log:
"logs/mashmap.log",
wrapper:
"master/bio/mashmap"
Binary file added bio/mashmap/test/read.fasta.gz
Binary file not shown.
Binary file added bio/mashmap/test/reference.fasta.gz
Binary file not shown.
1 change: 1 addition & 0 deletions bio/mashmap/test/reference.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
reference.fasta.gz
49 changes: 49 additions & 0 deletions bio/mashmap/wrapper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/usr/bin/python3.8
# coding: utf-8

""" Snakemake wrapper for MashMap """

__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2022, Thibault Dayris"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"

from snakemake.shell import shell

log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")
max_threads = snakemake.threads

# Handling input file types (either a fasta file, or a text file with a list of paths to fasta files)
ref = snakemake.input["ref"]
if ref.endswith(".txt"):
ref = f"--refList {ref}"
elif ref.endswith(".gz"):
ref = f"--ref <( gzip --decompress --stdout {ref} )"
max_threads -= 1
else:
ref = f"--ref {ref}"

if max_threads < 1:
raise ValueError(
"Reference fasta on-the-fly g-unzipping consumed one thread."
f" Please increase the number of available threads by {1 - max_threads}."
)


# Handling query file format (either a fastq file or a text file with a list of fastq files)
query = snakemake.input["query"]
if query.endswith(".txt"):
query = f"--queryList {query}"
else:
query = f"--query {query}"

shell(
"mashmap "
"{ref} "
"{query} "
"--output {snakemake.output} "
"--threads {snakemake.threads} "
"{extra} "
"{log}"
)
12 changes: 12 additions & 0 deletions test.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,18 @@ def run(wrapper, cmd, check_log=None):
os.chdir(origdir)


@skip_if_not_modified
def test_mashmap():
run(
"bio/mashmap",
["snakemake", "--cores", "2", "mashmap.out", "--use-conda", "-F"]
)

run(
"bio/mashmap",
["snakemake", "--cores", "2", "mashmap.out", "--use-conda", "-F", "-s", "Snakefile_reflist.smk"]
)

@skip_if_not_modified
def test_rbt_csvreport():
run(
Expand Down

0 comments on commit c05006d

Please sign in to comment.