Skip to content

Commit

Permalink
Merge pull request #19 from yyoshiaki/developmentv1.2
Browse files Browse the repository at this point in the history
Developmentv1.2
  • Loading branch information
yyoshiaki authored Oct 7, 2020
2 parents 0d55522 + 6b373e9 commit 32e62a7
Show file tree
Hide file tree
Showing 20 changed files with 226 additions and 365 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# VIRTUS : VIRal Transcript Usage Sensor v1.1 <img src="https://github.com/yyoshiaki/VIRTUS/raw/master/img/VIRTUS.jpg" width="20%" align="right" />
# VIRTUS : VIRal Transcript Usage Sensor v1.2 <img src="https://github.com/yyoshiaki/VIRTUS/raw/master/img/VIRTUS.jpg" width="20%" align="right" />

Virus transcript detection and quantification using normal human RNAseq. VIRTUS is the first tool to detect viral transcripts considering their splicing event rather than the viral genome copy number. VIRTUS can be applied to both bulk RNAseq and single-cell RNAseq. The virus reference covers 762 viruses including SARS-CoV-2 (cause of COVID-19). The workflow is implemented by [Common Workflow Language](https://www.commonwl.org/) and [Rabix](https://rabix.io/). You can specify each parameter individually or give `yaml` or `json` file which describes all the parameter information. In detail, check [the CWL User Guide](http://www.commonwl.org/user_guide/) out.

Expand All @@ -16,7 +16,7 @@ Yoshiaki Yasumizu ([yyasumizu@ifrec.osaka-u.ac.jp](yyasumizu@ifrec.osaka-u.ac.jp
## Citation

VIRTUS: a pipeline for comprehensive virus analysis from conventional RNA-seq data
Yoshiaki Yasumizu, Atsushi Hara, Shimon Sakaguchi, Naganari Ohkura. *bioRxiv* 2020.05.08.085308; doi: https://doi.org/10.1101/2020.05.08.085308
Yasumizu, Yoshiaki, Atsushi Hara, Shimon Sakaguchi, and Naganari Ohkura. 2020. “OUP Accepted Manuscript.” Edited by Jan Gorodkin. *Bioinformatics*, October. https://doi.org/10.1093/bioinformatics/btaa859.

## Acknowledgement

Expand Down
Binary file modified img/VIRTUS.PE.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/VIRTUS.SE.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion test/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,6 @@ if [[ ! -e ./SRR8315715_1.fastq.gz ]]; then
fi
cwltool --rm-tmpdir ../../workflow/VIRTUS.SE.cwl ../../workflow/VIRTUS.SE.job.yaml
cwltool --rm-tmpdir ../../workflow/VIRTUS.SE.singlevirus.cwl ../../workflow/VIRTUS.SE.singlevirus.job.yaml
cd ..
cd ..

echo Successfully completed test.sh!
36 changes: 36 additions & 0 deletions tool/fastq_pair/fastq_pair.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
class: CommandLineTool
cwlVersion: v1.0
$namespaces:
sbg: 'https://www.sevenbridges.com/'
id: fastq_pair
baseCommand:
- fastq_pair
inputs:
- id: fq1
type: File
inputBinding:
position: 0
- id: fq2
type: File
inputBinding:
position: 0
outputs:
- id: fq1_paired
type: File
outputBinding:
glob: $(inputs.fq1.basename).paired.fq
- id: fq2_paired
type: File
outputBinding:
glob: $(inputs.fq2.basename).paired.fq
label: fastq_pair
requirements:
- class: DockerRequirement
dockerPull: 'quay.io/biocontainers/fastq-pair:1.0--he1b5a44_1'
- class: InlineJavascriptRequirement
- class: InitialWorkDirRequirement
listing:
- entry: $(inputs.fq1)
writable: true
- entry: $(inputs.fq2)
writable: true
6 changes: 6 additions & 0 deletions tool/fastq_pair/fastq_pair.job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
fq1:
class: File
path: /Users/yyasumizu/bioinformatics/tmp/kz_1.fq
fq2:
class: File
path: /Users/yyasumizu/bioinformatics/tmp/kz_2.fq
4 changes: 4 additions & 0 deletions tool/kz_filter/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
FROM conda/miniconda3
USER root

RUN conda install -y -c eclarke komplexity
37 changes: 37 additions & 0 deletions tool/kz_filter/kz_filter.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
class: CommandLineTool
cwlVersion: v1.0
$namespaces:
sbg: 'https://www.sevenbridges.com/'
id: kz_filter
baseCommand:
- kz
inputs:
- 'sbg:toolDefaultValue': '0.1'
id: threshold
type: float?
inputBinding:
prefix: '--threshold'
shellQuote: false
position: 0
- id: input_fq
type: File
- id: output_fq
type: string
outputs:
- id: output
type: File?
outputBinding:
glob: $(inputs.output_fq)
label: kz-filter
arguments:
- prefix: ''
shellQuote: false
position: 0
valueFrom: '--filter'
requirements:
- class: ShellCommandRequirement
- class: DockerRequirement
dockerPull: 'yyasumizu/ko:0.1'
- class: InlineJavascriptRequirement
stdin: $(inputs.input_fq.path)
stdout: $(inputs.output_fq)
6 changes: 6 additions & 0 deletions tool/kz_filter/kz_filter.job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
input_fq:
class: File
path: >-
/home/yyasumizu/media32TB/bioinformatics/practice/VIRTUS/test/ERR3240275/unmapped_1.fq
output_fq: unmapped_kz0.1.fq
threshold: 0.1
2 changes: 1 addition & 1 deletion tool/samtools/bam_filter_polyx.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,5 @@ label: bam_filter_polyX
requirements:
- class: ShellCommandRequirement
- class: DockerRequirement
dockerPull: 'yyasumizu/bam_filter_polyx:1.1'
dockerPull: 'yyasumizu/bam_filter_polyx:1.2'
stdout: virusAligned.filtered.sortedByCoord.out.bam
2 changes: 1 addition & 1 deletion tool/samtools/bam_filter_polyx.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh

samtools view -h ${1} | \
grep -v "AAAAAAAAAAAAAAAAAAAA" | grep -v "TTTTTTTTTTTTTTTTTTTT" | \
grep -v "AAAAAAAAAAAAAAAAAAAA" | grep -v "TTTTTTTTTTTTTTTTTTTT" | grep -v "TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG" | \
samtools view -bS -
89 changes: 64 additions & 25 deletions workflow/VIRTUS.PE.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,19 @@
class: Workflow
cwlVersion: v1.0
id: VIRTUS.PE
doc: VIRTUS v1.1
doc: VIRTUS v1.2
label: VIRTUS.PE
$namespaces:
sbg: 'https://www.sevenbridges.com/'
inputs:
- id: fastq2
type: File
'sbg:x': -532.9534301757812
'sbg:y': -424.9623107910156
'sbg:x': -446.9178771972656
'sbg:y': -426.26885986328125
- id: fastq1
type: File
'sbg:x': -534.9113159179688
'sbg:y': -269
'sbg:x': -445.51458740234375
'sbg:y': -274.6721496582031
- id: genomeDir_human
type: Directory
'sbg:x': -305
Expand All @@ -26,8 +26,8 @@ inputs:
'sbg:y': -750
- id: nthreads
type: int?
'sbg:x': -366.3414611816406
'sbg:y': 303.09088134765625
'sbg:x': -386.4197082519531
'sbg:y': 242.87530517578125
- id: genomeDir_virus
type: Directory
'sbg:x': 417.4296875
Expand Down Expand Up @@ -67,20 +67,20 @@ outputs:
outputSource:
- star_mapping_pe_human/aligned
type: File
'sbg:x': 209.65237426757812
'sbg:y': 379.7947998046875
'sbg:x': 190.84915161132812
'sbg:y': 381.26885986328125
- id: output_unmapped
outputSource:
- samtools_view/output
type: File
'sbg:x': 412
'sbg:y': 218
'sbg:x': 340.177001953125
'sbg:y': 225.9409942626953
- id: output_fq2
outputSource:
- bedtools_bamtofastq_pe/output_fq2
type: File?
'sbg:x': 549.3675537109375
'sbg:y': -334.5509948730469
'sbg:x': 558.2530517578125
'sbg:y': -337.0843505859375
- id: output_fq1
outputSource:
- bedtools_bamtofastq_pe/output_fq1
Expand Down Expand Up @@ -156,8 +156,8 @@ steps:
- id: out_fastq1
- id: out_fastq2
run: ../tool/fastp/fastp-pe.cwl
'sbg:x': -302
'sbg:y': -343
'sbg:x': -271.3704528808594
'sbg:y': -358.8819885253906
- id: star_mapping_pe_human
in:
- id: fq1
Expand Down Expand Up @@ -192,8 +192,6 @@ steps:
in:
- id: threads
source: nthreads
- id: b
default: true
- id: f
default: 4
- id: prefix
Expand All @@ -204,8 +202,8 @@ steps:
- id: output
run: ../tool/samtools/samtools-view.cwl
label: samtools-view
'sbg:x': 199.102294921875
'sbg:y': 52.59203338623047
'sbg:x': 179.7147216796875
'sbg:y': 53.134429931640625
- id: bedtools_bamtofastq_pe
in:
- id: input
Expand All @@ -219,14 +217,14 @@ steps:
- id: output_fq2
run: ../tool/bedtools/bedtools-bamtofastq-pe.cwl
label: bedtools-bamtofastq-pe
'sbg:x': 406.1430358886719
'sbg:y': 28.673505783081055
'sbg:x': 315.8917236328125
'sbg:y': -33.99026870727539
- id: star_mapping_pe_virus
in:
- id: fq1
source: bedtools_bamtofastq_pe/output_fq1
source: fastq_pair/fq1_paired
- id: fq2
source: bedtools_bamtofastq_pe/output_fq2
source: fastq_pair/fq2_paired
- id: genomeDir
source: genomeDir_virus
- id: nthreads
Expand All @@ -247,8 +245,8 @@ steps:
- id: unmapped
run: ../tool/star/star_mapping-pe/star_mapping-pe.cwl
label: 'STAR mapping: running mapping jobs.'
'sbg:x': 681.9188232421875
'sbg:y': 132.9390411376953
'sbg:x': 753.9961547851562
'sbg:y': 133.0281219482422
- id: mk_virus_count
in:
- id: virus_bam
Expand Down Expand Up @@ -310,6 +308,47 @@ steps:
label: bam_filter_polyX
'sbg:x': 825.17626953125
'sbg:y': 513.4646606445312
- id: kz_filter_fq2
in:
- id: threshold
default: 0.1
- id: input_fq
source: bedtools_bamtofastq_pe/output_fq2
- id: output_fq
default: kz_2.fq
out:
- id: output
run: ../tool/kz_filter/kz_filter.cwl
label: kz-filter_fq2
'sbg:x': 460.7408752441406
'sbg:y': 27.032846450805664
- id: kz_filter_fq1
in:
- id: threshold
default: 0.1
- id: input_fq
source: bedtools_bamtofastq_pe/output_fq1
- id: output_fq
default: kz_1.fq
out:
- id: output
run: ../tool/kz_filter/kz_filter.cwl
label: kz-filter_fq1
'sbg:x': 462.7408752441406
'sbg:y': 147
- id: fastq_pair
in:
- id: fq1
source: kz_filter_fq1/output
- id: fq2
source: kz_filter_fq2/output
out:
- id: fq1_paired
- id: fq2_paired
run: ../tool/fastq_pair/fastq_pair.cwl
label: fastq_pair
'sbg:x': 577.2360229492188
'sbg:y': 93.92456817626953
requirements: []
'sbg:license': CC BY-NC 4.0
'sbg:toolAuthor': Yoshiaki Yasumizu
30 changes: 22 additions & 8 deletions workflow/VIRTUS.SE.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
class: Workflow
cwlVersion: v1.0
id: VIRTUS.SE
doc: VIRTUS v1.1
doc: VIRTUS v1.2
label: VIRTUS.SE
$namespaces:
sbg: 'https://www.sevenbridges.com/'
Expand Down Expand Up @@ -124,8 +124,6 @@ steps:
in:
- id: threads
source: nthreads
- id: b
default: true
- id: f
default: 4
- id: prefix
Expand Down Expand Up @@ -236,12 +234,12 @@ steps:
- id: output_fq
run: ../tool/bedtools/bedtools-bamtofastq-se.cwl
label: bedtools-bamtofastq-pe
'sbg:x': 392.1208801269531
'sbg:y': -110.00155639648438
'sbg:x': 350.34454345703125
'sbg:y': -113.04795837402344
- id: star_mapping_se_virus
in:
- id: fq
source: bedtools_bamtofastq_se/output_fq
source: kz_filter/output
- id: genomeDir
source: genomeDir_virus
- id: nthreads
Expand Down Expand Up @@ -274,9 +272,25 @@ steps:
label: bam_filter_polyX
'sbg:x': 847.6024780273438
'sbg:y': 366.4100036621094
requirements: []
- id: kz_filter
in:
- id: threshold
default: 0.1
- id: input_fq
source: bedtools_bamtofastq_se/output_fq
- id: output_fq
default: kz.fq
out:
- id: output
run: ../tool/kz_filter/kz_filter.cwl
label: kz-filter
'sbg:x': 489.5152893066406
'sbg:y': 39.372711181640625
requirements:
- class: InlineJavascriptRequirement
- class: StepInputExpressionRequirement
'sbg:license': CC BY-NC 4.0
'sbg:links':
- id: 'https://github.com/yyoshiaki/VIRTUS'
label: ''
'sbg:toolAuthor': Yoshiaki Yasumizu
'sbg:license': CC BY-NC 4.0
10 changes: 10 additions & 0 deletions workflow/kz_filter_PE.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
kz -k 2 < unmapped_1.fq > kz_1.txt
kz -k 2 < unmapped_2.fq > kz_2.txt

SCRIPT_DIR=$(cd $(dirname $0); pwd)
python3 $SCRIPT_DIR/kz_list_PE.py

list=(`cat kz_filter_list_1.txt` `cat kz_filter_list_2.txt`)

samtools view virusAligned.filtered.sortedByCoord.out.bam | egrep -v "`echo $(IFS="|"; echo "${list[*]}")`" | cut -f3 | sort | uniq -c > virus_counts_kz.txt
samtools view virusAligned.filtered.sortedByCoord.out.bam | egrep "`echo $(IFS="|"; echo "${list[*]}")`" > kz_removed.txt
8 changes: 8 additions & 0 deletions workflow/kz_filter_SE.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
kz -k 2 < unmapped.fq > kz.txt

SCRIPT_DIR=$(cd $(dirname $0); pwd)
python3 $SCRIPT_DIR/kz_list_SE.py

list=(`cat kz_filter_list.txt`)
samtools view virusAligned.filtered.sortedByCoord.out.bam | egrep -v "`echo $(IFS="|"; echo "${list[*]}")`" | cut -f3 | sort | uniq -c > virus_counts_kz.txt
samtools view virusAligned.filtered.sortedByCoord.out.bam | egrep "`echo $(IFS="|"; echo "${list[*]}")`" > kz_removed.txt
Loading

0 comments on commit 32e62a7

Please sign in to comment.