-
Notifications
You must be signed in to change notification settings - Fork 12
Getting Started for Clojure Beginners
- Java (1.8 or greater)
- Leiningen (2.0 or greater)
- Unix environment*
* Clojure works on any environments supported by Java. But this guide supposes Unix OS such as Linux and Mac OS.
Install Java from Oracle website, OpenJDK website, or package manager.
Leiningen is a popular build tool for Clojure.
Leiningen is similar to Maven and Gradle. It automatically resolves dependent libraries and classpath according to the project configuration. You do not have to install Clojure itself.
Install Leiningen according to the official instruction.
First, create a new Leiningen project with lein new
command.
$ lein new cljam-start
$ cd cljam-start
This command creates cljam-start/
directory and generates the project files in the directory.
cljam-start/
├── CHANGELOG.md
├── LICENSE
├── README.md
├── doc/
│ └── intro.md
├── project.clj
├── resources/
├── src/
│ └── cljam_start/
│ └── core.clj
└── test/
└── cljam_start/
└── core_test.clj
Then add cljam
dependency to the project configuration in project.clj
.
(defproject cljam-start "0.1.0-SNAPSHOT"
:description "FIXME: write description"
:url "http://example.com/FIXME"
:license {:name "Eclipse Public License"
:url "http://www.eclipse.org/legal/epl-v10.html"}
:dependencies [[org.clojure/clojure "1.8.0"]
[cljam "0.8.5"]]) ; <- Add this line
Leiningen automatically downloads Clojure, cljam, and other dependent libraries. Execute lein deps
.
$ lein deps
Retrieving org/clojure/clojure/1.8.0/clojure-1.8.0.pom from central
Retrieving org/sonatype/oss/oss-parent/7/oss-parent-7.pom from central
Retrieving cljam/cljam/0.5.0/cljam-0.5.0.pom from clojars
...
Retrieving org/clojure/tools.logging/0.3.1/tools.logging-0.3.1.jar from central
Retrieving org/clojure/clojure/1.8.0/clojure-1.8.0.jar from central
Retrieving org/clojure/tools.cli/0.3.5/tools.cli-0.3.5.jar from central
...
Retrieving org/clojure/tools.nrepl/0.2.12/tools.nrepl-0.2.12.jar from central
Retrieving cljam/cljam/0.5.0/cljam-0.5.0.jar from clojars
Retrieving me/raynes/fs/1.4.6/fs-1.4.6.jar from clojars
...
Downloads example SAM/BAM files (test.sam
and test.bam
) into resources/
directory.
$ wget https://raw.githubusercontent.com/chrovis/cljam/master/test-resources/sam/test.sam -O resources/test.sam
$ wget https://raw.githubusercontent.com/chrovis/cljam/master/test-resources/bam/test.bam -O resources/test.bam
Clojure provides an interactive shell that is called REPL. In REPL, you can try Clojure codes quickly.
lein repl
launches REPL in Leiningen project, which resolves dependencies.
$ lein repl
nREPL server started on port 49451 on host 127.0.0.1 - nrepl://127.0.0.1:49451
REPL-y 0.3.7, nREPL 0.2.12
Clojure 1.8.0
Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13
Docs: (doc function-name-here)
(find-doc "part-of-name-here")
Source: (source function-name-here)
Javadoc: (javadoc java-object-or-class-here)
Exit: Control+D or (exit) or (quit)
Results: Stored in vars *1, *2, *3, an exception in *e
user=>
require
loads the specified namespaces into the current namespace.
user=> (require '[cljam.io.sam :as sam])
Opens test.sam
with reader
and reads its header information.
user=> (with-open [r (sam/reader "resources/test.sam")]
(sam/read-header r))
{:SQ [{:SN "ref", :LN 45} {:SN "ref2", :LN 40}]}
To read alignments,
user=> (with-open [r (sam/reader "resources/test.sam")]
(doall (take 5 (sam/read-alignments r))))
(#cljam.io.protocols.SAMAlignment{:qname r003, :flag 16, :rname ref, :pos 29, :end 33, :mapq 30, :cigar 6H5M, :rnext *, :pnext 0, :tlen 0, :seq TAGGC, :qual *, :options []}
#cljam.io.protocols.SAMAlignment{:qname r001, :flag 163, :rname ref, :pos 7, :end 22, :mapq 30, :cigar 8M4I4M1D3M, :rnext =, :pnext 37, :tlen 39, :seq TTAGATAAAGAGGATACTG, :qual *, :options [{:XX {:type B, :value S,12561,2,20,112}}]}
#cljam.io.protocols.SAMAlignment{:qname r002, :flag 0, :rname ref, :pos 9, :end 18, :mapq 30, :cigar 1S2I6M1P1I1P1I4M2I, :rnext *, :pnext 0, :tlen 0, :seq AAAAGATAAGGGATAAA, :qual *, :options []}
#cljam.io.protocols.SAMAlignment{:qname r003, :flag 0, :rname ref, :pos 9, :end 14, :mapq 30, :cigar 5H6M, :rnext *, :pnext 0, :tlen 0, :seq AGCTAA, :qual *, :options []}
#cljam.io.protocols.SAMAlignment{:qname x3, :flag 0, :rname ref2, :pos 6, :end 27, :mapq 30, :cigar 9M4I13M, :rnext *, :pnext 0, :tlen 0, :seq TTATAAAACAAATAATTAAGTCTACA, :qual ??????????????????????????, :options []})
cljam.algo.sorter
provides sorting functions.
user=> (require '[cljam.io.sam :as sam]
'[cljam.algo.sorter :as sorter])
user=> (with-open [r (sam/reader "resources/test.bam")
w (sam/writer "resources/test.sorted.bam")]
(sorter/sort-by-pos r w))
nil
The above code creates a sorted BAM file (test.sorted.bam
).
$ ls resources
test.bam test.sam test.sorted.bam
cljam.algo.sorter/sort-by-pos
accepts reader and writer as arguments. In this
case, reader is the source BAM and writer is the sorted BAM that will be created.
To create a BAM index (BAI),
user=> (require '[cljam.algo.bam-indexer :as bai])
user=> (bai/create-index "resources/test.sorted.bam"
"resources/test.sorted.bam.bai")
nil
The index file (test.sorted.bam.bai
) has generated.
$ ls resources
test.bam test.sam test.sorted.bam test.sorted.bam.bai
cljam.algo.depth/depth
calculates a simple pileup and returns it as a lazy sequence.
user=> (require '[cljam.algo.depth :as depth])
user=> (with-open [r (sam/reader "resources/test.sorted.bam")]
(depth/depth r {:chr "ref" :start 1 :end 30}))
(0 0 0 0 0 0 0 1 1 3 3 3 3 3 3 2 3 3 3 2 2 2 2 1 1 1 1 1 1 2)
pileup
requires the BAI so that you need to create it beforehand.
cljam provides a command-line interface to check its features quickly.
$ wget https://github.com/chrovis/cljam/releases/download/0.8.5/cljam
$ chmod +x cljam
Place cljam
on your $PATH
where your shell can find it (e.g. ~/bin
).
cljam
has some sub-commands similar to SAMtools. Use --help
option to print all sub-commands and their descriptions.
$ cljam --help
Usage: cljam {view,convert,normalize,sort,index,pileup,faidx,dict,level,version} ...
Options Default Desc
------- ------- ----
-h, --no-help, --help false Show help
Command Desc
------- ----
view Extract/print all or sub alignments in SAM or BAM format.
convert Convert file format based on the file extension.
normalize Normalize references of alignments.
sort Sort alignments by leftmost coordinates.
index Index sorted alignment for fast random access.
pileup Generate pileup for the BAM file.
faidx Index reference sequence in the FASTA format.
dict Create a FASTA sequence dictionary file.
level Add level of alignments.
version Print version number.
cljam [sub-command] --help
shows help with each sub-command.
$ cljam view --help
Extract/print all or sub alignments in SAM or BAM format.
Usage: cljam view [--header] [-f FORMAT][-r REGION] <in.bam|sam>
Options:
--header Include header
-f, --format FORMAT auto Input file format <auto|sam|bam>
-r, --region REGION Only print in region (e.g. chr6:1000-2000)
-h, --help Print help
view
prints contents of SAM/BAM, which is equivalent of samtools view
.
$ cljam view --header resources/test.sam
@SQ SN:ref LN:45
@SQ SN:ref2 LN:40
r003 16 ref 29 30 6H5M * 0 0 TAGGC *
r001 163 ref 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112
r002 0 ref 9 30 1S2I6M1P1I1P1I4M2I * 0 0 AAAAGATAAGGGATAAA *
r003 0 ref 9 30 5H6M * 0 0 AGCTAA *
x3 0 ref2 6 30 9M4I13M * 0 0 TTATAAAACAAATAATTAAGTCTACA ??????????????????????????
r004 0 ref 16 30 6M14N1I5M * 0 0 ATAGCTCTCAGC *
r001 83 ref 37 30 9M = 7 -39 CAGCGCCAT *
x1 0 ref2 1 30 20M * 0 0 AGGTTTTATAAAACAAATAA ????????????????????
x2 0 ref2 2 30 21M * 0 0 GGTTTTATAAAACAAATAATT ?????????????????????
x4 0 ref2 10 30 25M * 0 0 CAAATAATTAAGTCTACAGAGCAAC ?????????????????????????
x6 0 ref2 14 30 23M * 0 0 TAATTAAGTCTACAGAGCAACTA ???????????????????????
x5 0 ref2 12 30 24M * 0 0 AATAATTAAGTCTACAGAGCAACT ????????????????????????
convert
command converts SAM into BAM or BAM into SAM.
$ cljam convert resources/test.sam /tmp/test.converted.bam
$ cljam convert resources/test.bam /tmp/test.converted.sam
See command-line tool manual for more information.