`krust`: counts k-mers, written in rust

krust is a k-mer counter - a bioinformatics 101 tool for counting the frequency of substrings of length k within strings of DNA data. krust is written in Rust and run from the command line. It takes a FASTA file of DNA sequences and will output all canonical k-mers (the double helix means each k-mer has a reverse complement) and their frequency across all records in the given data. krust is tested for accuracy against jellyfish.

krust: counts k-mers, written in rust

Usage: krust <k> <path>

Arguments:
  <k>     provides k length, e.g. 5
  <path>  path to a FASTA file, e.g. /home/lisa/bio/cerevisiae.pan.fa

Options:
  -h, --help     Print help information
  -V, --version  Print version information

krust supports either rust-bio or needletail to read FASTA record. Use the --features flag to select.

Run krust with rust-bio's fasta reader to count 5-mers like this:

cargo run --release --features rust-bio -- 5 your/local/path/to/fasta_data.fa

or, searching for 21-mers with needletail as the fasta reader, like this:

cargo run --release --features needletail -- 21 your/local/path/to/fasta_data.fa

krust prints to stdout, writing, on alternate lines:

>114928
ATGCC
>289495
AATCA
...

Name		Name	Last commit message	Last commit date
Latest commit History 275 Commits
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`krust`: counts k-mers, written in rust

About

Languages

License

suchapalaver/krust

Folders and files

Latest commit

History

Repository files navigation

krust: counts k-mers, written in rust

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

`krust`: counts k-mers, written in rust