librtd

This project aims to make DNA and RNA k-mer return time distribution analysis simple, fast, and generalizable.

What is a k-mer return time distribution?

Consider the DNA sequences AAAAAAAAAAAAAAAAATTTTTTTTTTTTTTTTT and ATATATATATATATATATATATATATATATATAT. Normal k-mer frequency based analysis methods would treat these sequences identically. For k=1, the number of A 1-mers is precisely equal to the number of T 1-mers in both sequences. However, k-mer return time methods ask the following question: For a given k-mer, how close (in base pairs) is it to the next occurrence of another k-mer (usually the same one).

So, going back to the example above, for k=1, the return times the first sequence for A would be 1, 1, 1... since each A k-mer is one base away from the next A k-mer. For the second sequence, it would be 2, 2, 2... since each A is takes two bases to become an A again.

In librtd, we have generalized the concept of k-mer return time to include the distance of a k-mer not only to the next occurrence of itself but also to the next occurrence of another k-mer. In the first sequence, the return times from A to T are 17, 16, ..., 2, 1. This is useful for studying the relationship between the location of various pairs of k-mers, not just individual k-mers.

Once the k-mer return times have been calculated, librtd can automatically compute the mean and standard deviation of the return times for each k-mer, allowing the distribution to be easily summarized. In the first example, the mean distance between A and T is 9 with a standard deviation of 5. whereas in the second example, the mean distance is 1 with a standard deviation of 0.

This technique is useful in applications wherever alignment-free sequence analysis is used, from phylogeny to metagenomics. Give librtd a try!

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
librtd-js		librtd-js
librtd-py		librtd-py
librtd		librtd
.flake8		.flake8
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
renovate.json		renovate.json
test.fasta		test.fasta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

librtd

What is a k-mer return time distribution?

About

Releases 5

Contributors 4

Languages

License

IQTLabs/librtd

Folders and files

Latest commit

History

Repository files navigation

librtd

What is a k-mer return time distribution?

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 5

Contributors 4

Languages