GitHub - omarwagih/rmotifx: Discovery of biological sequence motifs in R

Introduction

This package contains a useable implementation the motif-x tool in the R programming language. motif-x (short for motif extractor) is a software tool designed to extract overrepresented patterns from any sequence data set. The algorithm is an iterative strategy which builds successive motifs through comparison to a dynamic statistical background. For more information, please refer to the original motif-x resource. Please note that the current implementation only supports sequences with a fixed length (i.e. pre-aligned) and have a fixed central residue. For example, phosphorylation sites.

How to install?

The motif-x R package can be directly installed from github. First, ensure the devotools package is installed:

install.packages('devtools')

Then install rmotifx:

require(devtools)
install_github('omarwagih/rmotifx')

How to use?

To get started, fire up the motif-x package:

require(rmotifx)

The package contains the function motifx which does everything. For a simple run, you will need a foreground and background set of sequences.

We can go ahead and use the sample data provided with the package:

# Get paths to sample files
fg.path = system.file("extdata", "fg-data-ck2.txt", package = "rmotifx")
bg.path = system.file("extdata", "bg-data-serine.txt", package = "rmotifx")

# Read in sequences
fg.seqs = readLines(fg.path)
bg.seqs = readLines(bg.path)

# You can take a look at the format of the sample data
head(fg.seqs)
head(bg.seqs)

Here, the foreground data represents phosphorylation binding sites of Casein Kinase 2. The negative data represents 10,000 random serine-centered sites.

To start the program, run the following:

mot = motifx(fg.seqs, bg.seqs, central.res = 'S', min.seqs = 20, pval.cutoff = 1e-6)
print(mot)

The results returned should have the following format:

| motif           | score            | fg.matches | fg.size | bg.matches | bg.size | fold.increase    |
|-----------------|------------------|------------|---------|------------|---------|------------------|
| .......SD.E.... | 615.305311137178 | 57         | 399     | 23         | 6039    | 37.5093167701863 |
| .......S..EE... | 318.377804126939 | 37         | 342     | 37         | 6016    | 17.5906432748538 |
| .......SD.D.... | 615.305311137178 | 39         | 305     | 12         | 5979    | 63.7106557377049 |
| .......SE.E.... | 314.760503514246 | 24         | 266     | 32         | 5967    | 16.8242481203008 |
| .......S..E.... | 307.652655568589 | 56         | 242     | 325        | 5935    | 4.22581055308328 |
| .......SE.D.... | 315.866504156853 | 21         | 186     | 26         | 5610    | 24.3610421836228 |
| .......S..D.... | 10.915342261675  | 30         | 165     | 233        | 5584    | 4.35739367928209 |
| .......SD...... | 9.3715112092424  | 25         | 135     | 224        | 5351    | 4.42377645502645 |
| .......S.E..... | 7.27014238663954 | 25         | 110     | 342        | 5127    | 3.40709728867624 |

It's that easy!

For detailed explanations of all parameters and output, check out the documentation by typing ?motifx. You can also refer to the original motif-x resource or paper.

Citation

If you use rmotifx please do cite the following paper:

Wagih O, Sugiyama N, Ishihama Y, Beltrao P. (2015) Uncovering phosphorylation-based specificities through functional interaction networks (2015). Mol. Cell. Proteomics PUBMED

Todo

Add support for degenerate motifs
Add support for DNA sequences. Currently, only protein supported.
Allow motif discovery in non-centered k-mers

Feedback

If you have any feedback or suggestions, please drop me a line at (wagih(at)ebi.ac.uk) or open an issue on github.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
R		R
build		build
inst/extdata		inst/extdata
man		man
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

How to install?

How to use?

Citation

Todo

Feedback

About

Releases

Packages

Contributors 3

Languages

omarwagih/rmotifx

Folders and files

Latest commit

History

Repository files navigation

Introduction

How to install?

How to use?

Citation

Todo

Feedback

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages