In this project, the aim is to develop a method of classifying whether a given section of primate DNA contains a splice junction or not. Furthermore, determining whether the splice junction is exon-intron, or intron-exon. This will be accomplished using a Time Series Classification model, specifically using Dynamic Time Warping.
To run this file, make sure you have Python 3.7.2, Jupyter Notebook, and (optionally) Anaconda installed.
- Numpy >= 1.16.2
- Pandas >= 0.24.2
- Scikit-learn >= 0.20.3
- Matplotlib >= 3.0.3
- DTW >= 1.0.4
Most of these, except DTW, can be installed automatically using the requirements.txt file:
pip install -r requirements.txt
or if you're using Anaconda, create a virtual environment:
conda create --name splice_junction --file requirements.txt
Navigate to the directory with the ipynb file, and then launch Jupyter:
jupyter notebook
Eli Halpern, Abolfazl Saghafi
This project is licensed under the GNU General Public License - see the LICENSE.md file for details
- Dr. Abolfazl Saghafi, my research advisor who guided me through this project
- Dr. Zhijun Li, head of the Bioinformatics department at USciences