Skip to content

Appleabc123/GCSENet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 

Repository files navigation

GCSENet

GCSENet is a novel learning-based framework for miRNA-disease association identification via graph convolutional neural networks、convolutional neural networks、Squeeze-and-Excitation Networks(SENet).

it contains: "code" and "data"

In the ‘code’ folder, it includes ‘1.Generate Feature by GCN’, ‘2.Feature Process’, ‘3.Train’, ‘4.Test’, which means the code to generate feature, the code to process feature, the code to train the model and the code to test the model, respectively.
In the ‘data’ folder, it contains ‘generate feature’, ‘process feature’, ‘CNN_SENet’, ‘Test’, which means the place to save the raw data, the place to save the feature components (miRNA-gene, disease-gene), the place to save the feature of miRNA-disease and the place to save the test dataset, respectively.

#Dependencies

GCSENet was implemented with python 3.6.4. To run GCSENet, you need these packages:
Matplotlib (3.1.1),     (https://pypi.org/project/matplotlib/).
Networkx (2.5),       (https://pypi.org/project/networkx/).
Tensorflow-gpu (1.4.0), (https://pypi.tuna.tsinghua.edu.cn/simple/tensorflow-gpu/).
Numpy (1.19.1),       (https://pypi.org/project/numpy/).
Pandas (0.25.3),       (https://pypi.org/project/pandas/).
Sklearn (0.20.3),       (https://pypi.org/project/sklearn/).
Scipy (1.5.2),          (https://pypi.org/project/scipy/).
In addition,CUDA 8.0 and cuDNN 6.0 have been used.

#How to reproduce our results:

   Download the code package from https://github.com/Appleabc123/GCSENet.

Step1. Get the feature vector (disease-gene, miRNA-gene)

   Set the data_path in main.py, containing original data (d-d.csv, g-g.csv, d-g.csv,disease_name.csv, gene_name.csv).
   
   Run main.py to obtain the disease-gene vector. The result includes ‘disease-gene.csv’ file, which is saved in ‘../data/process_feature’ folder.
   
   Set the data_path in main.py, containing original data (m-m.csv, g-g.csv, m-g.csv, miRNA_name.csv, gene_name.csv).
   
   Run main.py to obtain the miRNA-gene vector. The result includes ‘miRNA-gene.csv’ file, which is saved in ‘../data/process_feature’ folder.

Step2. Get the weighted feature (disease-miRNA) and label

   Run process_feature.py to obtain the disease-miRNA vector. The result includes two files with ‘disease-miRNA.csv’ and ‘label.csv’, which are saved in the ‘../data/CNN_SENet’ folder.

Step3. Train the network

   Run CS_train.py to train the model.

Step4. Test the benchmark2019 set to get the AUROC, AUPR, Precision, Recall, F1-score

   Run CS_test.py to test the model.

#How to use the framework on your interested datasets (disease, gene, miRNA) as training or test datasets?

Step 1. Generate the feature vector of miRNA-gene, disease-gene

(1). Put the interested data (d-d.csv, g-d.csv, g-g.csv, disease_name.csv, gene_name.csv) in the ‘../data/Generate feature/disease-gene’ folder, the interested data (g-g.csv, g-m.csv, m-m.csv, gene_name.csv, miRNA_name.csv) in the ‘../data/Generate feature/miRNA-gene’ folder, respectively.
(2). Set some parameters in main.py:
     data_path = ‘../data/Generate feature/disease-gene’ #setting the data directory as the directory where you save the raw data (disease-gene, miRNA-gene).
     save_path = ‘../data/process_feature’  #the directory is the place to save the feature vectors of disease-gene and miRNA-gene which will used in the following step 2.
(3). Run ‘main.py’ to get the feature vector of disease-gene and miRNA-gene which will be saved in the given path.

Step 2. Process the feature vector to get the feature of miRNA-disease and label

(1). Put the positive sample (pos.txt) and negative sample (neg.txt) in the ‘../data/process_feature’.
(2). Set some parameters in process_feature.py:
     input_postive = ‘../data/process_feature/pos.txt’ #the directory is used for saving the positive sample file from the test dataset. In our GCSENet, the positive samples are from benchmark2019.
     input_negative = ‘../data/process_feature/neg.txt’ #the directory is used for saving the negative sample file. The negative samples in the test dataset are not in the dataset generating positive samples.
     Output = ‘../data/CNN_SENet’ #the directory is the path where you save the output files of miRNA-disease feature and label.
(3). Run ‘process_feature.py’ to get the miRNA-disease feature (disease-miRNA.csv) and label (label.csv), which will be used in the following step 3.

Step 3. Train the model

Run CS_train.py to train the model.

Step 4. Test the model

Run CS_test.py to get the AUROC, AUPR, Precision, Recall, F1-score.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages