Mid_bootcamp_project

General information:

The midbootcamp project: Can we predict cancer using gene expression profile?

select gene expression datasets for different types of cancers
Data checking, cleaning, transform if needed
Identify genes that are differentially expressed (called DEGs) in cancer samples vs normal samples. Using two samples t-test at p_sig = 0.05
Check and excluding genes that are highly correlated among the identified DEGs subset with a threshold for exluding at 0.95
Split and train model on working datasets:
- using transformed data (using quantile transformation) vs non-transformed data
Validation:
- on the whole (train + test) dataset
- using a new dataset of the same cancer type
- calculation all validation metrics: precision, accuracy, recel, F1, cohen_kappa_score

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
Models		Models
Raw_Data		Raw_Data
Slides		Slides
Transformer		Transformer
notebooks		notebooks
README.md		README.md
info_data_yaml.yaml		info_data_yaml.yaml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt