GitHub - amritbhanu/EDM591_Hyperparameter: Hyper Parameter Optimization for Education Mining

EDM591_Hyperparameter

Data folder contains raw data as well as preprocessed data. You will need raw folder and datset2 folder inside. dataset1, dataset2, dataset3 csvs will automatically be generated.
Dump folder contains the results dump from running our scripts. So that results can be generated quickly.
Results folder contain all our graphs and results for the report and ppt. They will automatically be generated our running our scripts.
Src directory contains all our scripts.

DE.py is the generalised code/class of our DE. We are planning to publish this DE as a python package.
demos.py is the code which runs our main script by calling its function and parameters as argument.
main.py is the main code which runs our tuned results and generates dump.
ML.py is the generalised code of all our Machine learning implementations.
Preprocess_dataset1_3.py is the preprocess script for dataset1 and dataset3.
preprocessing_dataset2.py is the preprocess script for dataset2.
read_pickle.py reads all the results dump from dump folder and generates graph in results folder.
sk.py is the code for our statistical test which is scottknot.
untuned.py is another main code which runs our untuned results and generates dump.

Go into src folder and run in the sequential order, how we mentioned below.

'python preprocessing_dataset2.py'
'python Preprocess_dataset1_3.py'
'python untuned.py _test dataset1' : this will generate dataset1_untuned.pickle in dump folder
'python untuned.py _test dataset2' : this will generate dataset2_untuned.pickle in dump folder
'python untuned.py _test dataset3' : this will generate dataset3_untuned.pickle in dump folder
Now to run these scripts you will need High Performance computing (HPC) servers since it will 4-8 hours to end each script. If it cant be run, we have provided the dump of our results. Directly jump to step 7.
- 'python main.py _test dataset1' : this will generate dataset1.pickle and dataset1_late.pickle in dump folder
- 'python main.py _test dataset2' : this will generate dataset2.pickle and dataset2_late.pickle in dump folder
- 'python main.py _test dataset3' : this will generate dataset3.pickle and dataset3_late.pickle in dump folder
'python read_pickle.py' : will generate graphs in results folder.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
csv		csv
data		data
dump		dump
hpc_install		hpc_install
results		results
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt