Using the decision tree technique based on entropy calculation, this application calculates the hit rate of the HASTIE file with a hit rate higher than 99%.
Requirements:
Spyder 4
In the C: directory, the attached file Hastie10_2Corrected.txt must have been downloaded. It is the HASTIE obtained with the facilities of sklearn but corrected so that all the records are correct in the training and to be able to measure the real precision of the algorithm (see https://github.com/ablanco1950/HASTIE_Corrected_HitRate_vs_Sensitivity).
Functioning:
Run from Spyder:
HASTIEDecisionTree_C4-5.py
Which uses the first 9,600 records of Hastie10_2Corrected.txt as training and the last 2,400 records as test. Any other test file that has the same HASTIE structure can be considered, changing the file assignment of line 111 of the program and the values of the StartTest and EndTest parameters that appear at the beginning of the program.
Also included are all the programs that have been used to calculate each node of the tree and decision values based on the entropy calculation.
References:
Diverse material of practices of the subject Artificial Intelligence of the Higher Polytechnic School of the Autonomous University of Madrid.
https://github.com/ablanco1950/HASTIE_Corrected_HitRate_vs_Sensitivity
https://github.com/ablanco1950/HASTIE_NAIVEBAYES
https://github.com/ablanco1950/SKLEARN_HitRate_vs_Sensitivity