-
Notifications
You must be signed in to change notification settings - Fork 1
Home
FastRandomForest is a re-implementation of the Random Forest classifier (RF) for the Weka machine learning environment. FastRF brings speed and memory use improvements over the original Weka RF, particulary for datasets with large numbers number of features or instances.
The current version, FastRF 2.0 beta, employs a particular algorithmic trick to improve efficiency over the standard Random Forest algorithm (as implemented in the previous FastRF 0.99 or as in Weka RF), while retaining the accuracy of predictions. In particular, each tree is built from a subset of attributes from the entire dataset. In comparison, in the standard RF, individual nodes are constructed using subsets of attributes, but there are no tree-wise constraints.
FastRF 2.0b was developed by Jordi Piqué Sellés at the Genome Data Science lab of the IRB Barcelona. The code is a much-improved version of FastRF 0.99 (by Fran Supek), which is itself loosely based on the RF implementation in Weka 3.6.
-
An explanation about how the algorithm is organized into Java classes and how they interact. Also lists the options.
-
The list of optimizations and changes that have been added in each version of FastRandomForest up to FastRF 2.0 beta.
-
FastRF accuracy and speed benchmarks
A comparison of the accuracy of FastRF 2.0 beta versus FastRF 0.99 and versus Weka 3.8.1 Random Forest implementation.
-
A discussion on the choice of the ‘number of attributes per leaf’
m_Kvalue
and ‘per tree’m_numFeatTree
. -
Future work on FastRandomForest
The changes and improvements that we are planning for a future version. We welcome user feedback!
To visit the old repository of FastRandomForest 0.99, please follow this link: https://code.google.com/archive/p/fast-random-forest/