Home

FastRandomForest is a re-implementation of the Random Forest classifier (RF) for the Weka machine learning environment. FastRF brings speed and memory use improvements over the original Weka RF, particulary for datasets with large numbers number of features or instances.

The current version, FastRF 2.0 beta, employs a particular algorithmic trick to improve efficiency over the standard Random Forest algorithm (as implemented in the previous FastRF 0.99 or as in Weka RF), while retaining the accuracy of predictions. In particular, each tree is built from a subset of attributes from the entire dataset. In comparison, in the standard RF, individual nodes are constructed using subsets of attributes, but there are no tree-wise constraints.

FastRF 2.0b was developed by Jordi Piqué Sellés at the Genome Data Science lab of the IRB Barcelona. The code is a much-improved version of FastRF 0.99 (by Fran Supek), which is itself loosely based on the RF implementation in Weka 3.6.

How is FastRF implemented?

An explanation about how the algorithm is organized into Java classes and how they interact. Also lists the options.
Algorithmic Optimizations

The list of optimizations and changes that have been added in each version of FastRandomForest up to FastRF 2.0 beta.
FastRF accuracy and speed benchmarks

A comparison of the accuracy of FastRF 2.0 beta versus FastRF 0.99 and versus Weka 3.8.1 Random Forest implementation.
Important FastRF parameters

A discussion on the choice of the ‘number of attributes per leaf’ m_Kvalue and ‘per tree’ m_numFeatTree.
Future work on FastRandomForest

The changes and improvements that we are planning for a future version. We welcome user feedback!

To visit the old repository of FastRandomForest 0.99, please follow this link: https://code.google.com/archive/p/fast-random-forest/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Clone this wiki locally