These scripts are for random forest (RF) classification (RF_classification.R) and regression (RF_regression.R). These scripts have three sections viz. 1) training/running the RF model, 2) testing the accuracy of the models, and 3) assessing variable importance
Detailed description of the scripts and outputs are present in the following links:
- RF-classification: https://avishekdutta14.github.io/rfc/
- RF- regression: https://avishekdutta14.github.io/rfr/
The paprica_to_rf.R script is for converting paprica outputs to RF inputs. Please go through the script for additional details about what paprica files you can use as an input to this script.
If you are using this code please cite the following literatures and also other libraries used in these codes
- Dutta, A., Goldman, T., Keating, J., Burke, E., Williamson, N., Dirmeier, R., & Bowman, J. S. (2022). Machine Learning Predicts Biogeochemistry from Microbial Community Structure in a Complex Model System. Microbiology Spectrum, 10(1), e01909-21.
- Liaw A, Wiener M. 2002. Classification and regression by random Forest. R News 2:18–22.