In this project we aim to create models that can predict the area burned in fires. We will be working with data from the UCI machine learning repository and is a collection curated by scientists Paulo Cortez and Anibal Morais from the Unviersity of Minho, Portugal.
The data includes meteorological measurements taken at the time the fire was reported, spatial/temporal measurements, and index metrics that take into a account weather data in the recent past. The index metrics are a part of the Canadian Fire Weather Index System (FWI). The data was collected from January 2000 to December 2003 from the northeast region of Portugal. Data was collected from a total of 517 fires.
Many different ML models can be applied - in the basis paper the one who achieved the best performance was Support Vector Machines (SVM). However, in this project we will be testing advanced regression techniques only.
- Data exploration and manipulation. Data reshaping and outlier removal to enhance the dataset.
- Create models to predict the burned area according to different regression techniques.
- Compare the models performance and choose a winner.
We make use of advanced techniques such as subset selection, removal of influential points and so on.
P. Cortez and A. Morais. A Data Mining Approach to Predict Forest Fires using Meteorological Data. In J. Neves, M. F. Santos and J. Machado Eds., New Trends in Artificial Intelligence, Proceedings of the 13th EPIA 2007 - Portuguese Conference on Artificial Intelligence, December, Guimaraes, Portugal, pp. 512-523, 2007. APPIA, ISBN-13 978-989-95618-0-9. Available at: http://www.dsi.uminho.pt/~pcortez/fires.pdf