Evaluation of Factors and Strategies to Mitigate Colony Loss in US
Project for the course of NonParametric Statistics, 2022, Politecnico di Milano
Bees are one of the most ecologically and commercially important insects in the world, in our project:
- we analyze the risk factors that can be associated with the disappearance of colonies,
- estimate the economic impact associated with lost honey, (although losses from missed pollination could be even more significant),
- show how these losses are not bearable for beekeepers. Finally, this information was gathered to provide suggestions for public authorities to better develop resilience plans.
git clone https://github.com/eugeniovaretti/honeybeehealth
packages_list <- c(“aplpack”, “car”, “DepthProc”, “dplyr”, “factoextra”, “fda”, “fdacluster”, “fdaPDE”, “fdatest”, “forecast”, “ggplot2”, “gridExtra”, “hexbin”, “ISLR2”, “ISLR2”, “latex2exp”, “lattice”, “maps”, “MASS”, “mgcv”, “np”,“pbapply”, “plotly”, “raster”, “RColorBrewer”, “readr”, “rgl”, “roahd”, “robustbase”, “sf”, “shiny”, “splines”, “stats”, “survival”, “survminer”, “svglite”, “tidyverse”, “tseries”, “usmap”, “viridis”, “visdat”, “weathermetrics”)
install.packages(packages_list)
In our analyis, we use different public data set:
- factors which stress directly bee colonies as Varroa Mite or pesticides (source: USDA)
- other possible influential factors, in particular Temperature, Drought, Precipitation (source: National Centers for Environmental Information)
- annual production and price of the honey for each state (USDA)
The required data for each Notebook is present in the folder where the code is.
To facilitate furher researches, a cleaned version of the dataset is provided in the folder FinalDataSet
Report and presentation can be found in Presentation and Report folder.
The folder Map_Visualization contains a Shiny tool that allows you to produce some plots as average losses per state.
The folder DataConsistency contains Data_Consistency.ipynb which is used to explain some apparent inconsistency in data.
Analysis of potential exploratory factors AnalysisOfFactors
Exploration_Plots.ipynb
: first exploratory analysis for colony losses distribution over states and seasons, effects of the stressors and creation of survival metrics.Functional_Depth_Measure.R
: Outliers detections in the different features available using functional boxplots and outliergramsFunctional_Permutation_Tests.R
: Functional permutation tests to check difference in distributions of featues among different seasonalities, quarters...paired univariate permutation test.Rmd
: two paired-population permutation test to check if the loss in summers is significantly different from the loss during wintersanova_for_varroa.Rmd-
: anova permutational test between 3 clusters based on minimum temperature to check if the influence of varroa is significantly different among the groupsMultivariate_Depth_Measure.R
: Outliers detection in multivariate casedata_outputs/
contains output (and input) dataset coming from the bayesmix analysis useful forbnp_clus_and_func.Rmd
A Model of loss-stressors relation LossModel
Gam_final.Rmd
: code for the final GAMbnp_clus_and_func.Rmd
: attemptive BNP clustering using (bayesmix) to cluster time series
Quantification of the impact of losses on Beekeepers and economical magnitude ImpactModel
Survival.Rmd
: Survival Analysis to quantify the impact of the colony loss on beekeepersSpatialRegression_ColonyLossPct_Price_Plots.R
: Penalized Spatial Spline Regression Models with targets the money loss or the colony loss percentage- 'TimeSpatialRegr_Price.R': Penalized Semiparametric Regression Model for Spatial Functional data, using as target the money loss in k$ for each 100 colonies present at a given state in a given quarter, + Eigen Sign-flip score test on beta coefficients of parametric part of the model
TimeSpatialRegr_AbsColonyLoss.R
: Penalized Semiparametric Regression Model for Spatial Functional data with the absolute values of colony loss as targetTimeSpatialRegr_ColonyLossPct.R
: Penalized Semiparametric Regression Model for Spatial Functional data with colony loss percentage as targetTimeSpatialRegr_ColonyLossPct_Inference.R
: One-at-a-time and simultaneous Eigen Sign-flip score tests on beta coefficients of parametric part of the Spatial functional regression model
Utilities utils
utils/
: contains utilities functions that are useful to manage data and plots
- Marta Cerri (@martacerri)
- Luca Mainini (@lucamainini)
- Lupo Marsigli (@LupoMarsigli)
- Eugenio Varetti (@eugeniovaretti)