Skip to content

Logistic regression analysis performed on a complex dataset of mushroom profiles with the goal of determining the characteristics which make a mushroom edible or inedible to humans.

Notifications You must be signed in to change notification settings

cmdecesaris/MushroomClassification

Repository files navigation

Mushroom Classification

Project Introduction

Mushrooms are diverse, complex structures with a multitude of overlapping and contrasting characteristics between species. To better understand how mushroom characteristics affect edibility, a previously unanalyzed dataset containing 173 detailed mushroom profiles was processed and fit with a logistic regression model using edibility status as a binary outcome (Wagner et al., 2021). Model inference revealed mushrooms with white stems, winter growing season, and brown caps have a lower probability of being inedible, while mushrooms with bell shaped caps, and green caps have a higher chance of being inedible. The overall model fit was sufficient, but future studies would benefit from datasets with increased sample sizes across all predictors.

Data Description

This dataset includes 61069 hypothetical mushrooms with caps based on 173 species (353 mushrooms per species). Each mushroom is identified as definitely edible, definitely poisonous, or of unknown edibility and not recommended (the latter class was combined with the poisonous class).

Original Owner and Doner: D. Wagner

For more about the data generation see: https://mushroom.mathematik.uni-marburg.de/files/

Running the Project

  1. Preprocessing.R is sourced to run before the Graphs_Tables.R and Logistic_Regression.R files
  2. The Graphs_Tables.R and Logistic_Regression.R files can be run independently

Files in this Repository

  1. primary_data.csv: data donated by D. Wangner used in this project
  2. Preprocessing.R: File which cleans and wrangles the data into a suitable form for visualizaiton and modeling approaches
  3. Graphs_Tables.R: File which generates the majority of graphs and tables found in the Report.pdf and Summary_Poster.pdf
  4. Logistic_Modeling.R: File contains the modeling approach, diagnostics, and evaluation of models
  5. Report.pdf: A detailed report of the project and methods used in code files. Reference this file for model and data interpretations
  6. Summary_Poster.pdf: Summary of the report and key parts of the project

References

Blackwell M. (2011). The fungi: 1, 2, 3 ... 5.1 million species ?. American journal of botany, 98(3), 426– 438. https://doi.org/10.3732/ajb.1000298

Casadevall, A., Heitman, J., & Buckley, M. (2008). The Fungal Kingdom: Diverse and Essential Roles in Earth's Ecosystem.

Faraway, J. (2016). Extending the linear model with R. Second Edition, Chapman and Hall. ISBN 9781498720960

Harding, P., (2013). Mushroms and Toadstools. Dorling Kindersley.

Wagner, D., Heider, D., & Hattab, G. (2021). Mushroom data creation, curation, and simulation to support classification tasks. Scientific reports, 11(1), 8134. https://doi.org/10.1038/s41598-021-87602-3

Wasser S. P. (2017). Medicinal Mushrooms in Human Clinical Studies. Part I. Anticancer, Oncoimmunological, and Immunomodulatory Activities: A Review. International journal of medicinal mushrooms, 19(4), 279–317. https://doi.org/10.1615/IntJMedMushrooms.v19.i4.10

About

Logistic regression analysis performed on a complex dataset of mushroom profiles with the goal of determining the characteristics which make a mushroom edible or inedible to humans.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages