This repository includes data, scripts, and documentation for analyzing and predicting crime rates in Chicago using weather data, featuring models like Linear Regression and Random Forest, and ARIMA for time series analysis using R Studio.
- Data Collection and Cleaning: Scripts and methodologies for acquiring and cleaning the crime and weather datasets.
- Exploratory Data Analysis (EDA): Initial analysis to uncover patterns and insights within the data.
- Feature Engineering: Techniques employed to create predictive features from raw data.
- Data Modeling: Implementation of various statistical models including Linear Regression and Random Forest Regression.
- Time Series Analysis: Utilization of ARIMA models to understand and predict the temporal patterns of crime.
- Documentation: Detailed project reports and presentation slides explaining the methodologies, results, and implications of the findings.
scripts/
- Contains R scripts for data preprocessing, exploratory analysis, and modeling.data/
- Raw and processed datasets used in the analyses.docs/
- Project reports and presentation materials.figures/
- Generated plots and figures to visualize insights and model results.
- Predictive Modeling: The implementation of machine learning models provides forecasts of crime occurrences, aiding in proactive public safety planning.
- Temporal Analysis: Time series analysis highlights significant seasonal and temporal trends in crime rates, which are crucial for planning law enforcement interventions.
- Impact of Weather: Correlations between weather conditions and crime rates were explored, emphasizing the influence of environmental factors on crime.
- R: For all data processing and analysis tasks.
- Libraries:
dplyr
,tidyr
,ggplot2
,forecast
,caret
, andrandomForest
. - Software: RStudio for script execution and development.
- Clone this repository.
- Navigate to the
scripts/
directory. - Run the scripts in the order specified in the
README.md
.
- Rohan Sattarapu