Skip to content

Comprehensive project on data warehousing and association rule mining applied to real crime datasets.

Notifications You must be signed in to change notification settings

vyclu20/CrimeDataWarehousing-AssociationMining

Repository files navigation

CrimeDataWarehousing-AssociationMining

Project was last updated on: 12 June 2023

This was a university project for a data warehousing unit/module.

This project explored data cleaning/ETL scripts, SQL queries, and a PowerBI report, along with the implementation of a star schema using SQL Server Management Studio, as well as the intricacies of multi-dimensional analysis service solutions and visualize results through Power BI. I received a grade of around 80% for this project.

Clarification

I didn't upload the csv files because I felt like they would've just been there and that like 8 csv files would serve little to no purpose when it comes to showcasing this project, but basically the csv files were just of data that related to the crime data set, linked with an ID number for the dimensions and fact tables. So I'll have csv files for the 2 location dimensions, 1 csv file for a date dimension, the IDs for the locations, and the crime ID itself for identification, etc. It was essentially multiple csv files from one whole csv file of all the data you'd possibly need for a single crime (single row) separated into different sections but still interconnected.

Project Overview

This project delves into the realm of data warehousing and association rule mining using real crime datasets.

⭒■━━━━━━ˁᱸᲲᱸˀ━━━━━━■⭒

Data Warehousing Design and Implementation:

Creation of a star schema using SQL Server Management Studio.

Population of tables through SQL commands from cleaned CSV files.

Development of a multi-dimensional analysis service solution in SQL Server Data Tools.

Visualization in Power BI:

Utilization of Power BI for visualizing data and presenting insights.

Integration of StarNet diagrams and cube diagrams to enhance data interpretation.

Association Rule Mining:

Processing the crime dataset into a case table and a nested table.

Explanation and interpretation of the top k association rules, focusing on the "crime" type.

Deliverables:

PowerBI file and PDF report for visualizations.

Data cleaning/ETL script and SQL script with cleaned CSV files.

Solution project file and folder for the SSDT analysis service multi-dimensional project.

PDF file detailing the association rule mining process and results.

⭒■━━━━━━━━━━━━━━━■⭒

About

Comprehensive project on data warehousing and association rule mining applied to real crime datasets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published