This project involves comprehensive data analysis and visualization of a retail sales dataset. The primary goal is to derive actionable insights from the data, which includes details about orders, customers, products, and regions. The project utilizes various tools and techniques, including Power BI, to create interactive dashboards that showcase key metrics and trends.
- Power BI: Used for creating interactive dashboards and visualizations.
- SQL: Employed for data extraction and manipulation.
- R: Utilized for data cleaning and preprocessing.
- Excel: Assisted in initial data exploration and verification.
The dataset used in this project is derived from a fictional retail store, providing a rich mix of data fields including:
- Order Details: Order ID, Order Date, Ship Date, Ship Mode
- Customer Information: Customer ID, Customer Name, Segment, Country, City, State, Postal Code, Region
- Product Details: Product ID, Category, Sub-Category, Product Name
- Financial Metrics: Sales, Quantity, Discount, Profit
- Total Sales: Sum of all sales transactions.
- Average Order Value: Average value per order, providing insight into customer spending habits.
- Customer Segmentation: Breakdown of customers by segment (e.g., Consumer, Corporate, Home Office).
- Top 5 Total Sales: Top 5 Products By Sales amount via Product Name.
- Category Performance: Analysis of sales by product category and sub-category.
- Sales by Region: Comparison of sales performance across different geographic regions.
- Market Share: Analysis of the market share within various regions.
- Data Preparation: The Dataset is already prepared when you run the R-code but you can also ensure the dataset is cleaned further and formatted correctly where you deem necessary.
- Preprocessing can be done using tools like Excel, R or Python (with pandas), in this case I used R to prepare the dataset.
- Machine Learning is also used in the project to perform statistical analysis and/or predictive modelling
- Import Data: Load the dataset into Power BI or another preferred data visualization tool (e.g. Tableau).
- Dashboard Creation: Use the provided DAX formulas and visualization guidelines that I've already done to build the dashboards otherwise create your own.
- Analysis and Interpretation: Utilize the dashboards to derive insights and make data-driven decisions.
- Data: The raw excel dataset and any processed versions.
- Scripts: R and SQL scripts used for data cleaning and preprocessing.
- Power BI Files: Power BI project files containing the dashboards.
- Certain regions, such as 'West' and 'East', contribute significantly to overall sales, accounting for a substantial portion of the total market share. These regions are the primary revenue drivers and represent strongholds for the business.
- Regions like 'Central' have shown promising growth but still hold a smaller share of the market. These areas represent potential growth opportunities where targeted marketing and sales strategies could be implemented to increase market penetration.
- The 'South' region, have relatively shown to have low market share. These regions may require further investigation to understand the challenges and develop strategies to enhance performance.
- This project demonstrates the practical application of data analytics and visualization techniques to understand business performance, customer behavior, and market trends.
- The insights gained can help businesses make informed decisions and strategize for future growth.
This project is licensed under the MIT License - see the LICENSE file for details or click here.
Special thanks to Kaggle for providing the Superstore Sales Dataset used in this project which is located in the 'Dataset' directory of this project or can be found on the Kaggle website when you click here.