This repository contains the code and resources for the Snowflake Data Pipeline Project. The goal of this project is to create an automated data pipeline using Snowflake and Azure Data Factory, focusing on data ingestion, transformation, and automation.
The Snowflake Data Pipeline Project is designed to demonstrate the process of building a robust data pipeline using Snowflake as the data warehousing solution and Azure Data Factory (ADF) for data ingestion and orchestration. The pipeline loads raw data into Snowflake, transforms it, and automates the process using Snowflake tasks. This project is ideal for data engineers looking to understand the integration of Snowflake with Azure services.
- Data Ingestion: Load raw sales data from a CSV file into Snowflake.
- Data Transformation: Transform raw data to calculate key metrics (e.g., total sales, sales year).
- Automation: Automate the data transformation process using Snowflake tasks.
- Integration with Azure Data Factory: Utilize Azure Data Factory for seamless data ingestion and processing.
- Monitoring and Error Handling: Monitor the task execution and handle any errors that arise.
- Snowflake: Cloud-based data warehousing platform.
- Azure Data Factory: Data integration and orchestration service.
- SQL: For data manipulation and transformation within Snowflake.
- GitHub: Version control and project hosting platform.
- A Snowflake account (Sign up for free).
- An Azure account (Sign up for free).
- Intermediate knowledge of SQL and data warehousing concepts.
snowflake-data-pipeline/ │ ├── data/ │ └── sales_data.csv # Sample data file │ ├── scripts/ │ ├── snowflake_setup.sql # SQL script to set up Snowflake database and schema │ ├── snowflake_task.sql # SQL script to create and manage Snowflake tasks │ └── data_transformation.sql # SQL script to transform raw data │ ├── README.md # Project README file
https://app.snowflake.com/stfofgt/do33111/#/sales-folder-fDw6k1nzG