This repository contains the results of a Data Engineering project aimed at modernizing the operational and analytical systems of the City Zoo. The project was conducted as part of a professional assignment to design and implement IT-supported processes for a long-established organization undergoing a comprehensive modernization effort.
The project addresses key tasks required for implementing integrated data systems in the zoo's operations, focusing on database design, implementation, and data quality assurance. The deliverables are structured into the following objectives:
-
Operational Database Design (ERM)
-
Relational Database Implementation & export in SQL format
- Implement the operational MS Access database based on the ERM.
- Provide comprehensive documentation, including a detailed data dictionary.
- Python-Based Export Algorithm to export MS Access data into an SQL format
-
Data Warehouse Design and Architecture
-
Data Quality Concept
- Develop a strategy to maintain data quality above 97%.
- Ensure sustainable processes for data maintenance and error handling.
-
Project Presentation
- Summarize project results in a structured presentation for stakeholders.
- Highlight methodologies, decisions, and best practices.
The following artifacts are included in this repository:
-
Operational Database Design
-
Data Warehouse Components
-
Data Quality Concept
-
Presentation
The project utilized the following tools and methodologies:
- Database Management: Microsoft Access for database prototyping.
- Modeling: ERM and Data Warehouse schema design with tools supporting PNG and PDF exports.
- Data Quality: A structured approach to ensure accuracy and consistency in the dataset.
- Collaboration: Stakeholder interviews and iterative feedback loops to refine deliverables.
The project paves the way for further enhancements in the zoo's IT infrastructure:
- Deployment of the operational database and Data Warehouse in a production environment.
- Continuous data quality monitoring and improvement.
- Expansion of analytical capabilities with advanced BI tools.
This repository is made available for educational purposes and is shared under CC BY-NC-ND 4.0. For details, see the LICENSE file.