This project explores the patterns of disease inheritance to understand how genetic factors contribute to the prevalence of various diseases across populations. We used Python (Jupyter Notebook) for data exploration and Power BI for visualization. The analysis focuses on:
- Identifying which diseases are more likely to be inherited.
- Examining the role of gender in disease inheritance.
- Understanding the prevalence of inherited diseases across different locations and age groups.
- Analyzing inheritance types (Maternal, Paternal, Both, None) to identify which type is more common.
- Studying family history and its relation to disease transmission over generations.
- An interactive Power BI Dashboard provides a comprehensive visualization of these insights.
Python (Pandas, Seaborn, Matplotlib) : For data generation, cleaning, exploration, and analysis. Power BI: For creating interactive visualizations and reports.
The data cleaning and exploratory analysis were performed in Jupyter Notebook using Python libraries like Pandas, Seaborn, and Matplotlib.
- Data Generation:
- Created a synthetic dataset containing information on individuals, their diseases, and the inheritance patterns.
- Variables included gender, disease name, inheritance type (Maternal, Paternal, Both, None), age group, location, and comorbidities.
- Data Cleaning:
- Addressed missing values and ensured the accuracy of inheritance percentages based on the inheritance type.
- Grouped diseases by type and prevalence, and validated inheritance patterns.
- Disease Inheritance Analysis:
- Analyzed the likelihood of disease inheritance.
- Explored which diseases are more likely to be inherited.
- Gender-Based Inheritance:
- Compared the inheritance rates for males and females across various diseases.
- This analysis focused on determining whether males or females are more likely to inherit diseases.
- Family History:
- Explored inheritance patterns across generations (1-4 generations back) and their significance for disease transmission.
- Location-Based Inheritance:
- Analyzed the inheritance patterns based on geographic location to identify which regions have a higher occurrence of inherited diseases.
The Power BI report includes various visualizations to make the insights accessible:
- Disease Inheritance Analysis: A bar chart showing the likelihood of inheritance for each disease.
- Gender-Based Inheritance: A stacked bar chart that highlights the inheritance distribution by gender.
- Inheritance Type Distribution: A pie chart visualizing the proportion of inheritance types (Maternal, Paternal, Both, None).
- Location vs. Inheritance: A visualization comparing how location influences disease inheritance patterns.
- Age Group Distribution: A filter to analyze how the prevalence of diseases changes across different age groups.
- Disease Inheritance: Diseases like Asthama and Cystic Fibrosis show higher inheritance rates compared to others.
- Gender-Based Patterns: Females are slightly more likely to inherit diseases, especially in the case of maternal inheritance.
- Location Trends: Regions such as India and London show a higher prevalence of inherited diseases, particularly for conditions like hypertension and heart disease.
- Inheritance Types: Maternal inheritance dominates for diseases such as asthma and cancer, while paternal inheritance is common for conditions like cystic fibrosis.
- Family History: Diseases inherited across generations tend to have higher rates of comorbidities, such as stroke or Alzheimer’s.
This project provides an in-depth look into the inheritance patterns of diseases, revealing significant trends based on gender, location, and family history. The combination of Python for data analysis and Power BI for interactive visualization makes it easy to explore these findings and gain valuable insights into genetic disease transmission.