Skip to content

betyvelavi/data-portfolio

Repository files navigation

🚀 Welcome to My Data Science Journey!

Hello there! I'm Beatriz, a passionate data scientist fresh out of the academic realm with a shiny new master's degree in Applied Statistics and Data Science. 🎓 Armed with a blend of R and Python, I'm on a mission to extract meaningful insights from data and tell compelling stories through analytics.

About Me:

  • Master's in Applied Statistics and Data Science
  • Aspiring Data Scientist & Storyteller
  • Skilled in R, Python, and on the SQL learning curve
  • Passionate about astronomy

Certifications in Progress:

  • Google Data Analythics Professional Certificate
  • IBM Data Engineering Professional Certificate

What to Expect:

Delve into my repository to discover a varied array of data projects capturing my academic journey, including those accomplished during my Master's Degree using R and Python. I am currently in the process of reformatting and refining these projects, ensuring they align with the latest techniques and best practices in the ever-evolving field of data science. Additionally, join me on my ongoing exploration of SQL as I weave this new skill into the fabric of my data science narrative.

Highlighted Projects:

R

  • Tinnitus Case Study:
    • Description: This project addresses the psychological impact of tinnitus through the analysis of a dataset containing information from 142 subjects.
    • Tools employed: Utilizing various R libraries, the exploration includes data cleaning, descriptive statistics, and correlation analysis. The project delves into linear regression models, and when traditional linear regression assumptions posed challenges, alternative models such as Generalized Additive Models (GAM) and K-Mean Regression were explored.
    • Findings: The comparison revealed that K-Mean Regression outperformed. The project concludes with insights into potential future enhancements, including addressing normality test failures and further exploring GAM models.
  • Marketing for Insurance Industry Case Study
    • Description: This project leverages an insurance industry dataset, aiming to optimize future marketing campaigns by refining target audience selection.
    • Tools employed: Information Value analysis, variable clustering, data cleaning, and modeling with Random Forests and Support Vector Machines.
    • Findings: The Random Forest demonstrated the highest Accuracy, while Support Vector Machines, especially the Gaussian Radial Kernel SVM, excelled in AUC values, suggesting its efficacy in target audience selection for the insurance company's marketing campaign. The conclusion emphasizes the relative utility of models and suggests avenues for improvement and further analysis.
  • Spotify Classification Case Study
    • Description: This project focuses on evaluating classification models for categorizing songs into musical genres using a Spotify dataset.
    • Tools employed: Various models, including k-Nearest Neighbors, Decision Trees, Random Forests, Support Vector Machines, Linear, and Quadratic Discriminant Analysis, were trained and compared based on Accuracy, Sensitivity, and Specificity.
    • Findings: The Support Vector Machine (SVM) emerged as the best-performing model in terms of Accuracy, with future research directions discussed.

Python

  • Simplified DES
    • This project revolves around the implementation of a Simplified Data Encryption Standard (DES) algorithm. The objective is to create a streamlined version of the DES encryption process, incorporating key components such as substitution, permutation, and key expansion. By leveraging Python's capabilities, the project delves into the intricacies of cryptographic algorithms, providing a hands-on exploration of the DES encryption process. The goal is to not only understand the fundamental principles of DES but also to gain practical experience in coding and executing this widely-used encryption standard.
  • Feistel Cipher
    • This project centers on the implementation and exploration of a Feistel Structure, a fundamental component in the design of cryptographic algorithms. The goal is to comprehend the principles behind data transformation within this structure, investigate its cryptographic properties, and gain practical insights into the design and implementation of secure and efficient cryptographic algorithms.

SQL Learning Journey

  • Access my SQL learning journey here!

Current Focus:

🔍 Currently honing my SQL skills to enrich my toolkit for effective data management and retrieval.

🔍 Reworking through "An Introduction to Statistical Learning with Applications in Python". You can access my notes, chapther problem solutions, and projects here

Feel free to explore, provide feedback, or connect! Let's collaborate and make data-driven discoveries together. 🚀

Connect with me on LinkedIn.