Skip to content
View seuwenfei's full-sized avatar
πŸ‘‹
πŸ‘‹
Block or Report

Block or report seuwenfei

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
seuwenfei/README.md

Hi there, I am Wen Fei πŸ‘‹


πŸ‘©β€πŸ’» About Me :

  • πŸ‘©β€πŸŽ“ I'm a Statistics graduate from Malaysia.
  • πŸ”­ I'm seeking for a job opportunity in data science and analytics fields, or any other related fields.
  • 🌱 I’m currently exploring Data Science.
  • ⚑ In my spare time, I'm always involved in data mining and machine learning projects.
  • πŸ“« How to reach me: Linkedin Badge

πŸ› οΈ Languages and Tools :

PythonΒ  MysqlΒ  PandasΒ  SklearnΒ  NumpyΒ  MathΒ  SeaΒ  SPSSΒ  KaggleΒ  JupyterΒ  TableauΒ  ExcelΒ  WordΒ 


πŸ“‘ Following are my projects in Python, Power BI, Tableau, SQL, Google Data Studio, Java:

  Β  Note: The dates indicate the month and year when each project was completed.

πŸ“Œ Python -

  • Identification of Disaster Related Tweets using NLP based Text Classification   |   May 2023   |   Show project
    • Developed an NLP-based text classification model in Python to predict whether a given tweet is disaster-related.
    • Utilized libraries such as Pandas, NumPy, Seaborn, Matplotlib, SciPy, Plotly, NLTK, re, collection, wordcloud, TensorFlow, scikit-learn.
    • Techniques employed: EDA, Text Preprocessing, Classification Model Comparison (Linear SVC, Multinomial NB, Neural Network).
    • Achieved an AUC score of 0.86 using a Linear SVC model. (Good separability between disaster and non-disaster tweets)

  • Churn Customer Prediction using Machine Learning   |   Apr 2023   |   Show project
    • Visualized the IBM Community's Telco Churn Dataset to quickly gain insights using Python in Jupyter Notebook.
    • Developed a churn prediction model using machine learning algorithms in Python to identify whether a customer has churned.
    • Utilized libraries such as Pandas, NumPy, Seaborn, Matplotlib, Plotly, H3, Folium, TensorFlow, imblearn, scikit-learn, XGBoost.
    • Techniques employed: EDA, Data Visualization, Classification Model Comparison (Random Forest, Logistic Regression, AdaBoost, XGBoost).
    • Obtained an AUC score of 0.86 using an XGBoost classifier.

  • Web Scraping Booking.com   |   Apr 2023   |   Show project
    • Scraped valuable hotel data in Kuala Lumpur, Malaysia from Booking.com using Beautiful Soup library in Python.
    • Extracted information such as hotel names, locations, room types, scores, ratings, number of reviews, distance from the center, and prices.
    • Utilized libraries such as Pandas, Requests, Beautiful Soup (bs4), RegEx (re).
    • Techniques employed: Data Extraction

  • Kaggle Titanic - Machine Learning from Disaster Competition   |   Mar 2023   |   Show project
    • Developed a machine learning model in Python to predict survival on the Titanic.
    • Utilized libraries such as Pandas, NumPy, Seaborn, Matplotlib, scikit-learn, TensorFlow.
    • Techniques employed: EDA, Feature Engineering, Data Visualization, Classification Model Comparison (Random Forest, Logistic Regression, Complement Naive Bayes).
    • Achieved a stratified k fold CV score of 0.85 using a Random Forest model.

  • Feature Engineering - Convert UTC time to Local time   |   Mar 2023   |   Show project
    • Converted UTC time to Malaysia Standard Time.
    • Utilized libraries such as Pandas, DateTime, Dateutil, pytz.
    • Techniques employed: Feature Engineering.

  • Data Visualization for Worldwide Movie Series   |   Jan 2023   |   Show project
    • Presented graphical visualizations using Python to highlight patterns and trends in movie series data.
    • Utilized libraries such as Pandas, NumPy, Seaborn, Matplotlib, wordcloud.
    • Techniques employed: EDA, Feature Engineering, Data Visualization.

  • Online Payment Fraud Detection using Machine Learning   |   Dec 2022   |   Show project
    • Trained machine learning models in Python to identify fraudulent and non-fraudulent payments.
    • Utilized libraries such as Pandas, NumPy, Seaborn, Matplotlib, Tabulate, scikit-learn.
    • Techniques employed: EDA, Data Visualization, Classification Model Comparison.
    • Obtained an F1 score of 0.79 using a Random Forest model.


πŸ“Œ Power BI -

  • Cookies Sales Dashboard   |   May 2023   |   Show project
    • Prepared data for the cookies sales dashboard using Query Editor in Power BI.
    • Created a cookies sales dashboard using Power BI, showcasing cookies sales, cost, profit, lead time, flavour and customer.


πŸ“Œ Tableau -

  • Flight Ticket Sales Analysis Dashboard   |   Jan 2023   |   Show project
    • Queried an airline flight ticket dataset from Airlines Database using PostgreSQL (SQL).
    • Created a flight ticket sales dashboard using Tableau, showcasing ticket sales, fare conditions, booking period, aircraft, departure and arrival airports.

  • KPMG Data Analytics Consulting Virtual Internship   |   Nov 2022   |   Show project
    • Participated in KPMG Virtual Experience Program with Forage to gain insight into working at KPMG to develop career skills and experience.
    • Completed tasks including data quality assessment, data insights analysis using Python (Jupyter Notebook) and data dashboard presentation using Tableau.


πŸ“Œ SAS Studio -

  • Non-parametric Test for Patient Health Status   |   Mar 2022   |   Show project
    • Analyzed the Patient Health Status using non-parametric tests such as Shapiro-Wilk, Wilcoxon Rank Sum, Ansari-Bradley, Kolmogorov-Smirnov, Kruskal-Wallis and Spearman's Correlation Tests as data distribution assumptions of parametric tests are not met.


πŸ“Œ Google Data Studio -

  • Ecommerce Dashboard  |   Jul 2021   |   Show project
    • Visualized ecommerce data in a dashboard using Google Data Studio, displaying sessions, transactions, revenue, product checkout, average order value, conversion rate, and more.

     


πŸ“Œ Java (Netbeans) -

  • Java Application -Simple Student Information System   | Nov 2019 |   Show project
    • Writed a Java application to represent a simple Student Information System.

Pinned Loading

  1. Portfolio-Project Portfolio-Project Public

    This repository contains my data analytics portfolio projects including Python, Tableau, SQL, Google Data Studio and Java