Skip to content

RGivisiez/Machine-Learning-Portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 

Repository files navigation

Machine Learning Portfolio: A brief overview

Get in touch:

. codeSTACKr | Instagram Twitter LinkedIn

Every day, students of pharmacy at the Federal University of Minas Gerais (UFMG) use chicken eggs to study the effects of drugs on blood vessels. Because of this, it is essential to automate, speed up, and improve the accuracy of blood vessel segmentation in chicken eggs; better segmentation guarantees reliable results with good reproducibility. This project uses machine learning algorithms to automate the process, while maintaining speed and improving accuracy.

Chicken Egg Blood Vessel Segmentation


Nubank, a Brazilian fintech, organized a competition aimed at scouting for new talents. One of the objectives of the competition was to create a model capable of predicting which customers would fail to meet their financial obligations, incurring what is known as default. To make these predictions, a dataset containing customer information was used. One of the major issues with this type of dataset is its class imbalance; there is a large number of customers who pay their debts and a minimal number who do not, making prediction difficult.

What you will see in this notebook:

  1. Data cleaning and creation of new features.
  2. Use of Pipeline to simplify data manipulation across various models and minimize the risk of data leakage.
  3. How to handle an imbalanced dataset.
  4. Metric selection for imbalanced datasets.
  5. Conversion of features to ordered categories and to One Hot Encoder (OHE).
  6. Reducing skewness of features using log or QuantileTransformer.
  7. Use of grid search to find the best parameters for the models.
  8. Models used: Decision Tree, Random Tree, Neural Network, Logistic Regression, Bagging, Random Patches, Adaboost, and Voting Classifier.
  9. Discussion of the results obtained.

Como foi visto nos resultados do notebook E-Commerce Seller, uma baixa avaliação dos vendedores está correlacionada com o atraso da entrega. Para melhorar as avaliações dos vendedores, e também deixar o consumir mais satisfeito, iremos criar uma algoritmo que indique a possibilidade de atraso na entrega.


Empresas que trabalham com marketplace agregam diversos tipos de vendedores em seus sistemas, porém nem todos eles conseguem deixar seus clientes satisfeitos. Sendo assim, é importante identificar característica de vendedores bem avaliados pelos usuários e incentivar que outros se comportem da mesma forma. Com a quantidade de dados disponível sobre os vendedores, é possível identificar de forma automatizada essas características e propor mudanças tanto por parte dos vendedores quanto por parte da empresa.

Sellers_clusters


Recommendation systems play an important role in filtering information before any user can consume it, whether by recommending movies to be watched or items a consumer might want. In this project, we build two recommendation systems using Nearest Neighbor and Matrix Factorization. Moreover, using the t-SNE algorithm, it is shown that a naive implementation of those algorithms has biases. They learn to recognize blockbuster movies from other movies as they are rated more often and with a higher rating score. That can lead to an algorithm prone to recommend more blockbuster movies than any other.

t-SNE


GAN is a machine learning model that can learn how to replicate the properties of a dataset. For example, a GAN trained in a dataset of breast cancer images learns how to generate new images similar to those seen in the dataset. Therefore, a GAN can be used to generate anonymized medical images, as they do not belong to any human, but still describe the disease characteristics. In this project, because GANs require a lot of computational resources and time, we will use two simple models to generate images of numbers.

model2


GitHub Template Photo by Hal Gatewood on Unsplash

About

Machine learning algorithms

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published