Skip to content

A full pipeline for fraud detection from database implementation, feature engineering to deployment of the model. A home task given for "Machine Learning Engineer" position.

Notifications You must be signed in to change notification settings

halilbilgin/MachineLearningEngineerInterviewChallenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Challenge

Explore the data for identified fraudsters and other users. What are your preliminary observations?

Databases

Write an ETL script in Python to load the data into the PostgreSQL database. The associated DDL should be executed through Python and not directly in SQL. You can find the desired schema in schema.yaml and some sample code for the ETL.

Feature Engineering

Utilizing your findings from part a) and some creativity, create some features. Explain your reasoning behind the features. Make a features.py script which when executed will create these features and store them in the DB.

Model Selection/Validation

Create an ML model which identifies fraudsters. Assess the quality of your model and explain. Make a train.py file which generates the fitted model artifact (it should be stored under the artifacts sub-directory).

Operationalization

How will you utilize this model to catch fraudsters? If a fraudster is identified, what should be the resulting action: LOCK_USER, ALERT_AGENT, or BOTH? Explain. Make a patrol.py file and write a simple function which implements your logic from above. The function should accept a user_id and yield the suggested action(s) (e.g. patrol(user_id) = [‘LOCK_USER’, ‘ALERT_AGENT’])

My solution

Database : code/etl.py

Feature engineering: code/features.py

Model selection and validation: code/train.py

Operationalization: code/patrol.py

Here is the jupyter notebook, where I presented all my solution. Run pipeline.ipynb

About

A full pipeline for fraud detection from database implementation, feature engineering to deployment of the model. A home task given for "Machine Learning Engineer" position.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published