For this project, we create a function that trains a model to predict blight ticket compliance in Detroit using train.csv. Using this model, we return a series with the data being the probability that each corresponding ticket from test.csv will be paid, and the index being the ticket_id.
In order to run this code, download all files* in this directory into a directory in your local machine, then run Assignment4_complete.ipynb.
- you need two data set called test.csv and train.csv for this project. However, since they are large, I have provided two external links to download them. You could also find them in https://data.detroitmi.gov/
train data: https://s3.ca-central-1.amazonaws.com/traintestdatablightviolation/train.csv
test data: https://s3.ca-central-1.amazonaws.com/traintestdatablightviolation/test.csv
The accuracy of this model based on roc curve is 81%.
This is the final project to complete the course "Applied machine learning with python" from the specialization "Applied data science with python". The requirement to achieve full score for this project was 75% accuracy. I managed to achieve 81%.
if you have any question about this project please contact me via email: mehran.yazdizadeh@gmail.com