Skip to content

This is a small ML project on the Titanic dataset. The aim of this project is to perform exploratory data analysis on the data with the help of graphs, processing the data and finally applying the Random Forest algorithm to predict the survival based on the features.

Notifications You must be signed in to change notification settings

sarthakkmishraa/Titanic-Dataset-EDA-Random-Forest-Algorithm

Repository files navigation

Titanic-Dataset-EDA-Random-Forest-Algorithm

The aim of this project is to perform Exploratory Data Analysis and predict the survival on the basis of various features in the dataset such as Age, Sex, Passenger class (Pclass), number of siblings or spouses (Sibsp) , no. of parents or children (Parch) , Boarding points (Embarked) , Fare and so on.

The dataset contains two sets as training dataset (train.csv) and testing dataset (test.csv).

The EDA is performed on the training data and conclusions are drawn on the basis of which the data is preprocessed after which the machine learning algorithm ( random forest ) is applied on the training data.

The predictions are then made on the testing data and the results are stored in a csv file as the output.

I managed to achieve 80% accuracy in the predictions.

More details, check the code. :)

About

This is a small ML project on the Titanic dataset. The aim of this project is to perform exploratory data analysis on the data with the help of graphs, processing the data and finally applying the Random Forest algorithm to predict the survival based on the features.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published