Skip to content

The dataset used contains various features such as age, sex, smoking habits, region etc. of individuals along with their respective insurance charges. The model applies multiple linear regression to predict the insurance charges and the accuracy is measured with the R2 score.

Notifications You must be signed in to change notification settings

nandini-mishra/Insurance-charges

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Insurance-charges

The dataset used contains various features such as age, sex, smoking habits, region etc. of individuals along with their respective insurance charges. The model applies multiple linear regression to predict the insurance charges and the accuracy is measured with the R2 score. Before starting with the model, we performed exploratory data analysis in Tableau to check for correlations.

alt text

As we can see, charges increase with age and hence there is a positive correlation.

We performed similar analysis with region,sex and smoking as well to find:

alt text

alt text

alt text

Thus, we concluded that region and sex did not have much effect on the deviation of charges. Smoking was one of the leading factors in explaining the deviation in charges.

The model was based on the features: age, children, bmi and smoking. R2 value was found to be 0.78

About

The dataset used contains various features such as age, sex, smoking habits, region etc. of individuals along with their respective insurance charges. The model applies multiple linear regression to predict the insurance charges and the accuracy is measured with the R2 score.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published