Problem Statement: You will be predicting the costs of used cars given the data collected from various sources and distributed across various locations in India.
FEATURES:
Name: The brand and model of the car.
Location: The location in which the car is being sold or is available for purchase.
Year: The year or edition of the model.
Kilometers_Driven: The total kilometres driven in the car by the previous owner(s) in KM.
Fuel_Type: The type of fuel used by the car.
Transmission: The type of transmission used by the car.
Owner_Type: Whether the ownership is Firsthand, Second hand or other.
Mileage: The standard mileage offered by the car company in kmpl or km/kg
Engine: The displacement volume of the engine in cc.
Power: The maximum power of the engine in bhp.
Seats: The number of seats in the car.
Price: The price of the used car in INR Lakhs.
Tasks:
1.Clean Data(Null value removal, Outlier identification)
2.Null Values(Dropping the rows /Columns and what is the reason or how you are imputing the null).
3.EDA(Minor Project to understand the relations, repeat the same here)
4.Handle Categorical Variable(Using Label Encoding/One hot encoding)
5.Try to do data scaling for Kilometers driven
6.Do the train test split
7.Apply different ML regression Algorithms
8.Calculate the error metrics.