This project presents a vehicle price prediction system by using the supervised machine learning technique. The system takes in account factors such as vehicle's transmision, mileage, distance travelled etc. and predicts the selling price of the car.
This project presents a vehicle price prediction system by using the supervised machine learning technique. The project uses multiple linear regression as the machine learning prediction method which offered 98% prediction precision. Using multiple linear regression, there are multiple independent variables but one and only one dependent variable whose actual and predicted values are compared to find precision of results. This project proposes a system where price is a dependent variable which is predicted, and this price is derived from factors like vehicle’s model, make, version, transmission ,mileage, ownership and distance travelled.
The technique used in this project is Random Forest Algorithm. Random forest is an optimal algorithm when handling complex models with a large number of variables and samples, thus being an ideal algorithm to execute this project. There are a number of reasons for using random forest algorithm.
- It handles high-dimensionality very well since it takes subsets of data.
- It requires very little preprocessing and is extremely versatile.
- Since every decision tree has low bias, it is great at avoiding overfitting.
- It allows us to check for feature importance.
The dataset chosen for this project is based on sale/purchase of cars where our end goal will be to predict the price of the car given its features to maximize the profit. The dataset used for the project was Kaggle’s Used Car Dataset by CarDekho. com because it had a variety of categorical and numerical data and allows to explore different ways of dealing with missing data.
Due to the increased price of new cars and the incapability of customers to buy new cars due to the lack of funds, used cars sales are on a global increase . There is a need for a used car price prediction system to effectively determine the worthiness of the car using a variety of features. In fact, leasing cars is now a common practice through which it is possible to get hold of a car by paying a fixed sum for an agreed number of months rather than buying it in its entirety. It is possible to buy the car by paying the residual value (expected resale price), once the leasing period is over. It is therefore in the interest of vendors to be able to predict this value with a certain degree of accuracy, since if this value is initially underestimated, the installment will be higher for the customer which will most likely opt for another dealership.
Used car price prediction problem has a certain value because different studies show that the market of used cars is destined to a continuous growth in the short term. It is clear that the price prediction of used cars has a high commercial value, especially in areas where the economy of leasing has a certain volume. For customers, knowing the reasonable price of the car can make them buy or sell used car with no worries; for car rental companies, predicting the residual value is useful for the pricing of their rental service; for banks and other financial institutions, evaluating the price of a lender’s car can help them control his or her loan quota. It is important to know their actual market value while both buying and selling.
However, it is not easy to determine the price as the car's value depends on many factor including year of registration, manufacturer, distance travelled, model, mileage, horsepower, origin and several other specific informations such as type of fuel and braking system, condition of bodywork and interiors, interior materials (plastics of leather), safety index, type of change (manual, assisted, automatic, semi-automatic), number of doors, number of previous owners, if it was previously owned by a private individual or by a company and the prestige of the manufacturer.