Car Evaluation Analysis

Authors: Danish Karlin Isa, Nicholas Varabioff, Ximin Xu, Zuer Zhong

This project is part of the coursework for DSCI 522 Data Science Workflows, a course of the Master of Data Science program at the University of British Columbia.

About

This project attempts to predict the level of acceptability of cars using machine learning methods. Using a 1997 dataset, the influence of various attributes of a new car on its acceptability among customers is analysed. These attributes include:

The buying price of the car
Maintenance costs
Number of doors
Passenger capacity
Boot size
Safety ratings

This analysis aims to identify the key factors that determine whether a car is considered acceptable, good, or unacceptable according to standardized criteria. To achieve this, several common machine learning models were explored. The SVM RBF classifier was identified as the best-performing model, achieving a test accuracy of 0.952. The SVM RBF model demonstrated exceptional performance, with an accuarcy of 0.99 on unseen data. This makes the SVM RBF model a solid choice for this project.

The dataset used in this project is the Car Evaluation Database created by M. Bohanec and V. Rajkovic in the early 1990s. It was sourced from the UCI Machine Learning Repository and is publicly available for research and can be found in the UCI Machine Learning Repository.

Report

The final report can be found here.

Project Dependencies

Docker

Usage

Setup

If you are using Windows or Mac, make sure Docker Desktop is running.
Clone this GitHub repository.

Running the Analysis

Navigate to the root of this repository.
Run the following command-line command:
```
docker compose up
```
In the terminal, look for a URL that starts with http://127.0.0.1:8888/lab?token= as shown in the image below. Copy and paste that URL into your browser. This will launch Jupyter Lab.
To reset the project to a clean slate, open a terminal in the root of the project and run the following command.
```
make clean
```
To run the entire analysis, open a terminal in the root of the project and run the following command.
```
make all
```

Clean Up

To shut down the container and clean up the resources, type Ctrl + C in the terminal where you launched the container. Then, type docker compose rm.

Developer Notes

Developer Dependencies

conda (version 23.9.0 or higher)
conda-lock (version 2.5.7 or higher)
Python and packages listed in environment.yml

Adding a New Dependency

Create a new branch.
Add dependency to the environment.yaml file. Make sure to pin the desired package version.
Update the conda-linux-64.yml by running the following command-line command from the root of the project:
```
conda-lock -k explicit --file environment.yml -p linux-64
```

Rebuild the Docker image locally to ensure it builds and runs properly.

docker build --tag env-test --platform=linux/amd64 .
docker run --rm -it --platform=linux/amd64 env-test /bin/bash

Push the changes to GitHub. A new Docker image will be built and pushed to DockerHub automatically. It will be tagged with the SHA for the commit that changed the file.
Update the docker-compose.yaml file on your branch to use the new container image (make sure to update the tag specifically).
Send a Pull Request and merge your branch changes into the main branch.

Running the Test Suite

Launch Jupyter Lab using the same docker compose up command in the Running the Analysis section.
To run the test suite, open a terminal in the root of the project and run the pytest command.

License

This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.The software code contained within this repository is licensed under the MIT license. See the license filefor more information.

References

Van Rossum, G., & Drake, F. L. (2009). Python 3 Reference Manual. CreateSpace, Scotts Valley, CA. ISBN: 1441412697.
McKinney, W. (2010). Data Structures for Statistical Computing in Python. In Proceedings of the 9th Python in Science Conference (pp. 56-61). Edited by S. van der Walt and J. Millman. DOI: 10.25080/Majora-92bf1922-00a.
Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in Science & Engineering, 9(3), 90-95. DOI: 10.1109/MCSE.2007.55.
Pedregosa, F. et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825-2830.
Bohanec, M. (1988). Car Evaluation. UCI Machine Learning Repository. DOI: 10.24432/C5JP48.
Harris, C. R. et al. (2020). Array programming with NumPy. Nature, 585(7825), 357-362. DOI: 10.1038/s41586-020-2649-2.
Developers, Scikit-learn. 2024. Grid Search — Scikit-Learn Documentation. https://scikit-learn.org/stable/modules/grid_search.html.

Name		Name	Last commit message	Last commit date
Latest commit History 279 Commits
.github/workflows		.github/workflows
data		data
img		img
report		report
results		results
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TEAM_CONTRACT.md		TEAM_CONTRACT.md
conda-linux-64.lock		conda-linux-64.lock
conda-osx-64.lock		conda-osx-64.lock
conda-osx-arm64.lock		conda-osx-arm64.lock
conda-win-64.lock		conda-win-64.lock
docker-compose.yml		docker-compose.yml
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Car Evaluation Analysis

About

Report

Project Dependencies

Usage

Setup

Running the Analysis

Clean Up

Developer Notes

Developer Dependencies

Adding a New Dependency

Running the Test Suite

License

References

About

Releases 4

Packages

Contributors 5

Languages

License

UBC-MDS/Car_Evaluation_Analysis

Folders and files

Latest commit

History

Repository files navigation

Car Evaluation Analysis

About

Report

Project Dependencies

Usage

Setup

Running the Analysis

Clean Up

Developer Notes

Developer Dependencies

Adding a New Dependency

Running the Test Suite

License

References

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 5

Languages

Packages