This project aims to predict carbon emissions using a machine learning model. I used historical data from the Carbon Majors Emissions dataset available on Kaggle. The main goal was to create an accurate predictive model and evaluate its performance.
The dataset used in this project can be found on Kaggle: Carbon Majors Emissions Data.
- data/: Directory containing the dataset.
- static/: Directory containing static files, including plots.
- templates/: Directory containing HTML templates.
- (in master branch) myproject/: Main project directory containing Django settings and configurations.
- myapp/: Django application containing views, models, and other app-specific files.
- migrations/: Directory containing database migrations.
- init.py: Initialization file for the app.
- admin.py: Admin configurations.
- apps.py: App configurations.
- models.py: Database models.
- views.py: Views for handling HTTP requests.
- predict.py: Script containing the model prediction and evaluation logic.
- urls.py: URL routing for the app.
- myapp/: Django application containing views, models, and other app-specific files.
-
Data Preprocessing:
- Handling missing values.
- Feature scaling and normalization.
- Encoding categorical variables.
-
Modeling:
- Random Forest Regressor.
- Bayesian Optimization for hyperparameter tuning.
-
Model Evaluation:
- Mean Absolute Error (MAE).
- Mean Squared Error (MSE).
- R-squared (R²).
- Root Mean Squared Error (RMSE).
- Data Preprocessing: Learned how to handle missing values, scale features, and encode categorical variables.
- Model Training: Gained experience in training a Random Forest Regressor and optimizing hyperparameters using Bayesian Optimization.
- Model Evaluation: Understood various evaluation metrics and their importance in assessing model performance.
- Django Integration: Integrated the machine learning model into a Django web application for real-time predictions.
The initial Mean Absolute Error (MAE) of the model was 3.43. After optimization and improvements, the MAE was reduced to 0.57. This significant improvement demonstrates the model's effectiveness in predicting carbon emissions.
- pandas: For data manipulation and analysis.
- numpy: For numerical computations.
- scikit-learn: For machine learning algorithms and model evaluation.
- matplotlib: For creating plots and visualizations.
- bayesian-optimization: For hyperparameter tuning using Bayesian Optimization.
- Django: For building the web application.
- joblib: For saving and loading machine learning models.
To run this project locally, follow these steps:
-
Clone the repository:
git clone https://github.com/yourusername/carbon-emissions-prediction.git cd carbon-emissions-prediction
-
Create a virtual environment:
python -m venv env source env/bin/activate # On Windows use `env\Scripts\activate`
-
Install the required packages:
pip install -r requirements.txt
-
Run the Django server:
python manage.py runserver
-
Access the application: Open your web browser and navigate to
http://localhost:8000
.
Here are some screenshots of the Django web application in action:
This is the home page of the application where you can input the year for which you want to predict carbon emissions.
This page displays the predicted emissions along with a comparison plot of actual vs. predicted values and the evaluation metrics.
- Input Year: Enter the year for which you want to predict carbon emissions.
- View Prediction: The application will display the predicted emissions and provide a comparison plot of actual vs. predicted values.
- Evaluate Model: The application shows model evaluation metrics including MAE, MSE, R², and RMSE.
Contributions are welcome! Please fork the repository and submit a pull request for any improvements or bug fixes.