CSP.mp4
The quality of concrete is determined by its compressive strength, which is measured using a conventional crushing test on a concrete cylinder. The strength of the concrete is also a vital aspect in achieving the requisite longevity. It will take 28 days to test strength, which is a long period. I solved this problem using Data science and Machine learning technology, developed a web application which predicts the "Concrete compressive strength" based on the quantities of raw material, given as an input.
Data source:-
https://www.kaggle.com/elikplim/concrete-compressive-strength-data-set
- Loading the dataset using Pandas and performed basic checks like the data type of each column and having any missing values.
- Performed Exploratory data analysis:
- First viewed the distribution of the target feature, "Concrete compressive strength", which was in Normal distribution with a very little right skewness.
- Visualized each predictor or independent feature with the target feature and found that there's a direct proportionality between cement and the target feature while there's an inverse proportionality between water and the target feature.
- To get even more better insights, plotted both Pearson and Spearman correlations, which showed the same results as above.
- Checked for the presence of outliers in all the columns and found that the column 'age' is having more no. of outliers. Removed outliers using IQR technique, in which I considered both including and excluding the lower and upper limits into two separate dataframes and merged both into a single dataframe. This has increased the data size so that a Machine learning model can be trained efficiently.
- Experimenting with various ML algorithms:
- First, tried with Linear regression models and feature selection using Backward elimination, RFE and the LassoCV approaches. Stored the important features found by each model into "relevant_features_by_models.csv" file into the "results" directory. Performance metrics are calculated for all the three approaches and recorded in the "Performance of algorithms.csv" file in the "results" directory. Even though all the three approaches delivered similar performance, I chose RFE approach, as the test RMSE score is little bit lesser compared to other approaches. Then, performed a residual analysis and the model satisfied all the assumptions of linear regression. But the disadvantage is, model showed slight underfitting.
- Next, tried with various tree based models, performed hyper parameter tuning using the Randomized SearchCV and found the best hyperparameters for each model. Then, picked the top most features as per the feature importance by an each model, recorded that info into a "relevant_features_by_models.csv" file into the "results" directory. Built models, evaluated on both the training and testing data and recorded the performance metrics in the "Performance of algorithms.csv" file in the "results" directory.
- Based on the performance metrics of both the linear and the tree based models, XGBoost regressor performed the best, followed by the random forest regressor. Saved these two models into the "models" directory.
- Deployment: Deployed the XGBoost regressor model using Koyeb, which works in the backend part while for the frontend UI Web page, used HTML5.
At each step in both development and deployment parts, logging operation is performed which are stored in the development_logs.log and deployment_logs.log files respectively.
So, now we can find the Concrete compressive strength quickly by just passing the quantities of the raw materials as an input to the web application 😊.
URL:-https://youtu.be/21mvK42_ubg?si=iUE7F5i9W3RVxO0p
Internship Experience Letter (CCSP).pdf
- Testing the compressive strength of a concrete in laboratory (https://www.youtube.com/watch?v=t4RDdn6rOwU&ab_channel=Anime_Edu-CivilEngineeringVideos)
- Concrete Basics: Essential Ingredients For A Concrete Mixture
- Applications of Fly ash
- Blast furnace slag cement
- Applications of Superplasitcizer in concrete making
- Factors that affect strength of concrete
- Feature selection with sklearn and pandas
- sklearn's LassoCV
- Post pruning technique in Decision tree algorithm
- Hyper parameter tuning in XGBoost
- HTML, CSS tutorials
To set up and run this project locally, follow these steps:
-
Clone the repository:
git clone https://github.com/your-username/concrete-compressive-strength-prediction.git cd concrete-compressive-strength-prediction
-
Create a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required dependencies:
pip install -r requirements.txt
-
Run the Flask application:
python app.py
-
Open your web browser and navigate to
http://localhost:5000
to access the application.
We welcome contributions to improve this project! Here are some ways you can contribute:
- Report bugs or suggest features by opening an issue.
- Improve documentation by submitting pull requests.
- Add new features or fix bugs by forking the repository and submitting a pull request.
Please ensure that your code adheres to the existing style and that all tests pass before submitting a pull request.
This project is licensed under the MIT License.
MIT License
Copyright (c) 2023 Sanket Jagtap
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.