Skip to content

The project is a predictive analysis for identifying the employees most likely to get promoted based on various factors such as training performance, KPI completion etc.

Notifications You must be signed in to change notification settings

MDSoleh/HR-Analysis-promotions-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HR-Analysis

Objective:

To develop predictive analysis for identifying the employees most likely to get promoted based on various factors such as training performance, KPI completion etc.

Key Skills:

Data collection and preprocessing using Pandas Exploratory data analysis(EDA) using Matplotlib and Seaborn Feature engineering Model building and evaluation using Scikit-learn

Step-by-step guide:

1.Data collection & cleaning: Using Pandas 'read_csv' to collect data from training and testing csv files. Screenshot 2024-07-11 192201 Screenshot 2024-07-11 192223

2.Descriptive statistics: Using .describe() to get information about statistical measures like max, min, average etc. Screenshot 2024-07-11 192259

3.Data exploration: Employing countplot, displot, histograms from seaborn, matplotlib libraries for various graphical insights about the datasets.

Count of employees who got promoted: Screenshot 2024-07-11 192555

Count of employees who got promoted wrt to education: Screenshot 2024-07-11 192604

Count of employees who got promoted wrt to age: Screenshot 2024-07-11 192636

Count of employees who got promoted wrt to previous_year_rating: Screenshot 2024-07-11 192656

Count of employees who got promoted wrt to age & length of service: Screenshot 2024-07-11 192721 Screenshot 2024-07-11 192734

Scatter plot for dataset exploration: Screenshot 2024-07-11 192749 Screenshot 2024-07-11 192832

4.Label conversion for categorical data attributes using LabelEncoder from preprocessing module: Screenshot 2024-07-11 192901 Screenshot 2024-07-11 192917

5.Correlation: Analyzing inter-dependency between different attributes, here KPI's, award's won & avg_training_score attributes have positive correlation thus having high impact on target variable('is_promoted') Screenshot 2024-07-11 192936 Screenshot 2024-07-11 192950 Screenshot 2024-07-11 193035

6.Splitting the data: Screenshot 2024-07-11 193100

7.XGBoost-classifier: Screenshot 2024-07-11 193115 Screenshot 2024-07-11 193147 Screenshot 2024-07-11 193159

8.RandomForest: Screenshot 2024-07-11 193217 Screenshot 2024-07-11 193230 Accuracy is not a good parameter for classification models, here focus is on recall or f1-score to make it close to 1.0 Screenshot 2024-07-11 203813

9.Gradient boosting model(GBM): Screenshot 2024-07-11 203825

10.Predictions for target variable by different models: Screenshot 2024-07-11 203843

About

The project is a predictive analysis for identifying the employees most likely to get promoted based on various factors such as training performance, KPI completion etc.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published