Welcome to the repository for my virtual internship at Cognizant, completed on the Forage platform. I have done this internship over the period of July 2023 to August 2023. This repository contains the materials, code, and documentation related to the internship tasks I undertook. Below is an overview of the tasks and the work I completed for each one and certification of internship.
-
Task 1: Exploratory Data Analysis
-
Task 2: Data Modeling
-
Task 3: Model Building and Interpretation
-
Task 4: Machine Learning Production
-
Task 5: Quality Assurance
In this task, I performed Exploratory Data Analysis (EDA) on customer data. The code I wrote for EDA is available as eda.ipynb. I summarized the findings in a concise and business-friendly manner within an email to the Data Science team leader.
For this task, I prepared a presentation that summarizes the data I planned to use and outlined a strategic plan of action. The presentation is available as Presentation.pdf.
I tackled data cleaning, feature engineering, and model building in this task. The code and documentation can be found as modelling_notebook.ipynb. I also created a strategic plan presentation outlining the model development process available as strategic plan.pdf.
In this task, I split the data into training and testing sets, built a machine learning model, tuned its parameters, and performed cross-validation. The code is available as training_deployment.py.
For the final task, I provided answers to frequently asked questions (FAQs) about the model, deployment, and suggestions for model improvement.
The internship allowed me to work on various stages of a data science project, from data exploration and preprocessing to model development, interpretation, and deployment. This holistic understanding of the entire project lifecycle has boosted the confidence for working on reallife ML problems.
The tasks I completed required hands-on work with real-world data and practical implementation of machine learning techniques. It helped me understand the challenges and nuances of working with real data and applying knowledge.
Creating strategic plans and presentations (Tasks 2, 3, and 5) emphasized the need to think strategically about the goals of the project and the impact on the business. It highlighted the importance of considering long-term implications and planning ahead.
Task 4 and 5 required me to interpret model results and explain their implications to the business. This emphasized that building a model is only part of the process; understanding how it aligns with business objectives is equally important.
Working with actual customer data exposed me to the challenges of data quality, missing values, and other real-world data issues. This experience reinforced the need for data preprocessing and cleaning.
Client: Juypiter Notebook, Python, Sci-kit learn, numpy, pandas, matplotlib
This project is developed for the following company:
- Cognizant
- Forage
- Cognizant