Skip to content

emmanueliarussi/DataScienceCapstone

Repository files navigation

Data Science Capstone

Intro to DSC

A python notebook exploring the Anscombe's quartet dataset.

A python notebook to perform sentiment analysis from tweets.

Data Extraction

A python notebook showing how to work with raw text files from Nasa.

A python notebook showing how to work with Excel spreadsheets files from California SAT test results.

A python notebook showing how to work with HTML tables with data from the Texas Department of Criminal Justice.

A python notebook showing how to work with JSON tables with data from Amazon Musical Instruments Reviews.

Data Wrangling

A python notebook demonstrating pandas dataframe.

A python notebook demonstrating pandas dataframe.

A python notebook demonstrating pandas dataframe pivot function.

A python notebook demonstrating outlier detection.

Data Mining

A python notebook demonstrating basic linear regression using Scikit-Learn.

A python notebook demonstrating classification in Scikit-Learn.

A python notebook demonstrating clustering in Scikit-Learn.

Data Analytics & Visualization

A python notebook demonstrating the use of Matplotlib, Pandas Visualization and Seaborn.

A python notebook demonstrating the use of Plotly to create interactive visualization.

Midterm Projects

The goal of this project is to characterize song clusters through their musical attributes.

The goal of this project is to clusterize and identify trends in beer preferences.

The goal of this project is to build a model that identifies covid-19 and pneumonia in chest X-Ray images.

The goal of this project is to build a model to identify fake news.

The goal of this project is to understand which of the variables in study drive the price of homes in Boston.

The goal of this project is to build a model that predicts the NBA salaries.

The goal of this project is to build a model able to predict the diagnosis of breast cancer tissues as malignant or benign.

The goal of this project is to identify and describe the various customer segments hidden in the data.

The goal of this project is predict flight delays in the month of January.

The goal of this project is to build a model that predicts the IMDB rating score based on movie attributes.

The goal of this project is to understand which features make a song popular.

The goal of this project is to identify hierarchies of clusters of US states according to violent crime rates data.

The goal of this project is to clusterize and identify trends in wine preferences.

The goal of this project is to build a model to classify audio tracks by genre.

The goal of this project is to build a chatbot using predefined input patterns and responses.

The goal of this project is to build a model able to recognize fraudulent credit card transactions.

The goal of this project is to build a model able to predict color names.

The goal of this project is to build a model that identifies traffic signs images.

Final Projects

Your task is to bring law enforcement up to date on the current organization of the Protectors of Kronos and how that organization has changed over time, as well as to characterize the events surrounding the disappearance of several employees of GAStech.

As a data scientist expert assisting law enforcement, your mission is to make sense of geospatial data to identify suspicious patterns of behavior and to prioritize which of these may be related to the missing staff members of GAStech.

You are being asked to characterize the movement of groups and individuals at DinoFun World park, with a special emphasis on what might be relevant to better understanding the incident that occurred in June 2014.

You will have to dive into communication registers over time that took place among DinoFun World park visitors using the park app. Linkages between visitors and among park patrons and park staff could reveal behaviors of interest.

As an expert in data analytics, you have been hired to help GAStech understand its operations data. In this task, you are given two weeks of building and prox sensor data. Can you identify typical patterns and issues of concern?

Characterize the vehicle patterns of life at the Boonsong Lekagul Nature Preserve and find the link between the traffic going through the preserve and the decline in the nesting Rose-crested Blue Pipit.

Your task, as supported by data analytics that you apply, is to detangle air sampling data monitored by nine different sensors in the preserve, in order to determine where problems may be affecting the decline in the nesting Rose-crested Blue Pipit.

Your task is to identify features that change over time via multispectral image analysis, with focus on changes that are occurring that may provide clues to the problems with the Pipit bird.

Using a bird call collection and a map of the Wildlife Preserve, your are asked to characterize the patterns of all of the bird species in the Preserve over the time of the collection, and then to verify or refute Kasios’ Company claims.

Your task is to investigate the hydrology data from across the Preserve to see if it could make up for the soil evidence that was destroyed.

An unexpected source suggests extent of the Kasios involvement in illicit activities may be much broader than just Mistford and the Wildlife Preserve. Data analysis of some challenging data may provide further insights on the scope of their activities.

Combining seismic readings of an earthquake, responses from an app, and background knowledge of the city, help the city of St. Himark triage their efforts for rescue and recovery.

Help the city’s government and emergency management officials to understand if there is a risk to the public while also responding to other emerging crises related to the earthquake as well as satisfying the public’s concern over radiation.

Using your data analytics skills, help the city of St. Himark to analyze Y*INT messages in order to determine the appropriate actions it must take in order to assist the community in this disaster.

CGCS research has resulted in the creation of profiles of typical white hat groups. One such profile has been identified by CGCS sociopsychologists as most likely to resemble the structure of the group in involved in this accidental shutdown. You are asked to examine CGCS records and identify those groups who most closely resemble the identified profile.

Further analysis of the cyber event has given a strong indication that a subgroup of eight individuals were behind the bug. The CGCS has received a tip that this group rarely meets in person and instead they use a special item—a totem—as a secret signal of their affiliation. Investigators at the CGCS need your help analyzing the contents of those records to look for clues to the identity of the eight individuals.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published