Skip to content

Data Wrangling Process / Data Gathering from Web-scrapping, URL, API

Notifications You must be signed in to change notification settings

SooyeonWon/Twitter_data_analysis

Repository files navigation

Twitter Data Wrangling

Data wrangling project based on Twitter data

by Sooyeon Won

Keywords

  • Gathering data from different sources: Flat file, URL, API
  • Handling Data Quality and Tidiness Issues

Summary of Findings

This project is mainly focuses on how I, as a data analyst, get the proper data. In this analysis, I collected datasets from Udacity URL, directly-downloaded flatfiles and also using Tweeter API. In the second part, I made a small report to answer the following questions based on the obtained datasets.

  • The popularity of each dog "stage" (i.e. doggo, floofer, pupper, and puppo)
  • The method of accessing to twitter
  • The number of counts for retweet and favorite to get insight into popularity of tweets
  • Relationship between retweet_count and favorite_count
  • The proportion of image predictions that predict dog images as the first stage

References

Getting Twitter Data in Python
Accessing the Twitter API with Python
Learn Python by analyzing Donald Trump’s tweets

Releases

No releases published

Packages

No packages published