Skip to content

yintellect/data-wrangling-r

Repository files navigation

Data Wrangling with R

More tutorials can be found here

Data wrangling, sometimes referred to as data munging, is the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics.

This repository presents four wrangling projects on numeric and text format to turn unstructed data into insights.

You can look at the code and my write-up by clicking the link in the title of projects below.

  • Wrangle a data set posted on U.S. Chronic Disease Indicators (CDI)
  • Produce some summary statistics
  • Visualize the correlation between binge drinking prevalence and poverty in U.S. States.
  • Build pipeline on World Bank data set
  • Explore the relationship between infant mortality and GPD per capita over time
  • Group data by region/country to compare with the overall regression
  • Select words with basic rules
  • Select words with regular expression
  • Basic frequenct summary
  • Extract multiple formats of data from strings (noun, time, number, etc.)
  • Manipulate natural language strings without preset tokenization.
  • Visualize and generate insights from text analysis.

About

Data wrangling into real-life insights with R

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages