Skip to content

Repository used for Coursera Course Getting and Cleaning Data

Notifications You must be signed in to change notification settings

MikkelPorse/CleaningData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

##Getting and Cleaning Data - Course Project This repository contains

  • this README.MD
  • run_analysis.R, the R script that generates
  • tidy_data.csv.txt,
  • codebook.MD

###run_analysis.R The R script that generates the tidy_data.csv.txt file. It reads the raw data from the UCI HAR dataset, http://archive.ics.uci.edu/ml/machine-learning-databases/00240/UCI%20HAR%20Dataset.zip , reduces 561 measurements (and derivations) to 81, by only selecting mean- and standard deviation measurements. It merges the training and test parts into one, groups the measurements by Subject and Activity, and computes the average of the measurements for each group. The result is output into the variable "tidySet"

Data is read as-is from the UCI HAR dataset, averaging over subject and activity is the only data transformation applied. Features have been renamed however. 't' and 'f'-prefixes have been expanded to 'TIME' and 'FREQUENCY' respectively, hyphens have been replaced with underscores and parentheses have been removed.

Every variable is in one column, each observation is in a separate row and data are all of the same kind, and as such the output conforms with our notion of tidyness.

usage:

  • download script,
  • set working directory to the "UCI HAR dataset" folder
  • run script
  • output is a variable tidySet with 180 observations and 81 variables

###tidy_data.csv.txt The output of run_analysis.R saved to csv, using the command write.csv(tidySet, file="tidy_data.csv.txt", row.names=F)

This file contains a summarized, tidy dataset based on the Samsung UCI HAR dataset. Data is a comma separated list of measurement averages, grouped by subject and activity

##codebook.MD This file describes the individual columns in the tidySet / tidy_data.csv.txt.

About

Repository used for Coursera Course Getting and Cleaning Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages