Skip to content

Project created with use of jupyter notebook on HAR dataset. Focused on making decision.

Notifications You must be signed in to change notification settings

KrzysiekJa/ML-project-on-HAR-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analyze of Human Activity Recognition Dataset

Author: Krzysztof Jarek

Project was made using as the base for its dataset prepared by Jorge L. Reyes-Ortiz, that contained samples from smartphones' gyroscopes and accelerometers, and also labels corresponding to the one of users states: WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING. Link to the dataset: UCI HAR Dataset

Project purpose

My purpose was to use the data and find the best model to make decisions based on them. That's why I was testing the SVM model- to find how good was an idea of dataset creator to use this- and Linear Regression. Finding which one would fit better and work better- giving result in determining the physical condition of the device user based on input' steam data.

Author's article: article

Dataset

Information about dataset is in UCI HAR Dataset folder.

Used tools

- SVC
- LogisticRegresion
- PCA
- StandardScaler
- GridSearchCV, RandomizedSearchCV
- Pipeline
- OneVsRestClassifier
- roc_curve, label_binarize

all from sklearn library. I also used Numpy and Matplotlib library.

Modelings steps

  1. Firstly I loaded the sets. This one with data to model I decomposed with use of PCA.
  2. Then I plotted the decomposed to 24 dimensions data and their labels. alt text
  3. Then I trained the SVC model on data in few configurations with use of cross validation.
  4. And on the same data I trained Linear Regression one.
  5. Finally, I made a classification repost for best two models.

Conclusion

Better mean results, using grid searcher, showed SVC model, but was computing much slower. Also, for another comparison for final repost, better scores gives SVC.

Misses

Firstly, I was trying to use as the learning set decomposed set with raw data from instruments, but without well-made signal preprocessing, it was giving low result (best: 0.67 accuracy for SVM), so I used special set already prepared by researchers. Also, I was trying to use the Hidden Markov' Chains model with use of hmmlearn library. But after long fight I reconciled that tools aren't made for my case in which I wanted to use labels set.

About

Project created with use of jupyter notebook on HAR dataset. Focused on making decision.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published