Module 1 of Scalable Data Science and Distributed Machine Learning

Module 1 – Introduction to Data Science: Introduction to fault-tolerant distributed file systems and computing.

The whole data science process illustrated with industrial case-studies. A practical introduction to the scalable data processing to ingest, extract, load, transform, and explore (un)structured datasets. Scalable machine learning pipelines to model, train/fit, validate, select, tune, test, and predict or estimate in an unsupervised and supervised setting using nonparametric and partitioning methods such as random forests. Introduction to distributed vertex-programming.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
day1		day1
day2		day2
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Module 1 of Scalable Data Science and Distributed Machine Learning

Module 1 – Introduction to Data Science: Introduction to fault-tolerant distributed file systems and computing.

About

Releases

Packages

Languages

JAGulin/module-1

Folders and files

Latest commit

History

Repository files navigation

Module 1 of Scalable Data Science and Distributed Machine Learning

Module 1 – Introduction to Data Science: Introduction to fault-tolerant distributed file systems and computing.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages