Skip to content

Based on the hugely popular Data Science Toolbox, but with Apache Spark included

Notifications You must be signed in to change notification settings

pamdla/DataScienceWithSpark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

DataScienceWithSpark

Based on the hugely popular Data Science Toolbox, but with Apache Spark included. To use just clone this repository to obtain the Vagrantfile.

git clone https://github.com/choerst/DataScienceWithSpark.git

For everything to work, you will need to install Vagrant (https://www.vagrantup.com/downloads.html) and Virtualbox (https://www.virtualbox.org/wiki/Downloads) on your machine.

Navigate to the directory containing the VagrantFile and execute

vagrant up

Afterwards, ssh into the machine and start the Ipython Notebook

vagrant ssh

Inside the vagrant box execute

/home/vagrant/startIpython.sh

Navigate your regular browser to: https://127.0.0.1:8888/

The default password is 'science', feel free to change it by invoking the command (Inside the vagrant box)

dst setup base

For more documentation, feel free to visit the page of the original creators: http://datasciencetoolbox.org/

About

Based on the hugely popular Data Science Toolbox, but with Apache Spark included

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published