Spark MapReduce Lab

Description:

This is an introductory lab in using PySpark to perform rudimentary MapReduce jobs. This assumes that the user has prior knowledge of Python and the concept of MapReduce. Furthermore, this assumes that the user has Spark running on a Hadoop cluster. That is, installation details have been omitted.

This was written for the class CSSE434 as a part of our research project.

Instructions:

To work through this lab, please clone the repo. To do so on the command line, execute the following:

  $ git clone https://github.com/lamdaV/SparkMapReduceLab.git

Once the project has been cloned, read through the introduction and work through its example. With the introduction read and the example worked through, attempt to work on the wordCountTask and friendsListTask.

What's Next?

If you would like to learn more about Spark and what it is capable of, try checking out the Spark Machine Learning Lab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Spark MapReduce Lab

Description:

Instructions:

What's Next?

Files

README.md

Latest commit

History

README.md

File metadata and controls

Spark MapReduce Lab

Description:

Instructions:

What's Next?