Skip to content

Latest commit

 

History

History
17 lines (11 loc) · 757 Bytes

README.md

File metadata and controls

17 lines (11 loc) · 757 Bytes

SparkDataProject

This Spark Scala project analyzes climate information. 
This is a beginner's project to help run several algorithms on real datasets. 

##Project Setup

Clone the URI using "git clone" command.

##Getting the dataset

The project has a dataset in the subfolder <Project_Home>/data/climate. 
 
This is the climate data (precipitation, temperature Average, maximum and minimum) for a 115 year range (1901 - 2016) 
for the city of Chennai, India (https://en.wikipedia.org/wiki/Chennai). 
More climate datasets are available at https://www.ncdc.noaa.gov/cdo-web/datasets -> "Daily Summaries" -> "Search Tool". 
Follow the instructions to select the city, data range and metrics needed.