Skip to content

[College Project] Big Data project using Spark for the Big Data class at UFRJ

License

Notifications You must be signed in to change notification settings

ssaporito/BigDataTime

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BigDataTime

[College Project] Big Data project using Spark for the Big Data class at UFRJ

Dependencies

Install the following dependencies in order to build and run the project

Project

SOM

Self-organizing Map

  • Location: src/som
  • Language: Scala

BTC Variation Calculator with DataFrame

This programs reads the Bitcoin valuation csv file and calculates the variation from a day to another. It also implements DataFrame, so we can easily get the variation of the currency by calling the method getVariation() by passing the date as a parameter.

  • Location: src/variation
  • Language: Java

Tools

news-finder

This program is used to retrieve news from a few websites and save them on disk

  • Location: tools/news-finder
  • Language: Javascript

bitcoin-market-price-downloader

This program is used to retrieve the bitcoin price history and save it as a CSV file

  • Location: tools/bitcoin-market-price-downloader
  • Language: Javascript

Build

SOM

cd src/som/
sbt package

BTC Variation Calculator with DataFrame

cd src/variation/
mvn package

Creating the jar application use maven to create a jar: add this to to your pom.xml file:

        <plugin>
            <artifactId>maven-assembly-plugin</artifactId>
            <configuration>
                <archive>
                    <manifest>
                        <mainClass>fully.qualified.MainClass</mainClass>
                    </manifest>
                </archive>
                <descriptorRefs>
                    <descriptorRef>jar-with-dependencies</descriptorRef>
                </descriptorRefs>
            </configuration>
        </plugin>

and call:

cd src/variation
mvn clean compile assembly:single

news-finder tool

cd tools/news-finder/
npm install

bitcoin-market-price-downloader

cd tools/bitcoin-market-price-downloader/
npm install

Run

SOM

cd src/som/
<your-spark-folder>/bin/spark-submit target/scala<version>/som_project_<version>.jar

BTC Variation Calculator with DataFrame

cd src/variation
<your-spark-folder>/bin/spark-submit VBBigData-1.0-SNAPSHOT.jar variation "date(yyyy-mm-dd)"

news-finder tool

Download all the news from all the available sites

cd tools/news-finder/
./run.sh

If you can't execute the script, change its permission (on linux)

chmod u+x ./run.sh

Download specific sites using the command

node index.js -s <site>

You can also choose the keyword, the initial and final pages

node index.js -s <site> -k <keyword> -f <from-page> -t <to-page>

Need help? Use the -h parameter

node index.js -h

bitcoin-market-price-downloader

Just run it using

cd tools/bitcoin-market-price-downloader/
node index.js

About

[College Project] Big Data project using Spark for the Big Data class at UFRJ

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • CSS 92.4%
  • JavaScript 7.2%
  • Other 0.4%