Skip to content

Distributed cluster execution of Apache Hadoop YARN SLS and centralized processing of results. (Server-Client model)

License

Notifications You must be signed in to change notification settings

msSarriman/iMS_Thesis_HadoopYARNslsAutomationOnCluster

Repository files navigation

Apache Hadoop YARN SLS, Server-Client model

The purposes of this Thesis project are the following:

  • Distributed cluster exectuion of YARN SLS
  • Centralized processing of the simulation results
  • Decentralized inspection of experiment's progress.

Architecture Features

  • Server assigns experiment using a builtin Queue, created form values.exp file values.
  • Echo heartbeat communication through UDP, for client inspection from Server -- Server reissues an experiment if a client is not echo-ing for a period of time.
  • Server keeps a file log for experiment queue recovery, in case of fatal shutdown.
  • Server creates plots, for decision making on the experiment results.
  • TCP communications for message and file transaction between server-client.

You can also:

  • Use built-in command line arguments on ./server.py to troubleshoot Hadoop installation and execution. See ./server.py --help for a detailed list.

Tech

Frameworks and tools:

Programming:

OS and Sandboxes:

  • Host: MS Windows 10 Pro
  • Guest: Ubuntu xfce 18.04.1 LTS, Bionic Beaver (One Server/Three Slaves)

Installation

Refer to Apache's Installation Guide to install and setup Apache Hadoop. Refer to matplotlib.org to install matplotlib for Python3.

Start up the HDFS:

$ cd $(HADOOP_HOME)
$ ./sbin/start-dfs.sh
$ jps

Populate the neccessary files (see Extra Files section), and run the Server with ./server.py python file (run ./server.py --help for execution details). On Client side execute the ./client.py (run ./client.py --help for execution details).

If the JavaGUI (experiments process overview) needs to open standalone, run it with the command: prompt# java -jar ProcessTracking.jar [ServerIPv4] [port] (default port is 30001).

For production environments (Apache Hadoop compiled from source), find the sbin folder under hadoop-3.1.1-src/hadoop-dist/target/hadoop-3.1.1/sbin

UMLs

See UMLs for more information on the scripts.

Extra Files

  • real_time_tracking.dat Log file for server recovery.
  • runtime_parameters.dat Necessary parameters for server and client execution.
  • values.exp The values for which the SLS will run.

License

MIT

About

Distributed cluster execution of Apache Hadoop YARN SLS and centralized processing of results. (Server-Client model)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages