Skip to content

hansugu/hadoop-on-lustre2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Diskless Hadoop 2 (YARN) on Lustre

This repository contains the code changes that allows Hadoop2 (YARN) to be run on "diskless" Hadoop nodes that use Lustre for all storage (temporary and permanent). This version of Hadoop should be built in a manner identical to unpatched Hadoop.

Please see this PDF for a more detailed explanation of how this patch works. The PDF is also included in the repository.

Here is a list of the bare minimum paramters that need to be changed in order to run this software in "diskless" mode.

  • Filename | Configuration Parameter | Suggested value
  • core-site.xml | fs.defaultFS | file:///
  • core-site.xml | hadoop.tmp.dir | Any location on the Lustre filesystem
  • mapred-site.xml | yarn.app.mapreduce.am.staging-dir | {lustre_mount_point}/tmp/yarn-staging
  • mapred-site.xml | mapred.cluster.local.dir | Any path on {lustre_mount_point}
  • mapred-site.xml | lustre.dir | Any directory on Lustre
  • mapred-site.xml | hadoop.ln.cmd | Set to the full path for the system “ln” command
  • yarn-site.xml | yarn.nodemanager.local-dirs | Any directory on Lustre named "${host.name}"
  • yarn-env.sh | HADOOP_OPTS | Add "-Dhost.name=hostname -s"
  • yarn-env.sh | "-XX:ErrorFile" and "-Xloggc" | Modify HADOOP_NAMENODE_OPTS, HADOOP_JOBTRACKER_OPTS, and HADOOP_SECONDARYNAMENODE_OPTS to point to non-volatile storage
  • yarn-env.sh | HADOOP_LOG_DIR and HADOOP_SECURE_DN_LOG_DIR | Modify the paths to point to non-volatile storage

For questions and other inquries, please contact us at hadoop.on.lustre@seagate.com

Seagate Technology November, 2014

About

Hadoop on Lustre (version 2)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published