Welcome to the docs repository for Revature's 200413 Big Data/Spark cohort. Here you will find weekly topics, useful resources, and project requirements.
Every week, we will focus on a particular technology or theme to add to our repertoire of competencies. These topics will feature heavily in assessments and QC meetings every week, and self-study and practical exploration will be necessary.
Each week may have a list of topic-based questions, which you should be prepared to study and answer in an assessment, whether in a meeting or a quiz. Associates are expected to answer at least 5 on a weekly discussion board, and respond to other posts with suggestions to improve or clarify them.
- Week 1 - Java
- Week 2 - SQL
- Week 3 - HTTP
- Week 4 - Big Data
- Week 5 - Apache Spark
- Week 6 - Spark SQL
- Week 7 - Spark Streaming
Google Doc - Contains our standard schedule, QC assessments overview and links, and a list of important contacts.
This cohort will prioritize individual and group-based project work:
- Project 0: Begins Week 1, due Wednesday Week 3
- Project 1: Begins end of Week 3, due Friday Week 5
- Project 2: Begins Week 6, due Friday Week 7
- Project 3: Begins Week 8, due Thursday Week 10
Each project will require a list of features to be implemented, whether functional or operational, and finishing your MVP (minimum viable product) as early as possible before iterating new features upon the project is highly suggested. Plan ahead, and be sure to reach out to everyone whenever you require guidance (or offer your own to those in need).
To maximize resources and minimize troubleshooting, please perform a clean install or refresh of your operating system. Update your system, Enable VT-x in BIOS if possible, and uninstall all unnecessary programs. Your development environment should be set up for Java, Git, and Maven as soon as possible. In later weeks we will also require PostgreSQL, Docker, SSH, curl, and of course Apache Spark. Refer to this Readme or the links provided in each week's topic and resources document to keep updated on the latest tools and programs needed for project work. You will be responsible for maintaining your environment throughout the program.
Install Chocolatey:
- Open
Powershell
as an administrator. - Run:
Set-ExecutionPolicy AllSigned
- Agree to all changes
- Run:
Set-ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))
- Open a new
Powershell
window as an administrator and run the following commands: - Install Git for Windows:
choco install git
- Install OpenJDK 8:
choco install adoptopenjdk8
- Install Apache Maven:
choco install maven
- Install an IDE of your choice:
- Visual Studio Code:
choco install vscode
- Eclipse:
choco install eclipse
- IntelliJ IDEA community:
choco install intellijidea-community
- Visual Studio Code:
To confirm all tools are properly installed and configured, be sure the following commands return no errors:
git -v
java -version
javac -version
mvn -v
java
and javac
should only reference Java 1.8.
All above tools can be installed at once for convenience using the following command:
choco install -y git adoptopenjdk8 maven vscode