Terasort-on-Hadoop-Spark

This Programming Assignment involves implementing the Sort Application using 3 different approaches:

The Assignment Directory contains following documents and folders:

Source Code of the program for Terasort on Hadoop, Spark adn Shared Memory - Source Code
Performance Evaluation Report - prog2_report.pdf
Snapshots of outputs running on Amazon AWS- Snapshots
Configuration files of Hadoop and Spark - Config files

STEPS FOR EXECUTION:

SHARED MEMORY:

Gathering: javac SharedMemoryTera.java

Execution: java SharedMemoryTera

So as to execute the Module on AWS, play out the accompanying advances:

APACHE HADOOP:

First of all, we need to introduce Apache Hadoop by executing the Script.
Once Apache Hadoop is introduced effectively, play out the accompanying advances:

I) Execute "gensort".

ii) Execute "TeraByteSorting.java".

iii) Execute "valsort".

APACHE SPARK:

First of all, we need to introduce Apache Spark by executing the Bash Script.
The Bash Script will introduce Apache Spark on the Amazon Cluster.
Once Apache Spark is introduced effectively, play out the accompanying advances:

I) Execute "gensort" and take the "input" document.

ii) Transfer File where the information is arranged for the gensort

iii) Execute "pyTeraSort.py".

iv) Execute "valsort".

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Config Files		Config Files
Snapshots		Snapshots
Source Code		Source Code
.gitattributes		.gitattributes
README.md		README.md
prog2_report.pdf		prog2_report.pdf

Provide feedback