-
Notifications
You must be signed in to change notification settings - Fork 43
Home
Ved Prakash Singh edited this page Feb 14, 2016
·
34 revisions
_ _ _ _ _
| | | | | | (_) | |
___ _ __ __ _ _ __ | | __ ______ | |_ _ _ | |_ ___ _ __ _ __ _ | |
/ __| | '_ \ / _` | | '__| | |/ / |______| | __| | | | | | __| / _ \ | '__| | | / _` | | |
\__ \ | |_) | | (_| | | | | < | |_ | |_| | | |_ | (_) | | | | | | (_| | | |
|___/ | .__/ \__,_| |_| |_|\_\ \__| \__,_| \__| \___/ |_| |_| \__,_| |_|
| |
|_|
This tutorial provides a quick introduction to using Spark. It demonstrates the basic functionality of RDD and DataFrame API
val conf = new SparkConf().setAppName(appName).setMaster(master)
new SparkContext(conf)
Note:
Only one SparkContext may be active per JVM. You must stop() the active SparkContext before creating a new one.
We have tried to cover basics of Spark Core, SQL, Streaming, ML and GraphX programming contexts.
- [Creations] (https://github.com/rklick-solutions/spark-tutorial/wiki/Spark-Core#create-rdds)
- Operations
- [Transformations] (https://github.com/rklick-solutions/spark-tutorial/wiki/Spark-Core#1transformations)
- [Actions] (https://github.com/rklick-solutions/spark-tutorial/wiki/Spark-Core#2actions)
*Create SQL Context *Creating DataFrames *Creating Datasets *Inferring the Schema using Reflection *Programmatically Specifying the Schema *DataFrame Operations in JSON file *DataFrame Operations in Text file *DataFrame Operations in CSV file *DataFarme API *Action *Basic DataFrame functions *DataFrame Operations
We are using SparkCommon from Utils package to run the Examples of in this tutorial.
object SparkCommon {
lazy val conf = {
new SparkConf(false)
.setMaster("local[*]")
.setAppName("Spark Tutorial")
}
lazy val sparkContext = new SparkContext(conf)
lazy val sparkSQLContext = SQLContext.getOrCreate(sparkContext)
lazy val streamingContext = StreamingContext.getActive()
.getOrElse(new StreamingContext(sparkContext, Seconds(2)))
}