Skip to content

Training material for data science including bash, Scala, spark

Notifications You must be signed in to change notification settings

NUSTemple/training

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

training

This repo is to add all related jupyter notebook for training Target audience is data science beginners.

Current Jupyter notebooks:

  1. Unix Since in the Data Science field, we normally need to transfer data between servers and do operation on servers. Basic Unix command skills are necessary. Besides, this is very useful when you use Hadoop hdfs related commands. Actually quite some of hdfs command are coming from unix command.

  2. Scala This is the cource preparation for Spark. As spark is written by Scala, we need to know basic for Scala before we go into Spark.

  3. Spark we are going to cover basic spark training, including read file and use sql to query Hive data and HBase data. DataFrame operation also.

  4. Jupyter notebook how to open jupyter notebook and how to write code in the jupyter

  5. Hive Introduction @todo

  6. HBase Introduction @todo

About

Training material for data science including bash, Scala, spark

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published