Skip to content

Latest commit

 

History

History
21 lines (15 loc) · 877 Bytes

README.md

File metadata and controls

21 lines (15 loc) · 877 Bytes

training

This repo is to add all related jupyter notebook for training Target audience is data science beginners.

Current Jupyter notebooks:

  1. Unix Since in the Data Science field, we normally need to transfer data between servers and do operation on servers. Basic Unix command skills are necessary. Besides, this is very useful when you use Hadoop hdfs related commands. Actually quite some of hdfs command are coming from unix command.

  2. Scala This is the cource preparation for Spark. As spark is written by Scala, we need to know basic for Scala before we go into Spark.

  3. Spark we are going to cover basic spark training, including read file and use sql to query Hive data and HBase data. DataFrame operation also.

  4. Jupyter notebook how to open jupyter notebook and how to write code in the jupyter

  5. Hive Introduction @todo

  6. HBase Introduction @todo