Skip to content

mdivk/airflow-training

 
 

Repository files navigation

Introduction to data pipeline management with Airflow

The modern Data Warehouse increase in complexity it is necessary to have a dependable, scalable, intuitive, and simple scheduling and management program to monitor the flow of data and watch how transformations are completed.

Apache Airflow, help manage the complexities of their Enterprise Data Warehouse, is being adopted by tech companies everywhere for its ease of management, scalability, and elegant design. Airflow is rapidly becoming the go-to technology for companies scaling out large data warehouses.

The Introduction to the data pipeline management with Airflow training course is designed to familiarize participants with the use of Airflow schedule and maintain numerous ETL processes running on a large scale Enterprise Data Warehouse. 

Table of contents:

  1. Introduction to Airflow
  2. Introduction to Airflow core concepts (DAGs, tasks, operators, sensors)
  3. Airflow UI
  4. Airflow Scheduler
  5. Airflow Operators & Sensors
  6. Advance Airflow Concepts (Hooks, Connections, Variables, Templates, Macros, XCom)

  7. SLA, Monitoring & Alerting
  8. Code examples

Prerequisites

Participants should have a technology background, basic programming skills in Python and be open to sharing their thoughts and questions.

Participants need to bring their laptops. The examples tested on mac & ununtu machines. Participants can use any hosted airflow solutions such as Google cloud composer or Astronomer

Installation

  1. install sqllite3

  2. run ./airflow scheduler to start the airflow scheduler. The installation script will install all the dependencies 

  3. run in another terminal ./airflow webserver

  4. on your browser visit http://localhost:8080 to access airflow UI

Contributing

Interested in contributing? Improving documentation? Adding more example? Check out Contributing.md

License

As stated in the License file all lecture slides are provided under Creative Commons BY-NC 4.0. The exercise code is released under an MIT license.

Author:

Credit

About

Airflow training for the crunch conf

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 86.2%
  • Shell 13.8%