Python code from videos is linked below.
Also, if you find the commands too small to view in Kalise's videos, here's the transcript with code for the second Prefect video and the fifth Prefect video.
- What is a Data Lake
- ELT vs. ETL
- Alternatives to components (S3/HDFS, Redshift, Snowflake etc.)
- Video
- Slides
- What is orchestration?
- Workflow orchestrators vs. other types of orchestrators
- Core features of a workflow orchestration tool
- Different types of workflow orchestration tools that currently exist
🎥 Video
- What is Prefect?
- Installing Prefect
- Prefect flow
- Creating an ETL
- Prefect task
- Blocks and collections
- Orion UI
🎥 Video
- Flow 1: Putting data to Google Cloud Storage
🎥 Video
- Flow 2: From GCS to BigQuery
🎥 Video
- Parametrizing the script from your flow
- Parameter validation with Pydantic
- Creating a deployment locally
- Setting up Prefect Agent
- Running the flow
- Notifications
🎥 Video
- Scheduling a deployment
- Flow code storage
- Running tasks in Docker
🎥 Video
- Using Prefect Cloud instead of local Prefect
- Workspaces
- Running flows on GCP
🎥 Video
Code from videos (with a few minor enhancements)
Homework can be found here.
Did you take notes? You can share them here.
- Blog by Marcos Torregrosa (Prefect)
- Notes from Victor Padilha
- Notes by Alain Boisvert
- Notes by Candace Williams
- Notes from Xia He-Bleinagel
- Notes from froukje
- Notes from Balaji
- More on Pandas vs SQL, Prefect capabilities, and testing your data, by Vera
- Add your notes here (above this line)