This repository showcases the results of the labs completed during my first semester at Igor Sikorsky Kyiv Polytechnic Institute, where I pursued a Master's degree in Informatics and Software Engineering🎓 -
The labs primarily focus on Apache Airflow and demonstrate data processing pipelines built using the ELT pattern. Through this repository, I aim to share my practical experiences and learnings from these labs with others interested in data engineering and workflow automation using Apache Airflow.
- Before proceeding, ensure you have Apache Airflow installed on your PC. If you are using a Windows system, you can use the Ubuntu subsystem available at https://www.microsoft.com/en-us/p/ubuntu/9nblggh4msv6. Make sure to enable developer mode in Windows Developer Settings and activate the Windows Subsystem for Linux component in Windows Features.
- Install the required packages by running the following commands:
sudo apt-get update
sudo apt-get install libmysqlclient-dev
sudo apt-get install libkrb5-dev
sudo apt-get install libsasl2-dev
sudo apt-get install postgresql postgresql-contrib
sudo service postgresql start
sudo nano /etc/postgresql/*/main/pg_hba.conf
sudo service postgresql restart
sudo apt install python3-pip
pip install apache-airflow
sudo pip install apache-airflow
airflow db init
sudo apt-get install build-dep python-psycopg2
pip install psycopg2-binary
- Place your DAGs in the following folder path: C:/Users/vicwa/AppData/Local/Packages/CanonicalGroupLimited.UbuntuonWindows_79rhkp1fndgsc/LocalState/rootfs/home/vic/airflow/dags
- Create a database using the following command:
psql -h 127.0.0.1 -d airflow -U vic
- Run the following commands in the Ubuntu console:
sudo service postgresql restart
airflow db init
airflow webserver -p 8080
airflow scheduler
sudo service postgresql restart
- Results for Lab1: