Like we've seen on the previous challenge, we can use GitHub Actions to automate workflows.
In this challenge, we're using a cron job to schedule a daily run of our dbt project.
Create a new file named scheduled_daily_run_<fbalseiro>.yml
(adjust that to your dbt development schema name) on the workflows folder (inside .github
folder) and copy the content below:
name: Scheduled daily run
on:
schedule:
- cron: '0 11 * * *' # This schedule runs the workflow every day at 11:00AM UTC
workflow_dispatch:
jobs:
build-and-deploy:
runs-on: ubuntu-latest
env:
DBT_USER: ${{ secrets.DBT_USER }}
DBT_PASSWORD: ${{ secrets.DBT_PASSWORD }}
SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
SNOWFLAKE_WAREHOUSE: ${{ secrets.SNOWFLAKE_WAREHOUSE }}
SNOWFLAKE_DATABASE: ${{ secrets.SNOWFLAKE_DATABASE }}
SNOWFLAKE_ROLE: ${{ secrets.SNOWFLAKE_ROLE }}
steps:
- name: Checkout repository
uses: actions/checkout@v2
with:
ref: feature_fbalseiro
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: 3.9
- name: Install dbt
run: pip3 install dbt-snowflake
- name: Deploy & Test Models and Source freshness
working-directory: ./dbt
run: |
dbt deps
dbt source freshness --profiles-dir .
dbt build -s +tag:daily+ --profiles-dir .
dbt run-operation drop_old_relations
We're using the schedule with cron as an event to trigger the workflow on a daily basis.
I've also added the workflow_dispatch:
key that let's you trigger the workflow manually to test it by yourself.
I've also added the ref
property under the Checkout repository
step so you can specify your github development branch to schedule this workflow to run on that branch.
Let's a do a deep dive on each dbt command used on the above workflow:
dbt deps
: Install the required packages listed in yourpackages.yml
filedbt source freshness --profiles-dir .
: Tests freshness from sources listed on_boardgames__sources.yml
using theprofiles.yml
file on the current working-directorydbt build -s +tag:daily+ --profiles-dir .
: Builds all the models with thedaily
tag, and all of its parents and children using theprofiles.yml
file on the current working-directory.dbt run-operation drop_old_relations
: Invokes the macrodrop_old_relations
to drop materializations on Snowflake that don't have a dbt model associated to it.
First, you should update the remote development branch by running git push
.
Checkout this video tutorial on how to trigger the workflow manually.