-
Notifications
You must be signed in to change notification settings - Fork 96
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Github workflow to populate the persistent source schema.
- Loading branch information
Showing
3 changed files
with
150 additions
and
2 deletions.
There are no files selected for viewing
115 changes: 115 additions & 0 deletions
115
.github/workflows/cd-sql-engine-populate-persistent-source-schema.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
# See [Persistent Source Schema](/GLOSSARY.md#persistent-source-schema) | ||
# Populating the source schema via this workflow ensures that it's done with the same settings as the tests. | ||
|
||
name: Reload Test Data in SQL Engines | ||
|
||
# We don't want multiple workflows trying to create the same table. | ||
concurrency: | ||
group: POPULATE_PERSISTENT_SOURCE_SCHEMA | ||
cancel-in-progress: true | ||
|
||
on: | ||
pull_request: | ||
types: [labeled] | ||
workflow_dispatch: | ||
|
||
env: | ||
# Unclear on how to make 'Reload Test Data in SQL Engines' a constant here as it does not work here. | ||
PYTHON_VERSION: "3.8" | ||
ADDITIONAL_PYTEST_OPTIONS: "--use-persistent-source-schema" | ||
|
||
jobs: | ||
snowflake-populate: | ||
environment: DW_INTEGRATION_TESTS | ||
if: > | ||
github.event.action == 'workflow_dispatch' | ||
|| (github.event.action == 'labeled' && github.event.label.name == 'Reload Test Data in SQL Engines') | ||
name: Snowflake | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Check-out the repo | ||
uses: actions/checkout@v3 | ||
|
||
- name: Populate w/Python ${{ env.PYTHON_VERSION }} | ||
uses: ./.github/actions/run-mf-tests | ||
with: | ||
python-version: ${{ env.PYTHON_VERSION }} | ||
mf_sql_engine_url: ${{ secrets.MF_SNOWFLAKE_URL }} | ||
mf_sql_engine_password: ${{ secrets.MF_SNOWFLAKE_PWD }} | ||
parallelism: 1 | ||
additional-pytest-options: ${{ env.ADDITIONAL_PYTEST_OPTIONS }} | ||
make-target: "populate-persistent-source-schema-snowflake" | ||
|
||
redshift-populate: | ||
environment: DW_INTEGRATION_TESTS | ||
name: Redshift | ||
if: > | ||
github.event.action == 'workflow_dispatch' | ||
|| (github.event.action == 'labeled' && github.event.label.name == 'Reload Test Data in SQL Engines') | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Check-out the repo | ||
uses: actions/checkout@v3 | ||
|
||
- name: Populate w/Python ${{ env.PYTHON_VERSION }} | ||
uses: ./.github/actions/run-mf-tests | ||
with: | ||
python-version: ${{ env.PYTHON_VERSION }} | ||
mf_sql_engine_url: ${{ secrets.MF_REDSHIFT_URL }} | ||
mf_sql_engine_password: ${{ secrets.MF_REDSHIFT_PWD }} | ||
parallelism: 1 | ||
additional-pytest-options: ${{ env.ADDITIONAL_PYTEST_OPTIONS }} | ||
make-target: "populate-persistent-source-schema-redshift" | ||
|
||
bigquery-populate: | ||
environment: DW_INTEGRATION_TESTS | ||
name: BigQuery | ||
if: > | ||
github.event.action == 'workflow_dispatch' | ||
|| (github.event.action == 'labeled' && github.event.label.name == 'Reload Test Data in SQL Engines') | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Check-out the repo | ||
uses: actions/checkout@v3 | ||
|
||
- name: Populate w/Python ${{ env.PYTHON_VERSION }} | ||
uses: ./.github/actions/run-mf-tests | ||
with: | ||
python-version: ${{ env.PYTHON_VERSION }} | ||
MF_SQL_ENGINE_URL: ${{ secrets.MF_BIGQUERY_URL }} | ||
MF_SQL_ENGINE_PASSWORD: ${{ secrets.MF_BIGQUERY_PWD }} | ||
parallelism: 1 | ||
additional-pytest-options: ${{ env.ADDITIONAL_PYTEST_OPTIONS }} | ||
make-target: "populate-persistent-source-schema-bigquery" | ||
|
||
databricks-populate: | ||
environment: DW_INTEGRATION_TESTS | ||
name: Databricks SQL Warehouse | ||
if: > | ||
github.event.action == 'workflow_dispatch' | ||
|| (github.event.action == 'labeled' && github.event.label.name == 'Reload Test Data in SQL Engines') | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Check-out the repo | ||
uses: actions/checkout@v3 | ||
|
||
- name: Populate w/Python ${{ env.PYTHON_VERSION }} | ||
uses: ./.github/actions/run-mf-tests | ||
with: | ||
python-version: ${{ env.PYTHON_VERSION }} | ||
mf_sql_engine_url: ${{ secrets.MF_DATABRICKS_SQL_WAREHOUSE_URL }} | ||
mf_sql_engine_password: ${{ secrets.MF_DATABRICKS_PWD }} | ||
parallelism: 1 | ||
additional-pytest-options: ${{ env.ADDITIONAL_PYTEST_OPTIONS }} | ||
make-target: "populate-persistent-source-schema-databricks" | ||
|
||
remove-label: | ||
name: Remove Label After Populating Test Data | ||
runs-on: ubuntu-latest | ||
needs: [ snowflake-populate, redshift-populate, bigquery-populate, databricks-populate] | ||
if: github.event.action == 'labeled' && github.event.label.name == 'Reload Test Data in SQL Engines' | ||
steps: | ||
- name: Remove Label | ||
uses: actions-ecosystem/action-remove-labels@v1 | ||
with: | ||
labels: 'Reload Test Data in SQL Engines' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# Glossary | ||
|
||
## Persistent source schema | ||
Many tests generate and execute SQL that depend on tables containing test data. By default, a | ||
pytest fixture creates a temporary schema and populates it with the tables that are required by | ||
the tests. This schema is referred to the source schema. Creating the source schema (and | ||
the associated tables) can be a slow process for some SQL engines. Since these tables generally | ||
do not change often, functionality was added to use a source schema that is assumed to already | ||
exist when running tests and persists between runs (a persistent source schema). In addition, | ||
functionality was added to create the persistent source schema based on table definitions in the | ||
repo. Because the name of the source schema is generated based on the hash of the data that's | ||
supposed to be in the schema, the creating and populating the persistent source schema should | ||
not be done concurrently as there are race conditions when creating tables and inserting data. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters