Skip to content

Commit

Permalink
Add and configure pre-commit (#174)
Browse files Browse the repository at this point in the history
  • Loading branch information
mikealfare authored Apr 19, 2024
1 parent 4774638 commit a9f1e14
Show file tree
Hide file tree
Showing 24 changed files with 152 additions and 216 deletions.
6 changes: 6 additions & 0 deletions .changes/unreleased/Under the Hood-20240417-192843.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Under the Hood
body: Validate that dbt-core and dbt-adapters remain de-coupled
time: 2024-04-17T19:28:43.400023-04:00
custom:
Author: mikealfare
Issue: "144"
6 changes: 3 additions & 3 deletions .github/ISSUE_TEMPLATE/internal-epic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ body:
label: Objectives
description: |
What are the high level goals we are trying to achieve? Provide use cases if available.
Example:
- [ ] Allow adapter maintainers to support custom materializations
- [ ] Reduce maintenance burden for incremental users by offering materialized views
Expand All @@ -48,7 +48,7 @@ body:
Provide a list of GH issues that will build out this functionality.
This may start empty, or as a checklist of items.
However, it should eventually become a list of Feature Implementation tickets.
Example:
- [ ] Create new macro to select warehouse
- [ ] https://github.com/dbt-labs/dbt-adapters/issues/42
Expand All @@ -66,7 +66,7 @@ body:
Provide a list of relevant documentation. Is there a proof of concept?
Does this require and RFCs, ADRs, etc.?
If the documentation exists, link it; if it does not exist yet, reference it descriptively.
Example:
- [ ] RFC for updating connection interface to accept new parameters
- [ ] POC: https://github.com/dbt-labs/dbt-adapters/pull/42
Expand Down
6 changes: 3 additions & 3 deletions .github/ISSUE_TEMPLATE/internal-feature-implementation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ body:
label: Acceptance criteria
description: |
What is the definition of done for this feature? Include any relevant edge cases and/or test cases.
Example:
- [ ] If there are no config changes, don't alter the materialized view
- [ ] If the materialized view is scheduled to refresh, a manual refresh should not be issued
Expand All @@ -58,7 +58,7 @@ body:
description: |
Provide scenarios to test. Include both positive and negative tests if possible.
Link to existing similar tests if appropriate.
Example:
- [ ] Test with no `materialized` field in the model config. Expect pass.
- [ ] Test with a `materialized` field in the model config that is not valid. Expect ConfigError.
Expand All @@ -68,7 +68,7 @@ body:
```
validations:
required: true

- type: textarea
attributes:
label: Security
Expand Down
8 changes: 8 additions & 0 deletions .github/actions/setup-hatch/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,13 @@ runs:
python-version: ${{ inputs.python-version }}

- name: Install dev dependencies
shell: bash
run: ${{ inputs.setup-command }}

- name: Add brew to the PATH
shell: bash
run: echo "/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin" >> $GITHUB_PATH

- name: Install pre-commit
shell: bash
run: brew install pre-commit
14 changes: 4 additions & 10 deletions .github/workflows/code-quality.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,13 @@ on:

permissions: read-all

defaults:
run:
shell: bash

# will cancel previous workflows triggered by the same event and for the same ref for PRs or same SHA otherwise
concurrency:
group: ${{ github.workflow }}-${{ github.event_name }}-${{ contains(github.event_name, 'pull_request') && github.event.pull_request.head.ref || github.sha }}
cancel-in-progress: true

jobs:
lint:
code-quality:
name: Code Quality
runs-on: ubuntu-latest

Expand All @@ -33,8 +29,6 @@ jobs:
- name: Setup `hatch`
uses: ./.github/actions/setup-hatch

- name: Run linters
run: hatch run lint:all

- name: Run typechecks
run: hatch run typecheck:all
- name: Run code quality
shell: bash
run: hatch run code-quality
2 changes: 1 addition & 1 deletion .github/workflows/github-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -256,4 +256,4 @@ jobs:
RELEASE_NOTES: ${{ inputs.changelog_path }}
COMMIT: ${{ inputs.sha }}
PRERELEASE: ${{ steps.release_type.outputs.prerelease }}
DRAFT: ${{ steps.draft.outputs.draft }}
DRAFT: ${{ steps.draft.outputs.draft }}
2 changes: 1 addition & 1 deletion .github/workflows/release_prep_hatch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -539,4 +539,4 @@ jobs:
- name: "Remove Temp Branch - ${{ needs.create-temp-branch.outputs.branch_name }}"
if: ${{ inputs.deploy_to == 'prod' && inputs.nightly_release == 'false' && needs.create-temp-branch.outputs.branch_name != '' }}
run: |
git push origin -d ${{ needs.create-temp-branch.outputs.branch_name }}
git push origin -d ${{ needs.create-temp-branch.outputs.branch_name }}
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -153,4 +153,4 @@ dmypy.json
cython_debug/

# PyCharm
.idea/
.idea/
112 changes: 53 additions & 59 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,63 +1,57 @@
# For more on configuring pre-commit hooks (see https://pre-commit.com/)

# Force all unspecified python hooks to run python 3.8
default_language_version:
python: python3
python: python3

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: check-yaml
args: [--unsafe]
- id: check-json
- id: end-of-file-fixer
- id: trailing-whitespace
- id: check-case-conflict
- repo: https://github.com/psf/black
rev: 23.1.0
hooks:
- id: black
additional_dependencies: ['click~=8.1']
args:
- "--line-length=99"
- "--target-version=py38"
- id: black
alias: black-check
stages: [manual]
additional_dependencies: ['click~=8.1']
args:
- "--line-length=99"
- "--target-version=py38"
- "--check"
- "--diff"
- repo: https://github.com/pycqa/flake8
rev: 6.0.0
hooks:
- id: flake8
- id: flake8
alias: flake8-check
stages: [manual]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.1.1
hooks:
- id: mypy
# N.B.: Mypy is... a bit fragile.
#
# By using `language: system` we run this hook in the local
# environment instead of a pre-commit isolated one. This is needed
# to ensure mypy correctly parses the project.
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: check-yaml
args: [--unsafe]
- id: check-json
- id: end-of-file-fixer
- id: trailing-whitespace
- id: check-case-conflict

- repo: https://github.com/dbt-labs/pre-commit-hooks
rev: v0.1.0a1
hooks:
- id: dbt-core-in-adapters-check

- repo: https://github.com/psf/black
rev: 24.4.0
hooks:
- id: black
args:
- --line-length=99
- --target-version=py38
- --target-version=py39
- --target-version=py310
- --target-version=py311
- --force-exclude=dbt/adapters/events/adapter_types_pb2.py

- repo: https://github.com/pycqa/flake8
rev: 7.0.0
hooks:
- id: flake8
exclude: dbt/adapters/events/adapter_types_pb2.py|tests/functional/
args:
- --max-line-length=99
- --select=E,F,W
- --ignore=E203,E501,E704,E741,W503,W504
- --per-file-ignores=*/__init__.py:F401

# It may cause trouble in that it adds environmental variables out
# of our control to the mix. Unfortunately, there's nothing we can
# do about per pre-commit's author.
# See https://github.com/pre-commit/pre-commit/issues/730 for details.
args: [--show-error-codes, --ignore-missing-imports, --explicit-package-bases]
files: ^dbt/adapters/.*
language: system
- id: mypy
alias: mypy-check
stages: [manual]
args: [--show-error-codes, --pretty, --ignore-missing-imports, --explicit-package-bases]
files: ^dbt/adapters
language: system
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.9.0
hooks:
- id: mypy
exclude: dbt/adapters/events/adapter_types_pb2.py|dbt-tests-adapter/dbt/__init__.py
args:
- --explicit-package-bases
- --ignore-missing-imports
- --pretty
- --show-error-codes
files: ^dbt/adapters/
additional_dependencies:
- types-PyYAML
- types-protobuf
- types-pytz
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ Remember to commit and push the file that's created.

### Signing the CLA

> **_NOTE:_** All contributors to `dbt-adapter` must sign the
> **_NOTE:_** All contributors to `dbt-adapter` must sign the
> [Contributor License Agreement](https://docs.getdbt.com/docs/contributor-license-agreements)(CLA).
Maintainers will be unable to merge contributions until the contributor signs the CLA.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from pathlib import Path

from dbt_common.exceptions import CompilationError

# TODO: does this belong in dbt-tests-adapter?
from dbt.exceptions import ParsingError
import pytest
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,16 +46,22 @@ def models(self):
def test_invalid_input(self, project):
results = run_dbt(["run"])
assert len(results) == 2

_, out = run_dbt_and_capture(
["test", "--select", "test_name:test_invalid_input_column_name"], expect_pass=False
)
assert "Invalid column name: 'invalid_column_name' in unit test fixture for 'my_upstream_model'." in out

assert (
"Invalid column name: 'invalid_column_name' in unit test fixture for 'my_upstream_model'."
in out
)

_, out = run_dbt_and_capture(
["test", "--select", "test_name:test_invalid_expect_column_name"], expect_pass=False
)
assert "Invalid column name: 'invalid_column_name' in unit test fixture for expected output." in out
assert (
"Invalid column name: 'invalid_column_name' in unit test fixture for expected output."
in out
)


class TestPostgresUnitTestInvalidInput(BaseUnitTestInvalidInput):
Expand Down
1 change: 1 addition & 0 deletions dbt/adapters/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
This adds all subdirectories of directories on `sys.path` to this package’s `__path__` .
It effectively combines all adapters into a single namespace (dbt.adapter).
"""

from pkgutil import extend_path

__path__ = extend_path(__path__, __name__)
4 changes: 3 additions & 1 deletion dbt/adapters/base/connections.py
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,9 @@ def set_connection_name(self, name: Optional[str] = None) -> Connection:
conn.handle = LazyHandle(self.open)
# Add the connection to thread_connections for this thread
self.set_thread_connection(conn)
fire_event(NewConnection(conn_name=conn_name, conn_type=self.TYPE, node_info=get_node_info()))
fire_event(
NewConnection(conn_name=conn_name, conn_type=self.TYPE, node_info=get_node_info())
)
else: # existing connection either wasn't open or didn't have the right name
if conn.state != "open":
conn.handle = LazyHandle(self.open)
Expand Down
17 changes: 9 additions & 8 deletions dbt/adapters/base/impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -1328,14 +1328,16 @@ def calculate_freshness_from_metadata_batch(
# Track schema, identifiers of sources for lookup from batch query
schema_identifier_to_source = {
(
source.path.get_lowered_part(ComponentName.Schema),
source.path.get_lowered_part(ComponentName.Identifier),
source.path.get_lowered_part(ComponentName.Schema), # type: ignore
source.path.get_lowered_part(ComponentName.Identifier), # type: ignore
): source
for source in sources
}

# Group metadata sources by information schema -- one query per information schema will be necessary
sources_by_info_schema: Dict[InformationSchema, List[BaseRelation]] = self._get_catalog_relations_by_info_schema(sources)
sources_by_info_schema: Dict[InformationSchema, List[BaseRelation]] = (
self._get_catalog_relations_by_info_schema(sources)
)

freshness_responses: Dict[BaseRelation, FreshnessResponse] = {}
adapter_responses: List[Optional[AdapterResponse]] = []
Expand Down Expand Up @@ -1393,7 +1395,9 @@ def _create_freshness_response(

return freshness

def _parse_freshness_row(self, row: "agate.Row", table: "agate.Table") -> Tuple[Any, FreshnessResponse]:
def _parse_freshness_row(
self, row: "agate.Row", table: "agate.Table"
) -> Tuple[Any, FreshnessResponse]:
from dbt_common.clients.agate_helper import get_column_value_uncased

try:
Expand All @@ -1404,10 +1408,7 @@ def _parse_freshness_row(self, row: "agate.Row", table: "agate.Table") -> Tuple[
except Exception:
raise MacroResultError(GET_RELATION_LAST_MODIFIED_MACRO_NAME, table)

freshness_response = self._create_freshness_response(
last_modified_val,
snapshotted_at_val
)
freshness_response = self._create_freshness_response(last_modified_val, snapshotted_at_val)
raw_relation = schema.lower().strip(), identifier.lower().strip()
return raw_relation, freshness_response

Expand Down
6 changes: 2 additions & 4 deletions dbt/adapters/contracts/relation.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,9 @@ class MaterializationConfig(Mapping, ABC):
contract: MaterializationContract
extra: Dict[str, Any]

def __contains__(self, item):
...
def __contains__(self, item): ...

def __delitem__(self, key):
...
def __delitem__(self, key): ...


class RelationConfig(Protocol):
Expand Down
2 changes: 1 addition & 1 deletion dbt/adapters/events/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ When events are processed via `fire_event`, nearly everything is logged. Whether

We have switched from using betterproto to using google protobuf, because of a lack of support for Struct fields in betterproto.

The google protobuf interface is janky and very much non-Pythonic. The "generated" classes in types_pb2.py do not resemble regular Python classes. They do not have normal constructors; they can only be constructed empty. They can be "filled" by setting fields individually or using a json_format method like ParseDict. We have wrapped the logging events with a class (in types.py) which allows using a constructor -- keywords only, no positional parameters.
The google protobuf interface is janky and very much non-Pythonic. The "generated" classes in types_pb2.py do not resemble regular Python classes. They do not have normal constructors; they can only be constructed empty. They can be "filled" by setting fields individually or using a json_format method like ParseDict. We have wrapped the logging events with a class (in types.py) which allows using a constructor -- keywords only, no positional parameters.

## Required for Every Event

Expand Down
Loading

0 comments on commit a9f1e14

Please sign in to comment.