Skip to content

Commit

Permalink
feat: add pre-commit (#4)
Browse files Browse the repository at this point in the history
* feat: add pre-commit
* fix: auto formatter
* feat: add ci workflow for tests
  • Loading branch information
leomaurodesenv authored Nov 2, 2023
1 parent 0624be9 commit 38daa5f
Show file tree
Hide file tree
Showing 6 changed files with 50 additions and 6 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/changelog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ jobs:
runs-on: ubuntu-latest

permissions:
# Give the default GITHUB_TOKEN write permission to commit and push the
# Give the default GITHUB_TOKEN write permission to commit and push the
# updated CHANGELOG back to the repository.
# https://github.blog/changelog/2023-02-02-github-actions-updating-the-default-github_token-permissions-to-read-only/
contents: write
Expand Down
22 changes: 22 additions & 0 deletions .github/workflows/continuous-integration.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: "Continuous Integration"

run-name: Running tests on "${{ github.ref }}" by "${{ github.actor }}"

on:
push:
# Ignore following branches
branches-ignore:
- "dev/*"

jobs:
# Run pre-commit hooks
pre-commit:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: "3.10"
cache: "pip" # caching pip dependencies
- run: pip install -r requirements.txt
- run: pre-commit run --all-files
13 changes: 13 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.3.0
hooks:
- id: check-yaml
- id: check-added-large-files
- id: check-docstring-first
- id: end-of-file-fixer
- id: trailing-whitespace
- repo: https://github.com/psf/black
rev: 22.10.0
hooks:
- id: black
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,6 @@
This is a learning repository about DVC Data Version Control and Luigi Pipelines

- luigi, dvc, pre-commit
- setup https://pre-commit.com/
- setup https://github.com/Kaggle/kaggle-api
- `kaggle competitions download -c sentiment-analysis-on-movie-reviews -p data`
3 changes: 2 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
pre-commit==3.5.0
kaggle==1.5.16
dvc==3.28.0
luigi==3.4.0
luigi==3.4.0
15 changes: 11 additions & 4 deletions source/get_raw_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,26 @@
import luigi
import zipfile


class ExtractRawData(luigi.Task):
data_path = luigi.Parameter(default="../data/sentiment-analysis-on-movie-reviews.zip")
"""
Extract raw data from zip file
"""

data_path = luigi.Parameter(
default="../data/sentiment-analysis-on-movie-reviews.zip"
)

def output(self):
return {
"test": luigi.LocalTarget('../data/output/test.tsv.zip'),
"train": luigi.LocalTarget('../data/output/train.tsv.zip'),
"test": luigi.LocalTarget("../data/output/test.tsv.zip"),
"train": luigi.LocalTarget("../data/output/train.tsv.zip"),
}

def run(self):
# Check if data file exists
assert os.path.exists(self.data_path)

# Unzip data file
with zipfile.ZipFile(self.data_path, 'r') as zip_ref:
with zipfile.ZipFile(self.data_path, "r") as zip_ref:
zip_ref.extractall("../data/output/")

0 comments on commit 38daa5f

Please sign in to comment.