Skip to content

Commit

Permalink
623 ci with cml runtimes (#118)
Browse files Browse the repository at this point in the history
* Build pbj-workbench-python3.9-standard.Dockerfile and test

* Check files

* Build full path instead of cd

* Test whole process with 3.8

* Update push branch for testing

* Navigate to project folder

* Add ls for debugging

* docker run -t cml_3.9

* Run docker in detached mode

* Run docker in -i mode

* Split process to many steps

* Add matrix with cml versions

* Typo matrix

* Tupo matrix

* Typo matrix

* Use string type for versions

* continue-on-error: true

* Tidy up workflow

Break steps
Add 3.11 runtime
Change name tags

* Set wd in pytest

* Add pre commit hooks

* Use checkout v3

* Move files around with docker cp

* Fix typo

* Use checkouts

* Copy files to container

* Mount parent volume

* Create container after checkout

* Trigger on pull requests in main

* 586 sic sut mapping (#113)

* Creating mapping validation function

* Docstrings

* Update docstring and leave unmatched as set

* Update to raise warning and created wraper to test multiple mapping files in one go

* adding test that passes when a warning raised

* update docstring to ask that mapping file be a folder and not a file

* Correct design and calibration values (#108)

*Add reusable function
*Add test data
*Add unit test
*Add TODO to use this function in other parts of the pipeline

* 632 module restructure (#116)

* move files to correct location

* change relevant module imports

* Move files to apropiate folders, rename folders

* Move all data to equivalent test folder structure

* Update estimation test paths and imports

* Update imputation test paths and imports

* Update outlier detection test paths and imports

* Update outpus test paths and imports

* Update utilities test paths and imports

* Remove tests/imputation/test_pivot_imputation_value.py

* Run hooks

* Add tests tree into tests readme

* Run hooks

* Remove duplicated test data

* These were copied instead of moved, hence duplicated

* these files aren't needed and covered by other tests

* update tree

---------

Co-authored-by: Wil Roberts <[email protected]>

* Use list, passing dict not support in pandas 2.1.4

* Enforce constrain marker str type in tests

* Create separate job for pre commit hooks

* Use legacy job for hooks

* Run hooks

* Pre commit mig config, user python instead of python3

* Use python 3.9 for hooks

* Use 3.10 for hooks

* Use python 3.10.13

---------

Co-authored-by: Jordan-Day-ONS <[email protected]>
Co-authored-by: Wil Roberts <[email protected]>
  • Loading branch information
3 people authored Oct 29, 2024
1 parent 576c5ae commit 9621804
Show file tree
Hide file tree
Showing 10 changed files with 53 additions and 66 deletions.
69 changes: 25 additions & 44 deletions .github/workflows/main.yaml
Original file line number Diff line number Diff line change
@@ -1,61 +1,19 @@
name: Build and run tests
name: cml_runtimes

# Controls when the action will run.
on:
# Triggers the workflow on push events for the main branch
push:
branches: [ main ]

# Triggers the workflow on pull requests to main branch
pull_request:
branches: [ main ]

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

jobs:
build:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3

- name: Set up Python 3.6.8
uses: actions/setup-python@v3
with:
python-version: 3.6.8

- name: Check package build
run: |
python -m pip install --upgrade pip
test:
runs-on: ubuntu-20.04
steps:
# Checks-out your repository under $GITHUB_WORKSPACE
- uses: actions/checkout@v3

- uses: actions/setup-python@v3
with:
python-version: 3.6.8
cache: 'pip'

- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
pip install .[dev]
- name: Run pytest
run: |
pytest -v
commit-hooks:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3

- uses: actions/setup-python@v3
with:
python-version: 3.6.8
python-version: 3.10.13
cache: 'pip'

- name: Install Python dependencies
Expand All @@ -66,3 +24,26 @@ jobs:
- name: Check commit hooks
run: |
pre-commit run --all-files
testing-cml:
runs-on: ubuntu-latest
strategy:
matrix:
cml_version: ["3.8", "3.9", "3.10","3.11"]
steps:
- name: checkout ml-runtimes #https://github.com/cloudera/ml-runtimes
uses: actions/checkout@master
with:
repository: cloudera/ml-runtimes
- name: build runtime cml_${{matrix.cml_version}}
run: docker build -t cml:${{matrix.cml_version}} -f 'pbj-workbench-python${{matrix.cml_version}}-standard.Dockerfile' .
- name: checkout to repository
uses: actions/checkout@v3
- name: create container
run: docker run -id --name container_${{matrix.cml_version}} -v"$(pwd)"://home/cdsw cml:${{matrix.cml_version}}
- name: build in dev mode
run: docker exec container_${{matrix.cml_version}} pip install ."[dev]"
- name: check env
run: docker exec container_${{matrix.cml_version}} pip list
- name: test
run: docker exec container_${{matrix.cml_version}} pytest
18 changes: 9 additions & 9 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ repos:
entry: pre-commits/check_added_large_files.py
name: Check for files larger than 5 MB
language: script
stages: [commit]
stages: [pre-commit]
args: [ "--maxkb=5120" ]

#works
Expand All @@ -19,7 +19,7 @@ repos:
entry: pre-commits/end_of_line_fixer.py
name: Check for a blank line at the end of scripts (auto-fixes)
language: script
stages: [commit]
stages: [pre-commit]

#works
- repo: local
Expand All @@ -28,7 +28,7 @@ repos:
entry: pre-commits/remove_whitespace.py
name: Check for trailing whitespaces (auto-fixes)
language: script
stages: [commit]
stages: [pre-commit]

#works
- repo: local
Expand All @@ -37,7 +37,7 @@ repos:
entry: pre-commits/mixed_line_endings.py
name: Check for consistent end of line type LF to CRLF to CR (auto-fixes)
language: script
stages: [commit]
stages: [pre-commit]

#works
#if using on different file types, it will need a seperate hook per file type
Expand All @@ -48,7 +48,7 @@ repos:
name: isort - Sort Python imports (auto-fixes)
language: system
types: [python]
stages: [commit]
stages: [pre-commit]
args: [ "--profile", "black", "--filter-files" ]

#works
Expand All @@ -58,7 +58,7 @@ repos:
entry: nbstripout
name: nbstripout - Strip outputs from notebooks (auto-fixes)
language: system
stages: [commit]
stages: [pre-commit]
# args:
# - --extra-keys
# - "metadata.colab metadata.kernelspec cell.metadata.colab cell.metadata.executionInfo cell.metadata.id cell.metadata.outputId"
Expand All @@ -71,7 +71,7 @@ repos:
name: black - consistent Python code formatting (auto-fixes)
language: system
types: [python]
stages: [commit]
stages: [pre-commit]
args: ["--verbose"]
exclude: ^playground/

Expand All @@ -83,7 +83,7 @@ repos:
name: flake8 - Python linting
language: system
types: [python]
stages: [commit]
stages: [pre-commit]


# works in testing
Expand All @@ -96,7 +96,7 @@ repos:
#args: [scan, audit]
language: system
types: [python]
stages: [commit]
stages: [pre-commit]



Expand Down
12 changes: 7 additions & 5 deletions mbs_results/staging/data_cleaning.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,10 @@ def clean_and_merge(
responses = pd.DataFrame(snapshot["responses"])

responses = filter_responses(responses, reference, period, "lastupdateddate")
responses = responses[responses_keep_cols].set_index([reference, period])
contributors = contributors[contributors_keep_cols].set_index([reference, period])
responses = responses[list(responses_keep_cols)].set_index([reference, period])
contributors = contributors[list(contributors_keep_cols)].set_index(
[reference, period]
)

validate_indices(responses, contributors)
return responses.merge(contributors, on=[reference, period])
Expand Down Expand Up @@ -439,8 +441,8 @@ def correct_values(
# Update value only if columns exist
if set(check_columns).issubset(df.columns):

df_temp.loc[
df[condition_column].isin(condition_values), columns_to_correct
] = replace_with
df_temp.loc[df[condition_column].isin(condition_values), columns_to_correct] = (
replace_with
)

return df_temp
6 changes: 3 additions & 3 deletions mbs_results/utilities/constrains.py
Original file line number Diff line number Diff line change
Expand Up @@ -375,9 +375,9 @@ def calculate_derived_outlier_weights(
)

updated_o_weight_bool = df_pre_winsorised[winsorised_target].isna()
df_pre_winsorised.loc[
updated_o_weight_bool, winsorised_target
] = post_win_derived.loc[updated_o_weight_bool, winsorised_target]
df_pre_winsorised.loc[updated_o_weight_bool, winsorised_target] = (
post_win_derived.loc[updated_o_weight_bool, winsorised_target]
)
df_pre_winsorised["post_wins_marker"] = updated_o_weight_bool

df_pre_winsorised.reset_index(inplace=True)
Expand Down
2 changes: 1 addition & 1 deletion pre-commits/check_added_large_files.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/usr/bin/env python3
#!/usr/bin/env python
"""Pre commit hook to ensure large files aren't added to repo."""
import argparse
import json
Expand Down
2 changes: 1 addition & 1 deletion pre-commits/check_merge_conflict.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/usr/bin/env python3
#!/usr/bin/env python
"""Pre commit hook to check for merge conflict flags in file."""
import argparse
import os.path
Expand Down
2 changes: 1 addition & 1 deletion pre-commits/end_of_line_fixer.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/usr/bin/env python3
#!/usr/bin/env python
"""Pre commit hook to ensure single blank line at end of python file."""
import argparse
import os
Expand Down
2 changes: 1 addition & 1 deletion pre-commits/mixed_line_endings.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/usr/bin/env python3
#!/usr/bin/env python
"""Pre commit hook to ensure all EOL characters are the same."""
import argparse
import collections
Expand Down
2 changes: 1 addition & 1 deletion pre-commits/remove_whitespace.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/usr/bin/env python3
#!/usr/bin/env python
"""Pre commit hook to remove any trailing whitespace."""
import argparse
import os
Expand Down
4 changes: 4 additions & 0 deletions tests/utilities/test_constrains.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,10 @@ def test_replace_values_index_base(filepath):
replace_values_index_based(df_in, "target", 49, ">", 40)
replace_values_index_based(df_in, "target", 90, ">=", 40)

# Enforce dtypes, otherwise null==null fails
df_in["constrain_marker"] = df_in["constrain_marker"].astype(str)
df_expected["constrain_marker"] = df_expected["constrain_marker"].astype(str)

assert_frame_equal(df_in, df_expected)


Expand Down

0 comments on commit 9621804

Please sign in to comment.