From 4fff2d1bea2e34aa71d08ca554b1f03b03133d3e Mon Sep 17 00:00:00 2001 From: Aldrian Harjati Date: Wed, 4 Oct 2023 16:14:26 -0400 Subject: [PATCH] clean up markup warnings --- README.md | 54 +++++++++++++++++++++++++++++++----------------------- 1 file changed, 31 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index 3a778ca5..1b3482b9 100644 --- a/README.md +++ b/README.md @@ -3,11 +3,13 @@ This is a RegTech submission data parser and validator which makes use of Pandera. You can read about Pandera schemas [here](https://pandera.readthedocs.io/en/stable/dataframe_schemas.html). ## Pre-requisites + Poetry is used as the package management tool. Once installed, just running poetry install in the root of the project should install all the dependencies needed by the app. Docker is used for local development where ancillary services will run. (Optional) Visual Studio Code for local development. ## Dependencies + All packages and libraries used in this repository can be found in`pyproject.toml` ## Dev Container Setup @@ -17,20 +19,23 @@ The code in this repository is developed and run inside of a dev container withi Open this repository within VS Code and press `COMMAND + SHIFT + p` on your keyboard. This will open the command bar at the top of your window. Enter `Dev Containers: Rebuild and Reopen in Container`. VS Code will open a new window and you'll see a status message towards the bottom right of your screen that the container is building and attaching. This will take a few minutes the first time because Docker needs to build the container without a build cache. You may receive a notification that VS Code wants to perform a reload because some extensions could not load. Sometimes this happens because extensions are loaded in conflicting orders and dependencies are not satisfied. ## Development -There are few files in `src/validator` that will be of interest. -- `checks.py` defines custom Pandera Check class called `SBLCheck`. + +There are few files in `src/validator` that will be of interest. + +- `checks.py` defines custom Pandera Check class called `SBLCheck`. - `global_data.py` defines functions to parse NAICS and GEOIDs. - `phase_validations.py` defines phase 1 and phase 2 Pandera schema/checks used for validating the SBLAR data. - `check_functions.py` contains a collection of functions to be run against the data that are a bit too complex to be implemented directly within the schema as Lambda functions. - Lastly, the file `main.py` pulls everything together and illustrates how the schema can catch the various validation errors present in our mock, invalid dataset and different LEI values. ## Test data + - The repo includes unit tests that can be executed using `pytest`. These tests can be located under `src/tests`. - The repo also includes 2 test datasets for manual testing, one with all valid data, and one where each line represents a different failed validation, or different permutation of of the same failed validation. - [`sbl-validations-pass.csv`](src/tests/data/sbl-validations-pass.csv) - [`sbl-validations-fail.csv`](src/tests/data/sbl-validations-fail.csv) -## Development +## Development Process and Standard Development Process Below are the steps the development team follows to fix issues, develop new features, etc. @@ -43,24 +48,26 @@ Below are the steps the development team follows to fix issues, develop new feat 6. The PR is merged only by a separate member in the dev team Development standard practice -* Check functions should focus on reuse. - * Most of the validations share logic with other validations. -* Avoid using lambdas for Check functions. - * They do not promote reuse. - * They are harder to debug. - * They are harder to test. -* Check function signatures should reflect the functionality. -* Check functions should have corresponding unit tests. - * [Unit Test](./src/tests/test_check_functions.py) -* Check definitions' name should be set to validation ID. - * Example: "denial_reasons. enum_value_conflict" + +- Check functions should focus on reuse. + - Most of the validations share logic with other validations. +- Avoid using lambdas for Check functions. + - They do not promote reuse. + - They are harder to debug. + - They are harder to test. +- Check function signatures should reflect the functionality. +- Check functions should have corresponding unit tests. + - [Unit Test](./src/tests/test_check_functions.py) +- Check definitions' name should be set to validation ID. + - Example: "denial_reasons. enum_value_conflict" ![Validation ID](validation_id.png) ## Running Validator -`main.py` allows user to test csv file with and without LEI number. -# Using VSCode +`main.py` allows user to test csv file with and without LEI number. + If using VS Code, validator can be executed by running `main.py` within a Dev Container. To run `main.py`, you can run these commands in VSCode terminal. + ```sh # Test validating the "good" file # If passing lei value, pass lei as first arg and csv_path as second arg @@ -74,8 +81,8 @@ python src/validator/main.py 000TESTFIUIDDONOTUSE src/tests/data/sbl-validations python src/validator/main.py src/tests/data/sbl-validations-fail.csv ``` -# Using terminal If using terminal, you can utilize `poetry` to run the code. To run `main.py`, you can use these commands. + ```sh # Test validating the "good" file # If passing lei value, pass lei as first arg and csv_path as second arg @@ -89,24 +96,25 @@ poetry run python src/validator/main.py 000TESTFIUIDDONOTUSE src/tests/data/sbl- poetry run python src/validator/main.py src/tests/data/sbl-validations-fail.csv ``` - ## Running Test -This repository is using `pytest`. -# Using VSCode -If using VS Code, tests can be completed within a Dev Container. +This repository is using `pytest`. If using VS Code, tests can be completed within a Dev Container. If using local terminal or console, you can use this command `poetry run pytest` in the root directory -# Using terminal -To run `pytest`, you can use this command `poetry run pytest` in the root directory +```sh +poetry run pytest +``` ## Coverage + [![Coverage badge](https://github.com/cfpb/regtech-data-validator/raw/python-coverage-comment-action-data/badge.svg)](https://github.com/cfpb/regtech-data-validator/tree/python-coverage-comment-action-data) Complete coverage details can be found under [`python-coverage-comment-action-data` branch](https://github.com/cfpb/regtech-data-validator/tree/python-coverage-comment-action-data) ## Contributing + [CFPB](https://www.consumerfinance.gov/) is developing the `RegTech Data Validator` in the open to maximize transparency and encourage third party contributions. If you want to contribute, please read and abide by the terms of the [License](./LICENSE) for this project. Pull Requests are always welcome. ## Open source licensing info + 1. [TERMS](./TERMS.md) 1. [LICENSE](./LICENSE) 1. [CFPB Source Code Policy](https://github.com/cfpb/source-code-policy/)