Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

58: update readme #59

Merged
merged 11 commits into from
Oct 12, 2023
Merged

58: update readme #59

merged 11 commits into from
Oct 12, 2023

Conversation

aharjati
Copy link
Contributor

update readme:

  • Change focus to use Poetry
  • Add more details on POETRY steps (installation, developments and tests)
  • Add more details on VScode development
  • Add contact/help information

@aharjati aharjati linked an issue Oct 10, 2023 that may be closed by this pull request
@github-actions
Copy link

github-actions bot commented Oct 10, 2023

Coverage report

The coverage rate went from 93.6% to 93.97% ⬆️
The branch rate is 82%.

0% of new lines are covered.

Diff Coverage details (click to unfold)

src/validator/main.py

0% of new lines are covered (0% of the complete file).
Missing lines: 8, 25

@aharjati
Copy link
Contributor Author

the README preview can be viewed on this branch: https://github.com/cfpb/regtech-data-validator/tree/features/update_README

Copy link
Member

@hkeeler hkeeler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aharjati, thanks for putting all of this together. I've taken a first pass, and put up my initial thoughts. Time permitting, I will try to contribute a bit as well. Some of the re-org bits are difficult to describe via PR review, and may just be easier if I tweak it.

README.md Outdated Show resolved Hide resolved
README.md Outdated

## Running the Demo
All packages and libraries used in this repository can be found in `pyproject.toml`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this a link to pyproject.toml.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think the poetry install step should be included here.

README.md Outdated
Comment on lines 7 to 9
- Poetry is used as the package management tool.
- (Optional) Visual Studio Code for development.
- (Optional) Docker is needed when using Visual Studio Code / Dev Container.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Poetry is used as the package management tool.
- (Optional) Visual Studio Code for development.
- (Optional) Docker is needed when using Visual Studio Code / Dev Container.
The following software packages are pre-requisites to installing this software.
- [Python](https://www.python.org/downloads/) version 3.10 or greater.
- [Poetry](https://python-poetry.org/docs/#installation) for Python package management.

I think most users coming to this repo for the first time just want to see it run, not necessarily develop anything. To simplify things for them, I think we should move the following down into a separate Development section.

  • (Optional) Visual Studio Code for development.
  • (Optional) Docker is needed when using Visual Studio Code / Dev Container.

Now, we it would be nice to have a Dockerfile dedicated to just running the CLI. That's an even easier setup for those who just want to see it run, and don't want to have to know anything about Python, etc. The one in .devcontainer is close, but I don't think it's quite what we'd want. Perhaps that'd be a good follow-up PR.

README.md Outdated
Comment on lines 25 to 30
## Development Tests

- The repo includes unit tests that can be executed using `pytest` or in Visual Studio Code. These tests can be located under `src/tests`.
- The repo also includes 2 test datasets for manual testing, one with all valid data, and one where each line represents a different failed validation, or different permutation of of the same failed validation.
- [`sbl-validations-pass.csv`](src/tests/data/sbl-validations-pass.csv)
- [`sbl-validations-fail.csv`](src/tests/data/sbl-validations-fail.csv)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should separate the test data from the tests themselves documentation-wise. When users first come here, they're probably going to just run the CLI and see what happens. Having the test data there is a nice thing for them to try first before they try their own data.

Test data

This repo includes 2 test datasets, one with all valid data, and one where each line represents a different failed validation, or different permutation of the same failed validation.

We use these test files in for automated test, but can also be passed in via the CLI for manual testing.

Similarly, I think all detailed testing-related bits should be moved under Development.

  • The repo includes unit tests that can be executed using pytest or in Visual Studio Code. These tests can be located under src/tests.

README.md Outdated
Comment on lines 195 to 211
Performing validation on the following DataFrame.

uid app_date app_method app_recipient ... po_4_race_baa_ff po_4_race_pi_ff po_4_gender_flag po_4_gender_ff
0 20241201 1 1 ...
1 BXUIDXVID11XTC2 20241201 1 1 ...
2 BXUIDXVID11XTC31234567890123456789012345678901 20241201 1 1 ...
3 BXUIDXVID12XTC1abcdef 20241201 1 1 ...
4 000TESTFIUIDDONOTUSEXBXVID13XTC1 20241201 1 1 ...
.. ... ... ... ... ... ... ... ... ...
364 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC5 20241201 1 1 ... 988
365 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC6 20241201 1 1 ... 988
366 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC7 20241201 1 1 ... 988
367 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC8 20241201 1 1 ... 988
368 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC9 20241201 1 1 ... 988

[369 rows x 81 columns]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. We should probably tweak the CLI so it doesn't print the DataFrame by default. I think that's going to be confusing to first-time users unfamiliar with Pandas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed main.py to remove the DF print and also use pprint to display json content better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to removing the DF print.

Unfortunately, what we're currently printing isn't actually JSON. It's a string representation of the a Python dict.

I think what'd work better would be to have run_validation_on_df return a list of objects that could be rendered how the user chooses. We can start by defaulting to JSON, but it could be useful to render it as some tabular form as well. That's what I was going for on:

Let's not worry about that for this PR, though. I have some other tweaks I'd like to make to the code, and we can do that then.

README.md Outdated

[369 rows x 81 columns]

[{'validation': {'id': 'E3000', 'name': 'uid.duplicates_in_dataset', 'description': "Any 'unique identifier' may not be used in more than one record within a small business lending application register.", 'fields': ['uid'], 'severity': 'error'}, 'records': [{'number': 5, 'field_values': {'uid': '000TESTFIUIDDONOTUSEXBXVID13XTC1'}}, {'number': 6, 'field_values': {'uid': '000TESTFIUIDDONOTUSEXBXVID13XTC1'}}]}, {'validation': {'id': 'E0001', 'name': 'uid.invalid_text_length', 'description': "'Unique identifier' must be at least 21 characters in length and at most 45 characters in length.", 'fields': ['uid'], 'severity': 'error'}, 'records': [{'number': 1, 'field_values': {'uid': ''}}, {'number': 2, 'field_values': {'uid': 'BXUIDXVID11XTC2'}}, {'number': 3, 'field_values': {'uid': 'BXUIDXVID11XTC31234567890123456789012345678901'}}]}, {'validation': {'id': 'E0002', 'name': 'uid.invalid_text_pattern', 'description': "'Unique identifier' may contain any combination of numbers and/or uppercase letters (i.e., 0-9 and A-Z), and must not contain any other characters.", 'fields': ['uid'], 'severity': 'error'}, 'records': [{'number': 1, 'field_values': {'uid': ''}}, {'number': 4, 'field_values': {'uid': 'BXUIDXVID12XTC1abcdef'}}]}, {'validation': {'id': 'E0020', 'name': 'app_date.invalid_date_format', 'description': "'Application date' must be a real calendar date using YYYYMMDD format.", 'fields': ['app_date'], 'severity': 'error'}, 'records': [{'number': 8, 'field_values': {'app_date': ''}}, {'number': 9, 'field_values': {'app_date': '12012024'}}]}, {'validation': {'id': 'E0040', 'name': 'app_method.invalid_enum_value', 'description': "'Application method' must equal 1, 2, 3, or 4.", 'fields': ['app_method'], 'severity': 'error'}, 'records': [{'number': 10, 'field_values': {'app_method': ''}}, {'number': 11, 'field_values': {'app_method': '9001'}}]}, {'validation': {'id': 'E0060', 'name': 'app_recipient.invalid_enum_value', 'description': "'Application recipient' must equal 1 or 2", 'fields': ['app_recipient'], 'severity': 'error'}, 'records': [{'number': 12, 'field_values': {'app_recipient': ''}}, {'number': 13, 'field_values': {'app_recipient': '9001'}}]}, {'validation': {'id': 'E0080', 'name': 'ct_credit_product.invalid_enum_value', 'description': "'Credit product' must equal 1, 2, 3, 4, 5, 6, 7, 8, 977, or 988.", 'fields': ['ct_credit_product'], 'severity': 'error'}, 'records': [{'number': 14, 'field_values': {'ct_credit_product': ''}}, {'number': 15, 'field_values': {'ct_credit_product': '9001'}}]}, {'validation': {'id': 'E0100', 'name': 'ct_credit_product_ff.invalid_text_length', 'description': "'Free-form text field for other credit products' must not exceed 300 characters in length.", 'fields': ['ct_credit_product_ff'], 'severity': 'error'}, 'records': [{'number': 16, 'field_values': {'ct_credit_product_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0120', 'name': 'ct_guarantee.invalid_enum_value', 'description': "Each value in 'type of guarantee' (separated by semicolons) must equal 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 977, or 999.", 'fields': ['ct_guarantee'], 'severity': 'error'}, 'records': [{'number': 19, 'field_values': {'ct_guarantee': '9001'}}, {'number': 20, 'field_values': {'ct_guarantee': ''}}]}, {'validation': {'id': 'E0140', 'name': 'ct_guarantee_ff.invalid_text_length', 'description': "'Free-form text field for other guarantee' must not exceed 300 characters in length", 'fields': ['ct_guarantee_ff'], 'severity': 'error'}, 'records': [{'number': 24, 'field_values': {'ct_guarantee_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0160', 'name': 'ct_loan_term_flag.invalid_enum_value', 'description': "Each value in 'Loan term: NA/NP flag' (separated by semicolons) must equal 900, 988, or 999.", 'fields': ['ct_loan_term_flag'], 'severity': 'error'}, 'records': [{'number': 29, 'field_values': {'ct_loan_term_flag': ''}}, {'number': 30, 'field_values': {'ct_loan_term_flag': '9001'}}, {'number': 33, 'field_values': {'ct_loan_term_flag': '1'}}]}, {'validation': {'id': 'E0180', 'name': 'ct_loan_term.invalid_numeric_format', 'description': "When present, 'loan term' must be a whole number.", 'fields': ['ct_loan_term'], 'severity': 'error'}, 'records': [{'number': 36, 'field_values': {'ct_loan_term': 'must be blank'}}]}, {'validation': {'id': 'E0200', 'name': 'credit_purpose.invalid_enum_value', 'description': "Each value in 'credit purpose' (separated by semicolons) must equal 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 977, 988, or 999.", 'fields': ['credit_purpose'], 'severity': 'error'}, 'records': [{'number': 39, 'field_values': {'credit_purpose': '1;2;9001'}}, {'number': 40, 'field_values': {'credit_purpose': ''}}]}, {'validation': {'id': 'E0220', 'name': 'credit_purpose_ff.invalid_text_length', 'description': "'Free-form text field for other credit purpose' must not exceed 300 characters in length", 'fields': ['credit_purpose_ff'], 'severity': 'error'}, 'records': [{'number': 45, 'field_values': {'credit_purpose_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0240', 'name': 'amount_applied_for_flag.invalid_enum_value', 'description': "'Amount applied For: NA/NP flag' must equal 900, 988, or 999.", 'fields': ['amount_applied_for_flag'], 'severity': 'error'}, 'records': [{'number': 50, 'field_values': {'amount_applied_for_flag': ''}}, {'number': 51, 'field_values': {'amount_applied_for_flag': '9001'}}]}, {'validation': {'id': 'E0260', 'name': 'amount_applied_for.invalid_numeric_format', 'description': "When present, 'amount applied for' must be a numeric value.", 'fields': ['amount_applied_for'], 'severity': 'error'}, 'records': [{'number': 52, 'field_values': {'amount_applied_for': 'nonNumericValue'}}, {'number': 55, 'field_values': {'amount_applied_for': 'must be blank'}}]}, {'validation': {'id': 'E0280', 'name': 'amount_approved.invalid_numeric_format', 'description': "When present, 'amount approved or originated' must be a numeric value.", 'fields': ['amount_approved'], 'severity': 'error'}, 'records': [{'number': 56, 'field_values': {'amount_approved': 'nonNumericValue'}}]}, {'validation': {'id': 'E0300', 'name': 'action_taken.invalid_enum_value', 'description': "'Action taken' must equal 1, 2, 3, 4, or 5.", 'fields': ['action_taken'], 'severity': 'error'}, 'records': [{'number': 63, 'field_values': {'action_taken': ''}}, {'number': 64, 'field_values': {'action_taken': '9001'}}]}, {'validation': {'id': 'E0320', 'name': 'action_taken_date.invalid_date_format', 'description': "'Action taken date' must be a real calendar date using YYYYMMDD format.", 'fields': ['action_taken_date'], 'severity': 'error'}, 'records': [{'number': 65, 'field_values': {'action_taken_date': '12312024'}}]}, {'validation': {'id': 'E0001', 'name': 'denial_reasons.invalid_enum_value', 'description': "Each value in 'denial reason(s)' (separated by semicolons)must equal 1, 2, 3, 4, 5, 6, 7, 8, 9, 977, or 999.", 'fields': ['denial_reasons'], 'severity': 'error'}, 'records': [{'number': 70, 'field_values': {'denial_reasons': '9001'}}, {'number': 71, 'field_values': {'denial_reasons': ''}}, {'number': 78, 'field_values': {'denial_reasons': '999;1; 2'}}]}, {'validation': {'id': 'E0360', 'name': 'denial_reasons_ff.invalid_text_length', 'description': "'Free-form text field for other denial reason(s)'must not exceed 300 characters in length.", 'fields': ['denial_reasons_ff'], 'severity': 'error'}, 'records': [{'number': 80, 'field_values': {'denial_reasons_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0380', 'name': 'pricing_interest_rate_type.invalid_enum_value', 'description': "Each value in 'Interest rate type' (separated by semicolons) Must equal 1, 2, 3, 4, 5, 6, or 999", 'fields': ['pricing_interest_rate_type'], 'severity': 'error'}, 'records': [{'number': 85, 'field_values': {'pricing_interest_rate_type': ''}}, {'number': 86, 'field_values': {'pricing_interest_rate_type': '9001'}}, {'number': 87, 'field_values': {'pricing_interest_rate_type': '900'}}, {'number': 94, 'field_values': {'pricing_interest_rate_type': '900'}}, {'number': 101, 'field_values': {'pricing_interest_rate_type': '900'}}]}, {'validation': {'id': 'E0400', 'name': 'pricing_init_rate_period.invalid_numeric_format', 'description': ("When present, 'initial rate period' must be a whole number.",), 'fields': ['pricing_init_rate_period'], 'severity': 'error'}, 'records': [{'number': 118, 'field_values': {'pricing_init_rate_period': 'nonNumericValue'}}]}, {'validation': {'id': 'E0420', 'name': 'pricing_fixed_rate.invalid_numeric_format', 'description': "When present, 'fixed rate: interest rate' must be a numeric value.", 'fields': ['pricing_fixed_rate'], 'severity': 'error'}, 'records': [{'number': 127, 'field_values': {'pricing_fixed_rate': 'nonNumericValue'}}]}, {'validation': {'id': 'E0460', 'name': 'pricing_adj_index_name.invalid_enum_value', 'description': "'Adjustable rate transaction: index name' must equal 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 977, or 999.", 'fields': ['pricing_adj_index_name'], 'severity': 'error'}, 'records': [{'number': 145, 'field_values': {'pricing_adj_index_name': ''}}, {'number': 146, 'field_values': {'pricing_adj_index_name': '9001'}}]}, {'validation': {'id': 'E0480', 'name': 'pricing_adj_index_name_ff.invalid_text_length', 'description': "'Adjustable rate transaction: index name: other' must not exceed 300 characters in length.", 'fields': ['pricing_adj_index_name_ff'], 'severity': 'error'}, 'records': [{'number': 154, 'field_values': {'pricing_adj_index_name_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0500', 'name': 'pricing_adj_index_value.invalid_numeric_format', 'description': "When present, 'adjustable rate transaction: index value' must be a numeric value.", 'fields': ['pricing_adj_index_value'], 'severity': 'error'}, 'records': [{'number': 157, 'field_values': {'pricing_adj_index_value': 'nonNumericValue'}}]}, {'validation': {'id': 'E0520', 'name': 'pricing_origination_charges.invalid_numeric_format', 'description': ("When present, 'total origination charges' must be a numeric", 'value.'), 'fields': ['pricing_origination_charges'], 'severity': 'error'}, 'records': [{'number': 165, 'field_values': {'pricing_origination_charges': 'nonNumericValue'}}]}, {'validation': {'id': 'E0540', 'name': 'pricing_broker_fees.invalid_numeric_format', 'description': ("When present, 'amount of total broker fees' must be a", 'numeric value.'), 'fields': ['pricing_broker_fees'], 'severity': 'error'}, 'records': [{'number': 166, 'field_values': {'pricing_broker_fees': 'nonNumericValue'}}]}, {'validation': {'id': 'E0560', 'name': 'pricing_initial_charges.invalid_numeric_format', 'description': "When present, 'initial annual charges' must be anumeric value.", 'fields': ['pricing_initial_charges'], 'severity': 'error'}, 'records': [{'number': 167, 'field_values': {'pricing_initial_charges': 'nonNumericValue'}}]}, {'validation': {'id': 'E0580', 'name': 'pricing_mca_addcost_flag.invalid_enum_value', 'description': "'MCA/sales-based: additional cost for merchant cash advances or other sales-based financing: NA flag' must equal 900 or 999.", 'fields': ['pricing_mca_addcost_flag'], 'severity': 'error'}, 'records': [{'number': 168, 'field_values': {'pricing_mca_addcost_flag': ''}}, {'number': 169, 'field_values': {'pricing_mca_addcost_flag': '99009001'}}]}, {'validation': {'id': 'E0600', 'name': 'pricing_mca_addcost.invalid_numeric_format', 'description': "When present, 'MCA/sales-based: additional cost for merchant cash advances or other sales-based financing' must be a numeric value", 'fields': ['pricing_mca_addcost'], 'severity': 'error'}, 'records': [{'number': 171, 'field_values': {'pricing_mca_addcost': 'nonNumericValue'}}, {'number': 172, 'field_values': {'pricing_mca_addcost': 'must be blank'}}]}, {'validation': {'id': 'E0620', 'name': 'pricing_prepenalty_allowed.invalid_enum_value', 'description': "'Prepayment penalty could be imposed' must equal 1, 2, or 999.", 'fields': ['pricing_prepenalty_allowed'], 'severity': 'error'}, 'records': [{'number': 174, 'field_values': {'pricing_prepenalty_allowed': ''}}, {'number': 175, 'field_values': {'pricing_prepenalty_allowed': '9001'}}]}, {'validation': {'id': 'E0640', 'name': 'pricing_prepenalty_exists.invalid_enum_value', 'description': "'Prepayment penalty exists' must equal 1, 2, or 999.", 'fields': ['pricing_prepenalty_exists'], 'severity': 'error'}, 'records': [{'number': 176, 'field_values': {'pricing_prepenalty_exists': ''}}, {'number': 177, 'field_values': {'pricing_prepenalty_exists': '9001'}}]}, {'validation': {'id': 'E0640', 'name': 'census_tract_adr_type.invalid_enum_value', 'description': "'Census tract: type of address' must equal 1, 2, 3, or 988.", 'fields': ['census_tract_adr_type'], 'severity': 'error'}, 'records': [{'number': 178, 'field_values': {'census_tract_adr_type': ''}}, {'number': 179, 'field_values': {'census_tract_adr_type': '9001'}}]}, {'validation': {'id': 'E0680', 'name': 'census_tract_number.invalid_text_length', 'description': "When present, 'census tract: tract number' must be a GEOID with exactly 11 digits.", 'fields': ['census_tract_number'], 'severity': 'error'}, 'records': [{'number': 181, 'field_values': {'census_tract_number': '1234567890'}}, {'number': 182, 'field_values': {'census_tract_number': 'must be blank'}}]}, {'validation': {'id': 'E0700', 'name': 'gross_annual_revenue_flag.invalid_enum_value', 'description': "'Gross annual revenue: NP flag' must equal 900 or 988.", 'fields': ['gross_annual_revenue_flag'], 'severity': 'error'}, 'records': [{'number': 187, 'field_values': {'gross_annual_revenue_flag': ''}}, {'number': 188, 'field_values': {'gross_annual_revenue_flag': '99009001'}}]}, {'validation': {'id': 'E0720', 'name': 'gross_annual_revenue.invalid_numeric_format', 'description': "When present, 'gross annual revenue' must be a numeric value.", 'fields': ['gross_annual_revenue'], 'severity': 'error'}, 'records': [{'number': 189, 'field_values': {'gross_annual_revenue': 'nonNumericValue'}}, {'number': 190, 'field_values': {'gross_annual_revenue': 'must be blank'}}]}, {'validation': {'id': 'E0720', 'name': 'naics_code_flag.invalid_enum_value', 'description': "'North American Industry Classification System (NAICS) code: NP flag' must equal 900 or 988.", 'fields': ['naics_code_flag'], 'severity': 'error'}, 'records': [{'number': 192, 'field_values': {'naics_code_flag': ''}}, {'number': 193, 'field_values': {'naics_code_flag': '9001'}}]}, {'validation': {'id': 'E0761', 'name': 'naics_code.invalid_naics_format', 'description': "'North American Industry Classification System (NAICS) code' may only contain numeric characters.", 'fields': ['naics_code'], 'severity': 'error'}, 'records': [{'number': 196, 'field_values': {'naics_code': 'notDigits'}}]}, {'validation': {'id': 'E0780', 'name': 'number_of_workers.invalid_enum_value', 'description': "'Number of workers' must equal 1, 2, 3, 4, 5, 6, 7, 8, 9, or 988.", 'fields': ['number_of_workers'], 'severity': 'error'}, 'records': [{'number': 199, 'field_values': {'number_of_workers': ''}}, {'number': 200, 'field_values': {'number_of_workers': '9001'}}]}, {'validation': {'id': 'E0800', 'name': 'time_in_business_type.invalid_enum_value', 'description': "'Time in business: type of response' must equal 1, 2, 3, or 988.", 'fields': ['time_in_business_type'], 'severity': 'error'}, 'records': [{'number': 201, 'field_values': {'time_in_business_type': ''}}, {'number': 202, 'field_values': {'time_in_business_type': '9001'}}]}, {'validation': {'id': 'E0820', 'name': 'time_in_business.invalid_numeric_format', 'description': "When present, 'time in business' must be a whole number.", 'fields': ['time_in_business'], 'severity': 'error'}, 'records': [{'number': 205, 'field_values': {'time_in_business': 'must be blank'}}]}, {'validation': {'id': 'E0840', 'name': 'business_ownership_status.invalid_enum_value', 'description': "Each value in 'business ownership status' (separated by semicolons) must equal 1, 2, 3, 955, 966, or 988.", 'fields': ['business_ownership_status'], 'severity': 'error'}, 'records': [{'number': 207, 'field_values': {'business_ownership_status': '1;2; 9001'}}, {'number': 208, 'field_values': {'business_ownership_status': ''}}]}, {'validation': {'id': 'E0860', 'name': 'num_principal_owners_flag.invalid_enum_value', 'description': "'Number of principal owners: NP flag' must equal 900 or 988.", 'fields': ['num_principal_owners_flag'], 'severity': 'error'}, 'records': [{'number': 211, 'field_values': {'num_principal_owners_flag': ''}}, {'number': 212, 'field_values': {'num_principal_owners_flag': '9001'}}]}, {'validation': {'id': 'E0880', 'name': 'num_principal_owners.invalid_enum_value', 'description': "When present, 'number of principal owners' must equal 0, 1, 2, 3, or 4.", 'fields': ['num_principal_owners'], 'severity': 'error'}, 'records': [{'number': 213, 'field_values': {'num_principal_owners': '9001'}}, {'number': 214, 'field_values': {'num_principal_owners': 'must be blank'}}]}, {'validation': {'id': 'E0900', 'name': 'po_1_ethnicity.invalid_enum_value', 'description': "When present, each value in 'ethnicity of principal owner 1' (separated by semicolons) must equal 1, 11, 12, 13, 14, 2, 966, 977, or 988.", 'fields': ['po_1_ethnicity'], 'severity': 'error'}, 'records': [{'number': 216, 'field_values': {'po_1_ethnicity': '9001;1'}}]}, {'validation': {'id': 'E0920', 'name': 'po_1_ethnicity_ff.invalid_text_length', 'description': "'Ethnicity of principal owner 1: free-form text field for other Hispanic or Latino' must not exceed 300 characters in length.", 'fields': ['po_1_ethnicity_ff'], 'severity': 'error'}, 'records': [{'number': 228, 'field_values': {'po_1_ethnicity_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0940', 'name': 'po_1_race.invalid_enum_value', 'description': "When present, each value in 'race of principal owner 1' (separated by semicolons) must equal 1, 2, 21, 22, 23, 24, 25, 26, 27, 3, 31, 32, 33, 34, 35, 36, 37, 4, 41, 42, 43, 44, 5, 966, 971, 972, 973, 974, or 988.", 'fields': ['po_1_race'], 'severity': 'error'}, 'records': [{'number': 240, 'field_values': {'po_1_race': '9001;1'}}]}, {'validation': {'id': 'E0960', 'name': 'po_1_race_anai_ff.invalid_text_length', 'description': "'Race of principal owner 1: free-form text field for American Indian or Alaska Native Enrolled or Principal Tribe' must not exceed 300 characters in length.", 'fields': ['po_1_race_anai_ff'], 'severity': 'error'}, 'records': [{'number': 252, 'field_values': {'po_1_race_anai_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0980', 'name': 'po_1_race_asian_ff.invalid_text_length', 'description': "'Race of principal owner 1: free-form text field for other Asian' must not exceed 300 characters in length.", 'fields': ['po_1_race_asian_ff'], 'severity': 'error'}, 'records': [{'number': 264, 'field_values': {'po_1_race_asian_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1000', 'name': 'po_1_race_baa_ff.invalid_text_length', 'description': "'Race of principal owner 1: free-form text field for other Black or African American' must not exceed 300 characters in length.", 'fields': ['po_1_race_baa_ff'], 'severity': 'error'}, 'records': [{'number': 276, 'field_values': {'po_1_race_baa_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1020', 'name': 'po_1_race_pi_ff.invalid_text_length', 'description': "'Race of principal owner 1: free-form text field for other Pacific Islander race' must not exceed 300 characters in length.", 'fields': ['po_1_race_pi_ff'], 'severity': 'error'}, 'records': [{'number': 288, 'field_values': {'po_1_race_pi_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1040', 'name': 'po_1_gender_flag.invalid_enum_value', 'description': "When present, 'sex/gender of principal owner 1: NP flag' must equal 1, 966, or 988.", 'fields': ['po_1_gender_flag'], 'severity': 'error'}, 'records': [{'number': 300, 'field_values': {'po_1_gender_flag': '9001'}}]}, {'validation': {'id': 'E1060', 'name': 'po_1_gender_ff.invalid_text_length', 'description': "'Sex/gender of principal owner 1: free-form text field for self-identified sex/gender' must not exceed 300 characters in length.", 'fields': ['po_1_gender_ff'], 'severity': 'error'}, 'records': [{'number': 304, 'field_values': {'po_1_gender_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0900', 'name': 'po_2_ethnicity.invalid_enum_value', 'description': "When present, each value in 'ethnicity of principal owner 2' (separated by semicolons) must equal 1, 11, 12, 13, 14, 2, 966, 977, or 988.", 'fields': ['po_2_ethnicity'], 'severity': 'error'}, 'records': [{'number': 217, 'field_values': {'po_2_ethnicity': '9001;1'}}]}, {'validation': {'id': 'E0920', 'name': 'po_2_ethnicity_ff.invalid_text_length', 'description': "'Ethnicity of principal owner 2: free-form text field for other Hispanic or Latino' must not exceed 300 characters in length.", 'fields': ['po_2_ethnicity_ff'], 'severity': 'error'}, 'records': [{'number': 229, 'field_values': {'po_2_ethnicity_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0940', 'name': 'po_2_race.invalid_enum_value', 'description': "When present, each value in 'race of principal owner 2' (separated by semicolons) must equal 1, 2, 21, 22, 23, 24, 25, 26, 27, 3, 31, 32, 33, 34, 35, 36, 37, 4, 41, 42, 43, 44, 5, 966, 971, 972, 973, 974, or 988.", 'fields': ['po_2_race'], 'severity': 'error'}, 'records': [{'number': 241, 'field_values': {'po_2_race': '9001;1'}}]}, {'validation': {'id': 'E0960', 'name': 'po_2_race_anai_ff.invalid_text_length', 'description': "'Race of principal owner 2: free-form text field for American Indian or Alaska Native Enrolled or Principal Tribe' must not exceed 300 characters in length.", 'fields': ['po_2_race_anai_ff'], 'severity': 'error'}, 'records': [{'number': 253, 'field_values': {'po_2_race_anai_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0980', 'name': 'po_2_race_asian_ff.invalid_text_length', 'description': "'Race of principal owner 2: free-form text field for other Asian' must not exceed 300 characters in length.", 'fields': ['po_2_race_asian_ff'], 'severity': 'error'}, 'records': [{'number': 265, 'field_values': {'po_2_race_asian_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1000', 'name': 'po_2_race_baa_ff.invalid_text_length', 'description': "'Race of principal owner 2: free-form text field for other Black or African American' must not exceed 300 characters in length.", 'fields': ['po_2_race_baa_ff'], 'severity': 'error'}, 'records': [{'number': 277, 'field_values': {'po_2_race_baa_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1020', 'name': 'po_2_race_pi_ff.invalid_text_length', 'description': "'Race of principal owner 2: free-form text field for other Pacific Islander race' must not exceed 300 characters in length.", 'fields': ['po_2_race_pi_ff'], 'severity': 'error'}, 'records': [{'number': 289, 'field_values': {'po_2_race_pi_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1040', 'name': 'po_2_gender_flag.invalid_enum_value', 'description': "When present, 'sex/gender of principal owner 2: NP flag' must equal 1, 966, or 988.", 'fields': ['po_2_gender_flag'], 'severity': 'error'}, 'records': [{'number': 301, 'field_values': {'po_2_gender_flag': '9001'}}]}, {'validation': {'id': 'E1060', 'name': 'po_2_gender_ff.invalid_text_length', 'description': "'Sex/gender of principal owner 2: free-form text field for self-identified sex/gender' must not exceed 300 characters in length.", 'fields': ['po_2_gender_ff'], 'severity': 'error'}, 'records': [{'number': 305, 'field_values': {'po_2_gender_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0900', 'name': 'po_3_ethnicity.invalid_enum_value', 'description': "When present, each value in 'ethnicity of principal owner 3' (separated by semicolons) must equal 1, 11, 12, 13, 14, 2, 966, 977, or 988.", 'fields': ['po_3_ethnicity'], 'severity': 'error'}, 'records': [{'number': 218, 'field_values': {'po_3_ethnicity': '9001;1'}}]}, {'validation': {'id': 'E0920', 'name': 'po_3_ethnicity_ff.invalid_text_length', 'description': "'Ethnicity of principal owner 3: free-form text field for other Hispanic or Latino' must not exceed 300 characters in length.", 'fields': ['po_3_ethnicity_ff'], 'severity': 'error'}, 'records': [{'number': 230, 'field_values': {'po_3_ethnicity_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0940', 'name': 'po_3_race.invalid_enum_value', 'description': "When present, each value in 'race of principal owner 3' (separated by semicolons) must equal 1, 2, 21, 22, 23, 24, 25, 26, 27, 3, 31, 32, 33, 34, 35, 36, 37, 4, 41, 42, 43, 44, 5, 966, 971, 972, 973, 974, or 988.", 'fields': ['po_3_race'], 'severity': 'error'}, 'records': [{'number': 242, 'field_values': {'po_3_race': '9001;1'}}]}, {'validation': {'id': 'E0960', 'name': 'po_3_race_anai_ff.invalid_text_length', 'description': "'Race of principal owner 3: free-form text field for American Indian or Alaska Native Enrolled or Principal Tribe' must not exceed 300 characters in length.", 'fields': ['po_3_race_anai_ff'], 'severity': 'error'}, 'records': [{'number': 254, 'field_values': {'po_3_race_anai_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0980', 'name': 'po_3_race_asian_ff.invalid_text_length', 'description': "'Race of principal owner 3: free-form text field for other Asian' must not exceed 300 characters in length.", 'fields': ['po_3_race_asian_ff'], 'severity': 'error'}, 'records': [{'number': 266, 'field_values': {'po_3_race_asian_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1000', 'name': 'po_3_race_baa_ff.invalid_text_length', 'description': "'Race of principal owner 3: free-form text field for other Black or African American' must not exceed 300 characters in length.", 'fields': ['po_3_race_baa_ff'], 'severity': 'error'}, 'records': [{'number': 278, 'field_values': {'po_3_race_baa_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1020', 'name': 'po_3_race_pi_ff.invalid_text_length', 'description': "'Race of principal owner 3: free-form text field for other Pacific Islander race' must not exceed 300 characters in length.", 'fields': ['po_3_race_pi_ff'], 'severity': 'error'}, 'records': [{'number': 290, 'field_values': {'po_3_race_pi_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1040', 'name': 'po_3_gender_flag.invalid_enum_value', 'description': "When present, 'sex/gender of principal owner 3: NP flag' must equal 1, 966, or 988.", 'fields': ['po_3_gender_flag'], 'severity': 'error'}, 'records': [{'number': 302, 'field_values': {'po_3_gender_flag': '9001'}}]}, {'validation': {'id': 'E1060', 'name': 'po_3_gender_ff.invalid_text_length', 'description': "'Sex/gender of principal owner 3: free-form text field for self-identified sex/gender' must not exceed 300 characters in length.", 'fields': ['po_3_gender_ff'], 'severity': 'error'}, 'records': [{'number': 306, 'field_values': {'po_3_gender_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0900', 'name': 'po_4_ethnicity.invalid_enum_value', 'description': "When present, each value in 'ethnicity of principal owner 4' (separated by semicolons) must equal 1, 11, 12, 13, 14, 2, 966, 977, or 988.", 'fields': ['po_4_ethnicity'], 'severity': 'error'}, 'records': [{'number': 219, 'field_values': {'po_4_ethnicity': '9001;1'}}]}, {'validation': {'id': 'E0920', 'name': 'po_4_ethnicity_ff.invalid_text_length', 'description': "'Ethnicity of principal owner 4: free-form text field for other Hispanic or Latino' must not exceed 300 characters in length.", 'fields': ['po_4_ethnicity_ff'], 'severity': 'error'}, 'records': [{'number': 231, 'field_values': {'po_4_ethnicity_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0940', 'name': 'po_4_race.invalid_enum_value', 'description': "When present, each value in 'race of principal owner 4' (separated by semicolons) must equal 1, 2, 21, 22, 23, 24, 25, 26, 27, 3, 31, 32, 33, 34, 35, 36, 37, 4, 41, 42, 43, 44, 5, 966, 971, 972, 973, 974, or 988.", 'fields': ['po_4_race'], 'severity': 'error'}, 'records': [{'number': 243, 'field_values': {'po_4_race': '9001;1'}}]}, {'validation': {'id': 'E0960', 'name': 'po_4_race_anai_ff.invalid_text_length', 'description': "'Race of principal owner 4: free-form text field for American Indian or Alaska Native Enrolled or Principal Tribe' must not exceed 300 characters in length.", 'fields': ['po_4_race_anai_ff'], 'severity': 'error'}, 'records': [{'number': 255, 'field_values': {'po_4_race_anai_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0980', 'name': 'po_4_race_asian_ff.invalid_text_length', 'description': "'Race of principal owner 4: free-form text field for other Asian' must not exceed 300 characters in length.", 'fields': ['po_4_race_asian_ff'], 'severity': 'error'}, 'records': [{'number': 267, 'field_values': {'po_4_race_asian_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1000', 'name': 'po_4_race_baa_ff.invalid_text_length', 'description': "'Race of principal owner 4: free-form text field for other Black or African American' must not exceed 300 characters in length.", 'fields': ['po_4_race_baa_ff'], 'severity': 'error'}, 'records': [{'number': 279, 'field_values': {'po_4_race_baa_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1020', 'name': 'po_4_race_pi_ff.invalid_text_length', 'description': "'Race of principal owner 4: free-form text field for other Pacific Islander race' must not exceed 300 characters in length.", 'fields': ['po_4_race_pi_ff'], 'severity': 'error'}, 'records': [{'number': 291, 'field_values': {'po_4_race_pi_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1040', 'name': 'po_4_gender_flag.invalid_enum_value', 'description': "When present, 'sex/gender of principal owner 4: NP flag' must equal 1, 966, or 988.", 'fields': ['po_4_gender_flag'], 'severity': 'error'}, 'records': [{'number': 303, 'field_values': {'po_4_gender_flag': '9001'}}]}, {'validation': {'id': 'E1060', 'name': 'po_4_gender_ff.invalid_text_length', 'description': "'Sex/gender of principal owner 4: free-form text field for self-identified sex/gender' must not exceed 300 characters in length.", 'fields': ['po_4_gender_ff'], 'severity': 'error'}, 'records': [{'number': 307, 'field_values': {'po_4_gender_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should pretty-print the JSON. If it's too long, we can always truncate it with a ....

And there's also this sweet trick to allow expandable areas with GitHub Markdown:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. I used html tags to put all terminal output examples into an expandable block

README.md Outdated

## Development
- Current overall coverage: [![Coverage badge](https://github.com/cfpb/regtech-data-validator/raw/python-coverage-comment-action-data/badge.svg)](https://github.com/cfpb/regtech-data-validator/tree/python-coverage-comment-action-data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be nice to add the badge up near the top of the README.

README.md Outdated
Comment on lines 74 to 429
## Contributing
[CFPB](https://www.consumerfinance.gov/) is developing the `RegTech Data Validator` in the open to maximize transparency and encourage third party contributions. If you want to contribute, please read and abide by the terms of the [License](./LICENSE) for this project. Pull Requests are always welcome.
If you have an inquiry or suggestion for the validator or any SBL related code please reach out to us at <[email protected]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is buried too far down the page. I don't think most people are going to scroll this far. Perhaps we should move it to just before the Development section?


```sh
# example of unit tests output
$ poetry run pytest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should switch to the less verbose mode, or truncate the sample test output.


```

## Running Validator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section should go up above Development.

Copy link
Member

@hkeeler hkeeler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good first step. I have some follow-up ideas on both the docs and how to deal with the output of the CLI, but we don't need to do that as part of this PR.

@@ -1,80 +1,1240 @@
# RegTech Data Validator

This is a RegTech submission data parser and validator which makes use of Pandera. You can read about Pandera schemas [here](https://pandera.readthedocs.io/en/stable/dataframe_schemas.html).
Current overall coverage: [![Coverage badge](https://github.com/cfpb/regtech-data-validator/raw/python-coverage-comment-action-data/badge.svg)](https://github.com/cfpb/regtech-data-validator/tree/python-coverage-comment-action-data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to prefix it with anything. It's fairly standard for repos to just have a set of badges near the top of the README.

Suggested change
Current overall coverage: [![Coverage badge](https://github.com/cfpb/regtech-data-validator/raw/python-coverage-comment-action-data/badge.svg)](https://github.com/cfpb/regtech-data-validator/tree/python-coverage-comment-action-data)
[![Coverage badge](https://github.com/cfpb/regtech-data-validator/raw/python-coverage-comment-action-data/badge.svg)](https://github.com/cfpb/regtech-data-validator/tree/python-coverage-comment-action-data)

README.md Outdated
Comment on lines 195 to 211
Performing validation on the following DataFrame.

uid app_date app_method app_recipient ... po_4_race_baa_ff po_4_race_pi_ff po_4_gender_flag po_4_gender_ff
0 20241201 1 1 ...
1 BXUIDXVID11XTC2 20241201 1 1 ...
2 BXUIDXVID11XTC31234567890123456789012345678901 20241201 1 1 ...
3 BXUIDXVID12XTC1abcdef 20241201 1 1 ...
4 000TESTFIUIDDONOTUSEXBXVID13XTC1 20241201 1 1 ...
.. ... ... ... ... ... ... ... ... ...
364 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC5 20241201 1 1 ... 988
365 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC6 20241201 1 1 ... 988
366 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC7 20241201 1 1 ... 988
367 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC8 20241201 1 1 ... 988
368 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC9 20241201 1 1 ... 988

[369 rows x 81 columns]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to removing the DF print.

Unfortunately, what we're currently printing isn't actually JSON. It's a string representation of the a Python dict.

I think what'd work better would be to have run_validation_on_df return a list of objects that could be rendered how the user chooses. We can start by defaulting to JSON, but it could be useful to render it as some tabular form as well. That's what I was going for on:

Let's not worry about that for this PR, though. I have some other tweaks I'd like to make to the code, and we can do that then.

@aharjati aharjati merged commit c6585d2 into main Oct 12, 2023
3 checks passed
@aharjati aharjati deleted the features/update_README branch October 12, 2023 13:20
jcadam14 pushed a commit that referenced this pull request May 3, 2024
update readme:
- Change focus to use Poetry
- Add more details on POETRY steps (installation, developments and
tests)
- Add more details on VScode development
- Add contact/help information

---------

Co-authored-by: Aldrian Harjati <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prepare README.md for outside audiences
2 participants