-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
58: update readme #59
Conversation
Coverage reportThe coverage rate went from
Diff Coverage details (click to unfold)src/validator/main.py
|
the README preview can be viewed on this branch: https://github.com/cfpb/regtech-data-validator/tree/features/update_README |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aharjati, thanks for putting all of this together. I've taken a first pass, and put up my initial thoughts. Time permitting, I will try to contribute a bit as well. Some of the re-org bits are difficult to describe via PR review, and may just be easier if I tweak it.
README.md
Outdated
|
||
## Running the Demo | ||
All packages and libraries used in this repository can be found in `pyproject.toml` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this a link to pyproject.toml
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think the poetry install
step should be included here.
README.md
Outdated
- Poetry is used as the package management tool. | ||
- (Optional) Visual Studio Code for development. | ||
- (Optional) Docker is needed when using Visual Studio Code / Dev Container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Poetry is used as the package management tool. | |
- (Optional) Visual Studio Code for development. | |
- (Optional) Docker is needed when using Visual Studio Code / Dev Container. | |
The following software packages are pre-requisites to installing this software. | |
- [Python](https://www.python.org/downloads/) version 3.10 or greater. | |
- [Poetry](https://python-poetry.org/docs/#installation) for Python package management. |
I think most users coming to this repo for the first time just want to see it run, not necessarily develop anything. To simplify things for them, I think we should move the following down into a separate Development section.
- (Optional) Visual Studio Code for development.
- (Optional) Docker is needed when using Visual Studio Code / Dev Container.
Now, we it would be nice to have a Dockerfile
dedicated to just running the CLI. That's an even easier setup for those who just want to see it run, and don't want to have to know anything about Python, etc. The one in .devcontainer
is close, but I don't think it's quite what we'd want. Perhaps that'd be a good follow-up PR.
README.md
Outdated
## Development Tests | ||
|
||
- The repo includes unit tests that can be executed using `pytest` or in Visual Studio Code. These tests can be located under `src/tests`. | ||
- The repo also includes 2 test datasets for manual testing, one with all valid data, and one where each line represents a different failed validation, or different permutation of of the same failed validation. | ||
- [`sbl-validations-pass.csv`](src/tests/data/sbl-validations-pass.csv) | ||
- [`sbl-validations-fail.csv`](src/tests/data/sbl-validations-fail.csv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should separate the test data from the tests themselves documentation-wise. When users first come here, they're probably going to just run the CLI and see what happens. Having the test data there is a nice thing for them to try first before they try their own data.
Test data
This repo includes 2 test datasets, one with all valid data, and one where each line represents a different failed validation, or different permutation of the same failed validation.
We use these test files in for automated test, but can also be passed in via the CLI for manual testing.
Similarly, I think all detailed testing-related bits should be moved under Development.
- The repo includes unit tests that can be executed using
pytest
or in Visual Studio Code. These tests can be located undersrc/tests
.
README.md
Outdated
Performing validation on the following DataFrame. | ||
|
||
uid app_date app_method app_recipient ... po_4_race_baa_ff po_4_race_pi_ff po_4_gender_flag po_4_gender_ff | ||
0 20241201 1 1 ... | ||
1 BXUIDXVID11XTC2 20241201 1 1 ... | ||
2 BXUIDXVID11XTC31234567890123456789012345678901 20241201 1 1 ... | ||
3 BXUIDXVID12XTC1abcdef 20241201 1 1 ... | ||
4 000TESTFIUIDDONOTUSEXBXVID13XTC1 20241201 1 1 ... | ||
.. ... ... ... ... ... ... ... ... ... | ||
364 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC5 20241201 1 1 ... 988 | ||
365 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC6 20241201 1 1 ... 988 | ||
366 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC7 20241201 1 1 ... 988 | ||
367 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC8 20241201 1 1 ... 988 | ||
368 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC9 20241201 1 1 ... 988 | ||
|
||
[369 rows x 81 columns] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. We should probably tweak the CLI so it doesn't print the DataFrame by default. I think that's going to be confusing to first-time users unfamiliar with Pandas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed main.py
to remove the DF print and also use pprint
to display json content better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 to removing the DF print.
Unfortunately, what we're currently printing isn't actually JSON. It's a string representation of the a Python dict
.
I think what'd work better would be to have run_validation_on_df
return a list of objects that could be rendered how the user chooses. We can start by defaulting to JSON, but it could be useful to render it as some tabular form as well. That's what I was going for on:
Let's not worry about that for this PR, though. I have some other tweaks I'd like to make to the code, and we can do that then.
README.md
Outdated
|
||
[369 rows x 81 columns] | ||
|
||
[{'validation': {'id': 'E3000', 'name': 'uid.duplicates_in_dataset', 'description': "Any 'unique identifier' may not be used in more than one record within a small business lending application register.", 'fields': ['uid'], 'severity': 'error'}, 'records': [{'number': 5, 'field_values': {'uid': '000TESTFIUIDDONOTUSEXBXVID13XTC1'}}, {'number': 6, 'field_values': {'uid': '000TESTFIUIDDONOTUSEXBXVID13XTC1'}}]}, {'validation': {'id': 'E0001', 'name': 'uid.invalid_text_length', 'description': "'Unique identifier' must be at least 21 characters in length and at most 45 characters in length.", 'fields': ['uid'], 'severity': 'error'}, 'records': [{'number': 1, 'field_values': {'uid': ''}}, {'number': 2, 'field_values': {'uid': 'BXUIDXVID11XTC2'}}, {'number': 3, 'field_values': {'uid': 'BXUIDXVID11XTC31234567890123456789012345678901'}}]}, {'validation': {'id': 'E0002', 'name': 'uid.invalid_text_pattern', 'description': "'Unique identifier' may contain any combination of numbers and/or uppercase letters (i.e., 0-9 and A-Z), and must not contain any other characters.", 'fields': ['uid'], 'severity': 'error'}, 'records': [{'number': 1, 'field_values': {'uid': ''}}, {'number': 4, 'field_values': {'uid': 'BXUIDXVID12XTC1abcdef'}}]}, {'validation': {'id': 'E0020', 'name': 'app_date.invalid_date_format', 'description': "'Application date' must be a real calendar date using YYYYMMDD format.", 'fields': ['app_date'], 'severity': 'error'}, 'records': [{'number': 8, 'field_values': {'app_date': ''}}, {'number': 9, 'field_values': {'app_date': '12012024'}}]}, {'validation': {'id': 'E0040', 'name': 'app_method.invalid_enum_value', 'description': "'Application method' must equal 1, 2, 3, or 4.", 'fields': ['app_method'], 'severity': 'error'}, 'records': [{'number': 10, 'field_values': {'app_method': ''}}, {'number': 11, 'field_values': {'app_method': '9001'}}]}, {'validation': {'id': 'E0060', 'name': 'app_recipient.invalid_enum_value', 'description': "'Application recipient' must equal 1 or 2", 'fields': ['app_recipient'], 'severity': 'error'}, 'records': [{'number': 12, 'field_values': {'app_recipient': ''}}, {'number': 13, 'field_values': {'app_recipient': '9001'}}]}, {'validation': {'id': 'E0080', 'name': 'ct_credit_product.invalid_enum_value', 'description': "'Credit product' must equal 1, 2, 3, 4, 5, 6, 7, 8, 977, or 988.", 'fields': ['ct_credit_product'], 'severity': 'error'}, 'records': [{'number': 14, 'field_values': {'ct_credit_product': ''}}, {'number': 15, 'field_values': {'ct_credit_product': '9001'}}]}, {'validation': {'id': 'E0100', 'name': 'ct_credit_product_ff.invalid_text_length', 'description': "'Free-form text field for other credit products' must not exceed 300 characters in length.", 'fields': ['ct_credit_product_ff'], 'severity': 'error'}, 'records': [{'number': 16, 'field_values': {'ct_credit_product_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0120', 'name': 'ct_guarantee.invalid_enum_value', 'description': "Each value in 'type of guarantee' (separated by semicolons) must equal 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 977, or 999.", 'fields': ['ct_guarantee'], 'severity': 'error'}, 'records': [{'number': 19, 'field_values': {'ct_guarantee': '9001'}}, {'number': 20, 'field_values': {'ct_guarantee': ''}}]}, {'validation': {'id': 'E0140', 'name': 'ct_guarantee_ff.invalid_text_length', 'description': "'Free-form text field for other guarantee' must not exceed 300 characters in length", 'fields': ['ct_guarantee_ff'], 'severity': 'error'}, 'records': [{'number': 24, 'field_values': {'ct_guarantee_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0160', 'name': 'ct_loan_term_flag.invalid_enum_value', 'description': "Each value in 'Loan term: NA/NP flag' (separated by semicolons) must equal 900, 988, or 999.", 'fields': ['ct_loan_term_flag'], 'severity': 'error'}, 'records': [{'number': 29, 'field_values': {'ct_loan_term_flag': ''}}, {'number': 30, 'field_values': {'ct_loan_term_flag': '9001'}}, {'number': 33, 'field_values': {'ct_loan_term_flag': '1'}}]}, {'validation': {'id': 'E0180', 'name': 'ct_loan_term.invalid_numeric_format', 'description': "When present, 'loan term' must be a whole number.", 'fields': ['ct_loan_term'], 'severity': 'error'}, 'records': [{'number': 36, 'field_values': {'ct_loan_term': 'must be blank'}}]}, {'validation': {'id': 'E0200', 'name': 'credit_purpose.invalid_enum_value', 'description': "Each value in 'credit purpose' (separated by semicolons) must equal 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 977, 988, or 999.", 'fields': ['credit_purpose'], 'severity': 'error'}, 'records': [{'number': 39, 'field_values': {'credit_purpose': '1;2;9001'}}, {'number': 40, 'field_values': {'credit_purpose': ''}}]}, {'validation': {'id': 'E0220', 'name': 'credit_purpose_ff.invalid_text_length', 'description': "'Free-form text field for other credit purpose' must not exceed 300 characters in length", 'fields': ['credit_purpose_ff'], 'severity': 'error'}, 'records': [{'number': 45, 'field_values': {'credit_purpose_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0240', 'name': 'amount_applied_for_flag.invalid_enum_value', 'description': "'Amount applied For: NA/NP flag' must equal 900, 988, or 999.", 'fields': ['amount_applied_for_flag'], 'severity': 'error'}, 'records': [{'number': 50, 'field_values': {'amount_applied_for_flag': ''}}, {'number': 51, 'field_values': {'amount_applied_for_flag': '9001'}}]}, {'validation': {'id': 'E0260', 'name': 'amount_applied_for.invalid_numeric_format', 'description': "When present, 'amount applied for' must be a numeric value.", 'fields': ['amount_applied_for'], 'severity': 'error'}, 'records': [{'number': 52, 'field_values': {'amount_applied_for': 'nonNumericValue'}}, {'number': 55, 'field_values': {'amount_applied_for': 'must be blank'}}]}, {'validation': {'id': 'E0280', 'name': 'amount_approved.invalid_numeric_format', 'description': "When present, 'amount approved or originated' must be a numeric value.", 'fields': ['amount_approved'], 'severity': 'error'}, 'records': [{'number': 56, 'field_values': {'amount_approved': 'nonNumericValue'}}]}, {'validation': {'id': 'E0300', 'name': 'action_taken.invalid_enum_value', 'description': "'Action taken' must equal 1, 2, 3, 4, or 5.", 'fields': ['action_taken'], 'severity': 'error'}, 'records': [{'number': 63, 'field_values': {'action_taken': ''}}, {'number': 64, 'field_values': {'action_taken': '9001'}}]}, {'validation': {'id': 'E0320', 'name': 'action_taken_date.invalid_date_format', 'description': "'Action taken date' must be a real calendar date using YYYYMMDD format.", 'fields': ['action_taken_date'], 'severity': 'error'}, 'records': [{'number': 65, 'field_values': {'action_taken_date': '12312024'}}]}, {'validation': {'id': 'E0001', 'name': 'denial_reasons.invalid_enum_value', 'description': "Each value in 'denial reason(s)' (separated by semicolons)must equal 1, 2, 3, 4, 5, 6, 7, 8, 9, 977, or 999.", 'fields': ['denial_reasons'], 'severity': 'error'}, 'records': [{'number': 70, 'field_values': {'denial_reasons': '9001'}}, {'number': 71, 'field_values': {'denial_reasons': ''}}, {'number': 78, 'field_values': {'denial_reasons': '999;1; 2'}}]}, {'validation': {'id': 'E0360', 'name': 'denial_reasons_ff.invalid_text_length', 'description': "'Free-form text field for other denial reason(s)'must not exceed 300 characters in length.", 'fields': ['denial_reasons_ff'], 'severity': 'error'}, 'records': [{'number': 80, 'field_values': {'denial_reasons_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0380', 'name': 'pricing_interest_rate_type.invalid_enum_value', 'description': "Each value in 'Interest rate type' (separated by semicolons) Must equal 1, 2, 3, 4, 5, 6, or 999", 'fields': ['pricing_interest_rate_type'], 'severity': 'error'}, 'records': [{'number': 85, 'field_values': {'pricing_interest_rate_type': ''}}, {'number': 86, 'field_values': {'pricing_interest_rate_type': '9001'}}, {'number': 87, 'field_values': {'pricing_interest_rate_type': '900'}}, {'number': 94, 'field_values': {'pricing_interest_rate_type': '900'}}, {'number': 101, 'field_values': {'pricing_interest_rate_type': '900'}}]}, {'validation': {'id': 'E0400', 'name': 'pricing_init_rate_period.invalid_numeric_format', 'description': ("When present, 'initial rate period' must be a whole number.",), 'fields': ['pricing_init_rate_period'], 'severity': 'error'}, 'records': [{'number': 118, 'field_values': {'pricing_init_rate_period': 'nonNumericValue'}}]}, {'validation': {'id': 'E0420', 'name': 'pricing_fixed_rate.invalid_numeric_format', 'description': "When present, 'fixed rate: interest rate' must be a numeric value.", 'fields': ['pricing_fixed_rate'], 'severity': 'error'}, 'records': [{'number': 127, 'field_values': {'pricing_fixed_rate': 'nonNumericValue'}}]}, {'validation': {'id': 'E0460', 'name': 'pricing_adj_index_name.invalid_enum_value', 'description': "'Adjustable rate transaction: index name' must equal 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 977, or 999.", 'fields': ['pricing_adj_index_name'], 'severity': 'error'}, 'records': [{'number': 145, 'field_values': {'pricing_adj_index_name': ''}}, {'number': 146, 'field_values': {'pricing_adj_index_name': '9001'}}]}, {'validation': {'id': 'E0480', 'name': 'pricing_adj_index_name_ff.invalid_text_length', 'description': "'Adjustable rate transaction: index name: other' must not exceed 300 characters in length.", 'fields': ['pricing_adj_index_name_ff'], 'severity': 'error'}, 'records': [{'number': 154, 'field_values': {'pricing_adj_index_name_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0500', 'name': 'pricing_adj_index_value.invalid_numeric_format', 'description': "When present, 'adjustable rate transaction: index value' must be a numeric value.", 'fields': ['pricing_adj_index_value'], 'severity': 'error'}, 'records': [{'number': 157, 'field_values': {'pricing_adj_index_value': 'nonNumericValue'}}]}, {'validation': {'id': 'E0520', 'name': 'pricing_origination_charges.invalid_numeric_format', 'description': ("When present, 'total origination charges' must be a numeric", 'value.'), 'fields': ['pricing_origination_charges'], 'severity': 'error'}, 'records': [{'number': 165, 'field_values': {'pricing_origination_charges': 'nonNumericValue'}}]}, {'validation': {'id': 'E0540', 'name': 'pricing_broker_fees.invalid_numeric_format', 'description': ("When present, 'amount of total broker fees' must be a", 'numeric value.'), 'fields': ['pricing_broker_fees'], 'severity': 'error'}, 'records': [{'number': 166, 'field_values': {'pricing_broker_fees': 'nonNumericValue'}}]}, {'validation': {'id': 'E0560', 'name': 'pricing_initial_charges.invalid_numeric_format', 'description': "When present, 'initial annual charges' must be anumeric value.", 'fields': ['pricing_initial_charges'], 'severity': 'error'}, 'records': [{'number': 167, 'field_values': {'pricing_initial_charges': 'nonNumericValue'}}]}, {'validation': {'id': 'E0580', 'name': 'pricing_mca_addcost_flag.invalid_enum_value', 'description': "'MCA/sales-based: additional cost for merchant cash advances or other sales-based financing: NA flag' must equal 900 or 999.", 'fields': ['pricing_mca_addcost_flag'], 'severity': 'error'}, 'records': [{'number': 168, 'field_values': {'pricing_mca_addcost_flag': ''}}, {'number': 169, 'field_values': {'pricing_mca_addcost_flag': '99009001'}}]}, {'validation': {'id': 'E0600', 'name': 'pricing_mca_addcost.invalid_numeric_format', 'description': "When present, 'MCA/sales-based: additional cost for merchant cash advances or other sales-based financing' must be a numeric value", 'fields': ['pricing_mca_addcost'], 'severity': 'error'}, 'records': [{'number': 171, 'field_values': {'pricing_mca_addcost': 'nonNumericValue'}}, {'number': 172, 'field_values': {'pricing_mca_addcost': 'must be blank'}}]}, {'validation': {'id': 'E0620', 'name': 'pricing_prepenalty_allowed.invalid_enum_value', 'description': "'Prepayment penalty could be imposed' must equal 1, 2, or 999.", 'fields': ['pricing_prepenalty_allowed'], 'severity': 'error'}, 'records': [{'number': 174, 'field_values': {'pricing_prepenalty_allowed': ''}}, {'number': 175, 'field_values': {'pricing_prepenalty_allowed': '9001'}}]}, {'validation': {'id': 'E0640', 'name': 'pricing_prepenalty_exists.invalid_enum_value', 'description': "'Prepayment penalty exists' must equal 1, 2, or 999.", 'fields': ['pricing_prepenalty_exists'], 'severity': 'error'}, 'records': [{'number': 176, 'field_values': {'pricing_prepenalty_exists': ''}}, {'number': 177, 'field_values': {'pricing_prepenalty_exists': '9001'}}]}, {'validation': {'id': 'E0640', 'name': 'census_tract_adr_type.invalid_enum_value', 'description': "'Census tract: type of address' must equal 1, 2, 3, or 988.", 'fields': ['census_tract_adr_type'], 'severity': 'error'}, 'records': [{'number': 178, 'field_values': {'census_tract_adr_type': ''}}, {'number': 179, 'field_values': {'census_tract_adr_type': '9001'}}]}, {'validation': {'id': 'E0680', 'name': 'census_tract_number.invalid_text_length', 'description': "When present, 'census tract: tract number' must be a GEOID with exactly 11 digits.", 'fields': ['census_tract_number'], 'severity': 'error'}, 'records': [{'number': 181, 'field_values': {'census_tract_number': '1234567890'}}, {'number': 182, 'field_values': {'census_tract_number': 'must be blank'}}]}, {'validation': {'id': 'E0700', 'name': 'gross_annual_revenue_flag.invalid_enum_value', 'description': "'Gross annual revenue: NP flag' must equal 900 or 988.", 'fields': ['gross_annual_revenue_flag'], 'severity': 'error'}, 'records': [{'number': 187, 'field_values': {'gross_annual_revenue_flag': ''}}, {'number': 188, 'field_values': {'gross_annual_revenue_flag': '99009001'}}]}, {'validation': {'id': 'E0720', 'name': 'gross_annual_revenue.invalid_numeric_format', 'description': "When present, 'gross annual revenue' must be a numeric value.", 'fields': ['gross_annual_revenue'], 'severity': 'error'}, 'records': [{'number': 189, 'field_values': {'gross_annual_revenue': 'nonNumericValue'}}, {'number': 190, 'field_values': {'gross_annual_revenue': 'must be blank'}}]}, {'validation': {'id': 'E0720', 'name': 'naics_code_flag.invalid_enum_value', 'description': "'North American Industry Classification System (NAICS) code: NP flag' must equal 900 or 988.", 'fields': ['naics_code_flag'], 'severity': 'error'}, 'records': [{'number': 192, 'field_values': {'naics_code_flag': ''}}, {'number': 193, 'field_values': {'naics_code_flag': '9001'}}]}, {'validation': {'id': 'E0761', 'name': 'naics_code.invalid_naics_format', 'description': "'North American Industry Classification System (NAICS) code' may only contain numeric characters.", 'fields': ['naics_code'], 'severity': 'error'}, 'records': [{'number': 196, 'field_values': {'naics_code': 'notDigits'}}]}, {'validation': {'id': 'E0780', 'name': 'number_of_workers.invalid_enum_value', 'description': "'Number of workers' must equal 1, 2, 3, 4, 5, 6, 7, 8, 9, or 988.", 'fields': ['number_of_workers'], 'severity': 'error'}, 'records': [{'number': 199, 'field_values': {'number_of_workers': ''}}, {'number': 200, 'field_values': {'number_of_workers': '9001'}}]}, {'validation': {'id': 'E0800', 'name': 'time_in_business_type.invalid_enum_value', 'description': "'Time in business: type of response' must equal 1, 2, 3, or 988.", 'fields': ['time_in_business_type'], 'severity': 'error'}, 'records': [{'number': 201, 'field_values': {'time_in_business_type': ''}}, {'number': 202, 'field_values': {'time_in_business_type': '9001'}}]}, {'validation': {'id': 'E0820', 'name': 'time_in_business.invalid_numeric_format', 'description': "When present, 'time in business' must be a whole number.", 'fields': ['time_in_business'], 'severity': 'error'}, 'records': [{'number': 205, 'field_values': {'time_in_business': 'must be blank'}}]}, {'validation': {'id': 'E0840', 'name': 'business_ownership_status.invalid_enum_value', 'description': "Each value in 'business ownership status' (separated by semicolons) must equal 1, 2, 3, 955, 966, or 988.", 'fields': ['business_ownership_status'], 'severity': 'error'}, 'records': [{'number': 207, 'field_values': {'business_ownership_status': '1;2; 9001'}}, {'number': 208, 'field_values': {'business_ownership_status': ''}}]}, {'validation': {'id': 'E0860', 'name': 'num_principal_owners_flag.invalid_enum_value', 'description': "'Number of principal owners: NP flag' must equal 900 or 988.", 'fields': ['num_principal_owners_flag'], 'severity': 'error'}, 'records': [{'number': 211, 'field_values': {'num_principal_owners_flag': ''}}, {'number': 212, 'field_values': {'num_principal_owners_flag': '9001'}}]}, {'validation': {'id': 'E0880', 'name': 'num_principal_owners.invalid_enum_value', 'description': "When present, 'number of principal owners' must equal 0, 1, 2, 3, or 4.", 'fields': ['num_principal_owners'], 'severity': 'error'}, 'records': [{'number': 213, 'field_values': {'num_principal_owners': '9001'}}, {'number': 214, 'field_values': {'num_principal_owners': 'must be blank'}}]}, {'validation': {'id': 'E0900', 'name': 'po_1_ethnicity.invalid_enum_value', 'description': "When present, each value in 'ethnicity of principal owner 1' (separated by semicolons) must equal 1, 11, 12, 13, 14, 2, 966, 977, or 988.", 'fields': ['po_1_ethnicity'], 'severity': 'error'}, 'records': [{'number': 216, 'field_values': {'po_1_ethnicity': '9001;1'}}]}, {'validation': {'id': 'E0920', 'name': 'po_1_ethnicity_ff.invalid_text_length', 'description': "'Ethnicity of principal owner 1: free-form text field for other Hispanic or Latino' must not exceed 300 characters in length.", 'fields': ['po_1_ethnicity_ff'], 'severity': 'error'}, 'records': [{'number': 228, 'field_values': {'po_1_ethnicity_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0940', 'name': 'po_1_race.invalid_enum_value', 'description': "When present, each value in 'race of principal owner 1' (separated by semicolons) must equal 1, 2, 21, 22, 23, 24, 25, 26, 27, 3, 31, 32, 33, 34, 35, 36, 37, 4, 41, 42, 43, 44, 5, 966, 971, 972, 973, 974, or 988.", 'fields': ['po_1_race'], 'severity': 'error'}, 'records': [{'number': 240, 'field_values': {'po_1_race': '9001;1'}}]}, {'validation': {'id': 'E0960', 'name': 'po_1_race_anai_ff.invalid_text_length', 'description': "'Race of principal owner 1: free-form text field for American Indian or Alaska Native Enrolled or Principal Tribe' must not exceed 300 characters in length.", 'fields': ['po_1_race_anai_ff'], 'severity': 'error'}, 'records': [{'number': 252, 'field_values': {'po_1_race_anai_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0980', 'name': 'po_1_race_asian_ff.invalid_text_length', 'description': "'Race of principal owner 1: free-form text field for other Asian' must not exceed 300 characters in length.", 'fields': ['po_1_race_asian_ff'], 'severity': 'error'}, 'records': [{'number': 264, 'field_values': {'po_1_race_asian_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1000', 'name': 'po_1_race_baa_ff.invalid_text_length', 'description': "'Race of principal owner 1: free-form text field for other Black or African American' must not exceed 300 characters in length.", 'fields': ['po_1_race_baa_ff'], 'severity': 'error'}, 'records': [{'number': 276, 'field_values': {'po_1_race_baa_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1020', 'name': 'po_1_race_pi_ff.invalid_text_length', 'description': "'Race of principal owner 1: free-form text field for other Pacific Islander race' must not exceed 300 characters in length.", 'fields': ['po_1_race_pi_ff'], 'severity': 'error'}, 'records': [{'number': 288, 'field_values': {'po_1_race_pi_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1040', 'name': 'po_1_gender_flag.invalid_enum_value', 'description': "When present, 'sex/gender of principal owner 1: NP flag' must equal 1, 966, or 988.", 'fields': ['po_1_gender_flag'], 'severity': 'error'}, 'records': [{'number': 300, 'field_values': {'po_1_gender_flag': '9001'}}]}, {'validation': {'id': 'E1060', 'name': 'po_1_gender_ff.invalid_text_length', 'description': "'Sex/gender of principal owner 1: free-form text field for self-identified sex/gender' must not exceed 300 characters in length.", 'fields': ['po_1_gender_ff'], 'severity': 'error'}, 'records': [{'number': 304, 'field_values': {'po_1_gender_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0900', 'name': 'po_2_ethnicity.invalid_enum_value', 'description': "When present, each value in 'ethnicity of principal owner 2' (separated by semicolons) must equal 1, 11, 12, 13, 14, 2, 966, 977, or 988.", 'fields': ['po_2_ethnicity'], 'severity': 'error'}, 'records': [{'number': 217, 'field_values': {'po_2_ethnicity': '9001;1'}}]}, {'validation': {'id': 'E0920', 'name': 'po_2_ethnicity_ff.invalid_text_length', 'description': "'Ethnicity of principal owner 2: free-form text field for other Hispanic or Latino' must not exceed 300 characters in length.", 'fields': ['po_2_ethnicity_ff'], 'severity': 'error'}, 'records': [{'number': 229, 'field_values': {'po_2_ethnicity_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0940', 'name': 'po_2_race.invalid_enum_value', 'description': "When present, each value in 'race of principal owner 2' (separated by semicolons) must equal 1, 2, 21, 22, 23, 24, 25, 26, 27, 3, 31, 32, 33, 34, 35, 36, 37, 4, 41, 42, 43, 44, 5, 966, 971, 972, 973, 974, or 988.", 'fields': ['po_2_race'], 'severity': 'error'}, 'records': [{'number': 241, 'field_values': {'po_2_race': '9001;1'}}]}, {'validation': {'id': 'E0960', 'name': 'po_2_race_anai_ff.invalid_text_length', 'description': "'Race of principal owner 2: free-form text field for American Indian or Alaska Native Enrolled or Principal Tribe' must not exceed 300 characters in length.", 'fields': ['po_2_race_anai_ff'], 'severity': 'error'}, 'records': [{'number': 253, 'field_values': {'po_2_race_anai_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0980', 'name': 'po_2_race_asian_ff.invalid_text_length', 'description': "'Race of principal owner 2: free-form text field for other Asian' must not exceed 300 characters in length.", 'fields': ['po_2_race_asian_ff'], 'severity': 'error'}, 'records': [{'number': 265, 'field_values': {'po_2_race_asian_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1000', 'name': 'po_2_race_baa_ff.invalid_text_length', 'description': "'Race of principal owner 2: free-form text field for other Black or African American' must not exceed 300 characters in length.", 'fields': ['po_2_race_baa_ff'], 'severity': 'error'}, 'records': [{'number': 277, 'field_values': {'po_2_race_baa_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1020', 'name': 'po_2_race_pi_ff.invalid_text_length', 'description': "'Race of principal owner 2: free-form text field for other Pacific Islander race' must not exceed 300 characters in length.", 'fields': ['po_2_race_pi_ff'], 'severity': 'error'}, 'records': [{'number': 289, 'field_values': {'po_2_race_pi_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1040', 'name': 'po_2_gender_flag.invalid_enum_value', 'description': "When present, 'sex/gender of principal owner 2: NP flag' must equal 1, 966, or 988.", 'fields': ['po_2_gender_flag'], 'severity': 'error'}, 'records': [{'number': 301, 'field_values': {'po_2_gender_flag': '9001'}}]}, {'validation': {'id': 'E1060', 'name': 'po_2_gender_ff.invalid_text_length', 'description': "'Sex/gender of principal owner 2: free-form text field for self-identified sex/gender' must not exceed 300 characters in length.", 'fields': ['po_2_gender_ff'], 'severity': 'error'}, 'records': [{'number': 305, 'field_values': {'po_2_gender_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0900', 'name': 'po_3_ethnicity.invalid_enum_value', 'description': "When present, each value in 'ethnicity of principal owner 3' (separated by semicolons) must equal 1, 11, 12, 13, 14, 2, 966, 977, or 988.", 'fields': ['po_3_ethnicity'], 'severity': 'error'}, 'records': [{'number': 218, 'field_values': {'po_3_ethnicity': '9001;1'}}]}, {'validation': {'id': 'E0920', 'name': 'po_3_ethnicity_ff.invalid_text_length', 'description': "'Ethnicity of principal owner 3: free-form text field for other Hispanic or Latino' must not exceed 300 characters in length.", 'fields': ['po_3_ethnicity_ff'], 'severity': 'error'}, 'records': [{'number': 230, 'field_values': {'po_3_ethnicity_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0940', 'name': 'po_3_race.invalid_enum_value', 'description': "When present, each value in 'race of principal owner 3' (separated by semicolons) must equal 1, 2, 21, 22, 23, 24, 25, 26, 27, 3, 31, 32, 33, 34, 35, 36, 37, 4, 41, 42, 43, 44, 5, 966, 971, 972, 973, 974, or 988.", 'fields': ['po_3_race'], 'severity': 'error'}, 'records': [{'number': 242, 'field_values': {'po_3_race': '9001;1'}}]}, {'validation': {'id': 'E0960', 'name': 'po_3_race_anai_ff.invalid_text_length', 'description': "'Race of principal owner 3: free-form text field for American Indian or Alaska Native Enrolled or Principal Tribe' must not exceed 300 characters in length.", 'fields': ['po_3_race_anai_ff'], 'severity': 'error'}, 'records': [{'number': 254, 'field_values': {'po_3_race_anai_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0980', 'name': 'po_3_race_asian_ff.invalid_text_length', 'description': "'Race of principal owner 3: free-form text field for other Asian' must not exceed 300 characters in length.", 'fields': ['po_3_race_asian_ff'], 'severity': 'error'}, 'records': [{'number': 266, 'field_values': {'po_3_race_asian_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1000', 'name': 'po_3_race_baa_ff.invalid_text_length', 'description': "'Race of principal owner 3: free-form text field for other Black or African American' must not exceed 300 characters in length.", 'fields': ['po_3_race_baa_ff'], 'severity': 'error'}, 'records': [{'number': 278, 'field_values': {'po_3_race_baa_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1020', 'name': 'po_3_race_pi_ff.invalid_text_length', 'description': "'Race of principal owner 3: free-form text field for other Pacific Islander race' must not exceed 300 characters in length.", 'fields': ['po_3_race_pi_ff'], 'severity': 'error'}, 'records': [{'number': 290, 'field_values': {'po_3_race_pi_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1040', 'name': 'po_3_gender_flag.invalid_enum_value', 'description': "When present, 'sex/gender of principal owner 3: NP flag' must equal 1, 966, or 988.", 'fields': ['po_3_gender_flag'], 'severity': 'error'}, 'records': [{'number': 302, 'field_values': {'po_3_gender_flag': '9001'}}]}, {'validation': {'id': 'E1060', 'name': 'po_3_gender_ff.invalid_text_length', 'description': "'Sex/gender of principal owner 3: free-form text field for self-identified sex/gender' must not exceed 300 characters in length.", 'fields': ['po_3_gender_ff'], 'severity': 'error'}, 'records': [{'number': 306, 'field_values': {'po_3_gender_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0900', 'name': 'po_4_ethnicity.invalid_enum_value', 'description': "When present, each value in 'ethnicity of principal owner 4' (separated by semicolons) must equal 1, 11, 12, 13, 14, 2, 966, 977, or 988.", 'fields': ['po_4_ethnicity'], 'severity': 'error'}, 'records': [{'number': 219, 'field_values': {'po_4_ethnicity': '9001;1'}}]}, {'validation': {'id': 'E0920', 'name': 'po_4_ethnicity_ff.invalid_text_length', 'description': "'Ethnicity of principal owner 4: free-form text field for other Hispanic or Latino' must not exceed 300 characters in length.", 'fields': ['po_4_ethnicity_ff'], 'severity': 'error'}, 'records': [{'number': 231, 'field_values': {'po_4_ethnicity_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0940', 'name': 'po_4_race.invalid_enum_value', 'description': "When present, each value in 'race of principal owner 4' (separated by semicolons) must equal 1, 2, 21, 22, 23, 24, 25, 26, 27, 3, 31, 32, 33, 34, 35, 36, 37, 4, 41, 42, 43, 44, 5, 966, 971, 972, 973, 974, or 988.", 'fields': ['po_4_race'], 'severity': 'error'}, 'records': [{'number': 243, 'field_values': {'po_4_race': '9001;1'}}]}, {'validation': {'id': 'E0960', 'name': 'po_4_race_anai_ff.invalid_text_length', 'description': "'Race of principal owner 4: free-form text field for American Indian or Alaska Native Enrolled or Principal Tribe' must not exceed 300 characters in length.", 'fields': ['po_4_race_anai_ff'], 'severity': 'error'}, 'records': [{'number': 255, 'field_values': {'po_4_race_anai_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E0980', 'name': 'po_4_race_asian_ff.invalid_text_length', 'description': "'Race of principal owner 4: free-form text field for other Asian' must not exceed 300 characters in length.", 'fields': ['po_4_race_asian_ff'], 'severity': 'error'}, 'records': [{'number': 267, 'field_values': {'po_4_race_asian_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1000', 'name': 'po_4_race_baa_ff.invalid_text_length', 'description': "'Race of principal owner 4: free-form text field for other Black or African American' must not exceed 300 characters in length.", 'fields': ['po_4_race_baa_ff'], 'severity': 'error'}, 'records': [{'number': 279, 'field_values': {'po_4_race_baa_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1020', 'name': 'po_4_race_pi_ff.invalid_text_length', 'description': "'Race of principal owner 4: free-form text field for other Pacific Islander race' must not exceed 300 characters in length.", 'fields': ['po_4_race_pi_ff'], 'severity': 'error'}, 'records': [{'number': 291, 'field_values': {'po_4_race_pi_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}, {'validation': {'id': 'E1040', 'name': 'po_4_gender_flag.invalid_enum_value', 'description': "When present, 'sex/gender of principal owner 4: NP flag' must equal 1, 966, or 988.", 'fields': ['po_4_gender_flag'], 'severity': 'error'}, 'records': [{'number': 303, 'field_values': {'po_4_gender_flag': '9001'}}]}, {'validation': {'id': 'E1060', 'name': 'po_4_gender_ff.invalid_text_length', 'description': "'Sex/gender of principal owner 4: free-form text field for self-identified sex/gender' must not exceed 300 characters in length.", 'fields': ['po_4_gender_ff'], 'severity': 'error'}, 'records': [{'number': 307, 'field_values': {'po_4_gender_ff': '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890XXX'}}]}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should pretty-print the JSON. If it's too long, we can always truncate it with a ...
.
And there's also this sweet trick to allow expandable areas with GitHub Markdown:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. I used html tags to put all terminal output examples into an expandable block
README.md
Outdated
|
||
## Development | ||
- Current overall coverage: [![Coverage badge](https://github.com/cfpb/regtech-data-validator/raw/python-coverage-comment-action-data/badge.svg)](https://github.com/cfpb/regtech-data-validator/tree/python-coverage-comment-action-data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be nice to add the badge up near the top of the README.
README.md
Outdated
## Contributing | ||
[CFPB](https://www.consumerfinance.gov/) is developing the `RegTech Data Validator` in the open to maximize transparency and encourage third party contributions. If you want to contribute, please read and abide by the terms of the [License](./LICENSE) for this project. Pull Requests are always welcome. | ||
If you have an inquiry or suggestion for the validator or any SBL related code please reach out to us at <[email protected]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is buried too far down the page. I don't think most people are going to scroll this far. Perhaps we should move it to just before the Development section?
|
||
```sh | ||
# example of unit tests output | ||
$ poetry run pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should switch to the less verbose mode, or truncate the sample test output.
|
||
``` | ||
|
||
## Running Validator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section should go up above Development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good first step. I have some follow-up ideas on both the docs and how to deal with the output of the CLI, but we don't need to do that as part of this PR.
@@ -1,80 +1,1240 @@ | |||
# RegTech Data Validator | |||
|
|||
This is a RegTech submission data parser and validator which makes use of Pandera. You can read about Pandera schemas [here](https://pandera.readthedocs.io/en/stable/dataframe_schemas.html). | |||
Current overall coverage: [![Coverage badge](https://github.com/cfpb/regtech-data-validator/raw/python-coverage-comment-action-data/badge.svg)](https://github.com/cfpb/regtech-data-validator/tree/python-coverage-comment-action-data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to prefix it with anything. It's fairly standard for repos to just have a set of badges near the top of the README.
Current overall coverage: [![Coverage badge](https://github.com/cfpb/regtech-data-validator/raw/python-coverage-comment-action-data/badge.svg)](https://github.com/cfpb/regtech-data-validator/tree/python-coverage-comment-action-data) | |
[![Coverage badge](https://github.com/cfpb/regtech-data-validator/raw/python-coverage-comment-action-data/badge.svg)](https://github.com/cfpb/regtech-data-validator/tree/python-coverage-comment-action-data) |
README.md
Outdated
Performing validation on the following DataFrame. | ||
|
||
uid app_date app_method app_recipient ... po_4_race_baa_ff po_4_race_pi_ff po_4_gender_flag po_4_gender_ff | ||
0 20241201 1 1 ... | ||
1 BXUIDXVID11XTC2 20241201 1 1 ... | ||
2 BXUIDXVID11XTC31234567890123456789012345678901 20241201 1 1 ... | ||
3 BXUIDXVID12XTC1abcdef 20241201 1 1 ... | ||
4 000TESTFIUIDDONOTUSEXBXVID13XTC1 20241201 1 1 ... | ||
.. ... ... ... ... ... ... ... ... ... | ||
364 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC5 20241201 1 1 ... 988 | ||
365 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC6 20241201 1 1 ... 988 | ||
366 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC7 20241201 1 1 ... 988 | ||
367 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC8 20241201 1 1 ... 988 | ||
368 000TESTFIUIDDONOTUSEXBXVIDPODEMO4XTC9 20241201 1 1 ... 988 | ||
|
||
[369 rows x 81 columns] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 to removing the DF print.
Unfortunately, what we're currently printing isn't actually JSON. It's a string representation of the a Python dict
.
I think what'd work better would be to have run_validation_on_df
return a list of objects that could be rendered how the user chooses. We can start by defaulting to JSON, but it could be useful to render it as some tabular form as well. That's what I was going for on:
Let's not worry about that for this PR, though. I have some other tweaks I'd like to make to the code, and we can do that then.
update readme: - Change focus to use Poetry - Add more details on POETRY steps (installation, developments and tests) - Add more details on VScode development - Add contact/help information --------- Co-authored-by: Aldrian Harjati <[email protected]>
update readme: