-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
65 create test to make sure validations stay in sync with 2024 validationscsv #69
65 create test to make sure validations stay in sync with 2024 validationscsv #69
Conversation
Several changes and discrepancies existed between the python code and CSV. This corrects those issues.
…he python code and the csv located at https://raw.githubusercontent.com/cfpb/sbl-content/main/fig-files/validation-spec/2024-validations.csv There was no automated check to verify that the code and csv were in sync. Because changes to the csv had been made but not flowed into the code, several differences were found. This will make it easier to identify changes and implement them in the code or correct them in the csv. https://raw.githubusercontent.com/cfpb/sbl-content/main/fig-files/validation-spec/2024-validations.csv
Black linter didn't pass
Realized I could simplify my list of list comprehension Better linting Easier readability
To pass the Ruff linter
To pass the linter action
…e-test-to-make-sure-validations-stay-in-sync-with-2024-validationscsv
…2014 and E2015 so that just the strings can be compared (without spaces, new lines, bullets, etc) Because the CSV has formatting for E2014 and E2015 that is more complicated than just a couple of sentences, I added in the pytest to remove all formatting for those error codes so that just character data is compared.
(new line characters, 'bullets', etc) Just to keep as close to the csv as we can, even though eventually this formatting will change once discussions on error formatting occur
Updated to pass linters
To keep consistent with the CSV. This structure is stripped away though during the automated testing in #65
so the pytest passes and the code and csv are in sync
…e-test-to-make-sure-validations-stay-in-sync-with-2024-validationscsv
align with csv updates
…e-test-to-make-sure-validations-stay-in-sync-with-2024-validationscsv
To pass the linting
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
…anges have been merged
For some reason the black linting doesn't format in VSCode, and branch 68, which has the same file, doesn't fail on the lines this linter does |
Added the errors.csv spit out by the unit test to .gitignore
maybe try running the linter outside vs code? I've found vs code formatting doesn't always conform with the cli setup. Try |
tests/test_csv_differences.py
Outdated
def test_csv_differences(self): | ||
vals = get_phase_1_and_2_validations_for_lei() | ||
code_descs = [ | ||
[s.title, s.severity, s.description] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't make too big of a difference for only 3 fields, but for better readability I would rather this be a dictionary; so later on it becomes c["title"] instead of c[0]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh... nvm... the csv matching from github...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually... since the csv has headers, if we can, let's try using heading for that as well if possible; so match with keys, instead of indices; gives more context
tests/test_csv_differences.py
Outdated
|
||
with open("errors.csv", "w") as error_file: | ||
for c in code_descs: | ||
found_cd = [d for d in csv_descs if d[0] == c[0]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe next((d for d in csv_descs if d[0] == c[0]), None)
, so you won't have to do x[0][0]
all the time
tests/test_csv_differences.py
Outdated
with open("errors.csv", "w") as error_file: | ||
for c in code_descs: | ||
found_cd = [d for d in csv_descs if d[0] == c[0]] | ||
if c[0] in self.remove_formatting_codes and len(found_cd) != 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like quite a bit of len(found_cd)
checks; shortcut:
if found_cd := next((d for d in csv_descs if d[0] == c[0]), None):
// do all the things without needing to check found_cd again
… for lookups instead of indices.
…ow the comparisons are being done. This uses the dataframe headers from the csv, and creates objects from the phase_validations.py json to compare using column/field names instead of indices. Makes it much more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
oh yeah, please run linters before merging |
…tionscsv (#69) This adds a pytest (test_csv_differences.py) to validate our python code against the CSV located at https://raw.githubusercontent.com/cfpb/sbl-content/main/fig-files/validation-spec/2024-validations.csv This will compare error/warning codes (making sure neither the code nor csv have codes the other doesn't), the type (error or warning) and the description. Special note is taken of E2014 and E2015 due to formatting in the CSV. In the near future when the frontend is ready to start displaying error/warning descriptions, discussions will be had to figure out how we want to display the more complicated descriptions and what sort of formatting the backend should have. Right now, we preserve as much of the formatting as we can but the pytest will also strip all of this off for these two errors (or any others added to the remove_formatting list) and compare just character data. In general, we do NOT want to do this because several strings in our python code were missing spaces and other standard grammatical formatting, and stripping that off would have caused the test to improperly accept that description. This story is being worked in conjunction with #68 which is being used to update the phase_validations.py for other discrepancies found during testing. It is being routinely merged into this branch to properly run the pytest.
This adds a pytest (test_csv_differences.py) to validate our python code against the CSV located at https://raw.githubusercontent.com/cfpb/sbl-content/main/fig-files/validation-spec/2024-validations.csv
This will compare error/warning codes (making sure neither the code nor csv have codes the other doesn't), the type (error or warning) and the description.
Special note is taken of E2014 and E2015 due to formatting in the CSV. In the near future when the frontend is ready to start displaying error/warning descriptions, discussions will be had to figure out how we want to display the more complicated descriptions and what sort of formatting the backend should have. Right now, we preserve as much of the formatting as we can but the pytest will also strip all of this off for these two errors (or any others added to the remove_formatting list) and compare just character data. In general, we do NOT want to do this because several strings in our python code were missing spaces and other standard grammatical formatting, and stripping that off would have caused the test to improperly accept that description.
This story is being worked in conjunction with #68 which is being used to update the phase_validations.py for other discrepancies found during testing. It is being routinely merged into this branch to properly run the pytest.