65 create test to make sure validations stay in sync with 2024 validationscsv #69

jcadam14 · 2023-12-05T21:07:01Z

This adds a pytest (test_csv_differences.py) to validate our python code against the CSV located at https://raw.githubusercontent.com/cfpb/sbl-content/main/fig-files/validation-spec/2024-validations.csv

This will compare error/warning codes (making sure neither the code nor csv have codes the other doesn't), the type (error or warning) and the description.

Special note is taken of E2014 and E2015 due to formatting in the CSV. In the near future when the frontend is ready to start displaying error/warning descriptions, discussions will be had to figure out how we want to display the more complicated descriptions and what sort of formatting the backend should have. Right now, we preserve as much of the formatting as we can but the pytest will also strip all of this off for these two errors (or any others added to the remove_formatting list) and compare just character data. In general, we do NOT want to do this because several strings in our python code were missing spaces and other standard grammatical formatting, and stripping that off would have caused the test to improperly accept that description.

This story is being worked in conjunction with #68 which is being used to update the phase_validations.py for other discrepancies found during testing. It is being routinely merged into this branch to properly run the pytest.

Several changes and discrepancies existed between the python code and CSV. This corrects those issues.

…he python code and the csv located at https://raw.githubusercontent.com/cfpb/sbl-content/main/fig-files/validation-spec/2024-validations.csv There was no automated check to verify that the code and csv were in sync. Because changes to the csv had been made but not flowed into the code, several differences were found. This will make it easier to identify changes and implement them in the code or correct them in the csv. https://raw.githubusercontent.com/cfpb/sbl-content/main/fig-files/validation-spec/2024-validations.csv

Black linter didn't pass

Realized I could simplify my list of list comprehension Better linting Easier readability

To pass the Ruff linter

To pass the linter action

…e-test-to-make-sure-validations-stay-in-sync-with-2024-validationscsv

…2014 and E2015 so that just the strings can be compared (without spaces, new lines, bullets, etc) Because the CSV has formatting for E2014 and E2015 that is more complicated than just a couple of sentences, I added in the pytest to remove all formatting for those error codes so that just character data is compared.

(new line characters, 'bullets', etc) Just to keep as close to the csv as we can, even though eventually this formatting will change once discussions on error formatting occur

Updated to pass linters

To keep consistent with the CSV. This structure is stripped away though during the automated testing in #65

so the pytest passes and the code and csv are in sync

…e-test-to-make-sure-validations-stay-in-sync-with-2024-validationscsv

align with csv updates

…e-test-to-make-sure-validations-stay-in-sync-with-2024-validationscsv

To pass the linting

github-actions · 2023-12-06T23:24:07Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
regtech_data_validator
global_data.py
phase_validations.py
Project Total

_{This report was generated by python-coverage-comment-action}

…anges have been merged

jcadam14 · 2023-12-21T16:47:48Z

For some reason the black linting doesn't format in VSCode, and branch 68, which has the same file, doesn't fail on the lines this linter does

Added the errors.csv spit out by the unit test to .gitignore

lchen-2101 · 2023-12-28T18:09:02Z

maybe try running the linter outside vs code? I've found vs code formatting doesn't always conform with the cli setup. Try poetry run black . in the code's root directory

lchen-2101 · 2023-12-28T18:40:12Z

tests/test_csv_differences.py

+    def test_csv_differences(self):
+        vals = get_phase_1_and_2_validations_for_lei()
+        code_descs = [
+            [s.title, s.severity, s.description]


doesn't make too big of a difference for only 3 fields, but for better readability I would rather this be a dictionary; so later on it becomes c["title"] instead of c[0]

oh... nvm... the csv matching from github...

actually... since the csv has headers, if we can, let's try using heading for that as well if possible; so match with keys, instead of indices; gives more context

lchen-2101 · 2023-12-28T18:55:57Z

tests/test_csv_differences.py

+
+        with open("errors.csv", "w") as error_file:
+            for c in code_descs:
+                found_cd = [d for d in csv_descs if d[0] == c[0]]


maybe next((d for d in csv_descs if d[0] == c[0]), None), so you won't have to do x[0][0] all the time

lchen-2101 · 2023-12-28T19:05:08Z

tests/test_csv_differences.py

+        with open("errors.csv", "w") as error_file:
+            for c in code_descs:
+                found_cd = [d for d in csv_descs if d[0] == c[0]]
+                if c[0] in self.remove_formatting_codes and len(found_cd) != 0:


looks like quite a bit of len(found_cd) checks; shortcut:

if found_cd := next((d for d in csv_descs if d[0] == c[0]), None): // do all the things without needing to check found_cd again

… for lookups instead of indices.

…ow the comparisons are being done. This uses the dataframe headers from the csv, and creates objects from the phase_validations.py json to compare using column/field names instead of indices. Makes it much more readable

lchen-2101

LGTM

lchen-2101 · 2024-01-02T18:07:02Z

oh yeah, please run linters before merging

…uns.

…tionscsv (#69) This adds a pytest (test_csv_differences.py) to validate our python code against the CSV located at https://raw.githubusercontent.com/cfpb/sbl-content/main/fig-files/validation-spec/2024-validations.csv This will compare error/warning codes (making sure neither the code nor csv have codes the other doesn't), the type (error or warning) and the description. Special note is taken of E2014 and E2015 due to formatting in the CSV. In the near future when the frontend is ready to start displaying error/warning descriptions, discussions will be had to figure out how we want to display the more complicated descriptions and what sort of formatting the backend should have. Right now, we preserve as much of the formatting as we can but the pytest will also strip all of this off for these two errors (or any others added to the remove_formatting list) and compare just character data. In general, we do NOT want to do this because several strings in our python code were missing spaces and other standard grammatical formatting, and stripping that off would have caused the test to improperly accept that description. This story is being worked in conjunction with #68 which is being used to update the phase_validations.py for other discrepancies found during testing. It is being routinely merged into this branch to properly run the pytest.

jcadam14 added 11 commits December 4, 2023 13:04

Update the phase_validations.py to be consistent with the updated CSV

3a62c46

Several changes and discrepancies existed between the python code and CSV. This corrects those issues.

Fixing Black formatting issues

8fb6ef8

Black linter didn't pass

Ruff and Black formatting

4fe809c

Realized I could simplify my list of list comprehension Better linting Easier readability

Ruff linting

4fcc327

To pass the Ruff linter

Ruff and Black linting seem to be at odds. So trying to resolve that

30b9439

To pass the linter action

Merge branch '68_correct_validation_description_errors' into 65-creat…

6d3787d

…e-test-to-make-sure-validations-stay-in-sync-with-2024-validationscsv

Updated the descriptions for E2014 and E2015 to be similar to the CSV

6e788e9

(new line characters, 'bullets', etc) Just to keep as close to the csv as we can, even though eventually this formatting will change once discussions on error formatting occur

Missing the phase_validations.py in the commit somehow

e190b16

Missed black formatting

6ffe84b

Updated to pass linters

jcadam14 linked an issue Dec 5, 2023 that may be closed by this pull request

Create test to make sure validations stay in sync with 2024-validations.csv #65

Closed

jcadam14 requested review from lchen-2101, hkeeler, guffee23 and nargis-sultani December 5, 2023 21:07

jcadam14 added 6 commits December 5, 2023 16:30

Updated E2014 and 2015 to be similar in structure to the CSV

1f26af4

To keep consistent with the CSV. This structure is stripped away though during the automated testing in #65

Corrected capitalization that was corrected in the csv

464eb0b

so the pytest passes and the code and csv are in sync

Merge branch '68_correct_validation_description_errors' into 65-creat…

62c87f3

…e-test-to-make-sure-validations-stay-in-sync-with-2024-validationscsv

Uncapitalizes And in the middle of a description

daedb72

align with csv updates

Merge branch '68_correct_validation_description_errors' into 65-creat…

43b1314

…e-test-to-make-sure-validations-stay-in-sync-with-2024-validationscsv

Ruff and Black formatting

284109f

To pass the linting

Changed the github path to the csv to the main branch now that the ch…

d45a54b

…anges have been merged

jcadam14 self-assigned this Dec 21, 2023

jcadam14 added 2 commits December 28, 2023 10:51

Updated README for details on the FIG CSV comparison unit test.

cbaed4d

Added the errors.csv spit out by the unit test to .gitignore

black linting

6696af8

lchen-2101 reviewed Dec 28, 2023

View reviewed changes

jcadam14 added 2 commits December 29, 2023 16:21

Created a second csv pytest that uses object and the dataframe fields…

54dee6b

… for lookups instead of indices.

lchen-2101 approved these changes Jan 2, 2024

View reviewed changes

jcadam14 added 4 commits January 2, 2024 15:19

Ruff linting fixes before merge

fcaba8b

Linting fixes that are different in github than in our local poetry r…

d2869fc

…uns.

Fixing differences between local black/ruff linting and github's

35f3be9

Autolinting isn't a good thing unless VSCode and Github agree

bc3d1d6

jcadam14 merged commit 4311e8c into main Jan 2, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

65 create test to make sure validations stay in sync with 2024 validationscsv #69

65 create test to make sure validations stay in sync with 2024 validationscsv #69

jcadam14 commented Dec 5, 2023

github-actions bot commented Dec 6, 2023 •

edited

Loading

jcadam14 commented Dec 21, 2023 •

edited

Loading

lchen-2101 commented Dec 28, 2023

lchen-2101 Dec 28, 2023

lchen-2101 Dec 28, 2023

lchen-2101 Dec 28, 2023

lchen-2101 Dec 28, 2023

lchen-2101 Dec 28, 2023

lchen-2101 left a comment

lchen-2101 commented Jan 2, 2024

65 create test to make sure validations stay in sync with 2024 validationscsv #69

65 create test to make sure validations stay in sync with 2024 validationscsv #69

Conversation

jcadam14 commented Dec 5, 2023

github-actions bot commented Dec 6, 2023 • edited Loading

Coverage report

jcadam14 commented Dec 21, 2023 • edited Loading

lchen-2101 commented Dec 28, 2023

lchen-2101 Dec 28, 2023

Choose a reason for hiding this comment

lchen-2101 Dec 28, 2023

Choose a reason for hiding this comment

lchen-2101 Dec 28, 2023

Choose a reason for hiding this comment

lchen-2101 Dec 28, 2023

Choose a reason for hiding this comment

lchen-2101 Dec 28, 2023

Choose a reason for hiding this comment

lchen-2101 left a comment

Choose a reason for hiding this comment

lchen-2101 commented Jan 2, 2024

github-actions bot commented Dec 6, 2023 •

edited

Loading

jcadam14 commented Dec 21, 2023 •

edited

Loading