You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As part of the data quality measures implemented for this project, we have developed SQL code that performs row counts and other data quality checks to make sure the correct number of records are included in the various models and that the values in the models meet various data quality measures. While a number of tests are implemented in the yml files, we have also developed SQL worksheets in Snowflake that perform checks not included in the yml files. Currently the worksheets have been developed for the intermediate diagnostic, clearinghouse, imputation and performance schemas to perform these data quality checks.
The results of these data quality checks can be used to implement additional tests in the yml files as well as check the results against other data sources. Below are some tests that need to be implemented and verified, these issues will be tracked as their own issues with additional details as needed:
Diagnostic models should contain the correct number of rows associated with active detectors and their associated stations on a daily basis (@kengodleskidot is the lead) Diagnostic Model QC #398
Clearinghouse models should contain the correct number of rows per detector on a daily basis with observed data - (@kengodleskidot is the lead) Clearinghouse Model QC #397
Imputation models should contain the correct number of rows per detector on a daily basis and fill in all data holes with either observed or imputed data (@mmmiah is the lead) Imputation Data Holes #404
Performance models should contain the correct number of rows per detector on a daily basis and should contain no data holes for the performance metric values VMT, VHT, Q, TTI, Delay and Productivity (@kengodleskidot is the lead) Performance Model QC #465
As part of the data quality measures implemented for this project, we have developed SQL code that performs row counts and other data quality checks to make sure the correct number of records are included in the various models and that the values in the models meet various data quality measures. While a number of tests are implemented in the yml files, we have also developed SQL worksheets in Snowflake that perform checks not included in the yml files. Currently the worksheets have been developed for the intermediate diagnostic, clearinghouse, imputation and performance schemas to perform these data quality checks.
The results of these data quality checks can be used to implement additional tests in the yml files as well as check the results against other data sources. Below are some tests that need to be implemented and verified, these issues will be tracked as their own issues with additional details as needed:
The text was updated successfully, but these errors were encountered: