Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature idea: ETL characterizations #92

Open
katy-sadowski opened this issue Oct 24, 2024 · 4 comments
Open

Feature idea: ETL characterizations #92

katy-sadowski opened this issue Oct 24, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@katy-sadowski
Copy link
Collaborator

Could we output summary stats like the delta in patient count between the source database and the CDM?

@katy-sadowski katy-sadowski added the enhancement New feature or request label Oct 24, 2024
@lawrenceadams
Copy link
Collaborator

This is a great idea - do you have an idea of how you'd want to do this practically? I've seen it done in the past as dbt tests where you check distinct counts from the source and compare to your final table etc?

@katy-sadowski
Copy link
Collaborator Author

This was something @fdefalco asked about at the demo yesterday - haven't given it much thought yet but the idea is supplementing the documentation with stats that allow users to understand the impact the OMOP transformation had on the actual contents of the dataset.

@lawrenceadams
Copy link
Collaborator

Ahhh I see! Stuff like column level lineage etc is nice for this, easier said than done - but a good thing to think about

@lawrenceadams
Copy link
Collaborator

image

Something like this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants