Cumulus Library 2.0.0 Release #26
dogversioning
announced in
Announcements
Replies: 1 comment
-
As part of this release, we have also updated the COVID Symptoms study to support the new core table structure. This can be installed from pypi via |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
You can install the latest library release with
pip install -U cumulus-library
. The release notes are over on the project github. Note that the minimum python version is now 3.10.This one makes a lot of changes! But the important bit is this: we have overhauled how we handle sql generation, such that a :lot: more of it is dynamic and responsive to the data you've extracted from your EHR. This, combined with some updates to how the ETL handles schema generation on creating its tables in Athena, means that the expectation now is that a Core study build should always succeed, although it may create empty tables if you are missing data. As part of this new contract, if it does fail for some reason, please let us know!
We've also changed the core FHIR resource tables structurally - they now more closely adhere to the US Core R4 specification, and there are no longer deeply nested structured in these tables. Instead, they are all flat tables (though they may have redundant records with, for example, different coding systems). You may see tables with a
*_dn_*
name in your database after running - these contain the expansion of array fields into square representations. You can usually ignore these, but we keep them around in case working with these directly is easier for some use case. We may revisit this decision in the future to remove them after a core build has completed.New Database: DuckDB
We support DuckDB as a database engine now - while we are using this primarily for unit testing, it also does allow for a non-cloud database option, if that is easier for your use case.
New Statistical method: Propensitiy Score Matching (PSM)
We've added a configurable PSM implementation, which can help to quickly identify positve/negative groups for a study via population sampling. Take a look at the docs for more information on usage.
Beta Was this translation helpful? Give feedback.
All reactions