Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated snomed codes, based on the exchange discusssion with Dylan #1520

Merged
merged 2 commits into from
Oct 16, 2024

Conversation

florim14
Copy link

Pull Request Description: Updated SNOMED Codes

Overview

This pull request aims to update the SNOMED codes across various Synthea modules, ensuring they are current and active. The updates were made using a script specifically developed for this purpose, and from an exchange discussion with Dylan.

Changes Made

  1. Script Development: A custom script was created to review and update the SNOMED codes within the Synthea modules. The script performed the following tasks:

    • Identified inactive or outdated SNOMED codes.
    • Check for active versions of these codes. If the code is still active but doesn't match with the current display, we update it (we also have a list of codes, which for now we ignore, and with be replaced in the future). On the other hand, if no code is found, we search based on the display and the semantic tag (if it has one), and check if it matches with at least 70% of similarity.
    • Updated the modules with the active SNOMED codes.
  2. Code Updates:

    • The script successfully updated all SNOMED codes across the modules, except for 12 codes for which no active matches were found, and the ones we choose to ignore for now.
    • Out of these 12 unmatched codes, 11 are located in the "TNM_Diagnosis" module file.

What are the benefits:

  • Ensures all SNOMED codes used in the Synthea modules are active, improving the integrity and reliability of the synthetic data generated.
  • The developed script (once it is published) can be reused for future updates, simplifying the maintenance process for SNOMED codes in Synthea modules.

Remaining Issues

  • Unmatched Codes:
    • 12 SNOMED codes could not be matched with active versions, and the ones we decided to ignore.
    • 11 of these unmatched codes are located in the "TNM_Diagnosis" module.

Future Work

  • Script Contribution: While this pull request focuses on the updated SNOMED codes, the script used for these updates will be shared in a future contribution. This script will assist other users in maintaining up-to-date SNOMED codes within their Synthea modules.

Thank you for considering this pull request.

@@ -158,15 +158,15 @@
"codes": [
{
"system": "SNOMED-CT",
"code": "384758001",
"code": 384758001,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice this before but there are some instances in the diff like this one, where neither the code nor display value changed, but the code was changed from a string to a number. Any idea what happened here? It doesn't seem like all codes were changed. This isn't a huge deal but if it's possible to not change the type if there wasn't a value change, or just make them all strings instead of numbers, that would be preferable to making them numbers

@florim14
Copy link
Author

@dehall - I have noticed that as well when I was writing the script, some were numbers, and some were strings, so I thought it would be better to be consistent and when I was doing the check, changed them to numbers. I can make the adjustment and make them strings, not a problem at all. Should I do it?

@dehall
Copy link
Contributor

dehall commented Oct 14, 2024

Yes, if that's a simple change please do that. Thanks!

For more detail: All codes are really strings, in the case of SNOMED it's usually safe to represent them as numbers but not always, so better to keep them as strings. (And LOINC for example includes a dash so the codes are necessarily strings.) The reason many of them are stored as numbers now is our module builder UI only has one text entry component and it tries to infer the type from the content, so a string that looks numeric turns into a number. This has been a problem before: synthetichealth/module-builder#280

@florim14
Copy link
Author

@dehall - perfect, I see. I did the changes and converted them into strings, and committed the changes. If there is any other change needed, please let me know.

Copy link
Contributor

@dehall dehall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, and especially for your patience as we worked through this to get it merged!

@dehall dehall merged commit 9de0460 into synthetichealth:master Oct 16, 2024
2 checks passed
@florim14
Copy link
Author

@dehall - thank you as well for the discussions and the help to make this PR great. I will reach to you later when you are more free and we can discuss on the way to update the rest of the codes. Thank you one more time, and wish you a great day

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants