Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live waveform data in Emap #66

Open
wants to merge 108 commits into
base: develop
Choose a base branch
from
Open

Live waveform data in Emap #66

wants to merge 108 commits into from

Conversation

jeremyestein
Copy link
Collaborator

@jeremyestein jeremyestein commented Oct 8, 2024

I think it's time to create this PR now. I would be particularly interested to see @stefpiatek 's feedback.

I appreciate it's quite large, but @skeating has been reviewing it in chunks, going into the sk/waveform-dev branch from which we're now merging. So this is not the first time it's been seen by anyone except me!

waveform_hf_data.md, the main design document, is a good place to start reading.

Project board is at https://github.com/orgs/UCLH-DHCT/projects/3/views/1

is not needed as long as project.build.sourceEncoding property is set.
the waveform queue. Fixing this a proper way was too awkward.
get around Spring circular dependency problems.
inserts are merged into fewer multi-row inserts
allow more data to be put in during validation.
to allow for configurations where not all data sources are present.
Copy link

github-actions bot commented Oct 8, 2024

PR checklist

Default guide for a PR (if multiple PRs for the work, only keep one version of it and link to it on the other PRs)

  • From the UCLH data science desktop, a validation run has been set off
  • load times
    in UCL teams has been populated with the run information
  • During the run, glowroot has been checked for any queries which are taking a substantial proportion of the
    total processing time. This can be useful to identify indexes that are required.
  • After the run, look for any unexpected errors in the etl_per_message_logging table, the error_search.sql file
    on the shared drive can be used for this \\sharefs6\UCLH6\EMAP\Shared\EmapSqlScripts\devops\error_search.sql.
    Create an issue if you find an unexpected exception and is not related to the changes you've made, otherwise
    fix them!
  • After the run, populate the end time in
    load times
  • Let Aasiyah know about the completed validation and give her information on the changes and where to start
    with the validation
  • Check validation report and give any feedback to Aasiyah if there are any changes needed on her side,
    iterate on getting the validation to match at least 99% (validation and emap code).

@jeremyestein jeremyestein marked this pull request as ready for review October 9, 2024 15:29
Copy link
Contributor

@stefpiatek stefpiatek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly a lot there, hopefully I made sense of most of it. Some comments and questions but nothing blocking

docs/dev/features/waveform_hf_data.md Show resolved Hide resolved
emap-setup/emap_runner/runner.py Outdated Show resolved Hide resolved
core/docker-compose.yml Outdated Show resolved Hide resolved
emap-setup/emap_runner/validation/validation_runner.py Outdated Show resolved Hide resolved
emap-setup/global-configuration-EXAMPLE.yaml Outdated Show resolved Hide resolved
Comment on lines +21 to +26
// derived from real data
1, List.of(11, 12, 14, 15, 16),
2, List.of(17, 18, 19, 20, 21),
3, List.of(22, 23, 24, 25, 26),
4, List.of(27, 28, 29, 30, 31),
5, List.of(33, 34, 35, 36));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't need to be fixed now but would be nice to have this as an input file rather than copiled in the code

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it but couldn't really see the use case.

If you want it as completely external data (ie. can change it with no docker rebuild, an env var points to an external file or something), then I'd argue this data is still important enough that you'd need some form of version control for it (eg. another git repo), which adds a dependency and thus complexity and moves away from our monorepo benefits.

If you just mean moving it to a CSV but keeping it in the repo, then it's still going to need a docker rebuild when it gets changed, but by using CSV you've lost the benefits of type/syntax checking that the java compiler provides. Perhaps if we had a dedicated config directory containing CSVs with this sort of data, we could exclude this from the docker build but mount it into the container instead. But it would only save us a rebuild as often as this data changes, which isn't very often.

* it seems to copy data to structures that aren't what I want as final output (Varies[]),
* whereas this parser doesn't attempt to process the contents of any fields, allowing
* the calling code to do as it wishes.
* It's about 100-1000x faster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ha fun, did you try and other HL7 parsers btw?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None other than HAPI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants