-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validation and setup script fixes #68
base: sk/waveform-dev
Are you sure you want to change the base?
Conversation
…es. Make property naming more consistent.
config file (default off)
PR checklistDefault guide for a PR (if multiple PRs for the work, only keep one version of it and link to it on the other PRs)
|
9d66ef1
to
4d84524
Compare
List<IdsMaster> msg = idsSession.createQuery( | ||
"select i from IdsMaster i where i.unid >= :fromUnid and i.persistdatetime >= :fromDatetime order by i.unid", | ||
"select i from IdsMaster i where i.persistdatetime >= :fromDatetime order by i.persistdatetime", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a feeling like when we didn't specify this, that we wouldn't get them ordered by unid in production. Might be worth doing a test of this against the IDS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good point. If I've got this query right, it should be doing the check:
select * from (
select
ids.unid,
ids.persistdatetime,
row_number() over (order by unid) as by_unid,
row_number() over (order by persistdatetime) as by_dt
from public.tbl_ids_master ids
) x
where
x.by_unid != x.by_dt
;
So far, the only ones that fail the test occur between 1 and 2 am on the last Sunday of each October 🙄 (at least as far as 2022)
Might suggest that the persistdatetime
is not in fact timezone aware, as it claims to be :(
I think this doesn't make any difference in the case where we use a date that is not in this ambiguous window. This is already a bug if hl7-reader should record progress that happens to fall in this window.
I will have a think about what's best here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLDR:
- The IDS timestamp field is broken and we have been imperfectly working around it; this change makes that workaround worse
- In practice it likely only affects validation runs, not production
- It only applies if requesting a start/end date that occurs in the ambiguous date zone between 1am and 2am at the end of October (that I sometimes refer to as "programmer's halloween" ;)
- It doesn't affect order of message delivery in any case
- It could be worked around with a little manual binary searching, but probably not worth the effort
- the advantage of the new way is that I don't have to wait an hour every time I do a validation run!
Details:
Note that this doesn't affect the order in which we process the messages, it just affects the starting/ending unid when a date bound has been requested (true only in validation rather than production I believe?). The current unid is then incremented in the expected way.
However, it has I think a ~50% chance of missing up to an hour of messages if and only if the given date bound occurs between 1am and 2am on this day of the year.
In this diagram the arrowed line represents a series of messages with monotonically increasing unids, where there is a jump back an hour because of the local datetime incorrectly being used.
Imagine that the requested bound is b1, then using my query it will chose message m1 because it's slightly nearer in time to b1, when it should have chosen m2 (which has a lower unid).
If the bound is b2, then it will correctly chose m3, because the message that occurred in BST happened to be nearer to b2 in local time.
The old query gives the correct unid for times b1 and b2, but would fail for the short time that sits in between the last message before 2am and 2am itself. In that case it would wrongly pick m4 instead of m5.
When the bug occurs, it just means that the unid is unstable with respect to small changes in the input date. However, given that the input date is inherently ambiguous due to the IDS' use of local time, it's not clear what the user would be hoping for anyway.
You could fix it better by using the old query, but first doing a binary search to search for a better :fromUnid
timestamp (targetting the timestamp at least one hour before what the user actually asked for to avoid the ambiguity). This would at least give it stable (but still wrong) behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefpiatek What do you want me to do about this bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix it in the easiest way possible that makes validation setup faster, its only for validation so we don't care much about it if we lose an hour of data.
Fixes #56 , waveform turned off in production (but optional in validation)