Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In the hourly files, need to pass for every hour the last 10 min value as 'var_i' to replace instantaneous values previously in tx files #285

Closed
BaptisteVandecrux opened this issue Aug 15, 2024 · 6 comments

Comments

@BaptisteVandecrux
Copy link
Member

BaptisteVandecrux commented Aug 15, 2024

For older logger programs, instantaneous values were only transmitted and not saved on the logger file because they are the same as the 10 minute average, which are saved on the logger files.

When trimming the tx files, these transmitted instantaneous values were also trimmed, and are therefore missing from the new AWS data files. If we want to continue having (all) the instantaneous data in the hourly files, we need to extract 10 min values at the end of each hour (need to be checked) and assign these values to the corresponding instantaneous variable.

Here's an illustration for CP1:
CP1_16

@PennyHow
Copy link
Member

I think we decided to pass the 10-minute raw data, as it is identical to the instantaneous values. We have not yet implemented this in the bufr re-processing though. Is this correct, @ladsmund?

In the case of CP1 above, we should have the corresponding raw data, therefore we will use the 10-minute raw data.

@BaptisteVandecrux
Copy link
Member Author

We have not yet implemented this in the bufr re-processing though.

Some users might also be interested in hourly instantaneous values in the level 3 files on dataverse and THREDDS, so it should be addressed within L0toL1 or L1toL2.

@PennyHow
Copy link
Member

PennyHow commented Aug 15, 2024

@ladsmund and I just had a discussion about this. In order to include the hourly instantaneous values in the Level 3 files, we would need to either:

  1. Process all Level 0 tx files --> This will drastically slow down operational processing
  2. Incorporate 10-minute raw data into the instantaneous variables at L0toL1 or L1toL2--> This alters the definition of what the instantaneous variables are, which might become confusing

So we had another idea: we distribute all instantaneous values (i.e. 10-minute raw data AND hourly tx instantaneous values) as a separate Level 3 instantaneous data product. This would be beneficial because:

  • @ladsmund could perform BUFR re-processing from this product
  • Enables total transparency to DMI and other users of the instantaneous values
  • Provides a clear difference between instantaneous values and the averaged values (i.e. the current Level 3 product)

I'm not sure where we would implement this at the moment. I don't think it needs to be operational on an hourly level. But perhaps a weekly or monthly routine that we call to make/update a Level 3 instantaneous product.

@BaptisteVandecrux
Copy link
Member Author

I agree that in the future, chopping the data into different files (one for core +derived data, one for instantaneous data, one for quality flags...) might be the way to go.

Incorporate 10-minute raw data into the instantaneous variables at L0toL1 or L1toL2--> This alters the definition of what the instantaneous variables are, which might become confusing

In a way, that is already what the logger program does:

  • it calculates 10 min averages every 10 min
  • every round hour, it takes the the last 10 min value and places it under the <var>_i variable in the 60 min table

So, as an intermediate solution, we could have a function that does the same in pypromice.

@BaptisteVandecrux BaptisteVandecrux changed the title Missing instantaneous values after tx files clean-up In the hourly files, need to pass for every hour the last 10 min value as 'var_i' to replace instantaneous values previously in tx files Aug 27, 2024
@BaptisteVandecrux
Copy link
Member Author

Some important info related to #300:

  • for each hourly timestamp, the value of t_u is the average of the following hour, whereas the value of t_i correspond to the average of the last 10 minutes
  • When fetching instantaneous values from 10 minute file, the upper boom is always used. This causes slight changes for some GC-Net stations at which, during the first years, instantaneous values were taken from the lower boom. Now all AWS should use upper boom for transmitted instantaneous values.
  • instantaneous data transmission is relatively recent. Now that we extract 10 min data as "instantaneous" for all round hours, there are many years where new values for t_i, wspd_i... etc are now available.
  • In a similar way, instantaneous values have sometimes been transmitted daily, sometimes 6-hourly, and more recently, hourly. Now every round hour having a 10 min value will have an instantaneous value. See illustration below where orange is new instantaneous values extracted from 10 minute files while the blue triangles are the 6h transmissions previously used:

billede

@BaptisteVandecrux
Copy link
Member Author

fixed in main in #302 and #304

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants