-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
first pass QC of underway data #29
Comments
Locate column header for flow meter for different cruises. |
Column header for salinity: SBE45S. Column header for flow meter: FLOW. A challenge with flow meter is that the average value (in ml/s) differs between cruises (e.g., ~50, ~80, ~130 depending on cruise). I saw zero values in several, and NAN values in some. Let's discuss with Taylor re: thresholds for minimum salinity and flow meter values. |
Yes thanks for pointing that out Stace. A while ago Joe and I discussed this. We thought maybe a running average with significant deviation from that average could work? Something along those lines. I've heard in passing from the ship's techs that the flow meter isn't super great so we should take values with a grain of salt. Basically, looking for deviations from consistency. |
Noting that in today's Zoom we discussed providing a quality flag column for each of these: salinity, flow meter, and fluorometer. We also discussed providing a comments column that would be auto-populated to alert the end user that a flag(s) was applied. |
Endeavor data does not use the column headers specified here, so the API will need to skip adding quality flags for that data. Once column names are regularized, this will not require per-cruise or per-vessel configurations (although regularization will). |
Related to #30 |
develop QA/QC checks using Armstrong data and evaluate with Taylor |
From the original description:
If these tools are not part of the REST API then we should consider tracking this work elsewhere. I'm leaving the issue open for now but addressing it may not touch this codebase. |
Establish some simple filters to pass over underway data for rough QC prior to uploading to the API.
Output NA for skipped data.
When looking at max/min values use some sort of rolling average or rolling median?
If 1 parameter is bad, do we mask out all parameters? Do we group TSG together and any others separate?
Possible filters:
The text was updated successfully, but these errors were encountered: