-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a criteria for how much data there should be when calculating daily average SST from high resolution buoy data. #26
Comments
|
Well, just using days for which we have at least one hourly value buoy sst value every two hours, is way more fiddly than expected. I'm going with this algorithm (for each buoy), which will be the same for 10-minute resolution data or hourly resolution data:
Any thoughts? @schckngs ? @travistai2 ? |
|
That makes sense. I guess you would essentially be 'binning' the values by 2-hour intervals and finding the 2 hour mean. Then filtering out days with less than 10 values to estimate the daily mean? |
Done all the above (by the commit below), and went with excluding any days with daily fluctuations >5 degC. |
Comparing the buoys vignette with the version before recent commits (saved locally as buoys-2023-08-23-before-two-hour-5degC.html), the above refinements have actually removed some outlier-ish values (or ones that were the max/min for particular days), and maybe more. |
Collating emails here: But, the only flags in the data are these:
Do you know what these codes mean (-10, -1 etc.; 51257 is the count)? They are different codes from Kellogg et al. Presumably 100 is good (and is the value for 92.5% of the 10-minute measurement). The other flags ( Maybe I should try just keeping the 100's for now and see if that helps (excluding the spurious looking data), but something a bit more concrete would help.
|
Andrea @schckngs: Yes, it’s too bad the OPP data stream doesn’t have this! as well as the ECCC flags . You can see that the value 100 is likely their value for “good”, 20 probably means “suspicious” and “0” is over a threshold. However, as we can directly compare this to the Hakai flags, what is flagged as 20 is generally “good” in their dataset. Similarly, looking at C46181, it looks like they have capped the temperature values at 20 C for that buoy which gives it a flag of “0”. So long story short – I completely ignore the ECCC flags for these reasons! Please feel free to double-check my plots, it also seems that ECCC removes some of the very “jumpy” values (e.g. compare C46206). |
Andrea: I feel like a middle ground on the q/c front might be good – for example the spurious values you point out at the beginning of that buoy record are ~8-10 degrees change over the course of a day. However, it might be better to bug ECCC directly about it, or set the start date for that buoy to be later, so that we don’t miss real events that may manifest large temp fluctuations (like storms or heatwaves). Similarly those weird “bumps” once a day I’ll revive the thread with them about, as that impacts all downstream users of this data. The code for the plots was off the top of my head from a while ago, but something like this should work:
Charles: How about inventing a new flag with a name like ‘deployment start-up’ Andy: Thanks – I did start trying the manual flagging, but it can't really be automated, and seems to happen more than once (one buoy had a short period with no data, then fluctuations again). So I've stuck with the 5degC cut-off for now. Interestingly, even for that wild 10degC swing in a day, when the temps are averaged over the day, the average is pretty much the same as the daily average a few days later (for days that did not have wild swings); i.e. the wild swing during the day kind of got averaged out anyway, and it doesn't look like a weird outlier when you look at just the daily averages. Maybe that was luck in that the average of the daily air temperature (if our guessing is correct and the sensor was on the deck of a ship) was pretty close to the average SST anyway. |
Also see #33 |
When calculating daily average SST temperatures from higher resolution values in
data-raw/buoys/sst-data.R
, the average will be biased if, say, only values during the night time are collected.Also should not calculate average for a final incomplete day.
When have plotting functions, delve into the values a bit more to check for outliers that should be excluded based on some criteria.
The text was updated successfully, but these errors were encountered: