Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue745 epoch part5 #901

Closed
wants to merge 13 commits into from
Closed

Issue745 epoch part5 #901

wants to merge 13 commits into from

Conversation

jhmigueles
Copy link
Collaborator

@jhmigueles jhmigueles commented Sep 13, 2023

Closes #745 => Enhanced functionality to aggregate epoch size in part 5. The new argument part5_epochSizes require 2 numeric values indicating the epoch length for timeuse and for fragmentation analysis in part 5.

  • In the event part5_agg2_60seconds is set to TRUE, then GGIR automatically sets part5_epochSizes = c(60, 60) and returns a warning.
  • If part5_epochSizes definition is not a multiple of the windowsizes[1], e.g., windowsizes[1] = 5 and part5_epochSizes = c(8, 60), then aggregation is not possible. In this case, GGIR sets part5_epochSizes = c(windowsizes[1], 60) and returns a warning. In the unlikely event that 60 is not a multiple of windowsizes[1] either, then part5_epochSizes = c(windowsizes[1], windowsizes[1])

Checklist before merging:

  • Existing tests still work (check by running the test suite, e.g. from RStudio).
  • Added tests (if you added functionality) or fixed existing test (if you fixed a bug).
  • Updated or expanded the documentation.
  • Updated release notes in inst/NEWS.Rd with a user-readable summary. Please, include references to relevant issues or PR discussions.
  • Added your name to the contributors lists in the DESCRIPTION and CITATION.cff files.

@codecov-commenter
Copy link

codecov-commenter commented Sep 13, 2023

Codecov Report

Merging #901 (ceb9fb7) into master (859a57f) will decrease coverage by 0.01%.
The diff coverage is 88.27%.

❗ Current head ceb9fb7 differs from pull request most recent head 4187ca1. Consider uploading reports for the commit 4187ca1 to get more accurate results

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@            Coverage Diff             @@
##           master     #901      +/-   ##
==========================================
- Coverage   78.22%   78.21%   -0.01%     
==========================================
  Files          92       94       +2     
  Lines       12301    12337      +36     
==========================================
+ Hits         9623     9650      +27     
- Misses       2678     2687       +9     
Files Changed Coverage Δ
R/check_params.R 92.81% <40.74%> (-4.81%) ⬇️
R/extract_nightsi.R 85.71% <85.71%> (ø)
R/g.part5.R 81.11% <85.78%> (+0.83%) ⬆️
R/g.part5_analyseSegment.R 93.55% <93.77%> (+0.16%) ⬆️
R/aggregate_to_epoch.R 100.00% <100.00%> (ø)
R/load_params.R 99.13% <100.00%> (+<0.01%) ⬆️

@jhmigueles jhmigueles marked this pull request as draft September 13, 2023 10:22
@jhmigueles jhmigueles marked this pull request as ready for review September 13, 2023 12:19
Copy link
Member

@vincentvanhees vincentvanhees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for having a go at this @jhmigueles. It may be easier to discuss first offline.

My concern is that the added loops and if statements reduce readability and increase complexity of the code a lot. Although this may work for what we currently want, it would make it really hard to make modifications to allow for new variations on this without adding ever more if statements and loops.

I am wondering whether it is worth all the effort, because we eventually have to deal with all this complexity when we fix bugs and implement other enhancements. The user has the option to re-run part 5, so it is not like GGIR does not facilitate the epochsize.

I may need to think more about this, but I am wondering whether there would be a way to create all resolutions of ts in a single place and to put them in a list ts_list, which is then forwarded to all existing functions. There we then have full freedom to use whichever resolution we may want without being constrained to where we are in the code.

Advantages:

  • No extra loops or if-statements in g.part 5
  • Full flexibility across part 5 to take advantage of multiple resolution data. For example, we may want to do fragmentation analysis during the day at 60 second resolution, and during the night at 30 second resolution, while time-use analysis at 5 second resolution. Similarly, we may at some point want to combine 5 second and 60 resolution data for time-use analysis.

Disadvantage:

  • Even the approach I describe here still adds complexity as we need to consider carefully what epoch size is used for each part of the code.
  • Exporting time series currently assumes 1 epoch size, but in the new situation we need to ask user which epoch size the exported time series should have.

EDIT: Another route may be:

  • Add option inside g.fragmentation to only aggregate to 60 seconds there. In that way user can use 5 seconds for time-use and 60 seconds for fragmentation with only one new argument, and all code enhancements specific to g.fragmentation.

aggregate_to_epoch = function(ts, to, lightpeak_available, params_general) {

# Aggregate time
ts$time_num = floor(as.numeric(iso8601chartime2POSIX(ts$time,tz = params_general[["desiredtz"]])) / to) * to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timestamp format conversion should not be here. The name of the function only refers to aggregation to epoch and we risk doing the same conversion multiple times. I think this should be done centrally inside for example g.part5_initialise_ts.

@@ -0,0 +1,21 @@
extract_nightsi = function(ts, dayborder, desiredtz, tail_expansion_log) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use or enhance g.detecmidnight? This seems to overlap.

if (ws3new == params_general[["part5_epochSizes"]][1]) classifier = c(classifier, "timeuse")
if (ws3new == params_general[["part5_epochSizes"]][2]) classifier = c(classifier, "fragmentation")
if (ws3new != ws3) {
ts = aggregate_to_epoch(ts, ws3new, lightpeak_available, params_general)
Copy link
Member

@vincentvanhees vincentvanhees Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You re-use ts from the previous iteration of the loop, which could cause problems:

  • If epochi is not ranked from low to high
  • First aggregating to, let's say 30 and then to 60 may not be the same as aggregating from 5 to 60 directly.
    So, if you do this then I would go back to original ts and aggregate from there.
  • If you want to export ts at high resolution.

@jhmigueles jhmigueles marked this pull request as draft November 23, 2023 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

make part5_agg2_60seconds work for fragmentation metrics only, or explain in the vignette better
3 participants