-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide generalized aggregation #77
Conversation
docs/_layouts/default.html
Outdated
@@ -27,7 +27,7 @@ | |||
<script src="{{ site.baseurl }}/assets/js/vendor/moment-with-locales.min.js"></script> | |||
<script src="{{ site.baseurl }}/assets/js/vendor/Chart-2.7.1.min.js"></script> | |||
<script src="{{ site.baseurl }}/assets/js/vendor/spin-2.3.2.min.js"></script> | |||
<script src="{{ site.baseurl }}/assets/js/charts.js?version=1ff0187"></script> | |||
<script src="{{ site.baseurl }}/assets/js/charts.js?version=e7e9c5a"></script> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to generate a random hash or a hash over a file with Jekyll (via GitHub pages)?
/cc @parkr 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
site.github.build_revision
would work on Pages. https://help.github.com/articles/repository-metadata-on-github-pages/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@parkr: Thanks for the pointer! This would generate new ?version=
hashes for each commit, right? Do you know of a way to, for example, get a hash based on the actual file content? In this way, we wouldn’t force client or server cache flushes with every single commit. (I’m just asking, your solution would solve our problem pretty well already!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No hashes on a per-file basis are generated, so that’s the best solution that will achieve cache misses based on what’s in Pages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, then this is the way to go. Thanks, @parkr!
194f329
to
7dfc075
Compare
In case someone is interested, I’ll come back to this soon. I wanted to have #81 in place before finishing this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/spec/charts.js
Outdated
createList, | ||
createTable, | ||
createSpinner, | ||
d3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Globals can be defined in a variety of ways in eslint. Other than inlining the list into particular .js files (like we do here), you can also specify them via eslint config files, similarly to how we do in the browser-context specific .eslintrc.json
file (both env
and globals
config properties, in their own different ways, define what globals are expected to be available). I think we should move any globals defined by external libraries that our scripts depend on, like d3, jquery and moment.js, for example, to the configuration files, rather than as per-js-file running lists. In this way, if our webapp frontend ends up adding a new .js file, we won't have to duplicate the inline comment allowing for d3 globals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that’s a good pointer. I wasn’t happy with inlining these globals either, and I second that it would be best to configure this in the .eslintrc.json
.
I’ll return to this pull request in a couple of days. I’ll have to rebase these changes on top of the other new stuff, and then I’ll also adjust the global directives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@filmaj: I tried to change the globals
declaration for d3
to true
without success. I’d say let’s tidy up the global declaration in a separate pull request 🙂.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted, I can take that on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be nice, thanks 😄!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pluehne I added another commit to my fork's branch of this PR that should fix that, feel free to pull in to this PR: https://github.com/filmaj/hubble/tree/patrick/improved-aggregation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@filmaj: Thanks a lot! What I didn’t understand was that there are multiple .eslintrc.json
files with different purposes—one for linting the assets themselves and one for linting the unit tests.
I squashed your changes into my commit above (to not have one commit do it improperly and another one fixing it right after), and I hope that this is fine with you 🙂.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pluehne yep perfectly fine 👍
b5944e2
to
618bf5f
Compare
618bf5f
to
01b488f
Compare
Huh, this code coverage comment uses the compact "header" layout. 🤔 |
f50edb7
to
379562a
Compare
data.sort((row1, row2) => row1['date'] - row2['date']); | ||
|
||
const dateStart = data[0]['date']; | ||
// Ranges are exclusive, so add one more day to include the last date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/last date/last day/
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refering to the “last date” was intentional, because this actually is about dates. However, “day” isn’t wrong either … not sure whether that’s worth rephrasing though.
docs/assets/js/charts.js
Outdated
// Note that this assumes complete data in the period | ||
// Should data points be missing, aggregation methods such as the sum will lead to results that can't be | ||
// compared to periods with complete data | ||
// Hence, the maintainers of the data need to ensure that the input is well-formed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
micro-nit: It might be more readable if you add a .
at the end of the sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn’t add periods, because we don’t have them in comments in other places. Should we do this everywhere then? Or put periods just between sentences in multisentence comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, then I’d say that we put periods between sentences systematically except for the end of comments (as is practiced by many print magazines, for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this as I suggested above.
docs/assets/js/charts.js
Outdated
function(keyID, key) | ||
{ | ||
if (key == 'date') | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
micro-nit: that might be overly cautious as we use the value already in line 155 above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is to exclude the dates themselves from being aggregated.
The data points all have the structure:
{
"date": "yyyy-mm-dd",
"key1": "value1",
"key2": "value2",
...
}
We only want to aggregate the values of key1
and key2
but not date
.
ce65f17
to
7f0abc0
Compare
This implements a generalized method for aggregating time-series data. Data can be aggregated over week or month intervals with a variety of aggregation methods to choose from. This will be useful for providing chart views at different levels (such as two-year periods vs. just showing the last month). Additionally, the generalized form of aggregation can be used to smooth out graphs where the sampling frequency changed with an update to Hubble Enterprise. The aggregation is done by splitting the time data into subsequent, gapless periods of time (weeks starting with Mondays or months), for each of which the aggregated values are then computed and returned. Aggregation methods define how to aggregate the values within individual time periods. The following aggregation methods are supported: - sum - mean - min - max - first (the chronologically first available value for that period) - last - median Periods with incomplete data at the beginning or the end of the time series are excluded from the aggregation. Finally, the pull request usage chart is changed to make use of the new aggregation facilities to reduce the granularity from daily to monthly data for now. This might be changed when we implement detail views. I also added several unit tests to check the aggregation methods (for off-by-one errors in particular) as well as a short piece of documentation on the new configuration options.
7f0abc0
to
01e714c
Compare
This implements a generalized method for aggregating time-series data. Data can be aggregated over week or month intervals with a variety of aggregation methods to choose from.
This will be useful for providing chart views at different levels (such as two-year periods vs. just showing the last month). Additionally, the generalized form of aggregation can be used to smooth out graphs where the sampling frequency changed with an update to Hubble Enterprise.
The aggregation is done by splitting the time data into subsequent, gapless periods of time (weeks starting with Mondays or months), for each of which the aggregated values are then computed and returned.
Aggregation methods define how to aggregate the values within individual time periods. The following aggregation methods are supported:
Periods with incomplete data at the beginning or the end of the time series are excluded from the aggregation.
Finally, the pull request usage chart is changed to make use of the new aggregation facilities to reduce the granularity from daily to monthly data for now. This might be changed when we implement detail views.
I also added several unit tests to check the aggregation methods (for off-by-one errors in particular) as well as a short piece of documentation on the new configuration options.