Skip to content
This repository has been archived by the owner on Dec 1, 2018. It is now read-only.

Introduce Oldtimer (with InfluxDB sink support) #1172

Merged
merged 2 commits into from
Jul 14, 2016

Conversation

DirectXMan12
Copy link
Contributor

@DirectXMan12 DirectXMan12 commented May 23, 2016

This PR introduces Oldtimer, proposed in docs/proposals/old-timer.md.

Oldtimer is accessible from the /api/v1/historical, and mirrors the model API. It can be enabled by passing --historical_source with one of the URIs used in the --sink argument.

Currently this PR supports the InfluxDB sink. Support for the rest of the metrics sinks should be coming along shortly in separate PRs (although if someone more familiar with one of the individual sinks is interested in writing the Oldtimer support for that sink, that would be appreciated).

cc @burmanm @piosz @mwielgus @bryk

@k8s-bot
Copy link

k8s-bot commented May 23, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

6 similar comments
@k8s-bot
Copy link

k8s-bot commented May 23, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-bot
Copy link

k8s-bot commented May 23, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-bot
Copy link

k8s-bot commented May 23, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-bot
Copy link

k8s-bot commented May 23, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-bot
Copy link

k8s-bot commented May 23, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-bot
Copy link

k8s-bot commented May 23, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@DirectXMan12
Copy link
Contributor Author

As far as sinks go, this PR includes InfluxDB, and Hawkular will probably be next (IIRC, @burmanm was interested in helping out there). After that, I'm assuming GCM is the next priority. Depending on time constrains, I can then work on OpenTSDB (or Reimann or Elasticsearch, but I haven't investigated those quite yet).

@DirectXMan12
Copy link
Contributor Author

Ok, I've removed the [WIP] tags from this, since it was decided that additional sinks will get their own PRs.

@DirectXMan12 DirectXMan12 changed the title [WIP] Introduce Oldtimer Introduce Oldtimer Jun 3, 2016
@DirectXMan12 DirectXMan12 changed the title Introduce Oldtimer Introduce Oldtimer (with InfluxDB sink support) Jun 3, 2016

// GetAggregation fetches the given aggregations for one or more objects (specified by metricKeys) of
// the same type, within the given time interval, calculated over a series of buckets
GetAggregation(metricName string, aggregations []string, metricKeys []HistoricalKey, start, end time.Time, bucketSize time.Duration) (map[HistoricalKey][]TimestampedAggregationValue, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fond of the idea of passing []string type here. Couldn't we make it a type that automatically restricts the available values and makes it easier to read?

Now each sink writer has to know that these strings are actually vars from historical_types.go

Considering that every sink writer has to modify them to their internal naming / functions in any case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a type alias around string? Sure. I often think that Go could use a proper enum type, though...

@piosz
Copy link
Contributor

piosz commented Jun 14, 2016

Please do not merge this until we will cut release 1.1 branch.


// GetAggregation fetches the given aggregations for one or more objects (specified by metricKeys) of
// the same type, within the given time interval, calculated over a series of buckets
GetAggregation(metricName string, aggregations []AggregationType, metricKeys []HistoricalKey, start, end time.Time, bucketSize time.Duration) (map[HistoricalKey][]TimestampedAggregationValue, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question applies to all of these methods. Are there parameters which can be assumed to be passed on with invalid value? Such as end time 0 or start time 0 (or both), bucketSize 0? I see there's checks for these in the example influx implementation, but nothing in these interface definitions.

And if there are, how should they be handled? I can make a lot of assumptions (and derive something from the influx example), but it would be nice to have these documented to make it easier to implement proper HistoricalSource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add in documentation to the interface, good catch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(but yes, they can have zero/empty values, which indicate no bounds for time, and to use only a single bucket for bucket size).

@ncdc
Copy link

ncdc commented Jun 24, 2016

What's the status on this? @DirectXMan12 is there more code to do from your end? @mwielgus does your team have any other comments? I see that the release-1.1 branch exists, so how close do you think we are to merging this?

@DirectXMan12 DirectXMan12 force-pushed the feature/oldtimer branch 4 times, most recently from 0d1857a to 28b15ae Compare June 29, 2016 21:19
@k8s-bot
Copy link

k8s-bot commented Jun 29, 2016

Jenkins GCE e2e

Build/test failed for commit 28b15ae.

To(metrics.InstrumentRouteFunc("podMetrics", a.podAggregations)).
Doc("Export some pod-level metric aggregations").
Operation("podAggregations").
Param(ws.PathParameter("pod-id", "The id of the pod to lookup").DataType("string")).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly is pod id. Please document.

Copy link
Contributor Author

@DirectXMan12 DirectXMan12 Jul 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, will do. It's the UID of the pod (as recorded in the pod metadata in Kubernetes). I'll just change the text there to say UID.

typeSel := fmt.Sprintf("type = '%s'", key.ObjectType)
switch key.ObjectType {
case core.MetricSetTypeNode:
return fmt.Sprintf("%s AND %s = '%s'", typeSel, core.LabelNodename.Key, key.NodeName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, can we escape the parameters?

Copy link
Contributor Author

@DirectXMan12 DirectXMan12 Jul 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Influx, key values are checked in checkSanitizedKey above, and an error is thrown if anything besides /^[a-zA-Z0-9_.-]+$/ is found.

Each of the sinks is individually responsible (instead of sanitization being at the API layer) since some sink driver will support bound parameters, which are more flexible.

@k8s-bot
Copy link

k8s-bot commented Jul 1, 2016

Jenkins GCE e2e

Build/test passed for commit 6a7dce2.

@k8s-bot
Copy link

k8s-bot commented Jul 2, 2016

Jenkins GCE e2e

Build/test failed for commit 5834108.

@DirectXMan12
Copy link
Contributor Author

Ok, I've added tests for the model handlers.

@k8s-bot
Copy link

k8s-bot commented Jul 6, 2016

Jenkins GCE e2e

Build/test passed for commit e84ab8f.

@mwielgus
Copy link
Contributor

mwielgus commented Jul 7, 2016

Tests fail

@DirectXMan12
Copy link
Contributor Author

Oddly, it passes locally (go version go1.6.1 linux/amd64). Looks like timezones in the timestamps in the test are changing across serialization and deserialization.

@mwielgus
Copy link
Contributor

Can you fix it anyhow? I cannot merge a pr that breaks unit test.

@DirectXMan12
Copy link
Contributor Author

@mwielgus yep, just need to figure out what's going on ;-)

@k8s-bot
Copy link

k8s-bot commented Jul 13, 2016

Jenkins GCE e2e

Build/test passed for commit d642641.

@DirectXMan12 DirectXMan12 force-pushed the feature/oldtimer branch 2 times, most recently from f355bb9 to b36ca14 Compare July 13, 2016 20:53
Oldtimer is the Heapster historical metrics access mechanism.  This
commit introduces the the API for Oldtimer, which mirrors the Heapster
model API, with additional paths for retrieving metrics aggregated over
time.

In order for the historical API to function, one specified sink must
support the historical access interface defined in this commit.
This commit adds support for historical access (Oldtimer) to the
InfluxDB sink.
@DirectXMan12
Copy link
Contributor Author

Ok, I've pushed one final change, which makes the start time parameter mandatory (and mandatorily non-zero), since different sinks treat a zero time differently, and some sinks have issues with excessively large duration (making an explicit start time mandatory makes sense here). Thanks to @burmanm for pointing out the issue.

Thanks to @ncdc for tracking down the issue with the times. It had to do with Go having a special "Local" timezone internally that was different than the actually local timezone (but "equivalent"), so DeepEqual considered them different (not entirely sure why it worked in east coast time, though).

@DirectXMan12
Copy link
Contributor Author

@mwielgus should be good to go!

@k8s-bot
Copy link

k8s-bot commented Jul 13, 2016

Jenkins GCE e2e

Build/test passed for commit f355bb9.

@k8s-bot
Copy link

k8s-bot commented Jul 13, 2016

Jenkins GCE e2e

Build/test passed for commit b36ca14.

@mwielgus
Copy link
Contributor

LGTM

@mwielgus mwielgus added the lgtm Indicates that a PR is ready to be merged. label Jul 14, 2016
@mwielgus mwielgus merged commit 0e1ec20 into kubernetes-retired:master Jul 14, 2016
@DirectXMan12 DirectXMan12 deleted the feature/oldtimer branch July 27, 2016 02:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants