-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update binding params #339
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice code clean up. I only find the drain-type
value for aggregate drains not intuitive and confusing.
bindingType = syslog.BINDING_TYPE_METRIC | ||
case "all": | ||
bindingType = syslog.BINDING_TYPE_ALL | ||
case "allWithTimers": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the parameter value confusing. Why is it allowWithTimers
and it refers to aggregate drains? Why not simply aggregate
as drain type?
Can you please explain what does allow with timers even mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very good question.
the short answer of why it's allWithTimers is that...honestly I thought a work in progress name making a pr would get the idea across and I didn't want to wait forever sitting around trying to come up with something better.
Right now "ALL" type doesn't actually allow all envelopes to pass through. So we have multiple types. All let's gauges and counters and logs through. Logs let's logs through, metrics lets gauges and counters through. However, as you may know, there exists more envelope types then that.
The main one in use is
return c.toRFC5424TimerMessage(env, hostname, appID) |
The other one is
return c.toRFC5424EventMessage(env, hostname, appID) |
Aggregate drains get them because a lot of the work towards getting aggregate drains to work was for log-cache. Log-cache needed everything, and so did some people's external integrations I presume, so they were allowed to get everything. App drains were I guess restricted to only get the "good" metrics.
So that leaves us in a tough situation if we want to make things more consistant, because anyone using "all" is likely to be surprised if they suddenly start getting timers and events, but what do you call "more then all". all-with-timers is extremly verbose, but technecially accurate without breaking backwards compatibility.
I'd be open and interested if you have ideas for alternate schemes tho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general
- we've seen some people ask for metrics only aggregate drains
and - it seems foolish to me to enforce the end of timers through inconsistant egress fiat rather then via reworking diego and router to use the new types first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The enum AGGREGATE_DRAIN could/should probubly also be changed. Maybe should have done it. Not super comfy with "all-with-timers", but might be the best we have).
Thanks for calling attention to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Benjamintf1 Thanks for the clarification. It's good that we are thinking tin he same direction :) I also understood allWithTimers
as everything that comes with all
+ the timers.
I find this change as a must have, but I would do it other way:
Why don't we use the same parameters as it is done in the Log Cache CLI's tail command, two parameters envelope-type
and envelope-class
? This way we'll have the same API on both places.
The other option that came to my mind if we don't want to change the current drain-type=all
behavior, is to add another include
parameter and make a union of the current all
and timers
and/or events
.
I'm personally for unification of the API and do the same what is done in the Log Cache CLI.
What do you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, but let's have timers
as a new type and all
will return all types, logs, metrics and timers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the more natural division here is metrics(counters and gauges), logs(logs, maybe events but those kinda don't exist), and timers(aka traces).
I like this division! What if
- Replace
allWithTimers
with atimers
type, and made thelogs
type include events. - Then either introduce a breaking change or feature flag to switch
all
to really send everything.
(1) doesn't seem like a breaking change in my mind given the seeming nonexistence of events, and it allows for users to at least have some way to retrieve timers if they want them. It's less efficient for sure, but you could, in that world, have two different drains one for logs
/metrics
/all
, and another for timers
.
(2) would move us toward an all
type that makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 Possibly the simplest answer for now is to remove allWithTimers
from this PR and create a GH issue to discuss with how to incorporate timers moving forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed allWithTimers.
Added new parameter "drain-data". Drain data operates under "new" logs/metrics/traces/all split vs "all/logs-and-metrics for app drains/logs/metrics". Events under logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably events should be under events, but loggregator isn't a good eventing system. Events should be deprecated and removed. But events arn't used much, so it's reasonable to keep it under logs for what they're used for.
return bindingType | ||
} | ||
|
||
func getRemoveMetadataQuery(u *url.URL) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool way to avoid breaking changes in the API and keep backward compatibility. Well done! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand avoiding breaking changes is important :D
…at complies with logs/metrics/traces/all split
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that both drain-type
and drain-data
control the same thing, what will be forwarded by the syslog drain, which is confusing.
|
||
func getBindingType(u *url.URL) syslog.DrainData { | ||
drainData := syslog.LOGS | ||
switch u.Query().Get("drain-type") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Benjamintf1 I know that with this handling of the drain-type
and the addition of the drain-data
parameter you've kept the things 100% backward compatible, but now we have two parameters which control the same thing: what will be forwarded by the syslog drain. That is confusing. Especially if a user specifies values for both parameters.
I was hoping that you will add traces
as drain-type
and drain-type=all
will return logs + metrics + traces. IMHO this is an enhancement of the current all
behavior and not a breaking change. If someone needs some drain type which is not covered by the built in combinations, they can create multiple drains.
The other clean solution would be to remove the drain-type
parameter and everything will be controlled the drain-data
parameter. The users can define something like drain-data=logs,metrics
or drain-data=metrics,traces
and we parse the value and forward the envelope types which are specified with the parameter... It will a breaking change, but we could provide a script which will adjust the urls and replace the drain-type
with an equivalent value for he drain-data
parameter. If the drain-type
parameter is missing we will add an drain-data
value which is equivalent for the current drain-type
behavior.
What do you think about this?
@ctlong Any thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a second thought, after sleeping over it, maybe we can stay backward compatible if we make the parameters mutually exclusive.
We could have something like:
if u.Query().Get("include-metrics-deprecated") != "" {
return syslog.ALL
}
switch u.Query().Get("drain-data") {
case "value" :
return syslog_type
...
}
switch u.Query().Get("drain-type") {
case "value" :
return syslog_type
....
}
This way:
- if both parameters
drain-data
anddrain-type
are specifieddrain-data
will have priority - the currently active behavior can be still reached if only the
drain-type
parameter is specified, so no breaking changes.
I find the usage of both parameters still confusing in this case, but this might be a good compromise if we don't won't to introduce breaking changes. TBH I'm more in favor of breaking changes, and do the things clean with only one parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're already mutually exclusive in the pr? The new drain-data overrides the old drain-type parameter. Also, given we havn't documented drain-type especially well, I believe most of the usage will be the old deprecated drain-cli usage, which shouldn't have mixed flags.
include-metrics-deprecated also I guess overrides both? I didn't have a paticularly good reason to make it override everything, but I believe it should mostly be being used for log-cache, which we can replace with that all type flag at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Benjamintf1 Thanks for the explanation. I've checked the code once again and saw that everything is as it should be.
Regarding the docs, I will adjust the Streaming app logs to log management services page and add the new parameter.
Should I maybe write that the new drain-data
parameter is the way to go and drain-type
is used for backward compatibility when someone sets its value to all
?
Description
Type of change
Testing performed?
Checklist:
main
branch, or relevant version branchIf you have any questions, or want to get attention for a PR or issue please reach out on the #logging-and-metrics channel in the cloudfoundry slack