-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: Introduce config coordinator bundling config specific logic #1744
*: Introduce config coordinator bundling config specific logic #1744
Conversation
This is acting on precisely the same code @stuartnelson3 and I are currently working on. |
@beorn7 absolutely. The two other patches have precedence. |
In case I sounded too negative: I like the refactoring in general. :o) |
I think it's now “safe” to continue with this refactoring. Perhaps you should sync with @stuartnelson3 about basic ideas what should be in main and what should be pushed into packages, cf. his comment. I'll now work on implementing my ideas about making the silence and inhibition checks less expensive. But that should not touch the configuration logic too much. |
738bd85
to
b5f772f
Compare
b5f772f
to
a795818
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. But the AM experts should have their say.
config/coordinator.go
Outdated
} | ||
|
||
// NewCoordinator returns a new coordinator with the given configuration file | ||
// path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps mention that the config file is not loaded yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Added.
config/coordinator.go
Outdated
|
||
// Coordinator coordinates Alertmanager configurations beyond the lifetime of a | ||
// single configuration. | ||
// TODO: Make Coordinator thread safe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that necessary? Isn't it the advantage of the coordinator with the subscription model that you don't need that anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say there are two reloads triggered via the API, this would require write access to config *Config
in a concurrent fassion. That should be synchronized, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. "Make Coordinator thread safe" made me think you want to somehow enable concurrent reloads.
I think we either can protect the Reload
method by a mutex, or we can document that Reload
must only be called from a single goroutine. (I could imagine the latter is naturally the case, but if not, a mutex might be the most convenient solution.) If that's what you meant anyway, I'd suggest to move this TODO to the Reload
method, reworded as "protect this method with a mutex" and add to the doc comment (for now) that Reload
is not concurrency-safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that configuration reloading is not a super time sensitive thing and does not need to be parallelized I added Mutex
to protect subscribers
and config
. Let me know what you think.
The failed CircleCI check succeeded when I reran it. Flaky test... |
config/coordinator.go
Outdated
"time" | ||
|
||
"github.com/prometheus/client_golang/prometheus" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) can be removed.
config/coordinator.go
Outdated
"file", c.cfgFilePath, | ||
) | ||
|
||
conf, plainCFG, err := LoadFile(c.cfgFilePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) s/plainCFG/plainCfg/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, Go would use plainCFG
see: https://github.com/golang/go/wiki/CodeReviewComments#initialisms.
I decided to replace all cfg
with config
to stay consistent with config.go
, let me know is you think differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I wouldn't consider CFG
as an initialism or acronym but I'm not a linguistic expert :-)
cmd/alertmanager/main.go
Outdated
tmpl, err = template.FromGlobs(conf.Templates...) | ||
if err != nil { | ||
return err | ||
level.Error(logger).Log("err", fmt.Errorf("failed to parse templates: %v", err.Error())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we enter this branch, the error wouldn't bubble up anymore. Is that intended? Wouldn't it make sense for Coordinator.Subscribe
to accept func(*Config) error
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that got me thinking as well. If Coordinator
accepts func(*Config) error
, it would need to handle errors returned by other components logic. What should it do with such error? It can't know what kind of errors are possible, hence it doesn't know how to handle them, right? How would you handle the errors in the Coordinator
code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Prometheus code, we apply all "reloaders" (equivalent to subscribers), log an error message for any that fails and return a generic error message tot the caller in case one reloader failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, good point. Did the relevant changes. Instead of returning a generic error message, I am returning and logging a specific one. Let me know if that is ok for you. Should be better for debugging.
8f2bce0
to
6a94cba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. But as I'm not the domain expert for AM, let's wait for another approve?
config/coordinator.go
Outdated
} | ||
|
||
err = c.notifySubscribers() | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason for not having the if err := foo.Thing(); err != nil
pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. That looks better. Adjusted above as well.
Instead of handling all config specific logic inside Alertmangaer.main(), this patch introduces the config coordinator component. Tasks of the config coordinator: - Load and parse configuration - Notify subscribers on configuration changes - Register and manage configuration specific metrics Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
6a94cba
to
d0cd5a0
Compare
Given that you both approved I will merge. Let me know if you want me to do any follow ups. |
Instead of handling all config specific logic inside
Alertmangaer.main(), this patch introduces the config coordinator
component.
Tasks of the config coordinator:
This patch still has a couple of
TODO
s. I am opening a pull request anyways, to get early feedback whether a refactoring like this is wanted.