Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Introduce config coordinator bundling config specific logic #1744

Merged
merged 1 commit into from
Feb 25, 2019

Conversation

mxinden
Copy link
Member

@mxinden mxinden commented Feb 6, 2019

Instead of handling all config specific logic inside
Alertmangaer.main(), this patch introduces the config coordinator
component.

Tasks of the config coordinator:

  • Load and parse configuration
  • Notify subscribers on configuration changes
  • Register and manage configuration specific metrics

This patch still has a couple of TODOs. I am opening a pull request anyways, to get early feedback whether a refactoring like this is wanted.

@beorn7
Copy link
Member

beorn7 commented Feb 6, 2019

This is acting on precisely the same code @stuartnelson3 and I are currently working on.
Can we perhaps postpone the refactoring until we are done with that?

@beorn7
Copy link
Member

beorn7 commented Feb 6, 2019

See #1743 and #1733 .

@mxinden
Copy link
Member Author

mxinden commented Feb 6, 2019

This is acting on precisely the same code @stuartnelson3 and I are currently working on.
Can we perhaps postpone the refactoring until we are done with that?

@beorn7 absolutely. The two other patches have precedence.

@beorn7
Copy link
Member

beorn7 commented Feb 6, 2019

In case I sounded too negative: I like the refactoring in general. :o)

@beorn7
Copy link
Member

beorn7 commented Feb 19, 2019

I think it's now “safe” to continue with this refactoring. Perhaps you should sync with @stuartnelson3 about basic ideas what should be in main and what should be pushed into packages, cf. his comment.

I'll now work on implementing my ideas about making the silence and inhibition checks less expensive. But that should not touch the configuration logic too much.

@mxinden mxinden force-pushed the introduce-config-coordinator branch from 738bd85 to b5f772f Compare February 20, 2019 12:46
@mxinden mxinden changed the title [WIP] *: Introduce config coordinator bundling config specific logic *: Introduce config coordinator bundling config specific logic Feb 20, 2019
@mxinden mxinden force-pushed the introduce-config-coordinator branch from b5f772f to a795818 Compare February 20, 2019 12:49
Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. But the AM experts should have their say.

}

// NewCoordinator returns a new coordinator with the given configuration file
// path.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps mention that the config file is not loaded yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Added.


// Coordinator coordinates Alertmanager configurations beyond the lifetime of a
// single configuration.
// TODO: Make Coordinator thread safe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that necessary? Isn't it the advantage of the coordinator with the subscription model that you don't need that anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say there are two reloads triggered via the API, this would require write access to config *Config in a concurrent fassion. That should be synchronized, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. "Make Coordinator thread safe" made me think you want to somehow enable concurrent reloads.

I think we either can protect the Reload method by a mutex, or we can document that Reload must only be called from a single goroutine. (I could imagine the latter is naturally the case, but if not, a mutex might be the most convenient solution.) If that's what you meant anyway, I'd suggest to move this TODO to the Reload method, reworded as "protect this method with a mutex" and add to the doc comment (for now) that Reload is not concurrency-safe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that configuration reloading is not a super time sensitive thing and does not need to be parallelized I added Mutex to protect subscribers and config. Let me know what you think.

@beorn7
Copy link
Member

beorn7 commented Feb 20, 2019

The failed CircleCI check succeeded when I reran it. Flaky test...

"time"

"github.com/prometheus/client_golang/prometheus"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) can be removed.

"file", c.cfgFilePath,
)

conf, plainCFG, err := LoadFile(c.cfgFilePath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) s/plainCFG/plainCfg/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, Go would use plainCFG see: https://github.com/golang/go/wiki/CodeReviewComments#initialisms.

I decided to replace all cfg with config to stay consistent with config.go, let me know is you think differently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I wouldn't consider CFG as an initialism or acronym but I'm not a linguistic expert :-)

tmpl, err = template.FromGlobs(conf.Templates...)
if err != nil {
return err
level.Error(logger).Log("err", fmt.Errorf("failed to parse templates: %v", err.Error()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we enter this branch, the error wouldn't bubble up anymore. Is that intended? Wouldn't it make sense for Coordinator.Subscribe to accept func(*Config) error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that got me thinking as well. If Coordinator accepts func(*Config) error, it would need to handle errors returned by other components logic. What should it do with such error? It can't know what kind of errors are possible, hence it doesn't know how to handle them, right? How would you handle the errors in the Coordinator code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Prometheus code, we apply all "reloaders" (equivalent to subscribers), log an error message for any that fails and return a generic error message tot the caller in case one reloader failed.

https://github.com/prometheus/prometheus/blob/e4a741cb7dfa24978097971c2c8d5c8cc12e6cef/cmd/prometheus/main.go#L714-L744

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, good point. Did the relevant changes. Instead of returning a generic error message, I am returning and logging a specific one. Let me know if that is ok for you. Should be better for debugging.

@mxinden mxinden force-pushed the introduce-config-coordinator branch 3 times, most recently from 8f2bce0 to 6a94cba Compare February 22, 2019 15:00
Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But as I'm not the domain expert for AM, let's wait for another approve?

}

err = c.notifySubscribers()
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for not having the if err := foo.Thing(); err != nil pattern?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right. That looks better. Adjusted above as well.

Instead of handling all config specific logic inside
Alertmangaer.main(), this patch introduces the config coordinator
component.

Tasks of the config coordinator:
- Load and parse configuration
- Notify subscribers on configuration changes
- Register and manage configuration specific metrics

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
@mxinden mxinden force-pushed the introduce-config-coordinator branch from 6a94cba to d0cd5a0 Compare February 25, 2019 10:26
@mxinden
Copy link
Member Author

mxinden commented Feb 25, 2019

Given that you both approved I will merge. Let me know if you want me to do any follow ups.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants