Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow dynamic updating of instance scrape_configs and remote_write #145

Merged
merged 3 commits into from
Aug 6, 2020

Conversation

rfratto
Copy link
Member

@rfratto rfratto commented Aug 5, 2020

This PR introduces an Update method against the Instance that allows for applying new changes found on remote_write and scrape_configs sections. If any other field changes, Update will return ErrInvalidUpdate, which will cause the InstanceManager to do the normal "restart" (i.e., shut down old instance, bring up new one with modified config).

Future changes may loosen the restriction on what is able to be dynamically updated, but I suspect that scrape_configs and remote_write are going to be far more common than anything else.

Some code cleanup can be found as part of this PR: fakes that have been called mocks have been updated to use their proper name.

Closes #140.

This commit introduces an Update method against the Instance that allows
for applying new changes found on remote_write and scrape_configs
sections for a particular instance. If any other field changes, Update
will return ErrInvalidUpdate, which will cause the InstanceManager to
do a normal "restart" of the instance.

Future commits may loosen the restriction on what is able to be
dynamically updated.

Some code cleanup can be found as part of this commit: fakes that have
been called mocks have been updated to use their proper name.
@rfratto rfratto requested a review from gotjosh August 5, 2020 17:09
// 4. Validates that after 15 seconds, the scrape endpoint and remote_write
// endpoint has been called.
func TestInstance_Update(t *testing.T) {
prometheus.DefaultRegisterer = prometheus.NewRegistry()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not happy that I have to do this, and I'm planning on cleaning this up with the PR for #141.

rfratto referenced this pull request in rfratto/agent Aug 5, 2020
This commit migrates InstanceManager to instance.BasicManager. This is
to reduce duplication as we add more implementations of the
instance.Manager interface (including removing ha.InstanceManager, which
was a duplicate of that interface).

While creating this commit, I noticed that there was a leaked locked
mutex when DeleteConfig fails. That has been fixed with this commit.

Related to the work for #141.
Depends on #145.
pkg/prom/instance/instance.go Outdated Show resolved Hide resolved
// failed, indicating that the instance config changed in an incompatible way
// and it must be fully stopped and replaced with a new instance for the new
// config to apply.
func (im *InstanceManager) forceApply(c instance.Config) error {
Copy link
Member Author

@rfratto rfratto Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this method needs to exist. I'd remove it here, but I already took it out with #146

Copy link
Contributor

@jeschkies jeschkies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot give much feedback. In any case, what is an instance?

func (i *Instance) Config() Config {
i.cfgMutex.Lock()
defer i.cfgMutex.Unlock()
return i.cfg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a naive question. Why is the lock required here? Isn't cfg a reference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.cfg will be copied by value here, but the mutex is needed to prevent i.cfg being read in the middle of a write from another Goroutine (i.e., the one calling Update), which is undefined behavior in Go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. Thanks 🙂

@rfratto
Copy link
Member Author

rfratto commented Aug 6, 2020

I cannot give much feedback. In any case, what is an instance?

Any feedback is appreciated :) Instance is a (poor) name I came up with to describe the combination of the Prometheus components we run. It's the set of service discovery, scraping, remote_write, and a WAL. The Agent can run any number of instances depending on how it's configured.

Copy link
Contributor

@gotjosh gotjosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! My comments are all nits - feel free to merge after choosing (if any) which ones to tackle.

pkg/prom/instance/instance.go Outdated Show resolved Hide resolved
pkg/prom/instance/instance.go Outdated Show resolved Hide resolved
pkg/prom/instance/instance.go Outdated Show resolved Hide resolved
pkg/prom/instance/instance.go Outdated Show resolved Hide resolved
pkg/prom/instance/instance.go Outdated Show resolved Hide resolved
@rfratto rfratto merged commit e5e440d into grafana:master Aug 6, 2020
@rfratto rfratto deleted the dynamic-update-instance branch August 6, 2020 18:46
rfratto referenced this pull request in rfratto/agent Aug 6, 2020
This commit migrates InstanceManager to instance.BasicManager. This is
to reduce duplication as we add more implementations of the
instance.Manager interface (including removing ha.InstanceManager, which
was a duplicate of that interface).

While creating this commit, I noticed that there was a leaked locked
mutex when DeleteConfig fails. That has been fixed with this commit.

Related to the work for #141.
Depends on #145.
rfratto added a commit that referenced this pull request Aug 7, 2020
* move InstanceManager to instance package

This commit migrates InstanceManager to instance.BasicManager. This is
to reduce duplication as we add more implementations of the
instance.Manager interface (including removing ha.InstanceManager, which
was a duplicate of that interface).

While creating this commit, I noticed that there was a leaked locked
mutex when DeleteConfig fails. That has been fixed with this commit.

Related to the work for #141.
Depends on #145.

* Update pkg/prom/instance/manager.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/prom/instance/manager.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/prom/instance/manager.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/prom/instance/manager.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/prom/instance/manager.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

Co-authored-by: gotjosh <josue.abreu@gmail.com>
@rfratto rfratto mentioned this pull request Feb 26, 2021
3 tasks
mattdurham pushed a commit that referenced this pull request Nov 11, 2021
)

* allow dynamic updating of instance scrape_configs and remote_write

This commit introduces an Update method against the Instance that allows
for applying new changes found on remote_write and scrape_configs
sections for a particular instance. If any other field changes, Update
will return ErrInvalidUpdate, which will cause the InstanceManager to
do a normal "restart" of the instance.

Future commits may loosen the restriction on what is able to be
dynamically updated.

Some code cleanup can be found as part of this commit: fakes that have
been called mocks have been updated to use their proper name.

* address review feedback that I gave myself

* address review feedback that I didn't give myself
mattdurham pushed a commit that referenced this pull request Nov 11, 2021
* move InstanceManager to instance package

This commit migrates InstanceManager to instance.BasicManager. This is
to reduce duplication as we add more implementations of the
instance.Manager interface (including removing ha.InstanceManager, which
was a duplicate of that interface).

While creating this commit, I noticed that there was a leaked locked
mutex when DeleteConfig fails. That has been fixed with this commit.

Related to the work for #141.
Depends on #145.

* Update pkg/prom/instance/manager.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/prom/instance/manager.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/prom/instance/manager.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/prom/instance/manager.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

* Update pkg/prom/instance/manager.go

Co-authored-by: gotjosh <josue.abreu@gmail.com>

Co-authored-by: gotjosh <josue.abreu@gmail.com>
@github-actions github-actions bot added the frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. label May 3, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow dynamic updating of instances
3 participants