Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

promclient: drop version check #1163

Merged
merged 2 commits into from
May 27, 2019
Merged

promclient: drop version check #1163

merged 2 commits into from
May 27, 2019

Conversation

miekg
Copy link
Contributor

@miekg miekg commented May 17, 2019

The version check would check the prometheus version and if > 2.2.0
would query the flags endpoint to do some extra validation.

Remove this for just checking the flags endpoints and if it responds do
the validation; otherwise log the failure, but continue the startup.

Signed-off-by: Miek Gieben miek@miek.nl

Changes

Verification

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall good - as we agreed on issue, but we some suggestions to make it work.

cmd/thanos/sidecar.go Outdated Show resolved Hide resolved
"Prometheus doesn't support flags endpoint, skip validation", "version", m.version.Original())
return nil
}

flags, err := promclient.ConfiguredFlags(ctx, logger, m.promURL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add flow for distincting 404 from that will show that either it's not Prometheus (this will be caught later on, so fine). Maybe via simple sentinel erorr?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid this has to retry ): As we removed first retry which helped us to wait until Prometheus is rdy (often the case). Thanks of first retry you removed we wouldn't need to retry here, as we know Prometheus must be rdy. This is not the case now without retry you removed ):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if my response is clear (:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. The 404 is a bit fiddly, because we see this. But PTAL

}

if resp.StatusCode == 200 || resp.StatusCode == 404 { // 404, just give up
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say something like

var NotFoundFlags = errors.New("No flag endpoint found")

has to be returned here. And let caller decide what to do -> e.g we should log on 404 and, grab flags on 200 and error on anything else IMO (:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being strict ):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'm not sure how you want to do that with the retry logic? If I understand that code correctly it retries upon seeing an error, meaning it will retry NotFoundFlags?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we need to pass it somehow. This question is just implementation detail. We either move retry logic to caller (I prefer that - more explicit) and just retry on certain errors or just pass error in separate var here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, moved and added a few things back to not make too many changes.

@povilasv
Copy link
Member

I think maybe we should keep the check but make the check to check for 2.x.
As this protects users from running super old Prometheus versions (1.x) and future versions.

@bwplotka
Copy link
Member

@povilasv we cannot. As users can build prom on their own so they can put whatever in a version. If they use promu and build from master it will have some commit + date + something version. Let's drop version check and try flags instead

@miekg
Copy link
Contributor Author

miekg commented May 21, 2019 via email

The version check would check the prometheus version and if > 2.2.0
would query the flags endpoint to do some extra validation.

Remove this for just checking the flags endpoints and if it responds do
the validation; otherwise log the failure, but continue the startup.

Signed-off-by: Miek Gieben <miek@miek.nl>
Copy link
Member

@povilasv povilasv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of small knits, other than that LGTM 🥇

pkg/promclient/promclient.go Outdated Show resolved Hide resolved
pkg/promclient/promclient.go Outdated Show resolved Hide resolved
Signed-off-by: Miek Gieben <miek@miek.nl>
@miekg
Copy link
Contributor Author

miekg commented May 21, 2019

ok, done.

Copy link
Member

@povilasv povilasv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 @bwplotka take another look when you have time.

@povilasv povilasv requested a review from bwplotka May 21, 2019 10:34
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but still something we could improve. WDYT? @miekg

if m.version == nil {
level.Warn(logger).Log("msg", "fetched version is nil or invalid. Unable to know whether Prometheus supports /version endpoint, skip validation")
return nil
flags := promclient.Flags{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, but it might be bit confusing - it's not really following the logical path, which can surprise reader. If there is no flags the whole function should just return. We workaround simple noFlagEndpoint bool with assuming some values and unneccesary validating them below - prone to bug in future. But not a blocker, we can merge this and fix it later.

@miekg
Copy link
Contributor Author

miekg commented May 22, 2019 via email

@GiedriusS
Copy link
Member

Happy to provide further fixes l, after potential merge. I'm still wondering about the whole mechanism, as you can reload prom configs on the fly...

On Wed, 22 May 2019, 13:56 Bartek Płotka, @.> wrote: @.* approved this pull request. LGTM, but still something we could improve. WDYT? @miekg https://github.com/miekg ------------------------------ In cmd/thanos/sidecar.go <#1163 (comment)> : > @@ -321,20 +305,27 @@ func runSidecar( } func validatePrometheus(ctx context.Context, logger log.Logger, m *promMetadata) error { - if m.version == nil { - level.Warn(logger).Log("msg", "fetched version is nil or invalid. Unable to know whether Prometheus supports /version endpoint, skip validation") - return nil + flags := promclient.Flags{ This works, but it might be bit confusing - it's not really following the logical path, which can surprise reader. If there is no flags the whole function should just return. We workaround simple noFlagEndpoint bool with assuming some values and unneccesary validating them below - prone to bug in future. But not a blocker, we can merge this and fix it later. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1163?email_source=notifications&email_token=AACWIW6PRYQKQ2DCZUD5PBTPWU7I7A5CNFSM4HNVMDOKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOBZLWQVY#pullrequestreview-240609367>, or mute the thread https://github.com/notifications/unsubscribe-auth/AACWIW2NQTGGNVQ5BMZ66I3PWU7I7ANCNFSM4HNVMDOA .

Is there really a problem with it? Min/max block duration in Prometheus are configured via command line options and AFAIK you can't change them without restarting the whole process, and if you are going to restart the whole process then Sidecar will get "disconnected" from Prometheus and it will recheck those again after getting "reconnected". Have I missed something? 🤔

@miekg
Copy link
Contributor Author

miekg commented May 22, 2019

ah, I wasn't aware it's a command line flag; that makes more sense then 👍

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, but it might be bit confusing - it's not really following the logical path, which can surprise reader. If there is no flags the whole function should just return. We workaround simple noFlagEndpoint bool with assuming some values and unneccesary validating them below - prone to bug in future. But not a blocker, we can merge this and fix it later.

This needs to be fixed at some point, but let's merge this to unblock custom builds 👍
Thanks @miekg !

@bwplotka bwplotka merged commit 43102df into thanos-io:master May 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants