-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
helm: Allow non-leader Orchestrator instances to accept requests #3665
Conversation
I haven't been able to test this yet, AKS broke my cluster and won't let me create a new 1.9.x cluster yet. |
Related to issue openark/orchestrator#245 and PR openark/orchestrator#408 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -24,34 +24,6 @@ spec: | |||
|
|||
--- | |||
|
|||
################################### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also remove the serviceName
from the StatefulSet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it looks like we have to set it to empty since it's required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually changed it to the main orchestrator
service, since that is replacing the old headless service. I also wonder if that will solve your port problem. I'd test, but AKS is still not working.
helm/vitess/values.yaml
Outdated
# Default values for orchestrator resources | ||
orchestrator: | ||
enabled: false | ||
image: "vitess/orchestrator:3.0.6" | ||
image: "vitess/orchestrator:3.0.7" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the k8s/orchestrator Dockerfile to build 3.0.7?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3.0.8
is coming up very shortly with bugfixes to 3.0.7
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should change to 3.0.9
now?
I think I got it working, but I had to add port 3000 to the per-pod orchestrator services, since that's the port that the reverse proxy is using. Is there an orc config option to change the port that followers assume the leader is serving the API on? Or maybe we should just tell orchestrator to directly serve on 80? |
The leader advertises to the followers on which port it is listening. If configured to |
This computation: Is done independently on each node, once, upon startup, and then advertised to the followers any time the node turns to be the leader. |
e66efac
to
860e41d
Compare
From the code that @shlomi-noach linked, it looks like you can specify advertised host separately from the listen host, but you can't specify advertised port separately from the listen port. Because we're using k8s Services to route port 80 (on the Service) to port 3000 (on the Pod), our advertised and listen ports are different. We could fix this by:
For testing, I did (3) since it's the quickest, but (2) is probably better in the long run for reducing confusion and complexity in our config. Although (1) sounds the most flexible, I'd rather do (2) even if orc adds that feature, since it simplifies our setup. |
hat's for the
that's about the HTTP endpoint. You are correct in your analysis. I'm merely pointing out that the HTTP endpoint doesn't have a "listen vs. advertised" config in the first place. |
@shlomi-noach wrote:
If the RaftAdvertise host wasn't intended to apply to the HTTP endpoint, then isn't it technically incorrect for the reverse proxy to use that host when hitting the HTTP port? If RaftAdvertise won't play double duty, it seems necessary to either auto-detect the local address (in case ListenAddress leaves the host empty) or provide a separate HttpAdvertise setting. That would actually be better for us, since HTTP traffic could bypass the per-Pod Service (which is a workaround to make Raft think our IPs are static) and go directly Pod-to-Pod. Of course, this assumes that followers will be advised of the latest Leader URI if an orc node has been restarted (in a replacement Pod) with the same RaftAdvertise host, but a different HttpAdvertise host. |
I'm a bit confused about the question, so I'm going to answer what I did understand and how and why whatever works works the way it does. And hope fully you can point me back on track?
Historically, when it was created, there was no reverse proxy, so "intended" or "not intended" are not the right words to use.
Why it is incorrect? Say the host is
Sorry, I'm not a native English speaker, and such phrases always leave me uncertain of their meaning. What does it mean for
But the local address would not necessarily be visible to outsiders, right? So whatever the host self-resolves is irrelevant to remote spectators.
Apologies, I'm not sure what that it means to bypass the per-pod service, or to go directly pod-to-pod. |
It's not the case for us currently, but I see now how that's unexpected. I shouldn't have used the phrase "technically incorrect" since we're the ones who are doing weird stuff. :) If we were using RaftAdvertised because of a machine with multiple addresses, it would be appropriate to assume you could use the same address for any port. However, we are actually using RaftAdvertised because of reverse proxies (that's basically what a k8s Service is). An example might look something like this:
Sorry for the regional idioms :) What I meant by "not play double duty" was the idea that RaftAdvertise applies only to Raft, and not to HTTP or any other port/protocol. If that's the case, I was proposing an equivalent setting for HTTP so we can set the advertised address explicitly.
Agreed. That's why I would prefer the explicit HttpAdvertise setting over trying to detect it automatically.
Basically I'm saying it would be nice if HTTP traffic could bypass the reverse proxy we set up for Raft, because it's not necessary for HTTP. We only needed the reverse proxy for Raft in order to make Raft think our IPs are static. |
OK, to me this reads like an Assuming
|
I guess it could make sense if the user only cares about changing the port, but where would you get the hostname from in that case? If the alternative to making hostname required is to take it from RaftAdvertise, I would prefer that it's required. Part of my confusion above was that I didn't expect a variable called RaftAdvertise to apply to the HTTP port.
I think that would make sense. |
Please see openark/orchestrator#430 for a |
@enisoc @derekperkins |
4948aca
to
f12d5ca
Compare
@enisoc I updated Orchestrator to 3.0.9 and pushed the As for the HTTPAdvertise feature, I'm not totally sure what you're wanting that configuration to look like. I added a commit that is likely using it incorrectly, but I figure that will make it easier to discuss the right config. I also wasn't sure if you wanted to open up a new port on the Orchestrator service. Once we figure that out, I'll overwrite that last commit and this should be good to merge. |
Here is my Orchestrator setting: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a spot to change orchestrator
version to 3.0.9
helm/vitess/values.yaml
Outdated
# Default values for orchestrator resources | ||
orchestrator: | ||
enabled: false | ||
image: "vitess/orchestrator:3.0.6" | ||
image: "vitess/orchestrator:3.0.7" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should change to 3.0.9
now?
Should this PR be merged or abandoned in favor of the operator? |
@sougou It's worth figuring out regardless. I'm just not familiar enough with the reasons @shlomi-noach and @enisoc architected it to take this any further. It should be relatively simple to get merged once I know what to put into the |
385edbd
to
2d32a99
Compare
I just rebased on master, updated to Orchestrator 3.0.10 + pmm-client 1.10.0 and pushed the corresponding docker images. |
I'm leaving |
@@ -60,7 +55,7 @@ kind: StatefulSet | |||
metadata: | |||
name: orchestrator | |||
spec: | |||
serviceName: orchestrator-headless | |||
serviceName: orchestrator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to stay orchestrator-headless
since (I believe) per-Pod DNS only works with headless services.
@@ -46,6 +46,7 @@ data: | |||
"HostnameResolveMethod": "none", | |||
"HTTPAuthPassword": "", | |||
"HTTPAuthUser": "", | |||
"HTTPAdvertise": "orchestrator-headless.{{ $namespace }}:80", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My goal was to set HTTPAdvertise to bypass any Service, and talk directly Pod-to-Pod. I think we can do that by referring here to the per-Pod DNS entries created in the headless service, so it should look something like:
POD_NAME.orchestrator-headless.{{ $namespace }}:3000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes more sense to me now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried it out and got this error
FATAL If specified, HTTPAdvertise must include host name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It didn't like not having http://
. I'm waiting for my cluster to cycle down so I can test it again.
https://play.golang.org/p/fxedGpvILSf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding http seems to have done the trick
c90d952
to
eef85db
Compare
This is working perfectly for me now. Once I get a LGTM from you @enisoc, I'll rebase my changes and fix the DCO. |
Other than fixing HTTPAdvertise, I was able to eliminate all the per-StatefulSet services in lieu of using the headless service DNS entries. I think we just added the services to get a persistent DNS record in the event that the pod moved around, but this should have the same effect. |
I thought the original reason for doing Service-per-Pod was that Orchestrator's Raft library required static IPs, not just static DNS. Is that still the case? |
Unsure what static DNS is?
|
I forgot about that, my bad. I tested deleting a pod and it failed. I'll roll those changes back. |
@enisoc I just launched a test cluster and all is well. Is there anything else we need to do before merging this aside from me rebasing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after squash.
3.0.7 added a proxy that forwards master only requests, so we don’t have to workaround that by having perpetually unready pods via the /api/leader-check endpoint 3.0.9 added HTTPAdvertise, which lets us eliminate the open raft port Signed-off-by: Derek Perkins <derek@derekperkins.com>
Signed-off-by: Derek Perkins <derek@derekperkins.com>
56ad760
to
a20c255
Compare
squashed and ready for merge |
WOOHOO! |
Thanks for the help with the core Orchestrator bits @shlomi-noach! This really makes it much nicer. |
Hey I actually have no idea what I've signed up for! 😆 |
Orchestrator 3.0.7 added a proxy that forwards master only requests, so we don’t have to workaround that by having perpetually unready pods via the /api/leader-check endpoint
cc @shlomi-noach @enisoc