Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helm: Allow non-leader Orchestrator instances to accept requests #3665

Merged
merged 2 commits into from
May 22, 2018

Conversation

derekperkins
Copy link
Member

Orchestrator 3.0.7 added a proxy that forwards master only requests, so we don’t have to workaround that by having perpetually unready pods via the /api/leader-check endpoint

cc @shlomi-noach @enisoc

@derekperkins
Copy link
Member Author

I haven't been able to test this yet, AKS broke my cluster and won't let me create a new 1.9.x cluster yet.

@derekperkins
Copy link
Member Author

derekperkins commented Feb 20, 2018

Related to issue openark/orchestrator#245 and PR openark/orchestrator#408

Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -24,34 +24,6 @@ spec:

---

###################################
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also remove the serviceName from the StatefulSet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it looks like we have to set it to empty since it's required.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually changed it to the main orchestrator service, since that is replacing the old headless service. I also wonder if that will solve your port problem. I'd test, but AKS is still not working.

# Default values for orchestrator resources
orchestrator:
enabled: false
image: "vitess/orchestrator:3.0.6"
image: "vitess/orchestrator:3.0.7"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the k8s/orchestrator Dockerfile to build 3.0.7?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.0.8 is coming up very shortly with bugfixes to 3.0.7.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should change to 3.0.9 now?

@enisoc
Copy link
Member

enisoc commented Feb 20, 2018

I think I got it working, but I had to add port 3000 to the per-pod orchestrator services, since that's the port that the reverse proxy is using. Is there an orc config option to change the port that followers assume the leader is serving the API on? Or maybe we should just tell orchestrator to directly serve on 80?

@shlomi-noach
Copy link
Contributor

Is there an orc config option to change the port that followers assume the leader is serving the API on?

The leader advertises to the followers on which port it is listening. If configured to "ListenAddress": ":12345" then that's the port the followers will use.

@shlomi-noach
Copy link
Contributor

@enisoc
Copy link
Member

enisoc commented Feb 21, 2018

From the code that @shlomi-noach linked, it looks like you can specify advertised host separately from the listen host, but you can't specify advertised port separately from the listen port. Because we're using k8s Services to route port 80 (on the Service) to port 3000 (on the Pod), our advertised and listen ports are different.

We could fix this by:

  1. Asking orc to let us specify advertised port separately from listen port.
  2. Changing our config to use the same port on the Service as the listen port (either 80 for both, or 3000 for both).
  3. Adding port 3000 to the Service in addition to port 80, so both Service ports redirect to port 3000 on the Pod.

For testing, I did (3) since it's the quickest, but (2) is probably better in the long run for reducing confusion and complexity in our config. Although (1) sounds the most flexible, I'd rather do (2) even if orc adds that feature, since it simplifies our setup.

@shlomi-noach
Copy link
Contributor

it looks like you can specify advertised host separately from the listen host

hat's for the raft endpoint, whereas:

but you can't specify advertised port separately from the listen port

that's about the HTTP endpoint.

You are correct in your analysis. I'm merely pointing out that the HTTP endpoint doesn't have a "listen vs. advertised" config in the first place.

@enisoc
Copy link
Member

enisoc commented Feb 22, 2018

@shlomi-noach wrote:

I'm merely pointing out that the HTTP endpoint doesn't have a "listen vs. advertised" config in the first place.

If the RaftAdvertise host wasn't intended to apply to the HTTP endpoint, then isn't it technically incorrect for the reverse proxy to use that host when hitting the HTTP port? If RaftAdvertise won't play double duty, it seems necessary to either auto-detect the local address (in case ListenAddress leaves the host empty) or provide a separate HttpAdvertise setting.

That would actually be better for us, since HTTP traffic could bypass the per-Pod Service (which is a workaround to make Raft think our IPs are static) and go directly Pod-to-Pod. Of course, this assumes that followers will be advised of the latest Leader URI if an orc node has been restarted (in a replacement Pod) with the same RaftAdvertise host, but a different HttpAdvertise host.

@shlomi-noach
Copy link
Contributor

I'm a bit confused about the question, so I'm going to answer what I did understand and how and why whatever works works the way it does. And hope fully you can point me back on track?

If the RaftAdvertise host wasn't intended to apply to the HTTP endpoint

Historically, when it was created, there was no reverse proxy, so "intended" or "not intended" are not the right words to use.

then isn't it technically incorrect for the reverse proxy to use that host when hitting the HTTP port?

Why it is incorrect? Say the host is 22.33.44.55 and RaftAdvertised = 77.88.66.55. The raft members will seek raft communication on 77.88.66.55:10008. The reverse proxy mechanism will seek the HTTP server on 77.88.66.55:XXXX, (assuming XXXX is the port in ListenAddr, and typically 3000).
Do you think reverse proxy should use 22.33.44.55:3000? I think not; if RaftAdvertised is specified, it's specified because the host is only visible to outsiders via some proxy/IP, and that IP (77.88.66.55 in our example) would hold true for :10008 as well as :3000. Is this not the case?

If RaftAdvertise won't play double duty

Sorry, I'm not a native English speaker, and such phrases always leave me uncertain of their meaning. What does it mean for RaftAdvertise to not play double duty?

it seems necessary to either auto-detect the local address

But the local address would not necessarily be visible to outsiders, right? So whatever the host self-resolves is irrelevant to remote spectators.

That would actually be better for us, since HTTP traffic could bypass the per-Pod Service (which is a workaround to make Raft think our IPs are static) and go directly Pod-to-Pod.

Apologies, I'm not sure what that it means to bypass the per-pod service, or to go directly pod-to-pod.

@enisoc
Copy link
Member

enisoc commented Feb 26, 2018

if RaftAdvertised is specified, it's specified because the host is only visible to outsiders via some proxy/IP, and that IP (77.88.66.55 in our example) would hold true for :10008 as well as :3000. Is this not the case?

It's not the case for us currently, but I see now how that's unexpected. I shouldn't have used the phrase "technically incorrect" since we're the ones who are doing weird stuff. :)

If we were using RaftAdvertised because of a machine with multiple addresses, it would be appropriate to assume you could use the same address for any port. However, we are actually using RaftAdvertised because of reverse proxies (that's basically what a k8s Service is).

An example might look something like this:

  • The host address is 22.33.44.55 and the host only has that one IP.
  • We use a reverse proxy at 77.88.66.55 to forward the raft port (10008) and only the raft port.
  • We set RaftAdvertised to 77.88.66.55.
  • Trying to access port 3000 on the RaftAdvertised address fails because that port is not forwarded.

What does it mean for RaftAdvertise to not play double duty?

Sorry for the regional idioms :)

What I meant by "not play double duty" was the idea that RaftAdvertise applies only to Raft, and not to HTTP or any other port/protocol. If that's the case, I was proposing an equivalent setting for HTTP so we can set the advertised address explicitly.

But the local address would not necessarily be visible to outsiders, right? So whatever the host self-resolves is irrelevant to remote spectators.

Agreed. That's why I would prefer the explicit HttpAdvertise setting over trying to detect it automatically.

I'm not sure what that it means to bypass the per-pod service, or to go directly pod-to-pod.

Basically I'm saying it would be nice if HTTP traffic could bypass the reverse proxy we set up for Raft, because it's not necessary for HTTP. We only needed the reverse proxy for Raft in order to make Raft think our IPs are static.

@shlomi-noach
Copy link
Contributor

OK, to me this reads like an orchestrator feature request to support an optional HttpAdvertise.

Assuming HttpAdvertise is configured by the user:

  • Does it makes sense for it to not have a hostname (can it be :8080)?
  • Does it makes sense for it to not have a port (can it be my-host-12345.com)?

@enisoc
Copy link
Member

enisoc commented Feb 27, 2018

Does it makes sense for it to not have a hostname (can it be :8080)?

I guess it could make sense if the user only cares about changing the port, but where would you get the hostname from in that case? If the alternative to making hostname required is to take it from RaftAdvertise, I would prefer that it's required. Part of my confusion above was that I didn't expect a variable called RaftAdvertise to apply to the HTTP port.

Does it makes sense for it to not have a port (can it be my-host-12345.com)?

I think that would make sense.

@shlomi-noach
Copy link
Contributor

shlomi-noach commented Mar 6, 2018

Please see openark/orchestrator#430 for a HTTPAdvertise offering.

@shlomi-noach
Copy link
Contributor

@derekperkins
Copy link
Member Author

@enisoc I updated Orchestrator to 3.0.9 and pushed the vitess/orchestrator:3.0.9 image.

As for the HTTPAdvertise feature, I'm not totally sure what you're wanting that configuration to look like. I added a commit that is likely using it incorrectly, but I figure that will make it easier to discuss the right config. I also wasn't sure if you wanted to open up a new port on the Orchestrator service. Once we figure that out, I'll overwrite that last commit and this should be good to merge.

@derekperkins
Copy link
Member Author

Here is my Orchestrator setting:"HTTPAdvertise": "orchestrator-headless.vitess:80"
Resulting in this error: FATAL If specified, HTTPAdvertise must include host name

Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a spot to change orchestrator version to 3.0.9

# Default values for orchestrator resources
orchestrator:
enabled: false
image: "vitess/orchestrator:3.0.6"
image: "vitess/orchestrator:3.0.7"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should change to 3.0.9 now?

@sougou
Copy link
Contributor

sougou commented May 6, 2018

Should this PR be merged or abandoned in favor of the operator?

@derekperkins
Copy link
Member Author

@sougou It's worth figuring out regardless. I'm just not familiar enough with the reasons @shlomi-noach and @enisoc architected it to take this any further. It should be relatively simple to get merged once I know what to put into the HTTPAdvertise flag.

@derekperkins derekperkins force-pushed the orchestrator-update branch 2 times, most recently from 385edbd to 2d32a99 Compare May 10, 2018 01:17
@derekperkins
Copy link
Member Author

derekperkins commented May 10, 2018

I just rebased on master, updated to Orchestrator 3.0.10 + pmm-client 1.10.0 and pushed the corresponding docker images.

@shlomi-noach
Copy link
Contributor

I'm leaving HTTPAdvertise to @enisoc . I don't have the insights he does on k8s deployments.

@@ -60,7 +55,7 @@ kind: StatefulSet
metadata:
name: orchestrator
spec:
serviceName: orchestrator-headless
serviceName: orchestrator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to stay orchestrator-headless since (I believe) per-Pod DNS only works with headless services.

@@ -46,6 +46,7 @@ data:
"HostnameResolveMethod": "none",
"HTTPAuthPassword": "",
"HTTPAuthUser": "",
"HTTPAdvertise": "orchestrator-headless.{{ $namespace }}:80",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My goal was to set HTTPAdvertise to bypass any Service, and talk directly Pod-to-Pod. I think we can do that by referring here to the per-Pod DNS entries created in the headless service, so it should look something like:

POD_NAME.orchestrator-headless.{{ $namespace }}:3000

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes more sense to me now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried it out and got this error
FATAL If specified, HTTPAdvertise must include host name

Copy link
Member Author

@derekperkins derekperkins May 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't like not having http://. I'm waiting for my cluster to cycle down so I can test it again.
https://play.golang.org/p/fxedGpvILSf

https://github.com/github/orchestrator/blob/677a004d0374e03e78e87bca417e67f673927fa1/go/config/config.go#L566

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding http seems to have done the trick

@derekperkins derekperkins force-pushed the orchestrator-update branch from c90d952 to eef85db Compare May 22, 2018 01:06
@derekperkins
Copy link
Member Author

This is working perfectly for me now. Once I get a LGTM from you @enisoc, I'll rebase my changes and fix the DCO.

@derekperkins
Copy link
Member Author

Other than fixing HTTPAdvertise, I was able to eliminate all the per-StatefulSet services in lieu of using the headless service DNS entries. I think we just added the services to get a persistent DNS record in the event that the pod moved around, but this should have the same effect.

@enisoc
Copy link
Member

enisoc commented May 22, 2018

I thought the original reason for doing Service-per-Pod was that Orchestrator's Raft library required static IPs, not just static DNS. Is that still the case?

@shlomi-noach
Copy link
Contributor

I thought the original reason for doing Service-per-Pod was that Orchestrator's Raft library required static IPs, not just static DNS. Is that still the case?

Unsure what static DNS is?

orchestrator's raft library does indeed require static IPs, at least externally facing (which is why RaftAdvertise exists), and nothing has changed in the past few months.

@derekperkins
Copy link
Member Author

I forgot about that, my bad. I tested deleting a pod and it failed. I'll roll those changes back.
openark/orchestrator#253

@derekperkins
Copy link
Member Author

derekperkins commented May 22, 2018

@enisoc I just launched a test cluster and all is well. Is there anything else we need to do before merging this aside from me rebasing?

Copy link
Member

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after squash.

3.0.7 added a proxy that forwards master only requests, so we don’t have to workaround that by having perpetually unready pods via the /api/leader-check endpoint

3.0.9 added HTTPAdvertise, which lets us eliminate the open raft port

Signed-off-by: Derek Perkins <derek@derekperkins.com>
Signed-off-by: Derek Perkins <derek@derekperkins.com>
@derekperkins derekperkins force-pushed the orchestrator-update branch from 56ad760 to a20c255 Compare May 22, 2018 21:11
@derekperkins
Copy link
Member Author

squashed and ready for merge

@enisoc enisoc changed the title helm: remove headless Orchestrator service helm: Allow non-leader Orchestrator instances to accept requests May 22, 2018
@enisoc enisoc merged commit c8e9070 into vitessio:master May 22, 2018
@shlomi-noach
Copy link
Contributor

WOOHOO!

@derekperkins
Copy link
Member Author

Thanks for the help with the core Orchestrator bits @shlomi-noach! This really makes it much nicer.

@shlomi-noach
Copy link
Contributor

Hey I actually have no idea what I've signed up for! 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants