Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

Forwarding http requests to leader automatically. #245

Closed
dbussink opened this issue Jul 27, 2017 · 13 comments
Closed

Forwarding http requests to leader automatically. #245

dbussink opened this issue Jul 27, 2017 · 13 comments

Comments

@dbussink
Copy link

Based on the discussion in #183 (comment), @shlomi-noach asked to open a new issue to track this idea.

The idea is that instead of needing to setup something like HAProxy to forward requests to the leader based on a status endpoint, that nodes in an orchestrator cluster themselves forward the http requests to the leader so that for an operator it doesn't matter where they send requests to.

@renecannao
Copy link

@dbussink : the way I read your comment is that orchestrator itself should become a proxy.
Instead I suggest that the nodes in an orchestrator cluster themselves could reply "I am not the leader, the IP of the leader is ..." , and orchestrator-client can then communicate with leader.

@shlomi-noach
Copy link
Collaborator

#whynotboth?

@shlomi-noach
Copy link
Collaborator

Things to note:

if orchestrator service is to redirect to leader:

  • This is programmatic. Some requests should not redirect to leader. Some trivial, like lb-check or status, a few other less trivial.
  • If the node is outside the quorum (broken, ...) it wouldn't even know who the leader is (it can do programmatic work based on configuration to ask who the leader is).

if orchestrator-client service is to redirect to leader:

  • Script should have the identity of all nodes
  • Otherwise the service to which is connects may not know the identity of the leader
  • Should strongly enforce timeouts; we won't fork the script; we won't want to hang on a blocked service (imagine something like .iptables -j DROP).

@shlomi-noach
Copy link
Collaborator

shlomi-noach commented Jul 31, 2017

274dee4 supports multiple-URLs for orchestrator-client, which then deduces identity of the leader.

@derekperkins
Copy link
Contributor

@shlomi-noach This would help significantly in Kubernetes, and I'd prefer that Orchestrator itself proxy the requests. The way that it currently has to be set up, there is a readiness check that pings /api/leader-check. Since all the non-leaders report as unready/unhealthy, all calls to the main Orchestrator service are routed to the leader. This is functional, as seen below:

image

which results in a completely healthy cluster looking like
image

This is programmatic. Some requests should not redirect to leader. Some trivial, like lb-check or status, a few other less trivial.

This seems pretty trivial to be handled by a whitelist/blacklist for the proxy.

If the node is outside the quorum (broken, ...) it wouldn't even know who the leader is (it can do programmatic work based on configuration to ask who the leader is).

In this instance, I would expect it to already be reporting that it is unhealthy, so it shouldn't be handling requests anyways, and I would expect it to error out the proxy.

@shlomi-noach
Copy link
Collaborator

@derekperkins are you familiar with any other way to route queries to a specific now in a ReplicaSet?

To be honest I'd hate to code HTTP forwarding within orchestrator. It isn't a HTTP server and other tools can do it better.

In this instance, I would expect it to already be reporting that it is unhealthy, so it shouldn't be handling requests anyways, and I would expect it to error out the proxy.

You're right that we can have a /api/raft-health endpoint. Since I only direct traffic to the leader it becomes less important, since of course if you're unhealthy you can't be the leader, so obviously not sending traffic your way.

@derekperkins
Copy link
Contributor

There are two alternative solutions I can think of in a Kubernetes context, neither of which feels great.

  1. Add a sidecar proxy container to Orchestrator (nginx, Go, etc) that would selectively pass-through or proxy requests to the leader. The downside is that it's more complexity, and all raft events would have to get synced to that container.
  2. Dynamic pod labels - you could run kubectl label that would set an orchestrator: master label to whichever pod is the current master, and the k8s service would have that same selector applied. I'm not sure what the latency on the k8s this would add, and would also require a sidecar container listening for raft events.

I don't feel like either of those is significantly better than the status quo, and wouldn't serve Orchestrator users running outside of Kubernetes.

I understand the reticence to add proxying support, but especially since it is running in Go, you have some of the best tools available. :) You could probably use https://golang.org/pkg/net/http/httputil/#ReverseProxy from the stdlib with little effort.

@derekperkins
Copy link
Contributor

cc @bbeaudreault @enisoc

@shlomi-noach
Copy link
Collaborator

What would you expect to receive from a node which is outside the group, e.g. network isolated, and cannot see the leader?

@derekperkins
Copy link
Contributor

I would still have a readiness check enabled, and so I would expect that in the event of a network partition, that node would report a 404 (current behavior), so I wouldn't even be making a request to be proxied.

That does leave a small window between network partition and whenever my readiness check picks up that the node is unhealthy, where some request could feasibly come in to be proxied. In that case, I would expect a 503 or something similar.

@shlomi-noach
Copy link
Collaborator

Oh, there will be a api/freno-health check for that (am I the leader, or am I a follower in a healthy group?)

What I meant was, what would happen if you queried some /api/cluster/my-cluster? Would you expect your request to end up with 500?

@shlomi-noach
Copy link
Collaborator

@derekperkins insofar this is an entertaining exercise: #408

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants