Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using upstream hostname for request #294

Closed
dkryptr opened this issue May 12, 2017 · 39 comments
Closed

Using upstream hostname for request #294

dkryptr opened this issue May 12, 2017 · 39 comments

Comments

@dkryptr
Copy link

dkryptr commented May 12, 2017

My team and I are working on figuring out a service discovery architecture for our multitude of Serverless stacks.

I've created two EC2 instances in AWS: one with Consul and the other with Fabio. I have a Serverless stack deployed with a single endpoint: /status. I registered a service in Consul with the address being the API Gateway URL (xxxxxxx.execute-api....com) and the port to 443. I set the tag to urlprefix-/status proto=https.

Now, maybe I'm confused on how all of this works, but how come I can't do curl <fabio-ip>:9999/status? When I run that command, I get a 403 status code and a body saying bad request from CloudFront. Adding a Trace header prints Routing to service status on https://xxxxxxx.execute-api....com:443/status. If I run curl https://xxxxxxx.execute-api....com:443/status, I get back a response from my lambda function as I expect.

It seems in order for it work, I need to add a Host header matching the API Gateway URL: curl -H 'Host: xxxxxxx.execute-api....com' <fabio-ip>:9999/status. That works fine and returns back a response from my lambda function, but that defeats the whole purpose. If I have clients go to curl the api, I can't expect them to know the API Gateway URL since Fabio and Consul are meant to abstract it.

So, my main question is, what am I doing wrong? Am I thinking about this in the wrong way? Am I missing some configuration of some sort? Thanks.

@magiconair
Copy link
Contributor

That should work. You're sure that your urlprefix- tag does not contain the hostname?

@mitchelldavis
Copy link
Contributor

mitchelldavis commented May 12, 2017

To piggy back on @CGreenburg question. We're hoping to get Fabio to work like a reverse proxy (I think). HTTP -> HTTPS in this case so our users can simply hit Fabio and Fabio routes the traffic to the backend services. (An obvious use case for the tool.) However, we're stubbing our toes somewhere.

http://<fabio machine>/<service name>/<path> --> https://<service address>/<path>
  • I think I understand that a straight through proxy wouldn't work in this case because the traffic would need to be terminated in Fabio and then continued on to the service and back in order to handle the certificates.
  • We were able to get this to work if we added the Host: <service address> to the request which is prohibitive. If the software knew the host header then it could go strait there.

What are we missing? Are we fundamentally missing something on how this should all work, or missing something in the Fabio configuration?

@mitchelldavis
Copy link
Contributor

mitchelldavis commented May 12, 2017

Here is my configuration for a Serverless REST API in Fabio:

/registrar	https://<garble>.amazonaws.com:443	proto=https tlsskipverify=true	100.00%

(purposefully garbling the url there.)

I get a Bad Request from Amazon when I run even though this is the use case we're looking for...

curl http://<ip>:9999/registrar

I should also note that I'm using the latest version of the magiconair/fabio docker container to run Fabio.

@magiconair
Copy link
Contributor

@mitchelldavis Pls open another ticket in the future. I don't mind answering several questions but I'd prefer not to morph or merge issues. You probably need to add strip=/servicename to the urlprefix tag since fabio doesn't do automatic path stripping.

@dkryptr
Copy link
Author

dkryptr commented May 12, 2017

@magiconair yes, the tag does not contain the hostname. I've noticed when running curl -v <fabio-ip>:9999/status, the Host header gets set to <fabio-ip>:9999. If I set the Host header to only the ip, I get the same result.

I registered a service pointing to www.google.com port 443 with a tag urlprefix-/google proto=https strip=/google. When I run curl -v -L <fabio-ip>:9999/google, the Host gets set to <fabio-ip>:9999 but I don't get the html from google. If I run curl -v -L -H 'Host: <fabio-ip>' http://<fabio-ip>:9999/google without the port specified in the host header, everything works like a charm.

So.... I'm unsure as to why the API Gateway URL is so picky about the Host header...

@magiconair
Copy link
Contributor

That looks like a bug.

@mitchelldavis
Copy link
Contributor

@magiconair Sorry to confuse. @CGreenburg sits behind me, and we're working through the same problem.

@magiconair
Copy link
Contributor

Indeed - that should have been obvious. Sorry for pushing :)

@magiconair
Copy link
Contributor

What does your consul service record look like?

@mitchelldavis
Copy link
Contributor

[
    {
        "Node": "api",
        "Address": "opesqzhtvl.execute-api.us-west-2.amazonaws.com",
        "TaggedAddresses": null,
        "ServiceID": "registrar",
        "ServiceName": "registrar",
        "ServiceTags": [
            "urlprefix-/registrar proto=https strip=/registrar"
        ],
        "ServiceAddress": "",
        "ServicePort": 443,
        "ServiceEnableTagOverride": false,
        "CreateIndex": 517,
        "ModifyIndex": 517
    }
]

@magiconair
Copy link
Contributor

Can you try setting the ServiceAddress ?

@mitchelldavis
Copy link
Contributor

I did and same result:

[
    {
        "Node": "api",
        "Address": "opesqzhtvl.execute-api.us-west-2.amazonaws.com",
        "TaggedAddresses": null,
        "ServiceID": "registrar",
        "ServiceName": "registrar",
        "ServiceTags": [
            "urlprefix-/registrar proto=https strip=/registrar"
        ],
        "ServiceAddress": "opesqzhtvl.execute-api.us-west-2.amazonaws.com",
        "ServicePort": 443,
        "ServiceEnableTagOverride": false,
        "CreateIndex": 745,
        "ModifyIndex": 745
    }
]

@magiconair
Copy link
Contributor

And the service is accessible under https://opesqzhtvl.execute-api.us-west-2.amazonaws.com/ ?

@mitchelldavis
Copy link
Contributor

Yes. We can curl it directly no problem.

@mitchelldavis
Copy link
Contributor

you should be able to curl it at: https://opesqzhtvl.execute-api.us-west-2.amazonaws.com/dev

@magiconair
Copy link
Contributor

magiconair commented May 12, 2017

Can you use access logging to log the target url:

fabio -log.access.format '$upstream_request_url` -log.access.target stdout

@mitchelldavis
Copy link
Contributor

mitchelldavis commented May 12, 2017

I was doing that already actually:

"192.168.16.145 - [12/May/2017:20:24:33 +0000]
        upstream_addr: opesqzhtvl.execute-api.us-west-2.amazonaws.com:443
        upstream_host: opesqzhtvl.execute-api.us-west-2.amazonaws.com$;
        upstream_request_url: https://opesqzhtvl.execute-api.us-west-2.amazonaws.com:443/dev
        upstream_request_uri: /dev
        upstream_request_scheme: https
        upstream_service: registrar"

(That extra dollar sign on the upstream_host was a typo on the log.access.format in the properties)

@magiconair
Copy link
Contributor

I'm getting a 403 since I don't have the access token.

$ curl -i https://opesqzhtvl.execute-api.us-west-2.amazonaws.com/dev
HTTP/1.1 403 Forbidden
Content-Type: application/json
Content-Length: 42
Connection: keep-alive
Date: Fri, 12 May 2017 20:30:45 GMT
x-amzn-RequestId: e07ebbf7-3751-11e7-a9b1-0b55e7aa1c29
x-amzn-ErrorType: MissingAuthenticationTokenException
X-Cache: Error from cloudfront
Via: 1.1 4f41781811f1a69022318a8d308fd9f3.cloudfront.net (CloudFront)
X-Amz-Cf-Id: rVQRObFNqJFaHmIOQjObLbqxZtrESrprFT4Ji_OaQvG2NPwMSgxDnA==

{"message":"Missing Authentication Token"}

@mitchelldavis
Copy link
Contributor

That's expected, Sorry, we're not going to give you access.

@magiconair
Copy link
Contributor

ok, late here (AMS, NL). I'll look at it probably on Monday.

@mitchelldavis
Copy link
Contributor

No Worries. Thank you!

@mitchelldavis
Copy link
Contributor

@magiconair, anymore thoughts on this?

@magiconair
Copy link
Contributor

@mitchelldavis got sidetracked. I try to have a look later tonight or tomorrow morning.

@mitchelldavis
Copy link
Contributor

@magiconair, We got a response back from the Cloud Front Engineers at AWS and it turns out that it's not working with the API Gateway because we're passing the wrong host header. In this case, we're passing the IP of the fabio machine as the host header. Cloud Front can't do anything with that host header and is expecting a header with the URL of the service we're trying to hit.

Does Fabio attempt to "re-write" the host header to the URL it's routing to?

@magiconair
Copy link
Contributor

No, it uses the host header for routing and then passes the request on as is.

@magiconair
Copy link
Contributor

OK, so the problem is this:

# b.com advertises 'urlprefix-/status' with Address: b.com
client -> http://a.com/status -> fabio -> http://b.com/status 'Host: a.com'

b.com is supposed to either advertise a.com/status and accept requests with that hostname or advertise /status and accept requests with any hostname.

Your case is different. You register the service without a hostname but the upstream service only accepts a specific hostname. fabio cannot do this right now.

I can see two options:

  1. tell fabio through an option to replace the Host header with the hostname of the upstream server
  2. tell fabio which hostname to use when making the upstream request.

I think this should only work if the upstream service advertises a route with /.

@magiconair magiconair changed the title Serverless architecture using Fabio and Consul Using upstream hostname for request May 18, 2017
@mitchelldavis
Copy link
Contributor

You've nailed it @magiconair. That's the issue. Do you see this as a viable pull request that won't destroy existing functionality? If so, I may be able to spend some cycles trying to get it implemented.

magiconair added a commit that referenced this issue May 18, 2017
WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP
-----------------------------------------------

This patch is work in progress and demonstrates the
desired behavior and adds a test to verify it. The
feature needs to be made configurable per route.

Fabio does not modify the Host header when forwarding
the request to the upstream server. This patch enables
a mode where for a specific route fabio will use the
hostname of the upstream server as the host header.

-----------------------------------------------
WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP

Fixes #294
@magiconair
Copy link
Contributor

The following line to proxy/proxy_http.go adds the behavior you want but for all services. This needs to be exposed and made configurable which probably isn't difficult but I'd like to think about it a bit more. You can hack the work around in there for now to make it work.

diff --git a/proxy/http_proxy.go b/proxy/http_proxy.go
index 4056d07..2e5d84f 100644
--- a/proxy/http_proxy.go
+++ b/proxy/http_proxy.go
@@ -99,6 +99,7 @@ func (p *HTTPProxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
        } else {
                targetURL.RawQuery = t.URL.RawQuery + "&" + r.URL.RawQuery
        }
+       r.Host = targetURL.Host

        // TODO(fs): The HasPrefix check seems redundant since the lookup function should
        // TODO(fs): have found the target based on the prefix but there may be other

@magiconair
Copy link
Contributor

I've pushed a branch which is WIP but contains an integration test to verify the behavior. The added line does what you want but now it needs to become configurable.

@mitchelldavis
Copy link
Contributor

This is awesome. Thank you for looking into this!

@mitchelldavis
Copy link
Contributor

@magiconair, I was thinking that a tag would be a great way to configure this. Just like the tlsskipverify boolean tag, we could add a usedesthostheader boolean tag that simply surrounds the code you outlined above in an if statement. Do you think that would work? Also, I'm not a GO guru, so could you give me a few hints on what needs to be updated in order for me to get this pull request started?

magiconair added a commit that referenced this issue May 24, 2017
WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP
-----------------------------------------------

This patch is work in progress and demonstrates the
desired behavior and adds a test to verify it. The
feature needs to be made configurable per route.

Fabio does not modify the Host header when forwarding
the request to the upstream server. This patch enables
a mode where for a specific route fabio will use the
hostname of the upstream server as the host header.

-----------------------------------------------
WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP

Fixes #294
magiconair added a commit that referenced this issue May 26, 2017
WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP
-----------------------------------------------

This patch is work in progress and demonstrates the
desired behavior and adds a test to verify it. The
feature needs to be made configurable per route.

Fabio does not modify the Host header when forwarding
the request to the upstream server. This patch enables
a mode where for a specific route fabio will use the
hostname of the upstream server as the host header.

-----------------------------------------------
WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP WIP

Fixes #294
@mitchelldavis
Copy link
Contributor

Thanks @magiconair, I'm currently working on a pull request. I'm having trouble getting the vault tests to pass, but we'll get it in when we can.

@magiconair
Copy link
Contributor

Can you try with vault 0.6.4?

@magiconair
Copy link
Contributor

The way I usually run the tests is VAULT_EXE=~/vault-0.6.4/vault make test

mitchelldavis pushed a commit to mitchelldavis/fabio that referenced this issue May 27, 2017
@mitchelldavis
Copy link
Contributor

@magiconair, changing to vault 0.6.4 worked. Thank you so much for your guidance and I was able to run the changes against AWS API Gateway and it worked like a charm.

magiconair pushed a commit that referenced this issue May 31, 2017
This patch adds support for a 'host=dst' option on a route which will
trigger the proxy to use the hostname of the target host for the 
outgoing request instead of the one provided by incoming request.
This allows fabio to act as a reverse proxy for an external site.
@magiconair
Copy link
Contributor

merged PR #301 to master

@magiconair
Copy link
Contributor

@mitchelldavis Thanks for this patch!

@mitchelldavis
Copy link
Contributor

You're very welcome! What kind of timeline are we looking at to get this released in a docker container?

@magiconair
Copy link
Contributor

Early next week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants