Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reverseproxy: New copy_response handler for handle_response routes #4391

Merged
merged 4 commits into from
Mar 9, 2022

Conversation

francislavoie
Copy link
Member

@francislavoie francislavoie commented Oct 19, 2021

Followup to #4298 and #4388.

This adds a new copy_response handler which may only be used in reverse_proxy's handle_response routes, which can be used to actually copy the proxy response downstream.

Previously, if handle_response was used (with routes, not the status code mode), it was impossible to use the upstream's response body at all, because we would always close the body, expecting the routes to write a new body from scratch.

To implement this, I had to refactor h.reverseProxy() to move all the code that came after the HandleResponse loop into a new function. This new function h.finalizeResponse() takes care of preparing the response by removing extra headers, dealing with trailers, then copying the headers and body downstream.

Since basically what we want copy_response to do is invoke h.finalizeResponse() at a configurable point in time, we need to pass down the proxy handler, the response, and some other state via a new req.WithContext(ctx). Wrapping a new context is pretty much the only way we have to jump a few layers in the HTTP middleware chain and let a handler pick up this information. Feels a bit dirty, but it works.

Also fixed a bug with the http.reverse_proxy.upstream.duration placeholder, it always had the same duration as http.reverse_proxy.upstream.latency, but the former was meant to be the time taken for the roundtrip plus copying/writing the response.

Example config, copy_response here will make sure the upstream's response body of {"hello": "world"} gets copied to the client. Previously, you had to write a new response with respond or file_server etc:

{
	admin off
	debug
}

:8881 {
	reverse_proxy 127.0.0.1:8882 {
		@200 status 200
		handle_response @200 {
			header Foo bar
			copy_response
		}
	}
}

:8882 {
	header Content-Type application/json
	respond `{"hello": "world"}` 200
}

TODO: I just noticed that if no handler that writes a body is used, the Content-Length header from the upstream is still copied during h.finalizeResponse(), so we end up with a mismatch between that header and the actual body in the response (being empty). Maybe we should remove the Content-Length header (or set it to zero? idk) specifically if bodyClosed is true. Not sure yet.

Edit: Fixed the Content-Length issue.

@francislavoie francislavoie added the feature ⚙️ New feature or request label Oct 19, 2021
@francislavoie francislavoie added this to the v2.5.0 milestone Oct 19, 2021
@francislavoie francislavoie requested a review from mholt October 19, 2021 08:32
@francislavoie francislavoie force-pushed the handle-response-copy-response branch 2 times, most recently from 5fe45c7 to 274ba70 Compare October 20, 2021 05:49
@caddyserver caddyserver deleted a comment from fahadfa Oct 22, 2021
@rainerborene
Copy link
Contributor

What do you think about the idea of copying only incoming headers? Right now, we have to specify each header on handle_response.

(transparent) {
  header Content-Disposition {http.reverse_proxy.header.Content-Disposition}
  header Content-Transfer-Encoding {http.reverse_proxy.header.Content-Transfer-Encoding}
  header Content-Type {http.reverse_proxy.header.Content-Type}
  header Referrer-Policy {http.reverse_proxy.header.Referrer-Policy}
  header Set-Cookie {http.reverse_proxy.header.Set-Cookie}
  header X-Content-Type-Options {http.reverse_proxy.header.X-Content-Type-Options}
  header X-Download-Options {http.reverse_proxy.header.X-Download-Options}
  header X-Frame-Options {http.reverse_proxy.header.X-Frame-Options}
  header X-Permitted-Cross-Domain-Policies {http.reverse_proxy.header.X-Permitted-Cross-Domain-Policies}
  header X-Request-Id {http.reverse_proxy.header.X-Request-Id}
  header X-Runtime {http.reverse_proxy.header.X-Runtime}
}

reverse_proxy web:3000 {
  @accel header X-Accel-Redirect *
  handle_response @accel {
    import transparent
    rewrite {http.reverse_proxy.header.X-Accel-Redirect}
    file_server
  }
}

It would be nice if we have something like mirror_headers instead of doing it manually.

@francislavoie
Copy link
Member Author

That's actually how it behaves with this PR @rainerborene

{
    admin off
    debug
}

:8881 {
    reverse_proxy 127.0.0.1:8882 {
        @200 status 200
        handle_response @200 {
            header Foo bar
            copy_response
        }
    }
}

:8882 {
    header Content-Type application/json
    header Set-Cookie foo
    respond `{"hello": "world"}` 200
}
$ curl -v localhost:8881
*   Trying 127.0.0.1:8881...
* Connected to localhost (127.0.0.1) port 8881 (#0)
> GET / HTTP/1.1
> Host: localhost:8881
> User-Agent: curl/7.74.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 18
< Content-Type: application/json
< Date: Fri, 22 Oct 2021 18:43:58 GMT
< Foo: bar
< Server: Caddy
< Server: Caddy
< Set-Cookie: foo
< 
* Connection #0 to host localhost left intact
{"hello": "world"}

@francislavoie
Copy link
Member Author

Oh, sorry, nevermind. It does copy the headers with copy_response but it doesn't if you use respond or file_server. Not a bad idea to provide a copy_response_headers directive as well. It might need include/exclude list support in case some headers are specifically problematic.

@rainerborene
Copy link
Contributor

rainerborene commented Oct 22, 2021

@francislavoie got it. I suggest splitting it in two directives then: copy_response_body and copy_response_headers.

Edit: copy prefix could be replaced with mirror. What do you think?

@francislavoie
Copy link
Member Author

francislavoie commented Oct 22, 2021

I prefer copy (because that's what the code does). And I was thinking copy_response + copy_response_headers.

@francislavoie francislavoie force-pushed the handle-response-copy-response branch 3 times, most recently from 33c590f to 1a77bd3 Compare October 31, 2021 03:19
@francislavoie
Copy link
Member Author

Alright - I've implemented copy_response_headers with include/exclude list support.

{
    admin off
    debug
}

:8881 {
    reverse_proxy 127.0.0.1:8882 {
        @200 status 200
        handle_response @200 {
            header Foo foo
            copy_response_headers {
                include Bar Baz
            }
            respond "handled"
        }
    }
}

:8882 {
    header Bar bar
    header Baz baz
    header Content-Type application/json
    respond `{"hello": "world"}` 200
}

The above will copy just the Bar header, ignoring the Content-Type and the JSON response, writing a new response of handled instead.

I ordered the copy_response_headers directive after header, I'm not sure if that's the right place for it or not. My thought is that since header supports defer, if the user wants to make things happen after the headers are copied, they can turn on defer mode.

@francislavoie francislavoie force-pushed the handle-response-copy-response branch 4 times, most recently from 2201ad3 to 166d1fc Compare October 31, 2021 05:57
@rainerborene
Copy link
Contributor

@francislavoie Is it possible to use copy_response_headers without specifying each header manually?

@francislavoie
Copy link
Member Author

francislavoie commented Oct 31, 2021

Yes, it supports either include or exclude or neither lists at all which will copy all of them.

Feel free to build from this branch to try it out.

@francislavoie francislavoie force-pushed the handle-response-copy-response branch from 166d1fc to 10000cf Compare November 30, 2021 03:45
@francislavoie francislavoie force-pushed the handle-response-copy-response branch from 10000cf to 93f5cd8 Compare December 19, 2021 00:44
@francislavoie francislavoie force-pushed the handle-response-copy-response branch from 93f5cd8 to 00766a9 Compare March 7, 2022 00:02
Followup to #4298 and #4388.

This adds a new `copy_response` handler which may only be used in `reverse_proxy`'s `handle_response` routes, which can be used to actually copy the proxy response downstream. 

Previously, if `handle_response` was used (with routes, not the status code mode), it was impossible to use the upstream's response body at all, because we would always close the body, expecting the routes to write a new body from scratch.

To implement this, I had to refactor `h.reverseProxy()` to move all the code that came after the `HandleResponse` loop into a new function. This new function `h.finalizeResponse()` takes care of preparing the response by removing extra headers, dealing with trailers, then copying the headers and body downstream.

Since basically what we want `copy_response` to do is invoke `h.finalizeResponse()` at a configurable point in time, we need to pass down the proxy handler, the response, and some other state via a new `req.WithContext(ctx)`. Wrapping a new context is pretty much the only way we have to jump a few layers in the HTTP middleware chain and let a handler pick up this information. Feels a bit dirty, but it works.

Also fixed a bug with the `http.reverse_proxy.upstream.duration` placeholder, it always had the same duration as `http.reverse_proxy.upstream.latency`, but the former was meant to be the time taken for the roundtrip _plus_ copying/writing the response.
Fixes a bug where the Content-Length will mismatch the actual bytes written if we skipped copying the response, so we get a message like this when using curl:

```
curl: (18) transfer closed with 18 bytes remaining to read
```

To replicate:

```
{
	admin off
	debug
}

:8881 {
	reverse_proxy 127.0.0.1:8882 {
		@200 status 200
		handle_response @200 {
			header Foo bar
		}
	}
}

:8882 {
	header Content-Type application/json
	respond `{"hello": "world"}` 200
}
```
@francislavoie francislavoie force-pushed the handle-response-copy-response branch from 00766a9 to 19a9115 Compare March 7, 2022 00:45
Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, finally looked it over, and honestly, it's not as scary as I thought it would be, ha. Good job.

What if we mark these features as EXPERIMENTAL so we can possibly change/tweak them later if needed?

@francislavoie
Copy link
Member Author

Experimental sounds fine to me

@mholt mholt merged commit c7d6c4c into master Mar 9, 2022
@mholt mholt deleted the handle-response-copy-response branch March 9, 2022 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature ⚙️ New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants