Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Caddy v2.4.2 with specified rewrite rules causes unexpected 308 redirection #4205

Closed
ghost opened this issue Jun 14, 2021 · 19 comments
Closed
Assignees
Labels
bug 🐞 Something isn't working
Milestone

Comments

@ghost
Copy link

ghost commented Jun 14, 2021

Phenomenon: Open web page on Firefox and Chrome, and they both say "the page cannot redirect correctly". After downgrading Caddy to v2.4.1, everything becomes OK.

Caddy configuation (Caddyfile):

${domain}
root * /path/to/html
encode zstd gzip
tls /path/to/cert.pem /path/to/key.pem {
  protocols tls1.3
  alpn h2 http/1.1
  curves x25519
}
header Strict-Transport-Security max-age=31536000
reverse_proxy /path/to/proxy 127.0.0.1:${port} {
  transport http {
    versions h2c
  }
}
@common not path /path/to/proxy
handle @common {
  rewrite * /path/to/rewrite.html
}
file_server

In Caddy's log, there are many lines like the one below:

{"level":"error","ts":1623667844.8768942,"logger":"http.handlers.reverse_proxy","msg":"aborting with incomplete response","error":"context canceled"}
@mholt
Copy link
Member

mholt commented Jun 14, 2021

This seems to happen when there's a rewrite (correct?). I think we need to revert f9b5445.

@diamondburned Could you help take a look at this?

@inoblue Can you distill your config down to the minimum required to reproduce the issue, and a curl -v command that exhibits the behavior?

@mholt mholt added the bug 🐞 Something isn't working label Jun 14, 2021
@mholt mholt added this to the v2.4.3 milestone Jun 14, 2021
mholt added a commit that referenced this issue Jun 14, 2021
@mholt
Copy link
Member

mholt commented Jun 14, 2021

For now, @inoblue, I've reverted what I believe to be the regression -- if you can build from source can you verify?

@diamondburned, maybe we can come up with a better solution to #4179 that doesn't cause a regression.

@ghost
Copy link
Author

ghost commented Jun 14, 2021

The issue was solved after building from source.

Thanks for your help.

@diamondburned
Copy link
Contributor

diamondburned commented Jun 14, 2021

Are there any debug logs available with the broken commit?

Edit: I think it has to do with the changes in modules/caddyhttp/fileserver/staticfiles.go. I wasn't too sure what the changes there had to do with my issue, and I only experienced an issue with modules/caddyhttp/fileserver/browse.go, but the original fix was for staticfiles.go.

I think I can make a commit with just the fix for browse.go and see how that goes.

@francislavoie
Copy link
Member

francislavoie commented Jun 14, 2021

FWIW, what you were fixing was a bug for both browse and non-browse. I've seen it come up on a couple topics on the forums where users wanted to use file_server within handle_path as well. So ideally we fix it for both. And I don't see why that should work fundamentally differently between the two, they're related.

@mholt
Copy link
Member

mholt commented Jun 14, 2021

The more I look at this and fiddle with it, the more I'm convinced that canonicalizing based on the original request path is not the correct behavior (i.e. I was wrong in the forum thread where I suggested that; or at least, wrong in the implementation).

@mholt
Copy link
Member

mholt commented Jun 14, 2021

@inoblue Can you please share your full config and curl commands and logs? I need to know your exact use case. I can tell your config is not real because it has lines like rewrite * /path/to/rewrite.html but this isn't helpful in us understanding the issue.

Ideally, we need to be able to reproduce the bug in the most minimal way possible. This allows us to write regression tests to verify the fix is working. If we can't reproduce it, then you'll have to test our changes for us until it's fixed -- and then we can't add test cases, either.

I've attached a template below that will help make this easier and faster! This will require some effort on your part -- please understand that we will be dedicating time to fix the bug you are reporting if you can just help us understand it and reproduce it easily.

This template will ask for some information you've already provided; that's OK, just fill it out the best you can. 👍 I've also included some helpful tips below the template. Feel free to let me know if you have any questions!

Thank you again for your report, we look forward to resolving it!

Template

## 1. Environment

### 1a. Operating system and version

```
paste here
```


### 1b. Caddy version (run `caddy version` or paste commit SHA)

```
paste here
```


### 1c. Go version (if building Caddy from source; run `go version`)

```
paste here
```


## 2. Description

### 2a. What happens (briefly explain what is wrong)




### 2b. Why it's a bug (if it's not obvious)




### 2c. Log output

```
paste terminal output or logs here
```



### 2d. Workaround(s)




### 2e. Relevant links




## 3. Tutorial (minimal steps to reproduce the bug)




Instructions -- please heed otherwise we cannot help you (help us help you!)

  1. Environment: Please fill out your OS and Caddy versions, even if you don't think they are relevant. (They are always relevant.) If you built Caddy from source, provide the commit SHA and specify your exact Go version.

  2. Description: Describe at a high level what the bug is. What happens? Why is it a bug? Not all bugs are obvious, so convince readers that it's actually a bug.

    • 2c) Log output: Paste terminal output and/or complete logs in a code block. DO NOT REDACT INFORMATION except for credentials.
    • 2d) Workaround: What are you doing to work around the problem in the meantime? This can help others who encounter the same problem, until we implement a fix.
    • 2e) Relevant links: Please link to any related issues, pull requests, docs, and/or discussion. This can add crucial context to your report.
  3. Tutorial: What are the minimum required specific steps someone needs to take in order to experience the same bug? Your goal here is to make sure that anyone else can have the same experience with the bug as you do. You are writing a tutorial, so make sure to carry it out yourself before posting it. Please:

    • Start with an empty config. Add only the lines/parameters that are absolutely required to reproduce the bug.
    • Do not run Caddy inside containers.
    • Run Caddy manually in your terminal; do not use systemd or other init systems.
    • If making HTTP requests, avoid web browsers. Use a simpler HTTP client instead, like curl.
    • Do not redact any information from your config (except credentials). Domain names are public knowledge and often necessary for quick resolution of an issue!
    • Note that ignoring this advice may result in delays, or even in your issue being closed. 😞 Only actionable issues are kept open, and if there is not enough information or clarity to reproduce the bug, then the report is not actionable.

Example of a tutorial:

Create a config file:
{ ... }

Open terminal and run Caddy:

$ caddy ...

Make an HTTP request:

$ curl ...

Notice that the result is ___ but it should be ___.

I will close the issue since we reverted the commit, and tag this as need more info. We can reopen it once we have that information.

@mholt mholt closed this as completed Jun 14, 2021
@mholt mholt added the needs info 📭 Requires more information label Jun 14, 2021
@cam-perry
Copy link

I am experiencing the same issue and downgrading to 2.4.1 worked for me. Adding repro details:

1. Environment

1a. Operating system and version

Ubuntu 18.04

1b. Caddy version (run caddy version or paste commit SHA)

v2.4.2 h1:chB106RlsIaY4mVEyq9OQM5g/9lHYVputo/LAX2ndFg=

2. Description

2a. What happens (briefly explain what is wrong)

Using file_server to serve the output of a webpack build, where we need all paths not belonging to a static file to rewrite to an index.html outside the directory of the Caddyfile. A GET to Caddy at http://localhost:3030 responds with a 308 Redirect with the header Location: /. This causes a redirect loop for any request to Caddy.

2b. Why it's a bug (if it's not obvious)

On Caddy 2.4.1 this redirect would not happen and the expected html file is served.

2c. Log output

01:28 # caddy run --config=caddy/Caddyfile
2021/06/15 01:28:56.577 INFO    using provided configuration    {"config_file": "caddy/Caddyfile", "config_adapter": ""}
2021/06/15 01:28:56.578 WARN    input is not formatted with 'caddy fmt' {"adapter": "caddyfile", "file": "caddy/Caddyfile", "line": 4}
2021/06/15 01:28:56.579 INFO    admin   admin endpoint started  {"address": "tcp/localhost:2019", "enforce_origin": false, "origins": ["127.0.0.1:2019", "localhost:2019", "[::1]:2019"]}
2021/06/15 01:28:56.579 INFO    tls.cache.maintenance   started background certificate maintenance      {"cache": "0xc00015f110"}
2021/06/15 01:28:56.580 INFO    tls     cleaning storage unit   {"description": "FileStorage:/root/.local/share/caddy"}
2021/06/15 01:28:56.580 INFO    tls     finished cleaning storage units
2021/06/15 01:28:56.581 INFO    autosaved config (load with --resume flag)      {"file": "/root/.config/caddy/autosave.json"}
2021/06/15 01:28:56.581 INFO    serving initial configuration
 2021/06/15 01:13:05.598 INFO    http.log.access.log0    handled request {"request": {"remote_addr": "[::1]:40218", "proto": "HTTP/1.1", "method": "GET", "host": "localhost:3030", "uri": "/", "headers": {"User-Agent": ["curl/7.68.0"], "Accept": ["*/*"]}}, "common_log": "::1 - - [15/Jun/2021:01:13:05 +0000] \"GET / HTTP/1.1\" 308 37", "duration": 0.000205737, "size": 37, "status": 308, "resp_headers": {"Content-Type": ["text/html; charset=utf-8"], "Server": ["Caddy"], "Location": ["/"]}}

2d. Workaround(s)

Downgrade to Caddy 2.4.1

3. Tutorial (minimal steps to reproduce the bug)

  1. The bug was first encountered in a CI environment where our repo is cloned to /root, so add a /root/index.html that we want to serve with Caddy.

  2. In a different directory, add minimum Caddyfile:

:3030 {
    root * /root
    try_files {path} /index.html
    file_server
}
  1. caddy run
  2. curl localhost:3030 => 308 Redirect to /

@mholt
Copy link
Member

mholt commented Jun 15, 2021

Thanks for the repro instructions. That exhibits it for me. The commit I linked earlier is definitely a regression, because that shouldn't yield a redirect loop (edit: actually, maybe it should, see below).

(I still want to know @inoblue's exact use case & config though, since how to solve both problems without a regression is still an open question.)

@mholt
Copy link
Member

mholt commented Jun 15, 2021

@cam-perry Actually, it could be argued that your config should have a redirect. Requests to /index.html (index files) are non-canonical, usually they should be to the directory itself (/).

If you change your config to use try_files {path} / you will see the redirect loop go away.

I agree its unintuitive, since it's an internal rewrite it's not clear whether canonicalization should be happening. At a glance, no, but it makes sense if you think about it. Few real URLs actually show index.html in them...

@cam-perry
Copy link

cam-perry commented Jun 15, 2021

Point taken. I agree the redirect is not necessarily wrong, but it is not obvious. I wouldn't expect that being more explicit with /index.html would cause an issue. (edit: while the redirect is not wrong, I also don't think /index.html is wrong either?)

Perhaps worth noting, my solution was guided by this community forum and your comment where the recommended solution is try_files {path} /index.html. This is the top search result for "caddy serve single page app".

@ghost
Copy link
Author

ghost commented Jun 15, 2021

1. Environment

1a. Operating system and version

Linux debian 5.10.0-0.bpo.5-amd64 #1 SMP Debian 5.10.24-1~bpo10+1 (2021-03-29) x86_64 GNU/Linux

(This is the output of uname -a.)

1b. Caddy version (run caddy version or paste commit SHA)

v2.4.2 h1:chB106RlsIaY4mVEyq9OQM5g/9lHYVputo/LAX2ndFg=

2. Description

2a. What happens (briefly explain what is wrong)

When setting up specified rewrite rules, unexpected 308 redirection ocurred.

2b. Why it's a bug (if it's not obvious)

It's obvoius.

2c. Log output

{"level":"info","ts":1623725626.73482,"logger":"http.log.access.log0","msg":"handled request","request":{"remote_addr":"127.0.0.1:47614","proto":"HTTP/1.1","method":"GET","host":"127.0.0.1","uri":"/","headers":{"Accept":["*/*"],"User-Agent":["curl/7.74.0"]}},"common_log":"127.0.0.1 - - [14/Jun/2021:22:53:46 -0400] \"GET / HTTP/1.1\" 308 37","duration":0.000219615,"size":37,"status":308,"resp_headers":{"Server":["Caddy"],"Location":["/"],"Content-Type":["text/html; charset=utf-8"]}}

2d. Workaround(s)

Downgrading to v2.4.1 or building from the newest source code.

3. Tutorial (minimal steps to reproduce the bug)

Create a Caddyfile like this:

127.0.0.1:80
root * /path/to/html
rewrite / /403.html
file_server

The folder /path/to/html has index.html and 403.html.
Run Caddy:

caddy run --config /path/to/Caddyfile

When running curl -v 127.0.0.1:80, the output:

*   Trying 127.0.0.1:80...
* Connected to 127.0.0.1 port 80 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 308 Permanent Redirect
< Content-Type: text/html; charset=utf-8
< Location: /
< Server: Caddy
< Date: Tue, 15 Jun 2021 02:53:46 GMT
< Content-Length: 37
<
<a href="/">Permanent Redirect<a>.
* Connection #0 to host 127.0.0.1 left intact

And when the rule is rewrite /private /, and I run curl -v 127.0.0.1:80/private, the issue occured again. But when I just change it to rewrite /private /index.html, it turns normal.

So, it seems that the issue happens when either the matcher or destination is the root path /.

@ghost ghost changed the title Bug: Caddy v2.4.2 with reverse proxy causes infinite redirection Bug: Caddy v2.4.2 with specified rewrite rules causes unexpected 308 redirection Jun 15, 2021
@mholt mholt reopened this Jun 15, 2021
@mholt
Copy link
Member

mholt commented Jun 15, 2021

Alright, thanks. Yeah, this happens when rewriting from an index/directory to a non-index/file, since the original path has a trailing slash but the rewritten path doesn't. Hmm hmm hmm.

@mholt
Copy link
Member

mholt commented Jun 16, 2021

Okay so I'm starting to understand this bug better, and to be honest, I'm torn on this one. Here's what I know:

  • The commit f9b5445 (in v2.4.2) does fix a certain, not-too-uncommon use case, described in fileserver: Redirect within the original URL #4179.

  • However, the same commit also breaks some existing sites. So far all the reports have involved rewriting the request path explicitly to an index file (e.g. try_files index.html or rewrite * /index.html or similar).

  • The file server stats the rewritten path and sees that it's a file, not a directory, so it doesn't look for index files.

  • When it doesn't look for index files, according to f9b5445 (v2.4.2), it will check to see if the original/external path (the one sent by the client) ends with / and if so, it will canonicalize the path by redirecting to the path without the /, leading the unwanted redirect(s).

  • Rewriting to / instead of /index.html -- i.e. letting the file server find the index file implicitly -- seems to resolve the issue in v2.4.2.

One could reasonably say that changing configs that rewrite from / to /index.html (or whatever index file explicitly) isn't a solution. You'd be right... it's a confusing breaking change, and the result is a less explicit configuration.

However, one could also argue that rewriting to an index file explicitly (/index.html) is a misconfiguration, since index files are generally accessed with / and implicitly used; for instance, you'd never write a hyperlink (href) to example.com/index.html. In addition, index files are configurable; it might not be index.html, and the config for the possible index file names lives in the file_server. (I have a config where the index file is called template.html.) By explicitly rewriting to an index file, you are depriving the file server of its job, or in other words, breaking abstractions.

I am guilty of this in at least 2 site configs of my own so far, and it's not really something I've thought about.

But now I wonder if these kinds of configs really are misconfigurations, and should be fixed; or whether we should try to be clever to not break existing sites.

For right now, f9b5445 has been reverted in 8848df9, but I'm hoping we can come to a satisfying resolution before releasing v2.4.3...

@mholt mholt added help wanted 🆘 Extra attention is needed in progress 🏃‍♂️ Being actively worked on labels Jun 16, 2021
@mholt mholt self-assigned this Jun 16, 2021
@mholt
Copy link
Member

mholt commented Jun 16, 2021

For the record, I tried using a solution that Francis suggested where we don't canonicalize (redirect) if the rewritten path is explicitly an index file. This worked for one of the use cases I tried, but not the Caddy website for example (the docs part of the site), in part due to how templates use the rewritten path (it expects the index page to have "index" in the rewritten path).

I think for now the best thing to do is keep the revert for v2.4.3. Yes, it can be problematic for URL canonicalization when used inside handle_path, but we'll need a solution that doesn't break so many sites in such complex ways.

@mholt mholt removed in progress 🏃‍♂️ Being actively worked on needs info 📭 Requires more information labels Jun 16, 2021
@mholt
Copy link
Member

mholt commented Jun 17, 2021

I think I came up with a solution that works in all 3 main cases I've tested: only redirect if the base element of the path (the filename) is the same in both the original request and the rewritten request. In other words, do not redirect if the filename in the path was changed/rewritten internally. (The logic being, if the admin wanted to rewrite to the canonical path, they would have.)

This seems to prevent the file server from stepping on the toes of intentional rewrites, while still enforcing path canonicalization when, for example, only the prefix of the path is changed (as in handle_path).

For all the test cases below, I found it helpful to turn on debug mode:

{
    debug
}

Test case 1

This config (used on my Expert Caddy website) used to cause a redirect loop:

localhost

root * /home/matt/Dev/matt.life

templates

# article index should have trailing slash to preserve hrefs
redir /expert-caddy /expert-caddy/
try_files {path} {path}/ {path}.html

# serve all articles from the template
rewrite /expert-caddy/* /expert-caddy/template.html

file_server

(This config could actually be better, probably by specifying template.html as an index file in the file_server, but 🤷‍♂️)

Test case 2

This config is from the Caddy website. Without my proposed change, it would not render pages in the /docs/ section of the site:

localhost

root * src

file_server
templates
encode gzip

try_files {path}.html {path}

redir   /docs/json      /docs/json/
redir   /docs/modules   /docs/modules/
rewrite /docs/json/*    /docs/json/index.html
rewrite /docs/modules/* /docs/modules/index.html
rewrite /docs/*         /docs/index.html

reverse_proxy /api/* localhost:4444

With my proposed change, the docs pages still function as expected (canonicalization redirects are NOT dispatched, since I explicitly rewrote the filename to be what I want it to be).

Test case 3

A simple file server with directory listings, that does not have an index file, where we use handle_path to serve the root folder within the path of /foo/*:

localhost

root * /any/folder/that's/not/a/website

handle_path /foo/* {
	file_server browse
}

With my proposed change, this continues to issue canonicalization redirects because only the path prefix is rewritten, not the filename; and the redirects correctly preserve the /foo/ prefix and the relative hrefs continue to work because of the redirects.

I'm going to commit and push this soon, then probably release v2.4.3 later today.

@mholt mholt removed the help wanted 🆘 Extra attention is needed label Jun 17, 2021
@mholt mholt closed this as completed in fbd6560 Jun 17, 2021
mholt added a commit that referenced this issue Jun 25, 2021
@simon04
Copy link
Contributor

simon04 commented Jan 8, 2022

i was puzzled by the HTTP 308 Permanent Redirect problem and found the issue with my configuration. When the address contains a trailing slash as in http://bar.example.com/ in the example below, the 308 Permanent Redirect problem triggers.

The trailing slash is not mentioned in, but also not forbidden according to https://caddyserver.com/docs/caddyfile/concepts#addresses

$ mkdir -p /srv/http/example.com/
$ touch /srv/http/example.com/index.html
$ touch /srv/http/example.com/map.js

$ caddy version
v2.4.6

# Caddyfile
http://foo.example.com,
http://bar.example.com/ { 
  root * /srv/http/example.com/
  file_server
}

$ curl --include foo.example.com/map.js
HTTP/1.1 200 OK

$ curl --include bar.example.com/map.js
HTTP/1.1 308 Permanent Redirect

@francislavoie
Copy link
Member

francislavoie commented Jan 8, 2022

The trailing slash creates a path matcher which only matches exactly / and nothing else.

I agree this is confusing behaviour, and I plan to deprecate path matchers in site addresses soon, because it's clearer to use a handle block explicitly instead, if you're actually intending to use a path matcher.

Your issue is not related to the one that was originally reported in this issue. It's a different problem altogether.

@mholt
Copy link
Member

mholt commented Jan 8, 2022

Before deprecating it, I'd like to try improving documentation around it first. It does have value and valid use cases.

jaygooby added a commit to jaygooby/caddy that referenced this issue Jan 10, 2023
jaygooby added a commit to jaygooby/caddy that referenced this issue Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants