-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Reverse proxy documentation issues with nginx example #9579
Comments
By default the regex for the nginx example ( Looking at the docs for the root directive, it looks as if |
I have seen some packages provide a default in some cases, mostly so people know they got their server working. In my opinion, for security purposes and in general, being explicit is better than relying on defaults. It's not a big deal, but it is a fine point. |
There are cases that the regex is not a viable solution. For example, I use matrix-media-repo. This required redirecting media traffic, which is imposable with the regex (regex gets priority with location matching), so I can not use it . Documentation for staying safe without regex would be a good idea. I'd like to note as well that the regex looks like it would forward that with what little I know. URLs starting with |
I agree with @davehayes. In my experience going to original: modified: Then the "It works! Synapse is running" page loaded correctly with the redirect to This exposes the admin endpoint so I had to do similar to what @davehayes suggested above with the deny for It would be good to update the example as most people will blindly copy and paste this code. I checked a number of Synapse servers for users that I suspected were running their own instances of Synapse and the vast majority exposed the admin endpoint, most likely without knowing. |
Wait, that doesn't even work by default? I didn't notice... This regex is a backbox to users, and that's not good. |
It didn't work for me with the regex statement on Arch Linux without the above noted modification but many other Synapse instances had the redirect working correctly so I sort of suspect it is something specific to the Arch Linux Nginx configuration. |
for reference, the documentation in question is here: https://matrix-org.github.io/synapse/latest/reverse_proxy.html#nginx, and it recommends:
this is a relatively simple regular expression, matching anything starting with either The reason we do this is because we cannot assume that synapse is the only thing running on that host (and likewise, that is the reason that we don't redirect the root to the "it works!" page by default). If you're not running anything else on the host, then feel free to forward everything. (Aside: What I do wonder is why it is I'm going to close this because afaict it's working as intended. |
This is pure safety. We use ~* so that mixed case URLs (if they somehow get to the location block) behave the same as lower case ones. Consider the URL The escape is because perl regular expressions typically use |
Since synapse will (should) 404 such a URL, it's hard to see how this is beneficial; rather it adds complexity for nginx.
Given the complaints above about the regular expression being a black box that people don't understand, no, I don't think it is. It makes it harder to read. |
Is there a rigorous way to calculate the incremental complexity of adding a "*"? Regardless, should synapse somehow break the behavior that you imply exists, this single * covers that base. If in the future, someone decides that synapse should handle mixed case URLs, a single * still covers that base. I'm pretty sure, were we to actually agree on a measurement of complexity, that the complexity in finding out that synapse rejects mixed or upper case URLs with 404 would be much greater than the addition of a "*" which most people configuring nginx as a reverse proxy should know means "case insensitive". I'd say Nginx's behavior is easily searchable using a search engine...would you agree that it is easier than searching for the implied behavior of synapse with regard to mixed case URLs?
Given that people paste things indiscriminately, while it might be slightly harder to read, I believe it's less secure. Your mileage may vary and all that. |
As someone with some experience dealing with ngnix with what I self host, that regex was something new to me, and I only figured it out by starting with what you'd given and sorting the issues I found as I went through things. It's only been through the discussion here that people have talked enough about it that I can see what's going on. Not using the regex in my use case in the end, as regex breaks the location matching I need. Assuming the users skill level is either know pretty much everything or just copy paste is a not great way at looking at things. While explaining everything nginx isn't something I think is needed, I think going through what's been provided is a good idea, as well as providing more than one example in this case, as that regex does make certain things impossible. |
Description
I have two issues with your reverse proxy documentation.
First, it suggests leaving out important nginx configuration variables like
root
. Without aroot
configuration, I believe it's a compiled in default. With the location block you have in the example, if someone blindly pastes that (I know that they shouldn't) they will expose whatever compiled in default is there to the internet. I suggest you add something likeso people know that any other https accesses to this server block should return a 404.
Next, the blurb at the bottom of this document was unknown to me. Perhaps your examples should include a way to filter the unwanted URL acesss, e.g.:
These are fine points to be sure, but I suspect new homeserver admins would appreciate seeing these changes. Thank you for reading.
The text was updated successfully, but these errors were encountered: