Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

url_preview_ip_range_blacklist not honored #9417

Closed
dani opened this issue Feb 16, 2021 · 16 comments
Closed

url_preview_ip_range_blacklist not honored #9417

dani opened this issue Feb 16, 2021 · 16 comments
Labels
X-Needs-Info This issue is blocked awaiting information from the reporter

Comments

@dani
Copy link

dani commented Feb 16, 2021

Description

URL preview is not honoring url_preview_ip_range_blacklist anymore since 1.25 (or was it 1.26 ?)

Steps to reproduce

I have enabled url_preview in my config and configured ip blacklist so private IP are not previewed

url_preview_enabled: True
url_preview_ip_range_blacklist:
  - '127.0.0.0/8'
  - '10.0.0.0/8'
  - '172.16.0.0/12'
  - '192.168.0.0/16'
  - '100.64.0.0/10'
  - '169.254.0.0/16'
max_spider_size: 10M

This used to work as expected, but since a few versions, it's not honoring url_preview_ip_range_blacklist anymore. URL corresponding to private IP are being previewed. I haven't noted the exact version in which the problem was introduced but I think it was either 1.25.0 or 1.26.0

Version information

  • Synapse 1.27.0
  • manual install with PIP
  • CentOS Linux 7 with Python 3.6.3
@anoadragon453
Copy link
Member

I don't see too many recent, relevant changes to PreviewUrlResource. It looks like outgoing requests are still using a SimpleHttpClient with an IP blacklist set.

There have been significant changes to the IP Blacklisting config however. The IPs for url_preview_ip_range_blacklist may be getting lost somewhere, which also may make sense as to why the unit tests didn't flag anything.

@dani
Copy link
Author

dani commented Feb 16, 2021

Probably unrelated, but I'm using an outbound proxy (through HTTP_PROXY and HTTPS_PROXY env var)

@clokep
Copy link
Member

clokep commented Feb 16, 2021

Probably unrelated, but I'm using an outbound proxy (through HTTP_PROXY and HTTPS_PROXY env var)

It is up to your proxy to prevent access to internal IPs in this case since the proxy does the DNS resolution, etc. I don't think anything has changed with proxies in the past few releases but maybe we're using them in more places now?

@dani
Copy link
Author

dani commented Feb 16, 2021

I though synapse was doing the DNS resolution to handle the blacklists/whitelist before passing the request to the proxy (because it was working as expected before, and I was using the very same proxy config)

@clokep
Copy link
Member

clokep commented Feb 16, 2021

Maybe the changes in #9084 caused this, but it should have previously just connected directly to the proxy and passed on the URL to request.

@clokep
Copy link
Member

clokep commented Feb 24, 2021

To add a bit more information about this, it seems that the HTTP client used for URL previews is not shared with anything else:

self.client = SimpleHttpClient(
hs,
treq_args={"browser_like_redirects": True},
ip_whitelist=hs.config.url_preview_ip_range_whitelist,
ip_blacklist=hs.config.url_preview_ip_range_blacklist,
http_proxy=os.getenvb(b"http_proxy"),
https_proxy=os.getenvb(b"HTTPS_PROXY"),
)

This was not changed recently.

#9084 changed the ProxyAgent (which SimpleHttpClient calls) to use the "real" reactor (not the BlacklistingReactorWrapper), this was to fix an oddity where the BlacklistingReactorWrapper stopped you from connecting to the proxy due to it being on a private IP range (see some conversation around https://matrix.to/#/!XaqDhxuTIlvldquJaV:matrix.org/$IVyi3-SwCcPSUnSzprx9wcPD4Ubru-uEnddhWK1IE5g?via=matrix.org&via=vector.modular.im&via=envs.net).

In that conversation there was a discussion that no where in the code did we use blacklisting + proxying until #8821, although it seems this was slightly mistaken as we have always done this for URL previews.

As far as I can tell the change to the reactor should have no affect on either an HTTP or HTTPS proxy as the name resolver only ever runs on the proxy's host, not on the destination. I believe this is also true of the CONNECT proxy, but the code is a bit harder to follow.

I'd be surprised if this behavior changed, but regardless I think you want to be configuring your proxy to not connect to private IPs. (I'd be surprised if this wasn't the default?) Maybe we should be calling this out in the documentation. Out of curiosity, what proxy are you using?

@Bubu do you have any thoughts on whether this behavior changed with #9084?

@clokep clokep added the X-Needs-Info This issue is blocked awaiting information from the reporter label Feb 24, 2021
@dani
Copy link
Author

dani commented Feb 24, 2021

Using squid. The VM running synapse does have access to some of the private IP for technical reason (local repo mirror, local API etc.). I will indeed have to fine tune the ACL on squid (or completly disable url previewing), but it's strange that it's needed now while it was working before

@clokep
Copy link
Member

clokep commented Feb 24, 2021

it's strange that it's needed now while it was working before

It would help to confirm which version it broke in FTR! 👍

@Bubu
Copy link
Contributor

Bubu commented Feb 24, 2021

@Bubu do you have any thoughts on whether this behavior changed with #9084?

I'd have thought that this didn't change anything regarding blacklisting of the final destination but unfortunately the code around the proxy connection stuff is sufficiently complex that it's mostly guesswork without diving into the codebase again :-/.

@dani
Copy link
Author

dani commented Feb 25, 2021

I switched to url_preview_url_blacklist, which I must admit makes more sense when using a proxy. It's working well. So maybe, this would just need a small doc update to indicate that url_preview_ip_range_blacklist shouldn't be used when using a proxy.

@richvdh
Copy link
Member

richvdh commented Apr 8, 2021

I'm going to close this as the main issue seems to be resolved. If anyone fancies putting up a PR to update the docs on url_preview_ip_range_blacklist, that would be appreciated.

@richvdh richvdh closed this as completed Apr 8, 2021
@anoadragon453
Copy link
Member

So to boil this down into some useful documentation going forward, the issue was that adding an IP range to url_preview_ip_range_blacklist that includes the proxy that you're using via HTTP_PROXY or HTTPS_PROXY would cause Synapse to refuse to connect to your proxy? url_preview_ip_range_blacklist is apparently required to be set to use URL previews at all. So we should advise people to set the list to something, but ensure it doesn't cover their proxy's IP when using one?

Would it be better to just ignore this option when a proxy is in use?

@richvdh
Copy link
Member

richvdh commented Apr 14, 2021

the issue was that adding an IP range to url_preview_ip_range_blacklist that includes the proxy that you're using via HTTP_PROXY or HTTPS_PROXY would cause Synapse to refuse to connect to your proxy

Was that the actual issue? It doesn't seem to match the OP's report.

It feels like HTTP_PROXY and HTTPS_PROXY should be exempt from url_preview_ip_range_blacklist (and that you shouldn't be required to set it if you're using a proxy).

Would it be better to just ignore this option when a proxy is in use?

I'd think if someone is setting url_preview_ip_range_blacklist and also using HTTP_PROXY we should warn about it.

@anoadragon453
Copy link
Member

Was that the actual issue? It doesn't seem to match the OP's report.

Oh indeed, that was the conclusion I arrived from given later comments, but from the original report it looks like the problem is much simpler. The documentation that needs to be added would be something along the lines of "The url_preview_ip_range_blacklist is ignored if you are using an HTTP(S) proxy as all DNS requests will be forwarded to the proxy instead of being resolved locally".

That coupled with allowing url_preview_ip_range_blacklist to be unset when using a proxy and warning when they are used together sounds like the most user friendly state.

If that all sounds right I can put it into a new issue 🙂

@clokep
Copy link
Member

clokep commented Apr 14, 2021

It feels like HTTP_PROXY and HTTPS_PROXY should be exempt from url_preview_ip_range_blacklist (and that you shouldn't be required to set it if you're using a proxy).

This is already true as of #9084.

The documentation that needs to be added would be something along the lines of "The url_preview_ip_range_blacklist is ignored if you are using an HTTP(S) proxy as all DNS requests will be forwarded to the proxy instead of being resolved locally".

That coupled with allowing url_preview_ip_range_blacklist to be unset when using a proxy and warning when they are used together sounds like the most user friendly state.

I agree that this is what might make this less confusing.

@anoadragon453
Copy link
Member

@clokep Thanks for clarifying! I've made an issue with the details of what could be done to improve things: #9812

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
X-Needs-Info This issue is blocked awaiting information from the reporter
Projects
None yet
Development

No branches or pull requests

5 participants