Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for GuzzleHTTP 'no' proxy #17684

Merged
merged 1 commit into from
Mar 22, 2020

Conversation

mlatief
Copy link
Contributor

@mlatief mlatief commented Oct 25, 2019

The custom config allows to setup a proxy URI that is passed to GuzzleHTTP client as request options. Guzzle has the option to receive an array of proxies for each URI scheme as well as 'no' key value pair
to provide a list of host names that should not be proxied to.

Guzzle would automatically populate this value with environment's NO_PROXY environment variable. However, when providing a 'proxy' request option, it is needed to provide the 'no' value.

More info:
http://docs.guzzlephp.org/en/stable/request-options.html#proxy

This commit will add support for new config 'proxyexclude', which takes a list of host names to be excluded.

It will also get 'proxy' request option as and array instead of a string to Guzzle, and populate 'http' and 'https' URI schemas with proxy URI, and 'no' with 'proxyexclude' list.

Should fix: #12402

Signed-off-by: Mohammed Abdellatif m.latief@gmail.com

@kesselb
Copy link
Contributor

kesselb commented Oct 25, 2019

Thank you for taking care of this 👍

@kesselb kesselb added 3. to review Waiting for reviews enhancement labels Oct 25, 2019
@kesselb kesselb added this to the Nextcloud 18 milestone Oct 25, 2019
@kesselb kesselb requested a review from rullzer October 25, 2019 19:58
@@ -532,6 +532,13 @@
*/
'proxyuserpwd' => '',

/**
* List of host names that should not be proxied to.
* For example ``['.mit.edu', 'foo.com']``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might add some hint like:

Use something like explode(',', getenv('NO_PROXY')) to sync this value with the global NO_PROXY option.

@kesselb
Copy link
Contributor

kesselb commented Dec 7, 2019

@mlatief could you rebase the branch? Thanks 👍

This was referenced Dec 11, 2019
@rullzer rullzer mentioned this pull request Dec 19, 2019
18 tasks
@rullzer rullzer modified the milestones: Nextcloud 18, Nextcloud 19 Dec 19, 2019
@juliusknorr
Copy link
Member

juliusknorr commented Feb 4, 2020

@mlatief Thanks a lot for this. Mind to do another rebase to trigger the CI again?

@juliusknorr juliusknorr added 4. to release Ready to be released and/or waiting for tests to finish and removed 3. to review Waiting for reviews labels Feb 20, 2020
@juliusknorr
Copy link
Member

Thanks, failures seem unrelated.

@kesselb
Copy link
Contributor

kesselb commented Feb 20, 2020

Ah. I broke NO_PROXY and HTTPS_PROXY support with #14363 🙈

  1. https://github.com/nextcloud/3rdparty/blob/d583bf2a4b2db903846bfbf1600f69f3b4b60702/guzzlehttp/guzzle/src/Client.php#L261-L268 Guzzle will read HTTPS_PROXY and NO_PROXY by default and use their values as default.

  2. A request from Nextcloud will pass a array with proxy, verify and timeout. If there is no proxy configured with Nextcloud the value for proxy is null.

  3. https://github.com/nextcloud/3rdparty/blob/d583bf2a4b2db903846bfbf1600f69f3b4b60702/guzzlehttp/guzzle/src/Client.php#L320 sometime we will hit this code where $options are the values passed from Nextcloud and $defaults are the defaults from guzzle.

Take this a example:

<?php

$defaults = [
    'proxy' => ['https' => '127.0.0.1', 'no' => 'mydomain.com']
];

$options = [
    'proxy' => null
];

var_dump($options + $defaults);

Result:

array(1) {
  ["proxy"]=>
  NULL
}

To restore the default guzzle behaviour we MUST not set proxy as request option if there is no one. I can shoot another pr if this one is merged or @mlatief is willing clean up my mess ;)

@mlatief
Copy link
Contributor Author

mlatief commented Feb 20, 2020

Would it be enough for example to modify this:

$defaults = [
RequestOptions::PROXY => $this->getProxyUri(),
RequestOptions::VERIFY => $this->getCertBundle(),
RequestOptions::TIMEOUT => 30,
];
and set proxy only if Nextcloud is configured to use one?

Also I would be interested in adding a test case for this, however I can't easily figure out a way to send the request to guzzle and see what values it used for proxy (or other options for that matter).

@kesselb
Copy link
Contributor

kesselb commented Feb 20, 2020

Yes that's the right place. I'm not sure how to realize a test. But we actually should not test that guzzle is behaving correctly. Just ensure we pass the right data.

@mlatief
Copy link
Contributor Author

mlatief commented Feb 29, 2020

Now there is a check if getProxyUri() returns null before adding 'proxy' key.

In the process, I was reconsidering if 'proxy' should revert back to be just a URI string. However, that would mean that if Nextcloud is configured to use a 'proxy', there won't be an easy way
to configure NO_PROXY hosts.

Do you think if proxyexclude is empty we should read NO_PROXY, something akin to: https://github.com/nextcloud/3rdparty/blob/7f4a8f54d850a92e50e143d8ef4d59ae8cca73ac/guzzlehttp/guzzle/src/Client.php#L265 ? For now I just added a note to proxy configuration in the sample config file.

@kesselb
Copy link
Contributor

kesselb commented Feb 29, 2020

Good point 👍

I don't know to be honest. Guzzle uses + to combine the data but this is not recursive. I think the best we can do is:

  1. We don't have any proxy information: Don't pass proxy to Guzzle so the defaults are used.
  2. A proxy is defined: Pass proxy to Guzzle and add a warning to the docs that NO_PROXY is ignored for Nextcloud requests.

We are able to read a NO_PROXY value as fallback (for example if proxyexclude is not defined) but that mixes explicit and implicit configuration. For example: HTTPS_PROXY and NO_PROXY are defined. Admin want Nextcloud to use a different proxy. The NO_PROXY value does not work for the other proxy. With a fallback we would use the NO_PROXY value also for the other proxy.

I would prefer: If a proxy is defined with Nextcloud also proxyexclude must be defined. A admin can still use getenv(NO_PROXY) to keep the value in sync with the system wide default. If we document it probably everyone should be happy.

@@ -520,6 +520,10 @@
/**
* The URL of your proxy server, for example ``proxy.example.com:8081``.
*
* Note: If a proxy is explicitly configured here, default values from proxy global
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guzzle (the http library used by Nextcloud) is reading the environment variables HTTP_PROXY (only for cli request), HTTPS_PROXY and NO_PROXY by default.

If you configure proxy with Nextcloud any default configuration by Guzzle is overwritten. Make sure to set proxyexclude accordingly if necessary.

// Only add RequestOptions::PROXY if Nextcloud is explicitly
// configured to use a proxy. This is needed in order not to override
// Guzzle default values.
if( $proxy !== null ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if( $proxy !== null ) {
if($proxy !== null) {

}

return $proxyUserPwd . '@' . $proxyHost;
return ['http' => $proxyHost, 'https' => $proxyHost, 'no' => $proxyExclude];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/nextcloud/3rdparty/blob/d583bf2a4b2db903846bfbf1600f69f3b4b60702/guzzlehttp/guzzle/src/Client.php#L265-L268

If getenv('NO_PROXY') is empty the no element is not set. We should probably do the same. Only set no if there is a value for proxyexclude.

The custom config allows to setup a proxy URI that is passed to
GuzzleHTTP client as request options. Guzzle has the option to receive
an array of proxies for each URI scheme as well as 'no' key value pair
to provide a list of host names that should not be proxied to.

Guzzle would automatically populate these options with HTTPS_PROXY
and NO_PROXY environment variables. However, when providing a 'proxy'
request option, default values will be overriden and it is required to
explicitly provide the 'no' value if needed.

More info:
http://docs.guzzlephp.org/en/stable/request-options.html#proxy

This commit will add support for a new config 'proxyexclude', which
takes a list of host names to be excluded.

It will also provide 'proxy' request option as an array instead of a
string to Guzzle, and populate 'http' and 'https' URI schemes with
proxy URI, and 'no' with 'proxyexclude' list.

Also, if no 'proxy' is configured, it will leave out 'proxy' request
option, so it won't override Guzzle default values.

Sample config file includes a hint on how to explicitly sync
'proxyexclude' with NO_PROXY, and a note about default values.

Signed-off-by: Mohammed Abdellatif <m.latief@gmail.com>
@rullzer rullzer merged commit 6675f9b into nextcloud:master Mar 22, 2020
@welcome
Copy link

welcome bot commented Mar 22, 2020

Thanks for your first pull request and welcome to the community! Feel free to keep them coming! If you are looking for issues to tackle then have a look at this selection: https://github.com/nextcloud/server/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22

@mlatief mlatief deleted the support-no-proxy branch March 22, 2020 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4. to release Ready to be released and/or waiting for tests to finish enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NO_PROXY
4 participants