Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Selenium 4.0.0 - Does not respect no_proxy variable but respects http_proxy and https_proxy #9925

Closed
supersmo opened this issue Oct 14, 2021 · 10 comments
Labels

Comments

@supersmo
Copy link

🐛 Bug Report

I upgraded python selenium 4.0.0 and when the tests run the request to the webdriver on localhost is sent to the proxy specified in environment variable https_proxy. The no_proxy variable is pecified to exclude localhost.

To Reproduce

  1. Define proxy environment variables in your shell

Run selenium tests inside the shell.
Requests to the webdriver on localhost is sent to the proxy even though no_proxy is defined.

When the reqeuest to localhost went to the proxy it was caught in the proxy protection solution and returned an erorr page that localhost is not categorized. :

<b>Blocked Category</b>: none<br>
<b>Internet Address</b>: http://localhost&#x2F;session <br>

The request shouldn't have been sent to the proxy in the first place.

Expected behavior

Both the values in http(s)_proxy and no_proxy should be respected.

This works in python selenium 3.141.0

Test script or set of commands reproducing this issue

Define environment variables in the shell:

export https_proxy=http:\\server:port
export http_proxy=http:\\server:port
export no_proxy=localhost

Run your selenium test:

python my_test.py

Environment

OS: Windows 10
Browser: Chrome
Browser version: 94.0.4606.71
Browser Driver version: ChromeDriver 94.0.4606.61
Language Bindings version: python selenium 4.0.0

@Cito
Copy link

Cito commented Oct 20, 2021

Can confirm. I believe the problem is that py\selenium\webdriver\remote\remote_connection.py evaluates the env vars http_proxy and https_proxy in _get_proxy_url(), but it does not evaluate the env var no_proxy.

The error I get when Selenium tries to communicate with the local web driver via the proxy is a "WebDriverException: A communication error occurred: Operation timed out." (Mentioning it here just for those who google for the error message).

@AutomatedTester
Copy link
Member

As a stop gap you can instantiate your own RemoteConnection object and have ignore_proxy=True. See https://www.selenium.dev/selenium/docs/api/py/webdriver_remote/selenium.webdriver.remote.remote_connection.html#selenium.webdriver.remote.remote_connection.RemoteConnection

I will get that set if NO_PROXY is set as an environment variable

@Cito
Copy link

Cito commented Oct 21, 2021

Thanks for the quick fix, but I fear 153298f is not a proper sulution, because it interprets no_proxy as a boolean, while the de facto standard considers it to be a comma separated list of hosts, possibly with wildcards or CIDR notation - see the discussion here.

@AutomatedTester
Copy link
Member

I was worried that might be the case. Since we're using urllib3, which weirdly seems to ignore proxy variables, I'm going to add in parsing and then use that for connections.

@Cito do the URLs purely relate to driver start up?

@Cito
Copy link

Cito commented Oct 21, 2021

@AutomatedTester Yes, as far as I understand it is only used for the driver (which could be on a remote host system). For the browser, proxy configuration is done like this.

And I think you're right, urllib3 ignores the _proxy env vars. However, the urllib in the standard lib and the requests library both support these variables. You may steal the code from there, or simply import the getproxies and proxy_bypass_environment functions from urllib.request.

@AutomatedTester
Copy link
Member

I know that browsers need it set differently, was just checking that there wasn't something in the middle, like grid, that we would need to deal with when trying to get status.

I was going to borrow from those libraries

@AutomatedTester
Copy link
Member

@Cito I have an implementation ready to go that passes but I was wondering if you had examples from your setup that I could use as tests?

@Cito
Copy link

Cito commented Oct 22, 2021

Here is a complex one: localhost,127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,mycompany.com,::1.

@AutomatedTester
Copy link
Member

cool, thanks, I have added these to test cases and things still look good.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Nov 22, 2021
elgatov pushed a commit to elgatov/selenium that referenced this issue Jun 27, 2022
If people are setting no_proxy for certain values to get around the need
to have a proxy for localhost, mostly, then we should set the poolmanager
to proxymanager if not in no_proxy or poolmanager if it is.

Fixes SeleniumHQ#9925
Fixes SeleniumHQ#9967
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants