Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection timeout (not Read, Wall or Total) is consistently taking twice as long #5773

Closed
MestreLion opened this issue Mar 15, 2021 · 11 comments

Comments

@MestreLion
Copy link
Contributor

I'm aware that several issues related to timeout were opened (and closed) before, so I'm trying to narrow this report down to a very specific scope: connection timeout is behaving in a consistent, wrong way: it times out at precisely twice the requested time.

Results below are so consistent we must acknowledge there is something going on here! I beg you guys not to dismiss this report before taking a look at it!

What this report is not about:

  • Total/Wall timeout:
    That would be a nice feature, but I'm fully aware this is currently outside the scope of Requests. I'm focusing on connection timeout only.

  • Read timeout:
    All my tests were using http://google.com:81, which fails to even connect. There's no read involved, the server exists but never responds, not even to refuse the connection. No data is ever transmitted. No HTTP connection is ever established. This is not about ReadTimeoutError, this is about ConnectTimeoutError.

  • Accurate timings / network fluctuations:
    Not asking for millisecond precision. I don't even care about whole seconds imprecision. But, surprisingly, requests is being incredibly accurate... to twice the time.

Expected Result

requests.get('http://google.com:81', timeout=(4, 1)) should take approximately 4 seconds to timeout.

Actual Result

It consistently takes about 8.0 seconds to raise requests.ConnectTimeout. It always takes twice the time, for timeouts ranging from 1 to 100. Exception message clearly says in the end: Connection to google.com timed out. (connect timeout=4), a very distinct message from read timeouts.

Reproduction Steps

import requests, time, os, sys

# Using a know URL to test connection timeout
def test_timeout(timeout, url='http://google.com:81'):
    start = time.time()
    try:
        requests.get(url, timeout=timeout)
        print("OK!")  # will never reach this...
    except requests.ConnectTimeout:  # any other exception will bubble out
        print('{}: {:.1f}'.format(timeout, time.time()-start))

print("\n1 to 10, simple numeric timeout")
for i in range(1, 11):
    test_timeout(i)

print("\n1 to 10, (x, 1) timeout tuple")
for i in range(1, 11):
    test_timeout((i, 1))

print("\n1 to 10, (x, 10) timeout tuple")
for i in range(1, 11):
    test_timeout((i, 1))

print("\nLarge timeouts")
for i in (20, 30, 50, 100):
    test_timeout((i, 1))

Results:

Linux desktop 5.4.0-66-generic #74~18.04.2-Ubuntu SMP Fri Feb 5 11:17:31 UTC 2021 x86_64
3.6.9 (default, Jan 26 2021, 15:33:00) 
[GCC 8.4.0]
Requests: 2.25.1
Urllib3: 1.26.3

1 to 10, simple numeric timeout
1: 2.0
2: 4.0
3: 6.0
4: 8.0
5: 10.0
6: 12.0
7: 14.0
8: 16.0
9: 18.0
10: 20.0

1 to 10, (x, 1) timeout tuple
(1, 1): 2.0
(2, 1): 4.0
(3, 1): 6.0
(4, 1): 8.0
(5, 1): 10.0
(6, 1): 12.0
(7, 1): 14.0
(8, 1): 16.0
(9, 1): 18.0
(10, 1): 20.0

1 to 10, (x, 10) timeout tuple
(1, 10): 2.0
(2, 10): 4.0
(3, 10): 6.0
(4, 10): 8.0
(5, 10): 10.0
(6, 10): 12.0
(7, 10): 14.0
(8, 10): 16.0
(9, 10): 18.0
(10, 10): 20.0

Large timeouts
(20, 1): 40.0
(30, 1): 60.0
(50, 1): 100.1
(100, 1): 200.2

System Information

{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": "3.2.1"
  },
  "idna": {
    "version": "2.8"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.6.9"
  },
  "platform": {
    "release": "5.4.0-66-generic",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "1010108f",
    "version": "17.5.0"
  },
  "requests": {
    "version": "2.25.1"
  },
  "system_ssl": {
    "version": "1010100f"
  },
  "urllib3": {
    "version": "1.26.3"
  },
  "using_pyopenssl": true
}

It seems there is a single, "hidden", connection retry, performed by either requests or urllib3, somewhere in the line. It has been reported by other users in other platforms too.

@MestreLion
Copy link
Contributor Author

When using urllib3 directly, it seems to also give a very consistent timeout of 8 times the requested one:

>>> import urllib3, time
>>> http = urllib3.PoolManager()
>>> try:
...     start = time.time()
...     http.request('GET', 'http://google.com:81', timeout=1)
... except urllib3.exceptions.MaxRetryError:
...     print(time.time() - start)
... 
8.030268907546997
>>> try:
...     start = time.time()
...     http.request('GET', 'http://google.com:81', timeout=2)
... except urllib3.exceptions.MaxRetryError:
...     print(time.time() - start)
... 
16.021981477737427
>>> try:
...     start = time.time()
...     http.request('GET', 'http://google.com:81', timeout=3)
... except urllib3.exceptions.MaxRetryError:
...     print(time.time() - start)
... 
24.041129112243652

@MestreLion
Copy link
Contributor Author

Urllib3's documentation on retrying requests say:

urllib3 can automatically retry idempotent requests. This same mechanism also handles redirects. You can control the retries using the retries parameter to request(). By default, urllib3 will retry requests 3 times and follow up to 3 redirects.

3 retries = 4 connection attempts, taking that into account still leads us to the same issue: each attempt is taking twice as long to timeout

@MestreLion
Copy link
Contributor Author

And, regarding the last comment on duplicate issue #5760 by @sigmavirus24:

You're asking for a wall clock timeout and we can not provide that.

I'm not asking for a wall clock timeout, nor a total request timeout. I'm asking for a fix in Connection timeout. That double time is a bug.

Further we have clearly documented how timeouts work in our docs

True, and my tests show that connection timeouts are not behaving as the documentation says it should. Where in the documentation this double time behavior is explained?

@MestreLion
Copy link
Contributor Author

I think I've found the culprit: IPv6! It seems requests/urllib3 is automatically trying to connect using both IPv4 and IPv6, and that accounts for the doubled time.

I'll do some more tests to properly isolate the problem, as it seems requests is trying IPv6 even when it's not available, raising a ConnectionError with Failed to establish a new connection: [Errno 101] Network is unreachable, which is undesirable.

@sigmavirus24
Copy link
Contributor

urllib3/urllib3#2030

@MestreLion
Copy link
Contributor Author

urllib3/urllib3#2030

I'm not sure that issue is related to this one. I'm not experiencing connections that are slower/faster depending on IP family, I'm experiencing an exact double timeout time due to (trying to) connect using both IPv4 and IPv6, and failing in both families.

Is this "if IPv4 fails, retry using IPv6" a known/expected behavior? Is this documented anywhere?

@MestreLion
Copy link
Contributor Author

After more tests, the issue is really the dual IPv4/IPv6 connection attempts. Using the workaround proposed at by a Stackoverflow answer to force either IPv4 or IPv6 only, timeout behaves as expected:

# Monkey-patch urllib3 to force IPv4-connections only.
# Adapted from https://stackoverflow.com/a/46972341/624066
import socket
import urllib3.util.connection
def allowed_gai_family():
	return socket.AF_INET

import urllib3
import requests
import time, os, sys

# Using a know URL to test connection timeout
URL='http://google.com:81'

http = urllib3.PoolManager()

def test_urllib3_timeout(timeout, url=URL):
    start = time.time()
    try:
        http.request('GET', url, timeout=timeout, retries=0)
        print("OK!")
    except urllib3.exceptions.MaxRetryError:
        print('{}: {:.1f}'.format(timeout, time.time()-start))

def test_requests_timeout(timeout, url=URL):
    start = time.time()
    try:
        requests.get(url, timeout=timeout)
        print("OK!")  # will never reach this...
    except requests.ConnectTimeout:  # any other exception will bubble out
        print('{}: {:.1f}'.format(timeout, time.time()-start))

def test_timeouts():
    print("\nUrllib3")
    for i in range(1, 6):
        test_urllib3_timeout(i)

    print("\nRequests")
    for i in range(1, 6):
        test_requests_timeout((i, 1))


print("BEFORE PATCH:")
test_timeouts()

urllib3.util.connection.allowed_gai_family = allowed_gai_family

print("\nAFTER PATCH:")
test_timeouts()

Results:

BEFORE PATCH:

Urllib3
1: 2.0
2: 4.0
3: 6.0
4: 8.0
5: 10.0

Requests
(1, 1): 2.0
(2, 1): 4.0
(3, 1): 6.0
(4, 1): 8.0
(5, 1): 10.0

AFTER PATCH:

Urllib3
1: 1.0
2: 2.0
3: 3.0
4: 4.0
5: 5.0

Requests
(1, 1): 1.0
(2, 1): 2.0
(3, 1): 3.0
(4, 1): 4.0
(5, 1): 5.0

@MestreLion
Copy link
Contributor Author

MestreLion commented Mar 15, 2021

Still, I believe at least requests documentation should mention this behavior. for example, https://requests.readthedocs.io/en/latest/user/advanced/#timeouts could add something like this:

When the client has IPv6 support and the server has an IPv6 DNS record (AAAA), if the IPv4 connection fails the underlying urllib3 will automatically retry using IPv6, which may lead to an effective connection timeout of twice the specified time, so take that into account when setting the connection timeout.

@sigmavirus24
Copy link
Contributor

@MestreLion I would accept a PR adding an explanation of what can go wrong with timeouts like this

@LefterisJP
Copy link

LefterisJP commented Mar 15, 2021

Just popping in here to say thanks for digging into this. This would explain my double timeout question on the other issue I opened too.

I just tested by forcing IPv4 and the timeout is no longer doubled

@Obinna-Nwankwo

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
@sigmavirus24 @MestreLion @LefterisJP @nateprewitt @Obinna-Nwankwo and others