Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client attempts to reuse a closed connection if "Connection: keep-alive" header was received #7297

Closed
1 task done
filip-muller opened this issue May 25, 2023 · 6 comments · Fixed by #7363
Closed
1 task done
Labels

Comments

@filip-muller
Copy link

Describe the bug

When making requests through a ClientSession, the session will try to reuse connections, where the "Connection: keep-alive" http header was last sent, even when the connection was closed on a lower layer.

Let's say we make a request to server.com. In the response, server.com will include (misleadingly) the header "Connection: keep-alive", but will close the connection, e.g. by closing the socket. When we try to make a second request to server.com, ClientSession (or the underlying connector) will reuse this connection, even though it is closed, and the request will thus fail.

As far as I'm aware, it is not a violation of protocol to send the "Connection: keep-alive" header and then close the connection, even though it is strange behavior. The requests library can deal with this without any problems and I wonder if this type of behavior could cause other issues as well.

To Reproduce

  1. Run a simple http server implementation, that responds to all requests with a simple "test" response and a "Connection: keep-alive" header, but then closes the connection (after each request):
"""
Simple http server that sends 'Connection: keep-alive' headers,
but closes the connection after every request
"""

import socket
import select


HOSTNAME = "localhost"
PORT = 8080


def accept_connection(server_socket: socket.socket):
    """Wait for connection, then return the client_socket"""
    select.select((server_socket,), (), (), 2)
    return server_socket.accept()


def handle_client(client_socket: socket.socket):
    """Send simple response with 'Connection: keep-alive' and close connection"""
    read_ready = select.select((client_socket,), (), ())[0]
    if not read_ready:
        raise RuntimeError("Not read_ready")
    client_socket.recv(1024)
    connection_header = "Connection: keep-alive\r\n"
    content = "test"
    content_len = f"Content-Length: {len(content)}\r\n"
    client_socket.send(
        f"HTTP/1.1 200 OK\r\n{connection_header}{content_len}\r\n{content}".encode()
    )
    client_socket.close()


def main():
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.bind((HOSTNAME, PORT))

    s.listen(5)
    print(f"Listening on {HOSTNAME}:{PORT}")

    while True:
        client_socket, address = accept_connection(s)
        print(f"New connection from {address}")
        handle_client(client_socket)


if __name__ == "__main__":
    main()
  1. Call the server twice in a row with one client session:
"""Call the server twice in a row"""

import aiohttp
import asyncio


async def main():
    session = aiohttp.ClientSession()
    print("making first call...")
    res = await session.get("http://localhost:8080")
    print("first call successful")
    print("headers:", res.headers)
    print("content:", await res.text())
    print("-"*20)
    print("making second call...")
    res = await session.get("http://localhost:8080")
    print("second call successful")
    print("headers:", res.headers)
    print("content:", await res.text())
    print("-"*20)

    await session.close()

if __name__ == "__main__":
    asyncio.run(main())

Client output:

making first call...
first call successful
headers: <CIMultiDictProxy('Connection': 'keep-alive', 'Content-Length': '4')>
content: test
--------------------
making second call...
Traceback (most recent call last):
  File "test_aiohttp_keep_alive.py", line 44, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "test_aiohttp_keep_alive.py", line 18, in main
    res = await session.get("http://localhost:8080")
  File "venv/lib/python3.9/site-packages/aiohttp/client.py", line 560, in _request
    await resp.start(conn)
  File "venv/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 899, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
  File "venv/lib/python3.9/site-packages/aiohttp/streams.py", line 616, in read
    await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7f18495c00d0>

Expected behavior

Expected output from the client script above is:

making first call..
second call successful
headers: {'Connection': 'keep-alive', 'Content-Length': '4'}
content: test
--------------------
making second call..
second call successful
headers: {'Connection': 'keep-alive', 'Content-Length': '4'}
content: test
--------------------

Which is what using the requests library yields.

import requests

session = requests.Session()
print("making first call..")
res = session.get("http://localhost:8080")
print("first call successful")
print("headers:", res.headers)
print("content:", res.text)
print("-"*20)
print("making second call..")
res = session.get("http://localhost:8080")
print("second call successful")
print("headers:", res.headers)
print("content:", res.text)
print("-"*20)

Logs/tracebacks

$ sudo tcpdump tcp -i lo port 8080
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on lo, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:00:23.426820 IP localhost.43634 > localhost.http-alt: Flags [S], seq 582969114, win 65495, options [mss 65495,sackOK,TS val 2734512433 ecr 0,nop,wscale 7], length 0
12:00:23.426838 IP localhost.http-alt > localhost.43634: Flags [S.], seq 3207724027, ack 582969115, win 65483, options [mss 65495,sackOK,TS val 2734512433 ecr 2734512433,nop,wscale 7], length 0
12:00:23.426851 IP localhost.43634 > localhost.http-alt: Flags [.], ack 1, win 512, options [nop,nop,TS val 2734512433 ecr 2734512433], length 0
12:00:23.427290 IP localhost.43634 > localhost.http-alt: Flags [P.], seq 1:124, ack 1, win 512, options [nop,nop,TS val 2734512433 ecr 2734512433], length 123: HTTP: GET / HTTP/1.1
12:00:23.427308 IP localhost.http-alt > localhost.43634: Flags [.], ack 124, win 511, options [nop,nop,TS val 2734512433 ecr 2734512433], length 0
12:00:23.427374 IP localhost.http-alt > localhost.43634: Flags [P.], seq 1:67, ack 124, win 512, options [nop,nop,TS val 2734512433 ecr 2734512433], length 66: HTTP: HTTP/1.1 200 OK
12:00:23.427389 IP localhost.43634 > localhost.http-alt: Flags [.], ack 67, win 512, options [nop,nop,TS val 2734512433 ecr 2734512433], length 0
12:00:23.427405 IP localhost.http-alt > localhost.43634: Flags [F.], seq 67, ack 124, win 512, options [nop,nop,TS val 2734512433 ecr 2734512433], length 0
12:00:23.428727 IP localhost.43634 > localhost.http-alt: Flags [P.], seq 124:247, ack 68, win 512, options [nop,nop,TS val 2734512435 ecr 2734512433], length 123: HTTP: GET / HTTP/1.1
12:00:23.428742 IP localhost.http-alt > localhost.43634: Flags [R], seq 3207724095, win 0, length 0
^C
10 packets captured
20 packets received by filter
0 packets dropped by kernel

Python Version

3.9.2

aiohttp Version

3.8.4

multidict Version

6.0.4

yarl Version

1.8.2

OS

Debian

Related component

Client

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@Dreamsorcerer
Copy link
Member

Thanks for the reproducer. If you have some time, do you think you could turn that into a test and create a PR including it (I think there are some similar tests already)?

I think we may have fixed something relating to this recently, so if we're lucky the test may pass on master (but, good to have the test to catch a regression either way).

@filip-muller
Copy link
Author

I tried running the script on master and it resulted in the same exception. I'm not sure if I can turn this into a unit test very effectively, but feel free to point me to some similar tests. It seems to be something with transport.is_closing() not being a completely reliable indicator of whether the connection is fine or not. It stays false throughout the whole interaction above.

One interesting thing I found was when making post requests, during the second one, the headers would get written by StreamWriter as transport.is_closing() was still False and then when it tried to write the post data, transport.is_closing() was True, resulting in a ConnectionResetError.

async def main():
    session = aiohttp.ClientSession()
    print("making first call...")
    res = await session.post("http://localhost:8080", data="data")
    print("first call successful")
    print("headers:", res.headers)
    print("content:", await res.text())
    print("-"*20)
    print("making second call...")
    res = await session.post("http://localhost:8080", data="data")
    print("second call successful")
    print("headers:", res.headers)
    print("content:", await res.text())
    print("-"*20)

    await session.close()

Resulting in

--------------------
making second call...
chunk: b'POST / HTTP/1.1\r\nHost: localhost:8080\r\nAccept: */*\r\nAccept-Encoding: gzip, deflate, br\r\nUser-Agent: Python/3.10 aiohttp/4.0.0a2.dev0\r\nContent-Length: 4\r\nContent-Type: text/plain; charset=utf-8\r\n\r\n'
protocol.connected: True
transport is None: False
transport.is_cloing(): False          # !!!!!!!!!!!!!
chunk: b'data'
protocol.connected: True
transport is None: False
transport.is_cloing(): True           # !!!!!!!!!!!!!
Traceback (most recent call last):
  File aiohttp/aiohttp/client_reqrep.py", line 538, in write_bytes
    await self.body.write(writer)
  File aiohttp/aiohttp/payload.py", line 244, in write
    await writer.write(self._value)
  File aiohttp/aiohttp/http_writer.py", line 119, in write
    self._write(chunk)
  File "aiohttp/aiohttp/http_writer.py", line 79, in _write
    raise ConnectionResetError("Cannot write to closing transport")
ConnectionResetError: Cannot write to closing transport

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test_keep_alive.py", line 48, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "test_keep_alive.py", line 21, in main
    res = await session.post("http://localhost:8080", data="data")
  File "aiohttp/aiohttp/client.py", line 545, in _request
    await resp.start(conn)
  File "aiohttp/aiohttp/client_reqrep.py", line 855, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
  File "aiohttp/streams.py", line 605, in read
    await self._waiter
aiohttp.client_exceptions.ClientOSError: [Errno None] Can not write request body for http://localhost:8080
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7f973c1475a0>

I added these print statements to StreamWriter._write()

print("chunk:", chunk)
print("protocol.connected:", bool(self._protocol.connected))
print("transport is None:", self.transport is None)
print("transport.is_cloing():", self.transport.is_closing())

@Dreamsorcerer
Copy link
Member

There are a load of tests that use a web application in https://github.com/aio-libs/aiohttp/blob/master/tests/test_client_functional.py

Maybe it's possible to create a test in a similar manner?

@filip-muller
Copy link
Author

I added two test cases, one succeeds but the other one fails. Perhaps that could be a lead towards the solution.

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Jul 9, 2023

I think the behaviour is correct. When the server sends Connection: close, the client itself will close the connection immediately. With Connection: keep-alive the connection is kept open and a new request is initiated. It is only after the second call is initiated that the server manages to close the connection itself.

This is definitely true of the test in your PR as the server and client are running in the same loop, so there is no possible way for the server to execute the code to close the connection until the client code yields to the event loop in the second call. I believe it is probably also true of the original example, as it likely waits until the client has finished reading the response, at which point it becomes a race and the client starts creating a new request before it receives the close.

The RFC does suggest that we can automatically retry for idempotent methods (I'd be curious if requests has the same issue with non-idempotent methods):
https://www.rfc-editor.org/rfc/rfc9112.html#name-retrying-requests

So, maybe we should do this automatically. I've tried it out and got the test passing pretty easily.

Dreamsorcerer added a commit that referenced this issue Jan 20, 2024
Dreamsorcerer added a commit that referenced this issue Jan 20, 2024
Fixes #7297

(cherry picked from commit be9a3cc)
Dreamsorcerer added a commit that referenced this issue Jan 20, 2024
Fixes #7297

(cherry picked from commit be9a3cc)
@udayarctempus
Copy link

Whats the timeline we are looking to release this fix ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants