Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aiohttp socket leak and incorrect connection close #1568

Closed
thehesiod opened this issue Jan 31, 2017 · 16 comments
Closed

aiohttp socket leak and incorrect connection close #1568

thehesiod opened this issue Jan 31, 2017 · 16 comments
Labels

Comments

@thehesiod
Copy link
Contributor

thehesiod commented Jan 31, 2017

Long story short

Issue 1

it appears for certain servers the ssl sockets are not released correctly.

Issue 2:

I believe aiohttp.client._RequestContextManager.__aexit__ () should not close() the connector on exception as this should be up to the client of aiohttp. Instead it should always opportunistically release the socket. Incidentally changing this to release also fixes #1 in my scenario.

Expected behaviour

Issue 1

With the sample script the number of open handles should stay relatively stable.

Issue 2

raising a custom exception from an aiohttp request contest should not cause the connector to be closed and a new one opened.

Actual behaviour

Issue 1

number of context handles grows very quickly with test script. Is resolved by changing https://github.com/KeepSafe/aiohttp/blob/master/aiohttp/connector.py#L392 to be abort(), or changing https://github.com/KeepSafe/aiohttp/blob/master/aiohttp/client.py#L555 to always release. However there appears to be an underlying issue with closing SSL sockets with certain servers.

Issue 2

tcpconnector is closed and re-opened when exception is raised from response context.

Steps to reproduce

Run sample script:

#!/usr/bin/env python3
import asyncio
import psutil
from aiohttp import web, ClientSession, TCPConnector
import ssl
from pympler.tracker import SummaryTracker

PORT = 8888

url = 'https://madis-data.ncep.noaa.gov/madisPublic1/data/archive/2001/07/01/LDAD/mesonet/netCDF/20010701_0800.gz'
# url = 'https://localhost:{}/'.format(PORT)
#url = 'https://localhost/20010701_0800.gz'

async def hello(request):
    return web.Response(status=304, content_type='application/octet-stream')


async def track_mem():
    tracker = SummaryTracker()
    proc = psutil.Process()
    while True:
        print("Number of open files: {} conns: {}".format(proc.num_fds(), len(proc.connections())))
        # tracker.print_diff()
        await asyncio.sleep(5)


async def server_startup(app):
    ssl_context = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
    ssl_context.set_default_verify_paths()
    ssl_context.load_cert_chain('/tmp/cert', '/tmp/key')

    connector = TCPConnector(limit=1, conn_timeout=10, ssl_context=ssl_context)
    session = ClientSession(connector=connector)

    # start 10 workers
    for i in range(10):
        asyncio.ensure_future(worker(session))

    # asyncio.ensure_future(track_mem())


async def worker(session: ClientSession):
    loop = asyncio.get_event_loop()
    loop.set_debug(True)

    while True:
        try:
            headers = {'If-Modified-Since': 'Sun, 01 Jul 2001 10:24:33 GMT'}
            async with session.get(url, headers=headers) as response:
                print('.', end='', flush=True)
                raise Exception
        except:
            pass


def main():
    app = web.Application()
    app.router.add_get('/', hello)
    app.on_startup.append(server_startup)

    ssl_context = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
    ssl_context.set_default_verify_paths()
    ssl_context.load_cert_chain('/tmp/cert', '/tmp/key')

    web.run_app(app, port=PORT, ssl_context=ssl_context)


if __name__ == '__main__':
    main()

keep track of open files:

watch -n0.5 "lsof | wc -l"

and note how the number of handles grows very quickly. It should remain stable. I was unable to reproduce this with the sample server included so it must be some kind of special behavior of the server this testcase hits.

Your environment

OSX and linux, python3.5.3 with latest aiohttp

@thehesiod
Copy link
Contributor Author

per the headers returned it's an apache server, I even tried reproducing with the OSX built-in apache server and it doesn't reproduce, so it seems like it's something particular to this server.

@thehesiod
Copy link
Contributor Author

this is in fact related to SSL as the http endpoint ends up redirecting to the https endpoint :)

@thehesiod
Copy link
Contributor Author

ends up being a asyncio issue! Here's the testcase:

import asyncio
import ssl


if True:  # madis
    HOST = 'madis-data.ncep.noaa.gov'
    PORT = 443
    PATH = '/madisPublic1/data/archive/2001/07/01/LDAD/mesonet/netCDF/20010701_0800.gz'
else:  # localhost
    HOST = 'localhost'
    PORT = 443
    PATH = '/20010701_0800.gz'


MSG = 'GET {path} HTTP/1.1\r\nIf-Modified-Since: Sun, 01 Jul 2001 10:24:33 GMT\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nHost: madis-data.ncep.noaa.gov\r\nUser-Agent: Python/3.5 aiohttp/1.3.0a0\r\nContent-Length: 0\r\n\r\n'.format(path=PATH)


class HttpClient(asyncio.Protocol):
    def connection_made(self, transport):
        self.transport = transport
        self.transport.write(MSG.encode('ascii'))
        print('data sent:', MSG)

    def data_received(self, data):
        print('data received:', data)

        asyncio.get_event_loop().call_later(1.0, self.transport.close)

    def eof_received(self):
        pass

    def connection_lost(self, exc):
        print('connection lost:', exc)
        asyncio.get_event_loop().stop()


async def main():
    loop = asyncio.get_event_loop()


    ssl_context = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
    ssl_context.set_default_verify_paths()
    ssl_context.load_cert_chain('/tmp/cert', '/tmp/key')

    conn = await loop.create_connection(HttpClient, HOST, PORT, ssl=ssl_context)


if __name__ == '__main__':
    _loop = asyncio.get_event_loop()
    _loop.run_until_complete(main())
    _loop.run_forever()

will write up a bug a backref here

@thehesiod
Copy link
Contributor Author

@thehesiod
Copy link
Contributor Author

created PR for #2: #1571

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 1, 2017

Could you post response data that you receive in 'data_received'

@thehesiod
Copy link
Contributor Author

thehesiod commented Feb 1, 2017

madis:

data received: b'HTTP/1.1 304 Not Modified\r\nDate: Wed, 01 Feb 2017 01:15:15 GMT\r\nServer: Apache\r\nETag: "19c5ce-31166-38805b7fa9a40"\r\n\r\n'

my server:

data received: b'HTTP/1.1 304 Not Modified\r\nDate: Wed, 01 Feb 2017 01:15:51 GMT\r\nServer: Apache/2.4.23 (Unix) LibreSSL/2.2.7 PHP/5.6.23\r\nETag: "31166-5475d6f656580"\r\n\r\n'

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 1, 2017

issue 1 should be fixed in master
i submitted patch for issue 2, i am not sure how much time it will take

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 7, 2017

@thehesiod i added special code for closed ssl transport, in addition to close aiohttp aborts all ssl transport after 2 seconds. this should solve leaked sockets

fafhrd91 pushed a commit that referenced this issue Feb 7, 2017
@thehesiod
Copy link
Contributor Author

@fafhrd91 thank you sir!

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 7, 2017

could you test if it works?

@thehesiod
Copy link
Contributor Author

will test asap.

@thehesiod
Copy link
Contributor Author

looks like the PR broke master:

Traceback (most recent call last):                                                                                                  │                                                                                                                                   
  File "/usr/local/lib/python3.5/asyncio/events.py", line 126, in _run                                                              │658                                                                                                                                
    self._callback(*self._args)                                                                                                     │                                                                                                                                   
  File "/usr/local/lib/python3.5/site-packages/aiohttp/connector.py", line 254, in _cleanup_closed                                  │                                                                                                                                   
    self._cleanup_closed_handle.close()
AttributeError: 'TimerHandle' object has no attribute 'close'

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 8, 2017

should be fixed

@thehesiod
Copy link
Contributor Author

MUCH better, yay, thank you!

@mikenerone
Copy link
Contributor

Just mentioning for others that might be mislead here: they circled back later and disabled this fix by default (see 02b0951).

@lock lock bot added the outdated label Mar 27, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Mar 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants