Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential memory leak in aiohttp server #4478

Closed
mvalkon opened this issue Dec 31, 2019 · 6 comments
Closed

Potential memory leak in aiohttp server #4478

mvalkon opened this issue Dec 31, 2019 · 6 comments
Assignees

Comments

@mvalkon
Copy link

mvalkon commented Dec 31, 2019

Long story short

I have a small HTTP API written with aiohttp as the backend. I am seeing a constantly growing memory footprint which results in segmentation faults in production. The segfaults seem to occur somewhat randomly. I am not able to get the core dumps at the moment for analysis, so I have resorted to debugging this locally. I do not have conclusive proof, but I am looking for some pointers on where to go from here.

The API makes two external calls per request; one to external API using aiottp.ClientSession and another to DynamoDB to fetch data. In both cases we maintain a separate session for the lifetime of the application.

Expected behaviour

Stable memory consumption

Actual behaviour

Growing memory footprint at low rps. The following graph and tracemalloc data is from a short test where the API was run locally and traffic was generated at roughly 45 requests per second.

leak

Tracemalloc snapshot comparison between test start and finish points to RequestHandler.data_received()-method and TCPConnector._wrap_create_connection()-method having the largest increase in memory.

aiohttp/web_protocol.py:275: size=86.3 MiB (+81.0 MiB), count=316057 (+295361), average=286 B
aiohttp/connector.py:936: size=13.1 MiB (+2577 KiB), count=765 (+109), average=17.6 KiB
basictracer/text_propagator.py:30: size=2454 KiB (+2306 KiB), count=44869 (+42176), average=56 B
traceback.py:357: size=797 KiB (+207 KiB), count=9108 (+2303), average=90 B
thrift/transport/THttpClient.py:153: size=0 B (-183 KiB), count=0 (-1)
lightstep/tracer.py:106: size=287 KiB (+143 KiB), count=1296 (+670), average=227 B
traceback.py:285: size=385 KiB (+141 KiB), count=4552 (+1678), average=87 B
lightstep/thrift_converter.py:61: size=40.8 KiB (-136 KiB), count=713 (-2378), average=59 B
lightstep/util.py:37: size=37.7 KiB (-126 KiB), count=594 (-1984), average=65 B
json/decoder.py:353: size=5208 KiB (+91.4 KiB), count=51545 (+1052), average=103 B

Ojbgraph points to a large number of CIMultiDict-objects in memory and the following object graph can be generated (not sure how helpful this is)

test

Additionally I am seeing errors reported in #3535 when traffic generation stops.

<uvloop.loop.SSLProtocol object at 0x108602bd0>: Fatal error on transport
Traceback (most recent call last):
  File "uvloop/sslproto.pyx", line 571, in uvloop.loop.SSLProtocol._do_shutdown
  File "/usr/local/opt/pyenv/versions/3.7.5/lib/python3.7/ssl.py", line 778, in unwrap
    return self._sslobj.shutdown()
ssl.SSLError: [SSL: KRB5_S_INIT] application data after close notify (_ssl.c:2629)

Any pointers on where to go from here for further debugging would be much appreciated.

Steps to reproduce

Unfortunately none at the moment. I will try to isolate a reproduceable snippet.

Your environment

The memory consumption is increased on both os x (my laptop) and on an ubuntu based docker image running on Kubernetes.

aiohttp version is 3.6.2 with uvloop on python 3.7.5, server and client.

@webknjaz
Copy link
Member

Try upgrading the multidict package. There's been a huge refactoring with a number of subsequent fixes and patch releases.
You could play with different versions and see if pre-rewrite don't have leaks.

@mvalkon
Copy link
Author

mvalkon commented Jan 2, 2020

Thanks for the tip @webknjaz. Upgrading multidict to version 4.7.3 (was 4.7.1) changes the memory profile a lot, but does not fix the leak.

multidict_upgraded

Using objgraph.show_most_common_types() I can still see a growing number of CIMultiDict-objects (this is from a shorter test run again)

(Pdb) objgraph.show_most_common_types()
function          18050
dict              15242
CIMultiDict       14775
_KeysView         14462
tuple             10103
OrderedDict       9816
list              6121
FrameSummary      5029
weakref           4935
getset_descriptor 2881

Getting the most leaking objects

roots = objgraph.get_leaking_objects()
(Pdb) objgraph.show_most_common_types(objects=roots)
_KeysView  14462
dict       1309
set        241
tuple      33
list       11
SignalDict 8
weakref    5
method     5
slice      2
CTypeDescr 2

@webknjaz
Copy link
Member

webknjaz commented Jan 2, 2020

How about downgrading?

@asvetlov
Copy link
Member

asvetlov commented Jan 2, 2020

After testing multidict in different scenarios I was unable to detect any memory leak; everything is returned back to the allocator.

Sorry, I cannot perform analyzing without a leaking code.

@gjcarneiro
Copy link
Contributor

I have not seen this memory leak either. Likely the fault is in the application code, rather than aiohttp itself. Is your app code storing requests somewhere, by any chance?...

You should try to create a minimal example that reproduces the leak, and post it.

@mvalkon
Copy link
Author

mvalkon commented Jan 7, 2020

@asvetlov @gjcarneiro @webknjaz thanks for looking into this and sorry for not being able to provide an isolated example.

I have managed to isolate this issue to a middleware function which creates an opentracing-span for every incoming request. I am not sure why this causes a memory leak but I think that it has nothing to do with aiohttp, so I think this issue can be closed.

The middleware in question does this, and I don't see anything obvious here that leaks memory. Perhaps something in the vendor implementation causes it, will debug further.

@web.middleware
async def opentracing_middleware(request: web.Request, handler: Callable):
    """ Tracing middleware function which is applied for all handlers. Extracts a
    span context from the request and creates a new span using the context as the
    parent. If there is no context, starts a new span without a reference. """

    # Avoid polluting the traces by ignoring the health endpoint.
    if request.rel_url.path == "/health":
        return await handler(request)

    try:
        span_context = opentracing.tracer.extract(
            format=Format.HTTP_HEADERS, carrier=request.headers
        )
    except (
        opentracing.InvalidCarrierException,
        opentracing.SpanContextCorruptedException,
    ):
        span_context = None

    with opentracing.tracer.start_active_span(
        child_of=span_context,
        operation_name=request.match_info.handler.__name__,
        finish_on_close=True,
        tags=default_server_tags(request),
    ) as scope:  # noqa
        return await handler(request)

@mvalkon mvalkon closed this as completed Jan 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants