Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aiohttp request exceptions can't be caught sometimes when encountering bad http message #3031

Closed
voidfyoo opened this issue May 28, 2018 · 2 comments · Fixed by #3079
Closed
Labels

Comments

@voidfyoo
Copy link

voidfyoo commented May 28, 2018

Long story short

When encountering bad http message, aiohttp request exceptions can't be caught sometimes.

For example, I use code like below to detect if a proxy is working:

import sys
import asyncio
import logging

import aiohttp


DETECT_WEBSITE = 'httpbin.org'


async def proxy_req(proxy_url):
    if proxy_url.startswith('https://'):
        protocol = 'https'
    else:
        protocol = 'http'
    detect_url = f'{protocol}://{DETECT_WEBSITE}/get'
    proxy_url = proxy_url.replace('https://', 'http://')
    print('Detect url:', detect_url)
    print('Proxy url:', proxy_url)
    try:
        async with aiohttp.ClientSession() as session:
            async with session.get(detect_url,
                                   proxy=proxy_url,
                                   headers={'User-Agent': 'Mozilla/5.0'},
                                   timeout=10) as resp:
                text = await resp.text()
                print('Response text:')
                print(text)
    except Exception as exc:
        logging.error(exc)


if __name__ == '__main__':
    proxy_url = sys.argv[1]
    loop = asyncio.get_event_loop()
    loop.run_until_complete(proxy_req(proxy_url))

Expected behaviour

In the above code, I tried to catch all exceptions when doing request, so if a request exception happened, it should always be logged normally.

Actual behaviour

When I detect some broken proxies, for that proxy, sometimes the exception can be caught normally and logged, but sometimes the exception is not caught but are thrown directly.

For example, detect the broken proxy http://218.106.205.145:8080 ( When using this broken proxy to doing requests, it will return two different groups of reponse headers ), the output may look like below ( The first execution thrown exception, the second execution caught exception and logged ):

✗ python test.py http://218.106.205.145:8080
Detect url: http://httpbin.org/get
Proxy url: http://218.106.205.145:8080
Exception in callback None()
handle: <Handle cancelled>
Traceback (most recent call last):
  File "/Users/xxx/Coding/zzz/venv/lib/python3.6/site-packages/aiohttp/client_proto.py", line 161, in data_received
    messages, upgraded, tail = self._parser.feed_data(data)
  File "aiohttp/_http_parser.pyx", line 297, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadHttpMessage: 400, message='invalid constant string'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/events.py", line 145, in _run
    self._callback(*self._args)
  File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/selector_events.py", line 730, in _read_ready
    self._protocol.data_received(data)
  File "/Users/xxx/Coding/zzz/venv/lib/python3.6/site-packages/aiohttp/client_proto.py", line 177, in data_received
    self.transport.close()
AttributeError: 'NoneType' object has no attribute 'close'
Response text:
HTTP/1.1 500 OK
Date: Mon, 28 May 2018 09:43:07 GMT
Cache-Control: no-cache
Pragma: no-cache
Content-Type: text/html; charset="UTF-8"
Content-Length: 2173
Accep

✗ python test.py http://218.106.205.145:8080
Detect url: http://httpbin.org/get
Proxy url: http://218.106.205.145:8080
ERROR:root:400, message='invalid constant string'

Steps to reproduce

Run the code above to detect the broken proxy like http://218.106.205.145:8080

Your environment

aiohttp 3.2.1
Mac OS X
aiohttp client

@asvetlov
Copy link
Member

Broken proxy can close connection at any place.
I can fix AttributeError but unfortunately you can get an empty response body sometimes.
The second situation cannot be detected by aiohttp itself.

shikhir-arora referenced this issue in gieseladev/discord.py Aug 28, 2018
==================

Features
--------

- Add type hints (`3049 &lt;https://github.com/aio-libs/aiohttp/pull/3049&gt;`_)
- Add ``raise_for_status`` request parameter (`3073 &lt;https://github.com/aio-libs/aiohttp/pull/3073&gt;`_)
- Add type hints to HTTP client (`3092 &lt;https://github.com/aio-libs/aiohttp/pull/3092&gt;`_)
- Minor server optimizations (`3095 &lt;https://github.com/aio-libs/aiohttp/pull/3095&gt;`_)
- Preserve the cause when `HTTPException` is raised from another exception. (`3096 &lt;https://github.com/aio-libs/aiohttp/pull/3096&gt;`_)
- Add `close_boundary` option in `MultipartWriter.write` method. Support streaming (`3104 &lt;https://github.com/aio-libs/aiohttp/pull/3104&gt;`_)
- Added a ``remove_slash`` option to the ``normalize_path_middleware`` factory. (`3173 &lt;https://github.com/aio-libs/aiohttp/pull/3173&gt;`_)
- The class `AbstractRouteDef` is importable from `aiohttp.web`. (`3183 &lt;https://github.com/aio-libs/aiohttp/pull/3183&gt;`_)


Bugfixes
--------

- Prevent double closing when client connection is released before the
last ``data_received()`` callback. (`3031 &lt;https://github.com/aio-libs/aiohttp/pull/3031&gt;`_)
- Make redirect with `normalize_path_middleware` work when using url encoded paths. (`3051 &lt;https://github.com/aio-libs/aiohttp/pull/3051&gt;`_)
- Postpone web task creation to connection establishment. (`3052 &lt;https://github.com/aio-libs/aiohttp/pull/3052&gt;`_)
- Fix ``sock_read`` timeout. (`3053 &lt;https://github.com/aio-libs/aiohttp/pull/3053&gt;`_)
- When using a server-request body as the `data=` argument of a client request, iterate over the content with `readany` instead of `readline` to avoid `Line too long` errors. (`3054 &lt;https://github.com/aio-libs/aiohttp/pull/3054&gt;`_)
- fix `UrlDispatcher` has no attribute `add_options`, add `web.options` (`3062 &lt;https://github.com/aio-libs/aiohttp/pull/3062&gt;`_)
- correct filename in content-disposition with multipart body (`3064 &lt;https://github.com/aio-libs/aiohttp/pull/3064&gt;`_)
- Many HTTP proxies has buggy keepalive support.
Let&#39;s not reuse connection but close it after processing every response. (`3070 &lt;https://github.com/aio-libs/aiohttp/pull/3070&gt;`_)
- raise 413 &quot;Payload Too Large&quot; rather than raising ValueError in request.post()
Add helpful debug message to 413 responses (`3087 &lt;https://github.com/aio-libs/aiohttp/pull/3087&gt;`_)
- Fix `StreamResponse` equality, now that they are `MutableMapping` objects. (`3100 &lt;https://github.com/aio-libs/aiohttp/pull/3100&gt;`_)
- Fix server request objects comparison (`3116 &lt;https://github.com/aio-libs/aiohttp/pull/3116&gt;`_)
- Do not hang on `206 Partial Content` response with `Content-Encoding: gzip` (`3123 &lt;https://github.com/aio-libs/aiohttp/pull/3123&gt;`_)
- Fix timeout precondition checkers (`3145 &lt;https://github.com/aio-libs/aiohttp/pull/3145&gt;`_)


Improved Documentation
----------------------

- Add a new FAQ entry that clarifies that you should not reuse response
objects in middleware functions. (`3020 &lt;https://github.com/aio-libs/aiohttp/pull/3020&gt;`_)
- Add FAQ section &quot;Why is creating a ClientSession outside of an event loop dangerous?&quot; (`3072 &lt;https://github.com/aio-libs/aiohttp/pull/3072&gt;`_)
- Fix link to Rambler (`3115 &lt;https://github.com/aio-libs/aiohttp/pull/3115&gt;`_)
- Fix TCPSite documentation on the Server Reference page. (`3146 &lt;https://github.com/aio-libs/aiohttp/pull/3146&gt;`_)
- Fix documentation build configuration file for Windows. (`3147 &lt;https://github.com/aio-libs/aiohttp/pull/3147&gt;`_)
- Remove no longer existing lingering_timeout parameter of Application.make_handler from documentation. (`3151 &lt;https://github.com/aio-libs/aiohttp/pull/3151&gt;`_)
- Mention that ``app.make_handler`` is deprecated, recommend to use runners
API instead. (`3157 &lt;https://github.com/aio-libs/aiohttp/pull/3157&gt;`_)


Deprecations and Removals
-------------------------

- Drop ``loop.current_task()`` from ``helpers.current_task()`` (`2826 &lt;https://github.com/aio-libs/aiohttp/pull/2826&gt;`_)
- Drop ``reader`` parameter from ``request.multipart()``. (`3090 &lt;https://github.com/aio-libs/aiohttp/pull/3090&gt;`_)
kornicameister referenced this issue in kornicameister/korni-stats-collector Sep 17, 2018
This PR updates [aiohttp](https://pypi.org/project/aiohttp) from **3.3.2** to **3.4.4**.



<details>
  <summary>Changelog</summary>
  
  
   ### 3.4.4
   ```
   ==================

- Fix installation from sources when compiling toolkit is not available (`3241 &lt;https://github.com/aio-libs/aiohttp/pull/3241&gt;`_)
   ```
   
  
  
   ### 3.4.3
   ```
   ==================

- Add ``app.pre_frozen`` state to properly handle startup signals in sub-applications. (`3237 &lt;https://github.com/aio-libs/aiohttp/pull/3237&gt;`_)
   ```
   
  
  
   ### 3.4.2
   ```
   ==================

- Fix ``iter_chunks`` type annotation (`3230 &lt;https://github.com/aio-libs/aiohttp/pull/3230&gt;`_)
   ```
   
  
  
   ### 3.4.1
   ```
   ==================

- Fix empty header parsing regression. (`3218 &lt;https://github.com/aio-libs/aiohttp/pull/3218&gt;`_)
- Fix BaseRequest.raw_headers doc. (`3215 &lt;https://github.com/aio-libs/aiohttp/pull/3215&gt;`_)
- Fix documentation building on ReadTheDocs (`3221 &lt;https://github.com/aio-libs/aiohttp/pull/3221&gt;`_)
   ```
   
  
  
   ### 3.4.0
   ```
   ==================

Features
--------

- Add type hints (`3049 &lt;https://github.com/aio-libs/aiohttp/pull/3049&gt;`_)
- Add ``raise_for_status`` request parameter (`3073 &lt;https://github.com/aio-libs/aiohttp/pull/3073&gt;`_)
- Add type hints to HTTP client (`3092 &lt;https://github.com/aio-libs/aiohttp/pull/3092&gt;`_)
- Minor server optimizations (`3095 &lt;https://github.com/aio-libs/aiohttp/pull/3095&gt;`_)
- Preserve the cause when `HTTPException` is raised from another exception. (`3096 &lt;https://github.com/aio-libs/aiohttp/pull/3096&gt;`_)
- Add `close_boundary` option in `MultipartWriter.write` method. Support streaming (`3104 &lt;https://github.com/aio-libs/aiohttp/pull/3104&gt;`_)
- Added a ``remove_slash`` option to the ``normalize_path_middleware`` factory. (`3173 &lt;https://github.com/aio-libs/aiohttp/pull/3173&gt;`_)
- The class `AbstractRouteDef` is importable from `aiohttp.web`. (`3183 &lt;https://github.com/aio-libs/aiohttp/pull/3183&gt;`_)


Bugfixes
--------

- Prevent double closing when client connection is released before the
  last ``data_received()`` callback. (`3031 &lt;https://github.com/aio-libs/aiohttp/pull/3031&gt;`_)
- Make redirect with `normalize_path_middleware` work when using url encoded paths. (`3051 &lt;https://github.com/aio-libs/aiohttp/pull/3051&gt;`_)
- Postpone web task creation to connection establishment. (`3052 &lt;https://github.com/aio-libs/aiohttp/pull/3052&gt;`_)
- Fix ``sock_read`` timeout. (`3053 &lt;https://github.com/aio-libs/aiohttp/pull/3053&gt;`_)
- When using a server-request body as the `data=` argument of a client request, iterate over the content with `readany` instead of `readline` to avoid `Line too long` errors. (`3054 &lt;https://github.com/aio-libs/aiohttp/pull/3054&gt;`_)
- fix `UrlDispatcher` has no attribute `add_options`, add `web.options` (`3062 &lt;https://github.com/aio-libs/aiohttp/pull/3062&gt;`_)
- correct filename in content-disposition with multipart body (`3064 &lt;https://github.com/aio-libs/aiohttp/pull/3064&gt;`_)
- Many HTTP proxies has buggy keepalive support.
  Let&#39;s not reuse connection but close it after processing every response. (`3070 &lt;https://github.com/aio-libs/aiohttp/pull/3070&gt;`_)
- raise 413 &quot;Payload Too Large&quot; rather than raising ValueError in request.post()
  Add helpful debug message to 413 responses (`3087 &lt;https://github.com/aio-libs/aiohttp/pull/3087&gt;`_)
- Fix `StreamResponse` equality, now that they are `MutableMapping` objects. (`3100 &lt;https://github.com/aio-libs/aiohttp/pull/3100&gt;`_)
- Fix server request objects comparison (`3116 &lt;https://github.com/aio-libs/aiohttp/pull/3116&gt;`_)
- Do not hang on `206 Partial Content` response with `Content-Encoding: gzip` (`3123 &lt;https://github.com/aio-libs/aiohttp/pull/3123&gt;`_)
- Fix timeout precondition checkers (`3145 &lt;https://github.com/aio-libs/aiohttp/pull/3145&gt;`_)


Improved Documentation
----------------------

- Add a new FAQ entry that clarifies that you should not reuse response
  objects in middleware functions. (`3020 &lt;https://github.com/aio-libs/aiohttp/pull/3020&gt;`_)
- Add FAQ section &quot;Why is creating a ClientSession outside of an event loop dangerous?&quot; (`3072 &lt;https://github.com/aio-libs/aiohttp/pull/3072&gt;`_)
- Fix link to Rambler (`3115 &lt;https://github.com/aio-libs/aiohttp/pull/3115&gt;`_)
- Fix TCPSite documentation on the Server Reference page. (`3146 &lt;https://github.com/aio-libs/aiohttp/pull/3146&gt;`_)
- Fix documentation build configuration file for Windows. (`3147 &lt;https://github.com/aio-libs/aiohttp/pull/3147&gt;`_)
- Remove no longer existing lingering_timeout parameter of Application.make_handler from documentation. (`3151 &lt;https://github.com/aio-libs/aiohttp/pull/3151&gt;`_)
- Mention that ``app.make_handler`` is deprecated, recommend to use runners
  API instead. (`3157 &lt;https://github.com/aio-libs/aiohttp/pull/3157&gt;`_)


Deprecations and Removals
-------------------------

- Drop ``loop.current_task()`` from ``helpers.current_task()`` (`2826 &lt;https://github.com/aio-libs/aiohttp/pull/2826&gt;`_)
- Drop ``reader`` parameter from ``request.multipart()``. (`3090 &lt;https://github.com/aio-libs/aiohttp/pull/3090&gt;`_)
   ```
   
  
</details>


 

<details>
  <summary>Links</summary>
  
  - PyPI: https://pypi.org/project/aiohttp
  - Changelog: https://pyup.io/changelogs/aiohttp/
  - Repo: https://github.com/aio-libs/aiohttp
</details>
@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
2 participants