Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement granular URL error hierarchy in the HTTP client #6722

Merged

Conversation

setla
Copy link
Contributor

@setla setla commented Apr 27, 2022

What do these changes do?

Added new exception type to separate cases with invalid redirect url during request. Previously ValueError or InvalidURL was raised and screening out was hard (valid url that redirects to invalid one raised the same error as invalid url).

Slightly improved test coverage.

Docs changes previews:

Are there changes in behavior for the user?

In case of invalid redirect url during request - new InvalidRedirectUrl will be raised instead of ValueError or InvalidURL, it derives from InvalidURL so it's backward compatibile

Related issue number

Resolves #2507
Resolves #2630
Resolves #3315

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

@setla setla requested review from webknjaz and asvetlov as code owners April 27, 2022 13:49
@setla setla marked this pull request as draft April 27, 2022 13:52
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Apr 27, 2022
@setla setla closed this Apr 27, 2022
@setla setla reopened this Apr 27, 2022
@setla setla marked this pull request as ready for review April 28, 2022 12:45
aiohttp/client.py Outdated Show resolved Hide resolved
aiohttp/client_exceptions.py Outdated Show resolved Hide resolved
aiohttp/client_exceptions.py Outdated Show resolved Hide resolved
tests/test_client_functional.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Apr 28, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (0ec65c0) 97.51% compared to head (80b5479) 97.52%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6722      +/-   ##
==========================================
+ Coverage   97.51%   97.52%   +0.01%     
==========================================
  Files         107      107              
  Lines       32733    32802      +69     
  Branches     3824     3840      +16     
==========================================
+ Hits        31920    31991      +71     
+ Misses        612      610       -2     
  Partials      201      201              
Flag Coverage Δ
CI-GHA 97.44% <100.00%> (+0.01%) ⬆️
OS-Linux 97.12% <100.00%> (+0.01%) ⬆️
OS-Windows 95.62% <100.00%> (+<0.01%) ⬆️
OS-macOS 96.75% <100.00%> (-0.18%) ⬇️
Py-3.10.11 95.54% <100.00%> (+<0.01%) ⬆️
Py-3.10.13 96.92% <100.00%> (+0.01%) ⬆️
Py-3.11.7 ?
Py-3.11.8 96.57% <100.00%> (?)
Py-3.12.1 ?
Py-3.12.2 96.69% <100.00%> (?)
Py-3.8.10 95.51% <100.00%> (+<0.01%) ⬆️
Py-3.8.18 96.86% <100.00%> (+0.01%) ⬆️
Py-3.9.13 95.51% <100.00%> (+<0.01%) ⬆️
Py-3.9.18 96.89% <100.00%> (+0.01%) ⬆️
Py-pypy7.3.15 96.43% <100.00%> (+0.01%) ⬆️
VM-macos 96.75% <100.00%> (-0.18%) ⬇️
VM-ubuntu 97.12% <100.00%> (+0.01%) ⬆️
VM-windows 95.62% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@setla setla requested a review from Dreamsorcerer April 28, 2022 18:42
@setla
Copy link
Contributor Author

setla commented May 27, 2022

@Dreamsorcerer can I get some clue if (and when if so) it's going to be merged?

@webknjaz
Copy link
Member

@setla looks like a few files diverged since your last update — could you fix the merge conflicts?

@Dreamsorcerer has this slipped off of your review queue?

CHANGES/6722.feature Outdated Show resolved Hide resolved
@webknjaz webknjaz changed the title New exception when redirect url is invalid Implement granular URL error hierarchy in the HTTP client Feb 13, 2024
@webknjaz webknjaz merged commit fb465e1 into aio-libs:master Feb 13, 2024
30 of 34 checks passed
Copy link
Contributor

patchback bot commented Feb 13, 2024

Backport to 3.10: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply fb465e1 on top of patchback/backports/3.10/fb465e155b872f01489173d11e35f02ccbf3a940/pr-6722

Backporting merged PR #6722 into master

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/aio-libs/aiohttp.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/3.10/fb465e155b872f01489173d11e35f02ccbf3a940/pr-6722 upstream/3.10
  4. Now, cherry-pick PR Implement granular URL error hierarchy in the HTTP client #6722 contents into that branch:
    $ git cherry-pick -x fb465e155b872f01489173d11e35f02ccbf3a940
    If it'll yell at you with something like fatal: Commit fb465e155b872f01489173d11e35f02ccbf3a940 is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x fb465e155b872f01489173d11e35f02ccbf3a940
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Implement granular URL error hierarchy in the HTTP client #6722 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/3.10/fb465e155b872f01489173d11e35f02ccbf3a940/pr-6722
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

Copy link
Contributor

patchback bot commented Feb 13, 2024

Backport to 3.9: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply fb465e1 on top of patchback/backports/3.9/fb465e155b872f01489173d11e35f02ccbf3a940/pr-6722

Backporting merged PR #6722 into master

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/aio-libs/aiohttp.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/3.9/fb465e155b872f01489173d11e35f02ccbf3a940/pr-6722 upstream/3.9
  4. Now, cherry-pick PR Implement granular URL error hierarchy in the HTTP client #6722 contents into that branch:
    $ git cherry-pick -x fb465e155b872f01489173d11e35f02ccbf3a940
    If it'll yell at you with something like fatal: Commit fb465e155b872f01489173d11e35f02ccbf3a940 is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x fb465e155b872f01489173d11e35f02ccbf3a940
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Implement granular URL error hierarchy in the HTTP client #6722 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/3.9/fb465e155b872f01489173d11e35f02ccbf3a940/pr-6722
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@webknjaz
Copy link
Member

@setla to get this released, backport PRs should be submitted per instructions above.

setla added a commit to setla/aiohttp that referenced this pull request Feb 13, 2024
@setla
Copy link
Contributor Author

setla commented Feb 13, 2024

@webknjaz right, should I create 2 PRs (3.9 and 3.10)? or just one (3.9)? 3.9 is already present #8154

@webknjaz
Copy link
Member

Two — the branches are separate. The backports are to made in reversed order. Besides, I know that 3.10 is definitely necessary, but not sure about 3.9 — I'll wait to see if anybody objects since bits of this might be perceived as breaking/new behaviors and not just a bugfix. Though I'll be able to merge into 3.10 right away.

@Dreamsorcerer
Copy link
Member

I'll wait to see if anybody objects since bits of this might be perceived as breaking/new behaviors and not just a bugfix.

I'm not sure it's important enough to bother going into 3.9. I think we have enough improvements in 3.10 that we should be planning a release in a couple of months anyway.

@webknjaz
Copy link
Member

I'm not sure it's important enough to bother going into 3.9. I think we have enough improvements in 3.10 that we should be planning a release in a couple of months anyway.

I want to make a 3.9 release as soon as #8089 is in. So if there's anything that could piggyback with that for free, I'd rather include it for as long as it's eligible. OTOH, the new exceptions are a new API, so per SemVer it should go into a minor release, not patch. I lost sight of it, but I just checked that the change notes are of the feature type, which makes more sense. I suppose, a 3.9 backport isn't needed indeed.

setla added a commit to setla/aiohttp that referenced this pull request Feb 14, 2024
This patch introduces 5 granular user-facing exceptions that may occur
when HTTP requests are made:
* `InvalidUrlClientError`
* `RedirectClientError`
* `NonHttpUrlClientError`
* `InvalidUrlRedirectClientError`
* `NonHttpUrlRedirectClientError`

Previously `ValueError` or `InvalidURL` was raised and screening out was
complicated (a valid URL that redirects to invalid one raised the same error
as an invalid URL).

Ref: aio-libs#6722 (comment)

PR aio-libs#6722

Resolves aio-libs#2507
Resolves aio-libs#2630
Resolves aio-libs#3315

Co-authored-by: Sviatoslav Sydorenko <sviat@redhat.com>
(cherry picked from commit fb465e1)
@setla
Copy link
Contributor Author

setla commented Feb 14, 2024

@webknjaz okay - I opened PR to 3.10 #8158

webknjaz pushed a commit that referenced this pull request Feb 14, 2024
…rchy in the HTTP client (#8158)

**This is a backport of PR #6722 as merged into master
(fb465e1).**

This patch introduces 5 granular user-facing exceptions that may occur
when HTTP requests are made:
* `InvalidUrlClientError`
* `RedirectClientError`
* `NonHttpUrlClientError`
* `InvalidUrlRedirectClientError`
* `NonHttpUrlRedirectClientError`

Previously `ValueError` or `InvalidURL` was raised and screening out was
complicated (a valid URL that redirects to invalid one raised the same
error
as an invalid URL).

Ref:
#6722 (comment)

PR #6722

Resolves #2507
Resolves #2630
Resolves #3315

Co-authored-by: Sviatoslav Sydorenko <sviat@redhat.com>
(cherry picked from commit fb465e1)
@webknjaz
Copy link
Member

@setla thank you! This improvement is what I've been wanting to get done for years and now it's mere months away.. Watch for new releases of the 3.10 series coming up this year!

@setla
Copy link
Contributor Author

setla commented Feb 15, 2024

thanks :)

@bdraco
Copy link
Member

bdraco commented Feb 16, 2024

It looks like this introduced a regression with websocket urls

2024-02-16 09:45:14.978 ERROR (MainThread) [pyunifiprotect.websocket] Websocket disconnect error: %s
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/pyunifiprotect/websocket.py", line 95, in _websocket_loop
    self._ws_connection = await session.ws_connect(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/client.py", line 884, in _ws_connect
    resp = await self.request(
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/client.py", line 472, in _request
    raise NonHttpUrlClientError(url)
aiohttp.client_exceptions.NonHttpUrlClientError: wss://192.168.209.164/proxy/protect/ws/updates?lastUpdateId=80f54055-84de-4e15-95d7-80cac46aafbd

@webknjaz
Copy link
Member

@bdraco does this mean there's no test coverage for this case? Would you look into it? What's the expected behavior? What's supposed to handle this?

@@ -167,6 +177,7 @@ class ClientTimeout:

# https://www.rfc-editor.org/rfc/rfc9110#section-9.2.2
IDEMPOTENT_METHODS = frozenset({"GET", "HEAD", "OPTIONS", "TRACE", "PUT", "DELETE"})
HTTP_SCHEMA_SET = frozenset({"http", "https", ""})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdraco @setla would augmenting this make it work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will analyze how its being used in the case I posted above when I get back home next week and provide additional detail

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a possible future improvement, I'd rename the constant to something like SUPPORTED_URI_SCHEMAS. Especially, if it'll stop doing listing http only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have two constants where one lists ws/wss and the check asserts against both in the first place but not the second one. Are websockets supposed to be disallowed in redirects?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ws_connect always calls into self.request think the simplest solution to avoid a breaking change is:

diff --git a/aiohttp/client.py b/aiohttp/client.py
index 8d8d13f2..d608c009 100644
--- a/aiohttp/client.py
+++ b/aiohttp/client.py
@@ -178,7 +178,7 @@ DEFAULT_TIMEOUT: Final[ClientTimeout] = ClientTimeout(total=5 * 60)
 
 # https://www.rfc-editor.org/rfc/rfc9110#section-9.2.2
 IDEMPOTENT_METHODS = frozenset({"GET", "HEAD", "OPTIONS", "TRACE", "PUT", "DELETE"})
-HTTP_SCHEMA_SET = frozenset({"http", "https", ""})
+HTTP_SCHEMA_SET = frozenset({"http", "https", "ws", "wss", ""})
 
 _RetType = TypeVar("_RetType")
 _CharsetResolver = Callable[[ClientResponse, bytes], str]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably want that in the _request() check, but probably still want to limit it to http/https in the redirects, and disable redirects in _ws_connect() (as mentioned above).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. So it seems we need to different constants

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should include wss:// as well as this breaks discord.py entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably lost track of this, maybe nobody got round to making the change discussed above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:chronographer:provided There is a change note present in this PR
Projects
None yet
6 participants