Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize URL string ports per RFC 3986 section 3.2.3 #1033

Merged
merged 19 commits into from
Jul 17, 2024

Conversation

commonism
Copy link
Contributor

@commonism commonism commented Jul 3, 2024

What do these changes do?

The changes add a missing step in rfc3986 normalization - strip default port from URL str representation.

Are there changes in behavior for the user?

Yes

Related issue number

#1023 (comment)

In the future, with another PR, I think the only normalisation step currently missing is https://datatracker.ietf.org/doc/html/rfc3986.html#section-6.2.3
Which would include (for HTTP/HTTPS schemes) setting the path to / when empty and removing the port number when matching the default 80/443.

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes

Copy link

codecov bot commented Jul 5, 2024

Codecov Report

Attention: Patch coverage is 85.71429% with 7 lines in your changes missing coverage. Please review.

Project coverage is 62.34%. Comparing base (4bededd) to head (4e255da).
Report is 4 commits behind head on master.

Files Patch % Lines
yarl/_url.py 80.76% 5 Missing ⚠️
tests/test_url.py 80.00% 1 Missing ⚠️
tests/test_url_update_netloc.py 94.44% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1033      +/-   ##
==========================================
+ Coverage   62.06%   62.34%   +0.28%     
==========================================
  Files          38       38              
  Lines        6550     6616      +66     
  Branches      352      357       +5     
==========================================
+ Hits         4065     4125      +60     
- Misses       2457     2463       +6     
  Partials       28       28              
Flag Coverage Δ
CI-GHA 62.31% <85.71%> (+0.28%) ⬆️
MyPy 25.56% <67.34%> (+0.54%) ⬆️
OS-Linux 99.25% <100.00%> (+<0.01%) ⬆️
OS-Windows 99.58% <100.00%> (+<0.01%) ⬆️
OS-macOS 99.01% <100.00%> (+0.01%) ⬆️
Py-3.10.11 98.90% <95.00%> (-0.02%) ⬇️
Py-3.10.14 99.10% <95.00%> (-0.02%) ⬇️
Py-3.11.9 99.10% <95.00%> (-0.02%) ⬇️
Py-3.12.4 99.10% <95.00%> (-0.02%) ⬇️
Py-3.8.10 98.87% <100.00%> (+0.01%) ⬆️
Py-3.8.18 99.07% <100.00%> (+0.01%) ⬆️
Py-3.9.13 98.87% <100.00%> (+0.01%) ⬆️
Py-3.9.19 99.07% <100.00%> (+0.01%) ⬆️
Py-pypy7.3.11 99.39% <100.00%> (+<0.01%) ⬆️
Py-pypy7.3.16 99.42% <100.00%> (+<0.01%) ⬆️
VM-macos-latest 99.01% <100.00%> (+0.01%) ⬆️
VM-ubuntu-latest 99.25% <100.00%> (+<0.01%) ⬆️
VM-windows-latest 99.58% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@commonism commonism changed the title WIP RFC3986 port normalization RFC3986 port normalization Jul 5, 2024
@commonism
Copy link
Contributor Author

@webknjaz @Dreamsorcerer - please review
In addition to getting more RFC3986 compliant by normalizing ports, this is part of a series (1/3) to implement URL.join without using urljoin.
Currently URL.join is implemented using using urllib.parse.urljoin, urljoin strips. empty segments in violation of rfc3985

@webknjaz
Copy link
Member

webknjaz commented Jul 6, 2024

Remind me on Monday evening if I forget, not earlier, plz.

yarl/_url.py Outdated Show resolved Hide resolved
yarl/_url.py Outdated Show resolved Hide resolved
commonism and others added 4 commits July 7, 2024 08:47
Co-authored-by: Sam Bull <aa6bs0@sambull.org>
Co-authored-by: Sam Bull <aa6bs0@sambull.org>
@commonism
Copy link
Contributor Author

Remind me on Monday evening if I forget, not earlier, plz.

Remind

yarl/_url.py Outdated Show resolved Hide resolved
yarl/_url.py Outdated Show resolved Hide resolved
yarl/_url.py Outdated Show resolved Hide resolved
yarl/_url.py Show resolved Hide resolved
CHANGES/1033.bugfix.rst Outdated Show resolved Hide resolved
commonism and others added 4 commits July 17, 2024 08:36
Co-authored-by: Sam Bull <aa6bs0@sambull.org>
Co-authored-by: Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>
Co-authored-by: Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>
Co-authored-by: Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>
yarl/_url.py Outdated Show resolved Hide resolved
commonism and others added 3 commits July 17, 2024 13:25
Co-authored-by: Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>
CHANGES/1033.bugfix.rst Outdated Show resolved Hide resolved
tests/test_url.py Outdated Show resolved Hide resolved
yarl/_url.py Outdated Show resolved Hide resolved
webknjaz and others added 2 commits July 17, 2024 15:38
Co-authored-by: Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>
CHANGES/1033.bugfix.rst Outdated Show resolved Hide resolved
@webknjaz webknjaz added bug bot:chronographer:skip This PR does not need to include a change note labels Jul 17, 2024
@webknjaz webknjaz changed the title RFC3986 port normalization Normalize URL string ports per RFC 3986 section 3.2.3 Jul 17, 2024
@webknjaz webknjaz enabled auto-merge (squash) July 17, 2024 17:12
@webknjaz webknjaz merged commit d5c5ab3 into aio-libs:master Jul 17, 2024
41 of 44 checks passed
@webknjaz webknjaz removed the bot:chronographer:skip This PR does not need to include a change note label Jul 17, 2024
@commonism commonism deleted the rfc3986-port-normalization branch July 18, 2024 06:43
@Dreamsorcerer Dreamsorcerer mentioned this pull request Sep 2, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants