-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(url): fix host extra trailing slash issue #1173
fix(url): fix host extra trailing slash issue #1173
Conversation
please review |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #1173 +/- ##
==========================================
- Coverage 90.40% 90.39% -0.02%
==========================================
Files 106 106
Lines 16515 16535 +20
Branches 36 36
==========================================
+ Hits 14930 14946 +16
- Misses 1578 1582 +4
Partials 7 7
Continue to review full report in Codecov by Sentry.
|
CodSpeed Performance ReportMerging #1173 will degrade performances by 23.15%Comparing Summary
Benchmarks breakdown
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! It looks like the main thread discussing all these cases is pydantic/pydantic#7186
I agree there's definitely a normalization bug there with path
, so let's get that fixed 👍
Regarding the more extensive changes you've made to query
and fragment, perhaps the more correct solution is to get servo/rust-url#835 implemented upstream and then we can move to a proper method to build these URLs rather than fiddling around with normalising into a string here. Maybe best to leave those off this PR for now; if you've got interest in tackling the upstream issue it sounds like there's a lot of demand for it!
if !url.ends_with('/') { | ||
url.push('/'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should expect that url
never ends with /
, as that's technically not a valid hostname? I wonder if there's a case to raise an error if url_host
contains /
?
if path.starts_with('/') { | ||
url.push_str(path.trim_start_matches('/')); | ||
} else { | ||
url.push_str(path); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think trimming the path is too aggressive, because ///foo
as a path is technically meaningful and
if path.starts_with('/') { | |
url.push_str(path.trim_start_matches('/')); | |
} else { | |
url.push_str(path); | |
} | |
if !path.starts_with('/') { | |
url.push('/'); | |
} |
if !query.starts_with('?') { | ||
url.push('?'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a change in functionality which is slightly different, so perhaps needs to be split out into a separate PR. Similar to my comment about hostname
, there should probably be some validation here.
if let Some(path) = path { | ||
url.push('/'); | ||
url.push_str(path); | ||
if !url.ends_with('/') { | ||
url.push('/'); | ||
} | ||
if path.starts_with('/') { | ||
url.push_str(path.trim_start_matches('/')); | ||
} else { | ||
url.push_str(path); | ||
} | ||
} | ||
if let Some(query) = query { | ||
url.push('?'); | ||
if !query.starts_with('?') { | ||
url.push('?'); | ||
} | ||
url.push_str(query); | ||
} | ||
if let Some(fragment) = fragment { | ||
url.push('#'); | ||
if !fragment.starts_with('#') { | ||
url.push('#'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments as above, I guess.
# 3) with host trailing slash/no path leading slash in the input | ||
url = Url.build(scheme='https', host='example.com/', path='foo/bar', query='baz=qux', fragment='quux') | ||
assert_example_url(url) | ||
|
||
# 4) with host trailing slash/with path leading slash in the input | ||
url = Url.build(scheme='https', host='example.com/', path='/foo/bar', query='baz=qux', fragment='quux') | ||
assert_example_url(url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I would be tempted to argue that these cases should never be valid, because the host
isn't valid? Maybe that needs to wait until V3 given it's technically breaking? Or is it kinda broken anyway already because of the unconditional /
addition?
@davidhewitt , this is in regards to all of your comments - you're right. I guess I got into "fix every case" mood here :) |
Change Summary
Currently
Url.build
andMultiHostUrl.build
methods both have a bug, where there is an extra slash always added afterhost
beforepath
, regardless of whether any of them already have trailing/leading slash. This PR adds checks to make sure that there will only single slash in the resulting URL. Additionally, added similar checks toquery
andfragment
appends, so jsut in case any of the contain leading?
/#
.Related issue number
I haven't found any related issue in this repo, but there were related issues in
pydantic/pydantic
repo.Checklist
pydantic-core
(except for expected changes)Selected Reviewer: @davidhewitt