-
-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix trailing slash dropped since v1.9.1 #1023
Conversation
When it was changed, I considered this a inconvenient change, time went by and I started using it to drop trailing slashes import yarl
yarl.__version__
# '1.9.4'
a = yarl.URL("schema://host/path/")
a
# URL('schema://host/path/')
a / ""
# URL('schema://host/path') if (nf := (yarl.URL(c).with_fragment(None) / "")) in … What is the recommended way to drop a trailing slash after this change? |
I don't think it was ever meant to be doing that, to me it's just buggy behaviour. If you need something like that, I guess you'll need to do something like: if a.path.endswith("/"):
a = a.with_path(u.path[:-1]) |
As mentioned, current behavior matches pathlib, but pathlib always removes trailing /, on the other hand internally "" was used to append a "/". I'm fine with whatever gets documented as behavior. New regressions:
|
@webknjaz any chance you can review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracked down the origin.
The detection to add a trailing / in _make_child did not identify the empty string as last part as indicator.
@@ -718,9 +718,7 @@ def _make_child(self, segments, encoded=False): | |||
# keep the trailing slash if the last segment ends with / | |||
parsed = [""] if segments and segments[-1][-1:] == "/" else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is origin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parsed = [""] if segments and segments[-1][-1:] == "/" else [] | |
parsed = [""] if segments and (segments[-1][-1:] == "/" or segments[-1] == "") else [] |
), | ||
"http://example.com/path/", | ||
id="trailing-slash-empty-string", | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytest.param(
"",
(
"path",
"",
"",
),
"http://example.com/path/",
id="trailing-double-slash-empty-string",
),
pytest.param(
"",
("path", "", "", "path"),
"http://example.com/path/path",
id="embedded-double-slash-empty-string",
),
pytest.param(
"",
("path/", ""),
"http://example.com/path/",
id="path-slash-follows-embedded-slash-empty-string",
),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t agree with the first 2 parametrizations. IMO they should result in double slashes.
otherwise how can you build a url with double slashes in the path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normalized path do not have double slashes?
And from a technical point of view,
("path/", "")
is the same as ("path", "", "")
So the third would be bad as well.
The requested behavior does not match the test_joinpath_relative[empty-element]
- the test you broke.
Line 835 in f9e3b1b
pytest.param(URL("a"), ("b", "", "c"), ("a", "b", "c"), id="empty-element"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure these would be a normalised paths?
Even if so, the lib should allow to generate non-normalised paths, since many web servers allow to differentiate for paths with and without slashes at the end for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to RFC 3986, the path is allowed to have empty strings as segment of the path, as defined in the grammar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youtux how to you think about:
commonism@56db9d3
it passes all current tests and satisfies the requirements set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that all looks correct to me, with no opinion on whether the last example should have 3 or 4 slashes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last example is meant to have 3 slashes - 4 would be a special case.
I've refactored things a little to match the wording in rfc3986 and added more tests to document expected behavior.
URL.join is inconsistent to URL.joinpath now, it uses urlib.parse.join, stripping empty segments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding a consistent URL.join & rfc3986 port normalization
https://github.com/commonism/yarl/compare/youtux-fix-join-trailing-slash...commonism:yarl:rfc3986-port-normalization?expand=1
How do you want this to proceed @youtux?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should open a new pull request since it's more complete.
Also, I'd split other changes you made in different PRs (like the query params changes)
if not seg: | ||
continue | ||
if seg[0] == "/": | ||
if seg and seg[0] == "/": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert
if seg and seg[0] == "/": | |
if not seg: | |
continue | |
if seg and seg[0] == "/": |
Your versions fails the unit tests.
|
What do these changes do?
Add back the ability to join with an empty string
""
, resulting in the/
being appended to the URL. This was broken in v1.9.1, and since then it's not possible to create a URL with a trailing backslash (without doing string manipulation on the last segment).Are there changes in behavior for the user?
Yes, fix a regression introuduced in v1.9.1, where you can't join "" expecting to have a trailing "/" appended to the URL.
Related issue number
Fixes #984.
Fixes #926.
Checklist