Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differentiate between 'url.host' and 'url.raw_host' #1590

Merged
merged 7 commits into from
Apr 23, 2021
Merged

Conversation

tomchristie
Copy link
Member

@tomchristie tomchristie commented Apr 22, 2021

Refs #1275

Throughout our URL model we're differentiating neatly between byte-wise cases and str cases.
We're always using bytes when escaping is not applied, and str when escaping is applied.

Eg...

url = httpx.URL("https://jo%40email.com:a%20secret@example.com:1234/pa th")
assert url.username == "jo@email.com"
assert url.password == "a secret"
assert url.userinfo == b"jo%40email.com:a%20secret"
assert url.path == "/pa th"
assert url.raw_path == b"/pa%20th"

This pull request is a proposal for treating IDNA domain names similarly, so...

url = httpx.URL("https://müller.de:80")
assert url.host == "müller.de"
assert url.raw_host == b"xn--mller-kva.de"

For API consistency this also necessarily results in url.netloc becoming a byte interface, which actually makes sense for the contexts in which it is used.

url = httpx.URL("https://müller.de:80")
assert url.netloc == b"xn--mller-kva.de:80"

Finally we also introduce .raw_scheme for a byte-wise representation of the scheme, for a nice consistency so that:

url = httpx.URL("https://müller.de:80/pa th")
assert url.raw == (url.raw_scheme, url.raw_host, url.port, url.raw_path)
assert url.raw == (b"https", b"xn--mller-kva.de", 80, b"/pa%20th")

@tomchristie tomchristie added the user-experience Ensuring that users have a good experience using the library label Apr 22, 2021
@StephenBrown2
Copy link
Contributor

Wouldn't raw indicate unencoded?

Copy link
Contributor

@StephenBrown2 StephenBrown2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple typos, and the previous question about what raw means.

httpx/_models.py Outdated Show resolved Hide resolved
httpx/_models.py Outdated Show resolved Hide resolved
httpx/_models.py Outdated Show resolved Hide resolved
httpx/_models.py Outdated Show resolved Hide resolved
httpx/_models.py Outdated Show resolved Hide resolved
tomchristie and others added 5 commits April 23, 2021 09:03
Co-authored-by: Stephen Brown II <Stephen.Brown2@gmail.com>
Co-authored-by: Stephen Brown II <Stephen.Brown2@gmail.com>
Co-authored-by: Stephen Brown II <Stephen.Brown2@gmail.com>
Co-authored-by: Stephen Brown II <Stephen.Brown2@gmail.com>
Co-authored-by: Stephen Brown II <Stephen.Brown2@gmail.com>
@tomchristie
Copy link
Member Author

tomchristie commented Apr 23, 2021

Raw, as in the "the raw bytes on the wire", or "the raw ingredients that make up the cake".
The raw representation of the host is the actual unaltered bytewise representation that's used to make the connection.

Or, in baking...

The raw ingredients: \xf0\x9f\x8e\x82
The cake: 🎂

Similar usage of "Raw" in other technical docs.

@tomchristie
Copy link
Member Author

Thanks so much for the review @StephenBrown2.
(Geez, me & my typos. 😬)

@tomchristie tomchristie merged commit 39d8ee6 into master Apr 23, 2021
@tomchristie tomchristie deleted the raw-host branch April 23, 2021 10:00
@StephenBrown2
Copy link
Contributor

(Geez, me & my typos. 😬)

I think it was mainly just a copy-paste issue that got propagated, but no more normlized! :-p

@tomchristie tomchristie mentioned this pull request Apr 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user-experience Ensuring that users have a good experience using the library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants