Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Convert to unescaped string #157

Open
Aran-Fey opened this issue Jul 30, 2022 · 2 comments
Open

Feature request: Convert to unescaped string #157

Aran-Fey opened this issue Jul 30, 2022 · 2 comments

Comments

@Aran-Fey
Copy link

When urls (or parts thereof) are converted to a string, they're always escaped:

>>> url = furl('foo.bar/fire truck?hello world=#hi there')
>>> str(url)
'foo.bar/fire%20truck?hello+world=#hi%20there'
>>> str(url.path)
'foo.bar/fire%20truck'
>>> str(url.query)
'hello+world='
>>> str(url.fragment)
'hi%20there'

It would be useful if there was a way to obtain unescaped strings:

>>> url.unescaped_str()
'foo.bar/fire truck?hello world=#hi there'
>>> url.path.unescaped_str()
'foo.bar/fire truck'
>>> url.query.unescaped_str()
'hello world='
>>> url.fragment.unescaped_str()
'hi there'
@gruns
Copy link
Owner

gruns commented Aug 4, 2022

lets zoom out a bit so i understand the exact problem youre trying to solve! that way we can best solve it with furl :)

to start, what are you using these unescaped strings for?

@Aran-Fey
Copy link
Author

Aran-Fey commented Aug 5, 2022

Hmm, that's a bit tough to explain. Essentially, my program is a web scraper. You give it an URL as input, and it scrapes that website. You can use the #fragment to narrow down what you want it to scrape. For example, if the URL is example.com#Hello World it looks for a <h1>Hello World</h1> and only scrapes that section. So I need the text "Hello World", and not "Hello%20World".

To put it more generally: furl is designed to output URLs. You put (unescaped) text in, and you get a valid (escaped) URL as output. But you can't do the opposite, i.e. take an URL as input and parse/destructure it into (unescaped) information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants