We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_Url
str
There was a previous discussion about this before in one of the PRs.
I'm re-opening this for tracking since this part of w3lib.util.to_unicode breaks: https://github.com/scrapy/w3lib/blob/master/w3lib/util.py#L46-L49
w3lib.util.to_unicode
In particular, doing something like:
from scrapy.linkextractors import LinkExtractor link_extractor = LinkExtractor() link_extractor.extract_links(response)
where response is a web_poet.page_inputs.http.HttpResponse instance and not scrapy.http.Response.
response
web_poet.page_inputs.http.HttpResponse
scrapy.http.Response
The full stacktrace would be:
File "/usr/local/lib/python3.10/site-packages/scrapy/linkextractors/[lxmlhtml.py](http://lxmlhtml.py/)", line 239, in extract_links base_url = get_base_url(response) File "/usr/local/lib/python3.10/site-packages/scrapy/utils/[response.py](http://response.py/)", line 27, in get_base_url _baseurl_cache[response] = html.get_base_url( File "/usr/local/lib/python3.10/site-packages/w3lib/[html.py](http://html.py/)", line 323, in get_base_url return safe_url_string(baseurl) File "/usr/local/lib/python3.10/site-packages/w3lib/[url.py](http://url.py/)", line 141, in safe_url_string decoded = to_unicode(url, encoding=encoding, errors="percentencode") File "/usr/local/lib/python3.10/site-packages/w3lib/[util.py](http://util.py/)", line 47, in to_unicode raise TypeError( TypeError: to_unicode must receive bytes or str, got ResponseUrl
Other alternatives could be adjusting Scrapy code instead to cast str(response.url) for every use.
str(response.url)
The text was updated successfully, but these errors were encountered:
No branches or pull requests
There was a previous discussion about this before in one of the PRs.
I'm re-opening this for tracking since this part of
w3lib.util.to_unicode
breaks: https://github.com/scrapy/w3lib/blob/master/w3lib/util.py#L46-L49In particular, doing something like:
where
response
is aweb_poet.page_inputs.http.HttpResponse
instance and notscrapy.http.Response
.The full stacktrace would be:
Other alternatives could be adjusting Scrapy code instead to cast
str(response.url)
for every use.The text was updated successfully, but these errors were encountered: