lxml, google description, qwant and removing yandex. #7

fliot · 2023-02-04T22:04:09Z

Hi,
Nice project, happy to contribute.
Best regards.

EdmundMartin · 2023-02-04T22:09:16Z

Why did you remove Yandex?

EdmundMartin · 2023-02-04T22:17:49Z

searchit/scrapers/qwant.py

@@ -62,12 +63,29 @@ def _check_exceptions(self, res: ScrapeResponse) -> None:
    async def scrape(self, req: ScrapeRequest) -> List[SearchResult]:
        geo = req.geo if req.geo else "en_GB"
        urls = self._paginate(req.term, "", geo, req.count)
+        headers = {


This doesn't actually get applied to the request - as on line 83 - the headers are overriden by the call to self.user_agent() - which is probably what should be providing the headers.

EdmundMartin · 2023-02-04T22:20:11Z

setup.py

@@ -14,6 +14,7 @@
 install_requires = [
    'aiohttp>=3.6.2',
    'beautifulsoup4>=4.8.2',
+    'lxml'


This is an extra dependency which is probably not strictly required to use the package - also the version is not pinned. It would probably be better to allow the user to provide the html.parser implementation they want and default to 'html.parser' if an implementation is not provided.

kasnder · 2023-10-16T08:54:36Z

Yandex does not work for me

Francois Liot added 4 commits February 4, 2023 22:57

lxml deps

96cb35b

google description

40c5cb4

qwant update

a84406d

remove yandex & publish qwant

d588e8b

EdmundMartin reviewed Feb 4, 2023

View reviewed changes

qwant correction

9d7e8fe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lxml, google description, qwant and removing yandex. #7

lxml, google description, qwant and removing yandex. #7

fliot commented Feb 4, 2023

EdmundMartin commented Feb 4, 2023

EdmundMartin Feb 4, 2023

EdmundMartin Feb 4, 2023

kasnder commented Oct 16, 2023

lxml, google description, qwant and removing yandex. #7

Are you sure you want to change the base?

lxml, google description, qwant and removing yandex. #7

Conversation

fliot commented Feb 4, 2023

EdmundMartin commented Feb 4, 2023

EdmundMartin Feb 4, 2023

Choose a reason for hiding this comment

EdmundMartin Feb 4, 2023

Choose a reason for hiding this comment

kasnder commented Oct 16, 2023