-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebSurfer Updated (Selenium, Playwright, and support for many filetypes) #1929
Conversation
* Add headless browser to the WebSurferAgent, closes #1481 * replace soup.get_text() with markdownify.MarkdownConverter().convert_soup(soup) * import HeadlessChromeBrowser * implicitly wait for 10s * inicrease max. wait time to 99s * fix: trim trailing whitespace * test: fix headless tests * better bing query search * docs: add example 3 for headless option --------- Co-authored-by: Vijay Ramesh <vijay@regrello.com>
* Based browser on mdconvert. * Updated web_surfer. * Renamed HeadlessChromeBrowser to SeleniumChromeBrowser
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1929 +/- ##
===========================================
+ Coverage 37.94% 50.75% +12.80%
===========================================
Files 77 83 +6
Lines 7784 8776 +992
Branches 1667 2040 +373
===========================================
+ Hits 2954 4454 +1500
+ Misses 4580 3946 -634
- Partials 250 376 +126
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@signalprime @vijaykramesh @INF800 With this PR, I tried to combine your Selenium browser PRs together in one place. Even if it doesn't show in the commit history, I used and learned a lot from each of your contributions, and welcome your further comments and contributions here. Once this is ready, the final PR will credit each of you, and we can perhaps co-author a Blog post. Further, I believe @INF800 and @vijaykramesh 's PRs used Selenium to call Bing search -- which is clever in that it simplifies requirements to get up and running (you don't need to register for an API key). However, I opted to leave this out in favor of the API because it is a better fit for our automated use. Bing actively discourages scraping, and supporting that approach long term would involve actively evading bot detection. I am open to adding further modularity and configurability to add other search engines, perhaps DuckDuckGo, ArXiv etc. that don't require an API key. |
Why are these changes needed?
This PR add Selenium and Playwright variants of the Markdown Web Browser used by WebSurfer. It also adds support for many additional content-types, and support for alternate search engines.
All MarkdownBrowser variants work via the following principle:
1. Fetch a page,
2. Convert it to markdown,
3. Operate on the Markdown
Such browsers are simple, and suitable for read-only agentic use -- they cannot be used to interact with complex web applications. Nevertheless, they are a great stopgap, and super useful when browsing local files (file:///user/afourney/repos/autogen) etc. because they can handle many different file formats (Office docs, PDFs, etc.), provide a common interface for Q&A, summarization, passage extraction etc.
Instructions
When installing AutoGen, use the
[websurfer]
optional dependencies.If using Selenium, you must also
pip install selenium
If using Playwright you must both
pip install playwright
andplaywright install --with-deps chromium
Related issue number
#1481, #1534, #1733, #1832