-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce SeleniumBrowser #1733
base: main
Are you sure you want to change the base?
Commits on Feb 19, 2024
-
Tests for the new Selenium WebDriver addition
Configuration menu - View commit details
-
Copy full SHA for 23ee145 - Browse repository at this point
Copy the full SHA 23ee145View commit details -
Inclusions of `SeleniumBrowserWrapper`, `SeleniumBrowser`, and several required helper functions that are part of the upcoming `ContentCollector` class and the `WebCollectionAgent`.
Configuration menu - View commit details
-
Copy full SHA for 2daec15 - Browse repository at this point
Copy the full SHA 2daec15View commit details -
Provides an optional drop-in replacement for `SimpleTextBrowser` with `SeleniumBrowserWrapper` for use-cases including pages that depend on JavaScript and others that prevent calls from `requests`. Nearly all compatibility is held through with the exception of page numbering.
Configuration menu - View commit details
-
Copy full SHA for 9efb297 - Browse repository at this point
Copy the full SHA 9efb297View commit details -
ContentAgent: Custom LLM agent for collecting online content.
The ContentAgent class is a custom Autogen agent that can be used to collect and store online content from different web pages. It extends the ConversableAgent class and provides additional functionality for managing a list of additional links, storing collected content in local directories, and customizing request headers. ContentAgent uses deque to manage a list of additional links for further exploration, with a maximum depth limit set by max_depth parameter. The collected content is stored in the specified storage path (storage_path) using local directories. ContentAgent can be customized with request_kwargs and llm_config parameters during instantiation. The default User-Agent header is used for requests, but it can be overridden by providing a new dictionary of headers under request_kwargs.
Configuration menu - View commit details
-
Copy full SHA for 217ed91 - Browse repository at this point
Copy the full SHA 217ed91View commit details -
Very minor updates prior to submitting a PR
Configuration menu - View commit details
-
Copy full SHA for 72a165a - Browse repository at this point
Copy the full SHA 72a165aView commit details
Commits on Feb 20, 2024
-
small fix in the `fix_missing_protocol` function
Configuration menu - View commit details
-
Copy full SHA for 46b2424 - Browse repository at this point
Copy the full SHA 46b2424View commit details -
Small addition to maintain a dictionary of processed html content, referenced by the source URL (Uniform Resource Locator)
Configuration menu - View commit details
-
Copy full SHA for d34ae1b - Browse repository at this point
Copy the full SHA d34ae1bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1ba9e05 - Browse repository at this point
Copy the full SHA 1ba9e05View commit details -
Unit Tests for the ContentAgent
We cover a small sample of websites, asserting expectations against a number of measurements performed on the collected content. The assertions include, but are not limited to: - the expected variables contain values - the presence of the expected output files - that the expected output files are not empty Further improvements can include: - evaluation against all choices of WebDriver to confirm functionality - evaluation against a larger sample of websites -
Configuration menu - View commit details
-
Copy full SHA for 84fa1b8 - Browse repository at this point
Copy the full SHA 84fa1b8View commit details -
It's noted that `_set_page_content`, `_split_pages`, and `viewport` are likely not yet compatible but seemingly not necessary at this time for the selenium browser wrapper class.
Configuration menu - View commit details
-
Copy full SHA for 67f95bf - Browse repository at this point
Copy the full SHA 67f95bfView commit details -
Small updates on imports that have been recently refactored to other locations. Specifically: ``` from ..agent import Agent from .. import ConversableAgent, AssistantAgent, UserProxyAgent, GroupChatManager, GroupChat from ...oai.client import OpenAIWrapper ```
Configuration menu - View commit details
-
Copy full SHA for 08f8ff9 - Browse repository at this point
Copy the full SHA 08f8ff9View commit details -
A small change to declaring `self.browser_kwargs` prior to initializing the parent class (ConversableAgent). This is done to avoid triggering an unexpected argument error for `browser_kwargs`.
Configuration menu - View commit details
-
Copy full SHA for 3954412 - Browse repository at this point
Copy the full SHA 3954412View commit details -
fixing the following pre-commit errors: autogen/agentchat/contrib/content_agent.py:21:1: E402 Module level import not at top of file autogen/agentchat/contrib/content_agent.py:34:1: E402 Module level import not at top of file autogen/agentchat/contrib/content_agent.py:65:33: F811 Redefinition of unused `deque` from line 6 autogen/agentchat/contrib/content_agent.py:374:26: F811 Redefinition of unused `filename` from line 7
Configuration menu - View commit details
-
Copy full SHA for 749a556 - Browse repository at this point
Copy the full SHA 749a556View commit details -
Fixing the redundant import of selenium webdriver within `SeleniumBrowser`
Configuration menu - View commit details
-
Copy full SHA for 818a010 - Browse repository at this point
Copy the full SHA 818a010View commit details -
Small corrections based on pre-commit errors, both resulting in removed code: content_agent.py:94:9: F821 Undefined name `f` content_agent.py:371:26: F811 Redefinition of unused `filename` from line 21
Configuration menu - View commit details
-
Copy full SHA for 643bad0 - Browse repository at this point
Copy the full SHA 643bad0View commit details -
pre-commit fixes for: autogen/browser_utils.py:455: argumnets ==> arguments autogen/browser_utils.py:486: compatability ==> compatibility
Configuration menu - View commit details
-
Copy full SHA for 20cd2a6 - Browse repository at this point
Copy the full SHA 20cd2a6View commit details -
Still a bit new to the unit test framework and had to remove some conditional statements that are covered elsewhere
Configuration menu - View commit details
-
Copy full SHA for 0389387 - Browse repository at this point
Copy the full SHA 0389387View commit details -
Updates to include selenium in websurfer extras, webdrivers in the py…
…thon-package.yml workflow, and additional small fixes to bring the PR into compliance
Configuration menu - View commit details
-
Copy full SHA for be89b9b - Browse repository at this point
Copy the full SHA be89b9bView commit details
Commits on Feb 22, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 0a40763 - Browse repository at this point
Copy the full SHA 0a40763View commit details -
Configuration menu - View commit details
-
Copy full SHA for 25e15e0 - Browse repository at this point
Copy the full SHA 25e15e0View commit details -
Restored to original form in official main branch. Added for clarity.…
… Updated to account for refactoring. All updates now stable and done. Inside Dev Docker, all test files Passed, all pre-commit checks Passed.
Configuration menu - View commit details
-
Copy full SHA for 5602958 - Browse repository at this point
Copy the full SHA 5602958View commit details -
Further cleaned the two test files and confirmed they passed using th…
…e dev docker and the pytest library
Configuration menu - View commit details
-
Copy full SHA for 8954fef - Browse repository at this point
Copy the full SHA 8954fefView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0c2202c - Browse repository at this point
Copy the full SHA 0c2202cView commit details -
Update contrib-tests.yml for Selenium
This update should GitHub to use the WebSurfer extras when testing test_web_surfer_selenium.py.
Configuration menu - View commit details
-
Copy full SHA for 13ba006 - Browse repository at this point
Copy the full SHA 13ba006View commit details -
Adding coverage within the Websurfer workflow for this PR: - test/agentchat/contrib/test_web_surfer_selenium.py - test/agentchat/contrib/test_content_agent.py
Configuration menu - View commit details
-
Copy full SHA for e1e81f6 - Browse repository at this point
Copy the full SHA e1e81f6View commit details -
Adding `test/agentchat/contrib/test_content_agent.py --skip-openai` under the assumption that all test files must be accounted for or they will rely on the default workflow. This test requires openAI calls, but still needs to be registered on this file to avoid build errors.
Configuration menu - View commit details
-
Copy full SHA for 0b5e733 - Browse repository at this point
Copy the full SHA 0b5e733View commit details -
removed duplicate entry for test_web_surfer_selenium.py
Configuration menu - View commit details
-
Copy full SHA for 9099b57 - Browse repository at this point
Copy the full SHA 9099b57View commit details -
Added the missing `pillow` dependency for graphical based web browsing and downstream tasks
Configuration menu - View commit details
-
Copy full SHA for 7443458 - Browse repository at this point
Copy the full SHA 7443458View commit details -
Moving the ContentAgent import to be conditional on "not skip_oai" in the hope that it helps avoid the `markdownify` import error during build tests.
Configuration menu - View commit details
-
Copy full SHA for 1b87acd - Browse repository at this point
Copy the full SHA 1b87acdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 11b00e5 - Browse repository at this point
Copy the full SHA 11b00e5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 66ac7bd - Browse repository at this point
Copy the full SHA 66ac7bdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6fbe0b8 - Browse repository at this point
Copy the full SHA 6fbe0b8View commit details
Commits on Feb 25, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 451405b - Browse repository at this point
Copy the full SHA 451405bView commit details -
Provided a more descriptive name for the agent responsible for collec…
…ting web data. Added '_' to internal functions and docstrings to the web_archiver_agent.py file.
Configuration menu - View commit details
-
Copy full SHA for c06f6fd - Browse repository at this point
Copy the full SHA c06f6fdView commit details
Commits on Mar 26, 2024
-
Configuration menu - View commit details
-
Copy full SHA for ef7586e - Browse repository at this point
Copy the full SHA ef7586eView commit details -
change _set_page_content to set_page_content
Configuration menu - View commit details
-
Copy full SHA for 2be44bc - Browse repository at this point
Copy the full SHA 2be44bcView commit details -
Removing the exception messages related to Selenium
Configuration menu - View commit details
-
Copy full SHA for e64ae32 - Browse repository at this point
Copy the full SHA e64ae32View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3e7cf18 - Browse repository at this point
Copy the full SHA 3e7cf18View commit details -
Configuration menu - View commit details
-
Copy full SHA for 841ed31 - Browse repository at this point
Copy the full SHA 841ed31View commit details