Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creates a submodule for contributed functions #1572

Closed
wants to merge 18 commits into from

Conversation

gagb
Copy link
Collaborator

@gagb gagb commented Feb 7, 2024

Why are these changes needed?

Currently AutoGen supports contributed agents and capabilities.
Why not support a set of contributed functions?

This PR creates a new submodule in agentchat/contrib called functions.

Function store further contains many (stateless) functions that can be used with function calling API.

See the added notebook to see those function in use.

Specifying dependencies via a FunctionWithRequirements decorator

Right now, I specify dependencies for each function via a decorator.
The decorator wraps the function and check availability of dependencies.
If they aren't it install it. For example,

@FunctionWithRequirements(python_packages=["foo", "bar==0.1.0"], secrets=["BETA"])
def gamma():
    # logic that requires foo, bar, and BETA

Pending

  • Automate specification of description from doc string [done]
  • Registering bulk [done
  • Specifying secrets [done]
  • Add warnings for other dependencies?

Limitations

  • Doesn't support stateful tools

Related issue number

#1563

Checks

@codecov-commenter
Copy link

codecov-commenter commented Feb 7, 2024

Codecov Report

Attention: 152 lines in your changes are missing coverage. Please review.

Comparison is base (5d81ed4) 35.03% compared to head (084ab2e) 48.32%.
Report is 2 commits behind head on main.

Files Patch % Lines
autogen/agentchat/contrib/functions/file_utils.py 0.00% 95 Missing ⚠️
...gen/agentchat/contrib/functions/functions_utils.py 0.00% 46 Missing ⚠️
...togen/agentchat/contrib/functions/youtube_utils.py 0.00% 11 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1572       +/-   ##
===========================================
+ Coverage   35.03%   48.32%   +13.28%     
===========================================
  Files          44       53        +9     
  Lines        5383     5838      +455     
  Branches     1247     1429      +182     
===========================================
+ Hits         1886     2821      +935     
+ Misses       3342     2792      -550     
- Partials      155      225       +70     
Flag Coverage Δ
unittests 48.28% <0.00%> (+13.25%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gagb gagb changed the title Contrib functions Creates contrib.function_store Feb 7, 2024
@gagb gagb changed the title Creates contrib.function_store Creates a submodules for contributed functions Feb 7, 2024
@gagb gagb requested a review from afourney February 7, 2024 01:25
@gagb gagb changed the title Creates a submodules for contributed functions Creates a submodule for contributed functions Feb 7, 2024
@gagb gagb requested a review from ekzhu February 7, 2024 01:27
@ekzhu ekzhu requested a review from davorrunje February 7, 2024 03:38
Copy link
Collaborator

@ekzhu ekzhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one early comment: functions instead of function_store? People typically associate the word store with either a data store or an app store.

import requests
from pdfminer.high_level import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import TextConverter
from pdfminer.pdfpage import PDFPage
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use a function generator so the import statements are only executed once in the wrapper.

def wrapper():
    import ...

    def _actual_func():
        ...
    
    return _actual_func()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a great idea.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To put it more clearly,

@require("...")
def function_generator() -> Callable:
  import ...
  import ...
  def actual_function():
     # implementation
  return actual function

The function generator returns a callable which is the actual function that user is going to use.

Maybe there is a more elegant way to do this. @davorrunje would know.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think just importing everytime the function is executed is fine too -- much easier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the package. Some take a while to import. It slows things down.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

@require("...")
def function_generator() -> Tuple[Callable[None, None], Callable[..., Any]]:
  def _import():
      import ...
      import ...
  def _function(*args, **kwargs):
     # implementation
     ...
     
  return (_import, _function)

This way executor can choose importing strategy by calling _import() when needed.

Another issue is that Python dependencies are only part of dependencies. Many Python libraries depend on particular C libraries being installed. We could additionally add base docker dependency to make sure everything needed is installed.

@jackgerrits
Copy link
Member

Bundling the requirements with the function is a super good idea. However, if these functions weren't executed as "tools" but were shipped into the code execution environment then this mode of dependency installation would not work right? If the execute_function call was able to inspect if dependencies were needed and satisfy them explicitly then I feel like it would be more flexible? Because then the same could be done when this was used like a skill in another code execution environment

@afourney
Copy link
Member

afourney commented Feb 7, 2024

The @ requires decorator is super interesting.

I note that we are in the same situation for many contrib agents that are now added as optional dependencies (RAG, Teachability, WebSurfer, etc.) I wonder if we should consolidate on mechanisms here?

@gagb
Copy link
Collaborator Author

gagb commented Feb 7, 2024

Bundling the requirements with the function is a super good idea. However, if these functions weren't executed as "tools" but were shipped into the code execution environment then this mode of dependency installation would not work right? If the execute_function call was able to inspect if dependencies were needed and satisfy them explicitly then I feel like it would be more flexible? Because then the same could be done when this was used like a skill in another code execution environment

If you remove requires, the agent will install the package from package missing error. I have some ideas to improve this. Will report back.

from youtube_transcript_api import YouTubeTranscriptApi

# Extract video ID from the YouTube link
video_id = youtube_link.split("v=")[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there are other querystring parameters like timecodes?

Maybe:

    from urllib.parse import urlparse, parse_qs
    parsed_url = urlparse(youtube_link)
    qs_params = parse_qs(parsed_url.query)
    video_id = qs_params['v'][0]


for page in PDFPage.get_pages(file):
interpreter.process_page(page)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the per-page approach rather than:

pdfminer.high_level.extract_text(file)

@jackgerrits
Copy link
Member

Bundling the requirements with the function is a super good idea. However, if these functions weren't executed as "tools" but were shipped into the code execution environment then this mode of dependency installation would not work right? If the execute_function call was able to inspect if dependencies were needed and satisfy them explicitly then I feel like it would be more flexible? Because then the same could be done when this was used like a skill in another code execution environment

If you remove requires, the agent will install the package from package missing error. I have some ideas to improve this. Will report back.

I'm totally happy to move forward with this. Just wanted to make sure resolving that part of that was raised and we can circle back to it



@runtime_checkable
class UserDefinedFunction(Protocol):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The protocol should not actually provide an implementation

Comment on lines +20 to +24
name: str
docstring: str
code: str
python_packages: List[str]
env_vars: List[str]
Copy link
Member

@jackgerrits jackgerrits Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The protocol should not mandate these attributes

python_packages: List[str]
env_vars: List[str]

def name(self) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be properties instead of methods

@afourney afourney mentioned this pull request Mar 14, 2024
@gagb
Copy link
Collaborator Author

gagb commented Mar 25, 2024

@jackgerrits check if you need this for UDFs and take appropriate actions -- add to roadmap/close.

Copy link

gitguardian bot commented Jul 20, 2024

⚠️ GitGuardian has uncovered 96 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
12853598 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret e43a86c test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret bdb40d7 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 954ca45 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10404662 Triggered Generic CLI Secret eff19ac .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 06a0a5d .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 0524c77 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret d7ea410 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret e43a86c .github/workflows/dotnet-build.yml View secret
10404662 Triggered Generic CLI Secret 841ed31 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 802f099 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 9a484d8 .github/workflows/dotnet-build.yml View secret
10404662 Triggered Generic CLI Secret e973ac3 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 89650e7 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret e07b06b .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret abe4c41 .github/workflows/dotnet-build.yml View secret
10404662 Triggered Generic CLI Secret 7362fb9 .github/workflows/dotnet-release.yml View secret
12853599 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret e43a86c test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret 954ca45 test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret bdb40d7 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret abad9ff test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret 954ca45 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret c7bb588 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret b97b99d test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret e43a86c test/oai/test_utils.py View secret
12853600 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
12853601 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10493810 Triggered Generic Password 49e8053 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 501610b notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 49e8053 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 501610b notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password d422c63 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 97fa339 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 49e8053 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password d422c63 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 97fa339 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password d422c63 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 97fa339 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 501610b notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10404696 Triggered Generic High Entropy Secret 954ca45 test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret bdb40d7 test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret e43a86c test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret bdb40d7 test/oai/test_utils.py View secret
12853602 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
11616921 Triggered Generic High Entropy Secret a86d0fd notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret 394561b notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret 3eac646 notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret f45b553 notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret 6563248 notebook/agentchat_agentops.ipynb View secret
12853598 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
12853598 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 0a3c6c4 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 76f5f5a test/oai/test_utils.py View secret
10404662 Triggered Generic CLI Secret 954ca45 .github/workflows/dotnet-build.yml View secret
12853599 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
12853599 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret 76f5f5a test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret 0a3c6c4 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret 3b79cc6 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret 11baa52 test/oai/test_utils.py View secret
12853600 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
12853600 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
12853601 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
12853601 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
10493810 Triggered Generic Password 3b79cc6 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 11baa52 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 11baa52 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 3b79cc6 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10404696 Triggered Generic High Entropy Secret 0a3c6c4 test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret 76f5f5a test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret

and 16 others.

🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@gagb gagb closed this Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request function/tool suggestion and execution of function/tool call
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants