[MRG] added the summary agent #147

huangyz0918 · 2024-08-28T18:10:28Z

User description

We need to update the Github integration function as well, I will do this in the same PR

PR Type

enhancement

Description

Introduced a new SummaryAgent class in mle/agents/summarizer.py to summarize GitHub projects, including project information, structure, issues, and source code.
Implemented a summary method to interact with a model and return project summaries in JSON format.
Added comments in mle/function/__init__.py for a new GitHub integration function schema, preparing for future enhancements.

Changes walkthrough 📝

Relevant files

Enhancement

summarizer.py Added `SummaryAgent` class for GitHub project summarization mle/agents/summarizer.py Introduced a new `SummaryAgent` class for summarizing GitHub projects. Implemented initialization with model, GitHub repository, and console. Added system prompt and JSON output format for project summarization. Developed a `summary` method to handle model queries and return summaries.	+84/-0
__init__.py `Added comments for GitHub integration function schema` mle/function/init.py Added comments for a new GitHub integration function schema. Prepared for potential future implementation of project structure reading.	+16/-0

💡 PR-Agent usage:
Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

github-actions · 2024-08-28T18:11:22Z

PR Reviewer Guide 🔍

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Key issues to review Code Clarity The `SummaryAgent` class initialization method contains a large block of text for `sys_prompt` and `json_mode_prompt` which reduces code readability and maintainability. Consider moving these large text blocks to external files or resources. Exception Handling The `summary` method lacks exception handling when parsing JSON, which could lead to unhandled exceptions if the JSON is malformed.

github-actions · 2024-08-28T18:11:24Z

PR Code Suggestions ✨

Category	Suggestion	Score
Possible bug	Add error handling for JSON parsing to improve robustness Use a more structured approach to handle JSON parsing to avoid potential crashes from malformed JSON strings. Implement error handling around `json.loads` to catch and handle `JSONDecodeError`. mle/agents/summarizer.py [82] -summary = json.loads(text) +try: + summary = json.loads(text) +except json.JSONDecodeError: + self.console.log("Failed to parse JSON response.") + summary = {} Suggestion importance[1-10]: 9 Why: Adding error handling for JSON parsing is crucial for robustness, as it prevents crashes due to malformed JSON, which is a significant improvement.	9
Possible issue	Handle cases of empty user content more explicitly to avoid issues in processing Instead of appending empty content directly to `chat_history`, provide a meaningful default message or handle this scenario more gracefully to avoid potential issues with processing empty strings. mle/agents/summarizer.py [73] -self.chat_history.append({"role": "user", "content": ""}) # TODO +# If user content is expected to be empty sometimes, handle it explicitly +if not some_user_input: + self.chat_history.append({"role": "user", "content": "No input provided"}) +else: + self.chat_history.append({"role": "user", "content": some_user_input}) Suggestion importance[1-10]: 8 Why: Handling empty user content explicitly can prevent potential issues during processing, making this a valuable improvement for robustness.	8
Best practice	Improve flexibility and testability by using dependency injection for the console Replace the direct instantiation of `Console` inside the `SummaryAgent` constructor with dependency injection. This change allows for better testing and flexibility, as it enables the use of different console implementations or mocks for testing. mle/agents/summarizer.py [24-25] -if not self.console: - self.console = Console() +# Assuming console is injected, no need to check or create a new instance here. Suggestion importance[1-10]: 7 Why: The suggestion to use dependency injection for the console improves testability and flexibility, but the existing code already allows for a console to be passed in, so the improvement is minor.	7
Maintainability	Improve code readability and maintainability by externalizing long string assignments Refactor the large string assignments for `sys_prompt` and `json_mode_prompt` into separate methods or external files for better readability and maintainability. mle/agents/summarizer.py [26-57] -self.sys_prompt = """ - You are a software expert tasked with summarizing the Github project provided by the user. The project may - contain the dataset, the source code, and the documentation, etc. - ... -self.json_mode_prompt = """ - JSON Output Format: - ... +self.sys_prompt = self.load_sys_prompt() +self.json_mode_prompt = self.load_json_mode_prompt() Suggestion importance[1-10]: 6 Why: Refactoring long string assignments into separate methods or files enhances readability and maintainability, but it is not a critical change.	6

huangyz0918 · 2024-09-02T19:17:59Z

How to test this PR? @leeeizhang @HuaizhengZhang

run this command under an MLE-Agent project:

mle start report

And you will be requiring to enter a GitHub repo name, please make sure you have set the github token before running:

export GITHUB_TOKEN=xxx

HuaizhengZhang · 2024-09-03T01:57:15Z

why does a public repo require a token to analyze?

HuaizhengZhang

current workflow is weired

I need to mle new first and then I can call mle start report for a non-relevant github repo in that project

HuaizhengZhang · 2024-09-03T02:04:42Z

I prefer a mle report

if users run mle report under our project without any github links, we generate reports for that project locally.
if users run mle report github-url, we create a project called mle-report-github-repo-name and put report in that project

HuaizhengZhang

we can merge and then improve next PR

leeeizhang · 2024-09-03T02:10:49Z

I prefer a mle report

if users run mle report under our project without any github links, we generate reports for that project locally.

if users run mle report github-url, we create a project called mle-report-github-repo-name and put report in that project

awesome!! i will refactor the report workflow today!

huangyz0918 · 2024-09-03T02:15:39Z

I prefer a mle report

if users run mle report under our project without any github links, we generate reports for that project locally.

if users run mle report github-url, we create a project called mle-report-github-repo-name and put report in that project

Agreed, @leeeizhang please file a PR to fix this, thank you!

✨ feat: added the summary agent

a2612e1

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Aug 28, 2024

github-actions bot added the Review effort [1-5]: 3 label Aug 28, 2024

huangyz0918 added 5 commits August 28, 2024 11:12

🚑 quickfix: fixed the lint

4f0b360

Merge branch 'main' into feat/summary-agent

5d72551

Merge branch 'main' into feat/summary-agent

8ee2f92

✨ feat: improved the github information aggregration

e049cd1

✨ feat: added structure scanning function

b6646e9

This was linked to issues Sep 2, 2024

New summary agent for workspace summary #144

Closed

Improve the Github integration functions #146

Closed

✨ feat: added summary agent functions

a68432e

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Sep 2, 2024

huangyz0918 changed the title ~~[WIP] added the summary agent~~ [MRG] added the summary agent Sep 2, 2024

huangyz0918 requested review from leeeizhang and HuaizhengZhang September 2, 2024 19:14

🚑 quickfix: added function search paper with code

5d73630

🐛 fix: resolved conflicts

4103aff

HuaizhengZhang reviewed Sep 3, 2024

View reviewed changes

HuaizhengZhang approved these changes Sep 3, 2024

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 3, 2024

HuaizhengZhang mentioned this pull request Sep 3, 2024

mle report workflow design #164

Closed

huangyz0918 merged commit 8de4cce into main Sep 3, 2024
3 checks passed

huangyz0918 deleted the feat/summary-agent branch September 3, 2024 02:15

HuaizhengZhang mentioned this pull request Sep 8, 2024

Add summary and optimization direction to MLE Advisor #129

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] added the summary agent #147

[MRG] added the summary agent #147

huangyz0918 commented Aug 28, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Aug 28, 2024

github-actions bot commented Aug 28, 2024 •

edited

Loading

huangyz0918 commented Sep 2, 2024 •

edited

Loading

HuaizhengZhang commented Sep 3, 2024

HuaizhengZhang left a comment

HuaizhengZhang commented Sep 3, 2024

HuaizhengZhang left a comment

leeeizhang commented Sep 3, 2024

huangyz0918 commented Sep 3, 2024

[MRG] added the summary agent #147

[MRG] added the summary agent #147

Conversation

huangyz0918 commented Aug 28, 2024 • edited by github-actions bot Loading

User description

PR Type

Description

Changes walkthrough 📝

github-actions bot commented Aug 28, 2024

PR Reviewer Guide 🔍

github-actions bot commented Aug 28, 2024 • edited Loading

PR Code Suggestions ✨

huangyz0918 commented Sep 2, 2024 • edited Loading

HuaizhengZhang commented Sep 3, 2024

HuaizhengZhang left a comment

Choose a reason for hiding this comment

HuaizhengZhang commented Sep 3, 2024

HuaizhengZhang left a comment

Choose a reason for hiding this comment

leeeizhang commented Sep 3, 2024

huangyz0918 commented Sep 3, 2024

huangyz0918 commented Aug 28, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Aug 28, 2024 •

edited

Loading

huangyz0918 commented Sep 2, 2024 •

edited

Loading