Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

princeton-nlp/SWE-agent: turns LMs (e.g. GPT-4) into software engineering agents that can resolve issues in real GitHub repositories. #915

Open
1 task
ShellLM opened this issue Aug 22, 2024 · 1 comment
Labels
AI-Agents Autonomous AI agents using LLMs CLI-UX Command Line Interface user experience and best practices code-generation code generation models and tools like copilot and aider Git-Repo Source code repository like gitlab or gh github gh tools like cli, Actions, Issues, Pages human-verified <INST>NEVER PICK THIS LABEL</INST> llm Large Language Models llm-benchmarks testing and benchmarking large language models llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets MachineLearning ML Models, Training and Inference Papers Research papers Software2.0 Software development driven by AI and neural networks. software-engineering Best practice for software engineering

Comments

@ShellLM
Copy link
Collaborator

ShellLM commented Aug 22, 2024

SWE-agent/README.md at main · princeton-nlp/SWE-agent

SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can resolve issues in real GitHub repositories.

On SWE-bench, SWE-agent resolves 12.47% of issues, achieving the state-of-the-art performance on the full test set.

We accomplish our results by designing simple LM-centric commands and feedback formats to make it easier for the LM to browse the repository, view, edit and execute code files. We call this an Agent-Computer Interface (ACI).
Read more about it in our paper!

SWE-agent is built and maintained by researchers from Princeton University.

My Movie 3

You can use SWE-agent either through a web interface (shown above) or through the command line.

🚀 Get started!

👉 Try SWE-agent in your browser: Open in GitHub Codespaces (more information)

Read our documentation to learn more:

💫 Contributions

  • If you'd like to ask questions, learn about upcoming features, and participate in future development, join our Discord community!
  • If you'd like to contribute to the codebase, we welcome issues and pull requests!

Contact person: John Yang and Carlos E. Jimenez (Email: johnby@stanford.edu, carlosej@princeton.edu).

📝 Citation

If you found this work helpful, please consider citing it using the following:

@misc{yang2024sweagent,
      title={SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering},
      author={John Yang and Carlos E. Jimenez and Alexander Wettig and Kilian Lieret and Shunyu Yao and Karthik Narasimhan and Ofir Press},
      year={2024},
      eprint={2405.15793},
      archivePrefix={arXiv},
      primaryClass={cs.SE}
}

🪪 License

MIT. Check LICENSE.

Pytest
Test build containers
Release to dockerhub (nightly)
Release to dockerhub (release)
build-docs
codecov
pre-commit.ci status
Markdown links

Suggested labels

None

@ShellLM ShellLM added AI-Agents Autonomous AI agents using LLMs CLI-UX Command Line Interface user experience and best practices code-generation code generation models and tools like copilot and aider Git-Repo Source code repository like gitlab or gh github gh tools like cli, Actions, Issues, Pages Papers Research papers software-engineering Best practice for software engineering labels Aug 22, 2024
@ShellLM
Copy link
Collaborator Author

ShellLM commented Aug 22, 2024

Related content

#758 similarity score: 0.92
#682 similarity score: 0.89
#743 similarity score: 0.89
#386 similarity score: 0.89
#762 similarity score: 0.89
#911 similarity score: 0.87

@irthomasthomas irthomasthomas added llm Large Language Models MachineLearning ML Models, Training and Inference llm-benchmarks testing and benchmarking large language models llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets Software2.0 Software development driven by AI and neural networks. human-verified <INST>NEVER PICK THIS LABEL</INST> labels Aug 22, 2024
@irthomasthomas irthomasthomas changed the title SWE-agent/README.md at main · princeton-nlp/SWE-agent princeton-nlp/SWE-agent: SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can resolve issues in real GitHub repositories. Aug 22, 2024
@irthomasthomas irthomasthomas changed the title princeton-nlp/SWE-agent: SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can resolve issues in real GitHub repositories. princeton-nlp/SWE-agent: turns LMs (e.g. GPT-4) into software engineering agents that can resolve issues in real GitHub repositories. Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI-Agents Autonomous AI agents using LLMs CLI-UX Command Line Interface user experience and best practices code-generation code generation models and tools like copilot and aider Git-Repo Source code repository like gitlab or gh github gh tools like cli, Actions, Issues, Pages human-verified <INST>NEVER PICK THIS LABEL</INST> llm Large Language Models llm-benchmarks testing and benchmarking large language models llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets MachineLearning ML Models, Training and Inference Papers Research papers Software2.0 Software development driven by AI and neural networks. software-engineering Best practice for software engineering
Projects
None yet
Development

No branches or pull requests

2 participants