Awesome-code-llm

Awesome-code-llm
- Survey
- Paper
- Projects
- Other

Survey

[2409.09030] Agents in Software Engineering: Survey, Landscape, and Vision
Can OpenSource beat ChatGPT? -- A Comparative Study of Large Language Models for Text-to-Code Generation, arXiv, 2409.04164, arxiv, pdf, cication: -1

Luis Mayer, Christian Heumann, Matthias Aßenmacher
Large Language Model-Based Agents for Software Engineering: A Survey, arXiv, 2409.02977, arxiv, pdf, cication: -1

Junwei Liu, Kaixin Wang, Yixuan Chen, Xin Peng, Zhenpeng Chen, Lingming Zhang, Yiling Lou · (Agent4SE-Paper-List - FudanSELab)
A Survey on Large Language Models for Code Generation, arXiv, 2406.00515, arxiv, pdf, cication: -1

Juyong Jiang, Fan Wang, Jiasi Shen, Sungju Kim, Sunghun Kim
Low-Cost Language Models: Survey and Performance Evaluation on Python Code Generation, arXiv, 2404.11160, arxiv, pdf, cication: -1

Jessica López Espejel, Mahaman Sanoussi Yahaya Alassan, Merieme Bouhandi, Walid Dahhane, El Hassane Ettifouri
A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond, arXiv, 2403.14734, arxiv, pdf, cication: -1

Qiushi Sun, Zhirui Chen, Fangzhi Xu, Kanzhi Cheng, Chang Ma, Zhangyue Yin, Jianing Wang, Chengcheng Han, Renyu Zhu, Shuai Yuan · (NCISurvey - QiushiSun)
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents, arXiv, 2401.00812, arxiv, pdf, cication: -1

Ke Yang, Jiateng Liu, John Wu, Chaoqi Yang, Yi R. Fung, Sha Li, Zixuan Huang, Xu Cao, Xingyao Wang, Yiquan Wang
A Survey on Language Models for Code, arXiv, 2311.07989, arxiv, pdf, cication: -1

Ziyin Zhang, Chaoyu Chen, Bingchang Liu, Cong Liao, Zi Gong, Hang Yu, Jianguo Li, Rui Wang

Paper

[2409.08692] B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests

· (B4 - ZJU-CTAG)
[2409.12186] Qwen2.5-Coder Technical Report
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data, arXiv, 2409.03810, arxiv, pdf, cication: -1

Yejie Wang, Keqing He, Dayuan Fu, Zhuoma Gongque, Heyang Xu, Yanxu Chen, Zhexu Wang, Yujia Fu, Guanting Dong, Muxi Diao · (XCoder - banksy23)
FuzzCoder: Byte-level Fuzzing Test via Large Language Model, arXiv, 2409.01944, arxiv, pdf, cication: -1

Liqun Yang, Jian Yang, Chaoren Wei, Guanglin Niu, Ge Zhang, Yunli Wang, Linzheng ChaI, Wanxu Xia, Hongcheng Guo, Shun Zhang · (FUZZ-CODER - weimo3221)
Planning In Natural Language Improves LLM Search For Code Generation, arXiv, 2409.03733, arxiv, pdf, cication: -1

Evan Wang, Federico Cassano, Catherine Wu, Yunfeng Bai, Will Song, Vaskar Nath, Ziwen Han, Sean Hendryx, Summer Yue, Hugh Zhang

· (jiqizhixin)
Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining, arXiv, 2409.02326, arxiv, pdf, cication: -1

Yuxiang Wei, Hojae Han, Rajhans Samdani
SWE-bench-java: A GitHub Issue Resolving Benchmark for Java, arXiv, 2408.14354, arxiv, pdf, cication: -1

Daoguang Zan, Zhirong Huang, Ailun Yu, Shaoxin Lin, Yifan Shi, Wei Liu, Dong Chen, Zongshuai Qi, Hao Yu, Lei Yu
CodeJudge-Eval: Can Large Language Models be Good Judges in Code Understanding?, arXiv, 2408.10718, arxiv, pdf, cication: -1

Yuwei Zhao, Ziyang Luo, Yuchen Tian, Hongzhan Lin, Weixiang Yan, Annan Li, Jing Ma · (CodeJudge-Eval - CodeLLM-Research)
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents, arXiv, 2408.07060, arxiv, pdf, cication: -1

Kexun Zhang, Weiran Yao, Zuxin Liu, Yihao Feng, Zhiwei Liu, Rithesh Murthy, Tian Lan, Lei Li, Renze Lou, Jiacheng Xu
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases, arXiv, 2408.03910, arxiv, pdf, cication: -1

Xiangyan Liu, Bo Lan, Zhiyuan Hu, Yang Liu, Zhicheng Zhang, Wenmeng Zhou, Fei Wang, Michael Shieh · (modelscope-agent - modelscope)
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents, arXiv, 2407.18901, arxiv, pdf, cication: -1

Harsh Trivedi, Tushar Khot, Mareike Hartmann, Ruskin Manku, Vinty Dong, Edward Li, Shashank Gupta, Ashish Sabharwal, Niranjan Balasubramanian · (appworld - stonybrooknlp)
Scaling Granite Code Models to 128K Context, arXiv, 2407.13739, arxiv, pdf, cication: -1

Matt Stallone, Vaibhav Saxena, Leonid Karlinsky, Bridget McGinn, Tim Bula, Mayank Mishra, Adriana Meza Soria, Gaoyuan Zhang, Aditya Prasad, Yikang Shen · (granite-code-models - ibm-granite)
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents, arXiv, 2407.16741, arxiv, pdf, cication: -1

Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh · (OpenDevin - OpenDevin)
SciCode - SciCode Benchmark

· (scicode-bench.github.io - scicode-bench) · (SciCode - scicode-bench)
InverseCoder: Unleashing the Power of Instruction-Tuned Code LLMs with Inverse-Instruct, arXiv, 2407.05700, arxiv, pdf, cication: -1

Yutong Wu, Di Huang, Wenxuan Shi, Wei Wang, Lingzhe Gao, Shihao Liu, Ziyuan Nan, Kaizhao Yuan, Rui Zhang, Xishan Zhang
Is Programming by Example solved by LLMs?, arXiv, 2406.08316, arxiv, pdf, cication: 1

Wen-Ding Li, Kevin Ellis

· (pbe-llm.github)
LLM Critics Help Catch LLM Bugs

· (openai)
Long Code Arena: a Set of Benchmarks for Long-Context Code Models, arXiv, 2406.11612, arxiv, pdf, cication: -1

Egor Bogomolov, Aleksandra Eliseeva, Timur Galimzyanov, Evgeniy Glukhov, Anton Shapkin, Maria Tigina, Yaroslav Golubev, Alexander Kovrigin, Arie van Deursen, Maliheh Izadi · (huggingface)
AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology, arXiv, 2406.11912, arxiv, pdf, cication: -1

Minh Huynh Nguyen, Thang Phan Chau, Phong X. Nguyen, Nghi D. Q. Bui · (AgileCoder - FSoft-AI4Code)
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence, arXiv, 2406.11931, arxiv, pdf, cication: -1

DeepSeek-AI, Qihao Zhu, Daya Guo, Zhihong Shao, Dejian Yang, Peiyi Wang, Runxin Xu, Y. Wu, Yukun Li, Huazuo Gao
DeepSeek-Coder-V2 - deepseek-ai

· (DeepSeek-Coder-V2 - deepseek-ai)
McEval: Massively Multilingual Code Evaluation, arXiv, 2406.07436, arxiv, pdf, cication: -1

Linzheng Chai, Shukai Liu, Jian Yang, Yuwei Yin, Ke Jin, Jiaheng Liu, Tao Sun, Ge Zhang, Changyu Ren, Hongcheng Guo · (mceval.github)
DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories, arXiv, 2405.19856, arxiv, pdf, cication: -1

Jia Li, Ge Li, Yunfei Zhao, Yongmin Li, Huanyu Liu, Hao Zhu, Lecheng Wang, Kaibo Liu, Zheng Fang, Lanshen Wang
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering, arXiv, 2405.15793, arxiv, pdf, cication: -1

John Yang, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, Ofir Press
AutoCoder: Enhancing Code Large Language Model with \textsc{AIEV-Instruct}, arXiv, 2405.14906, arxiv, pdf, cication: -1

Bin Lei, Yuchen Li, Qiuwu Chen
LogoMotion: Visually Grounded Code Generation for Content-Aware Animation, arXiv, 2405.07065, arxiv, pdf, cication: -1

Vivian Liu, Rubaiat Habib Kazi, Li-Yi Wei, Matthew Fisher, Timothy Langlois, Seth Walker, Lydia Chilton
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots, arXiv, 2405.07990, arxiv, pdf, cication: -1

Chengyue Wu, Yixiao Ge, Qiushan Guo, Jiahao Wang, Zhixuan Liang, Zeyu Lu, Ying Shan, Ping Luo · (huggingface)
NExT: Teaching Large Language Models to Reason about Code Execution, arXiv, 2404.14662, arxiv, pdf, cication: -1

Ansong Ni, Miltiadis Allamanis, Arman Cohan, Yinlin Deng, Kensen Shi, Charles Sutton, Pengcheng Yin
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency, arXiv, 2404.12872, arxiv, pdf, cication: -1

Zhaodonghui Li, Haitao Yuan, Huiming Wang, Gao Cong, Lidong Bing
How Far Can We Go with Practical Function-Level Program Repair?, arXiv, 2404.12833, arxiv, pdf, cication: -1

Jiahong Xiang, Xiaoyang Xu, Fanchu Kong, Mingyuan Wu, Haotian Zhang, Yuqun Zhang
Behavior Trees Enable Structured Programming of Language Model Agents, arXiv, 2404.07439, arxiv, pdf, cication: -1

Richard Kelley

· (dendron - richardkelley)
A Deep Dive into Large Language Models for Automated Bug Localization and Repair, arXiv, 2404.11595, arxiv, pdf, cication: -1

Soneya Binta Hossain, Nan Jiang, Qiang Zhou, Xiaopeng Li, Wen-Hao Chiang, Yingjun Lyu, Hoan Nguyen, Omer Tripp
USACO - princeton-nlp
CodeQwen1.5 - QwenLM

CodeQwen1.5 is the code version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud. · (qwenlm.github) · (huggingface)
InfiCoder-Eval: Systematically Evaluating the Question-Answering Capabilities of Code Large Language Models, arXiv, 2404.07940, arxiv, pdf, cication: -1

Linyi Li, Shijie Geng, Zhenwen Li, Yibo He, Hao Yu, Ziyue Hua, Guanghan Ning, Siwei Wang, Tao Xie, Hongxia Yang · (infi-coder.github)
AutoCodeRover: Autonomous Program Improvement, arXiv, 2404.05427, arxiv, pdf, cication: -1

Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, Abhik Roychoudhury · (auto-code-rover - nus-apr) · (twitter)
CodeGemma: Open Code Models Based on Gemma

· (huggingface) · (huggingface) · (twitter)
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models, arXiv, 2404.03543, arxiv, pdf, cication: -1

Jiawei Guo, Ziming Li, Xueling Liu, Kaijing Ma, Tianyu Zheng, Zhouliang Yu, Ding Pan, Yizhi LI, Ruibo Liu, Yue Wang
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models, arXiv, 2404.02575, arxiv, pdf, cication: -1

Hyungjoo Chae, Yeonghyeon Kim, Seungone Kim, Kai Tzu-iunn Ong, Beong-woo Kwak, Moohyeon Kim, Seonghwan Kim, Taeyoon Kwon, Jiwan Chung, Youngjae Yu
EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories, arXiv, 2404.00599, arxiv, pdf, cication: -1

Jia Li, Ge Li, Xuanming Zhang, Yihong Dong, Zhi Jin · (EvoCodeBench - seketeam) · (huggingface)
Improving LLM Code Generation with Grammar Augmentation, arXiv, 2403.01632, arxiv, pdf, cication: -1

Shubham Ugare, Tarun Suresh, Hangoo Kang, Sasa Misailovic, Gagandeep Singh
- SynCode is a framework that improves code generation with LLMs by using the grammar of programming languages (essentially an offline-constructed efficient lookup table) for syntax validation and to constrain the LLM’s vocabulary to only syntactically valid tokens.
DevBench: A Comprehensive Benchmark for Software Development, arXiv, 2403.08604, arxiv, pdf, cication: -1

Bowen Li, Wenhan Wu, Ziwei Tang, Lin Shi, John Yang, Jinyang Li, Shunyu Yao, Chen Qian, Binyuan Hui, Qicheng Zhang · (devBench - open-compass) · (qbitai)
Vulnerability Detection with Code Language Models: How Far Are We?, arXiv, 2403.18624, arxiv, pdf, cication: -1

Yangruibo Ding, Yanjun Fu, Omniyyah Ibrahim, Chawin Sitawarin, Xinyun Chen, Basel Alomair, David Wagner, Baishakhi Ray, Yizheng Chen
A comparison of Human, GPT-3.5, and GPT-4 Performance in a University-Level Coding Course, arXiv, 2403.16977, arxiv, pdf, cication: -1

Will Yeadon, Alex Peach, Craig P. Testrow
Compiler generated feedback for Large Language Models, arXiv, 2403.14714, arxiv, pdf, cication: -1

Dejan Grubisic, Chris Cummins, Volker Seeker, Hugh Leather
LLM4Decompile: Decompiling Binary Code with Large Language Models, arXiv, 2403.05286, arxiv, pdf, cication: -1

Hanzhuo Tan, Qi Luo, Jing Li, Yuqun Zhang · (LLM4Decompile - albertan017)
CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language Models to Coding Preferences, arXiv, 2403.09032, arxiv, pdf, cication: -1

Martin Weyssow, Aton Kamanda, Houari Sahraoui

· (CodeUltraFeedback - martin-wey)
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code, arXiv, 2403.07974, arxiv, pdf, cication: -1

Naman Jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, Ion Stoica

· (livecodebench.github)
Design2Code: How Far Are We From Automating Front-End Engineering?, arXiv, 2403.03163, arxiv, pdf, cication: -1

Chenglei Si, Yanzhe Zhang, Zhengyuan Yang, Ruibo Liu, Diyi Yang

· (salt-nlp.github) · (Design2Code - NoviScl)
- a benchmark for how well multimodal LLMs convert visual designs into code, using a curated set of 484 real-world webpages for evaluation, where GPT-4V emerged as the top-performing model.
StarCoder 2 and The Stack v2: The Next Generation, arXiv, 2402.19173, arxiv, pdf, cication: -1

Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei

· (starcoder2 - bigcode-project) · (huggingface)
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming, arXiv, 2402.14261, arxiv, pdf, cication: -1

Anisha Agarwal, Aaron Chan, Shubham Chandel, Jinu Jang, Shaun Miller, Roshanak Zilouchian Moghaddam, Yevhen Mohylevskyy, Neel Sundaresan, Michele Tufano
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement, arXiv, 2402.14658, arxiv, pdf, cication: -1

Tianyu Zheng, Ge Zhang, Tianhao Shen, Xueling Liu, Bill Yuchen Lin, Jie Fu, Wenhu Chen, Xiang Yue · (opencodeinterpreter.github) · (OpenCodeInterpreter - OpenCodeInterpreter)
Automated Unit Test Improvement using Large Language Models at Meta, arXiv, 2402.09171, arxiv, pdf, cication: 1

Nadia Alshahwan, Jubin Chheda, Anastasia Finegenova, Beliz Gokkaya, Mark Harman, Inna Harper, Alexandru Marginean, Shubho Sengupta, Eddy Wang

· (cover-agent - Codium-ai)
ARKS: Active Retrieval in Knowledge Soup for Code Generation, arXiv, 2402.12317, arxiv, pdf, cication: -1

Hongjin Su, Shuyang Jiang, Yuhang Lai, Haoyuan Wu, Boao Shi, Che Liu, Qian Liu, Tao Yu
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning, arXiv, 2402.09136, arxiv, pdf, cication: -1

Yejie Wang, Keqing He, Guanting Dong, Pei Wang, Weihao Zeng, Muxi Diao, Yutao Mou, Mengdi Zhang, Jingang Wang, Xunliang Cai
Executable Code Actions Elicit Better LLM Agents, arXiv, 2402.01030, arxiv, pdf, cication: -1

Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji · (code-act - xingyaoww)
MPIrigen: MPI Code Generation through Domain-Specific Language Models, arXiv, 2402.09126, arxiv, pdf, cication: -1

Nadav Schneider, Niranjan Hasabnis, Vy A. Vo, Tal Kadosh, Neva Krien, Mihai Capotă, Abdul Wasay, Guy Tamir, Ted Willke, Nesreen Ahmed · (MPI-rigen - Scientific-Computing-Lab-NRCN)
Code Representation Learning At Scale, arXiv, 2402.01935, arxiv, pdf, cication: -1

Dejiao Zhang, Wasi Ahmad, Ming Tan, Hantian Ding, Ramesh Nallapati, Dan Roth, Xiaofei Ma, Bing Xiang
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback, arXiv, 2402.01391, arxiv, pdf, cication: -1

Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan
Multi-line AI-assisted Code Authoring, arXiv, 2402.04141, arxiv, pdf, cication: -1

Omer Dunay, Daniel Cheng, Adam Tait, Parth Thakkar, Peter C Rigby, Andy Chiu, Imad Ahmad, Arun Ganesan, Chandra Maddila, Vijayaraghavan Murali
ReGAL: Refactoring Programs to Discover Generalizable Abstractions, arXiv, 2401.16467, arxiv, pdf, cication: -1

Elias Stengel-Eskin, Archiki Prasad, Mohit Bansal
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence, arXiv, 2401.14196, arxiv, pdf, cication: -1

Daya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi, Y. Wu, Y. K. Li
Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering, arXiv, 2401.08500, arxiv, pdf, cication: -1

Tal Ridnik, Dedy Kredo, Itamar Friedman

· (AlphaCodium - Codium-ai)
JumpCoder: Go Beyond Autoregressive Coder via Online Modification, arXiv, 2401.07870, arxiv, pdf, cication: -1

Mouxiang Chen, Hao Tian, Zhongxin Liu, Xiaoxue Ren, Jianling Sun · (JumpCoder - Keytoyze)
DebugBench: Evaluating Debugging Capability of Large Language Models, arXiv, 2401.04621, arxiv, pdf, cication: -1

Runchu Tian, Yining Ye, Yujia Qin, Xin Cong, Yankai Lin, Zhiyuan Liu, Maosong Sun
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution, arXiv, 2401.03065, arxiv, pdf, cication: -1

Alex Gu, Baptiste Rozière, Hugh Leather, Armando Solar-Lezama, Gabriel Synnaeve, Sida I. Wang
AST-T5: Structure-Aware Pretraining for Code Generation and Understanding, arXiv, 2401.03003, arxiv, pdf, cication: -1

Linyuan Gong, Mostafa Elhoushi, Alvin Cheung · (ast_t5 - gonglinyuan)
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models, arXiv, 2401.00788, arxiv, pdf, cication: -1

Terry Yue Zhuo, Armel Zebaze, Nitchakarn Suppattarachai, Leandro von Werra, Harm de Vries, Qian Liu, Niklas Muennighoff
"I Want It That Way": Enabling Interactive Decision Support Using Large Language Models and Constraint Programming, arXiv, 2312.06908, arxiv, pdf, cication: -1

Connor Lawless, Jakob Schoeffer, Lindy Le, Kael Rowan, Shilad Sen, Cristina St. Hill, Jina Suh, Bahar Sarrafzadeh
Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models, arXiv, 2312.04724, arxiv, pdf, cication: -1

Manish Bhatt, Sahana Chennabasappa, Cyrus Nikolaidis, Shengye Wan, Ivan Evtimov, Dominik Gabi, Daniel Song, Faizan Ahmad, Cornelius Aschermann, Lorenzo Fontana
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator, arXiv, 2312.04474, arxiv, pdf, cication: -1

Chengshu Li, Jacky Liang, Andy Zeng, Xinyun Chen, Karol Hausman, Dorsa Sadigh, Sergey Levine, Li Fei-Fei, Fei Xia, Brian Ichter · (chain-of-code.github)
Magicoder: Source Code Is All You Need, arXiv, 2312.02120, arxiv, pdf, cication: -1

Yuxiang Wei, Zhe Wang, Jiawei Liu, Yifeng Ding, Lingming Zhang · (magicoder - ise-uiuc)
ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks, arXiv, 2311.09835, arxiv, pdf, cication: -1

Yuliang Liu, Xiangru Tang, Zefan Cai, Junjie Lu, Yichi Zhang, Yanjun Shao, Zexuan Deng, Helan Hu, Zengxian Yang, Kaikai An · (ML-bench - gersteinlab) · (drive.google) · (ml-bench.github)
Leveraging Large Language Models for Automated Proof Synthesis in Rust, arXiv, 2311.03739, arxiv, pdf, cication: -1

Jianan Yao, Ziqiao Zhou, Weiteng Chen, Weidong Cui
MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning, arXiv, 2311.02303, arxiv, pdf, cication: -1

Bingchang Liu, Chaoyu Chen, Cong Liao, Zi Gong, Huan Wang, Zhichao Lei, Ming Liang, Dajun Chen, Min Shen, Hailian Zhou · [github]
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation, arXiv, 2310.18628, arxiv, pdf, cication: -1

Hailin Chen, Amrita Saha, Steven Hoi, Shafiq Joty
CodeFusion: A Pre-trained Diffusion Model for Code Generation, arXiv, 2310.17680, arxiv, pdf, cication: -1

Mukul Singh, José Cambronero, Sumit Gulwani, Vu Le, Carina Negreanu, Gust Verbruggen
ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation, arXiv, 2311.00272, arxiv, pdf, cication: -1

Zejun Wang, Jia Li, Ge Li, Zhi Jin

· (mp.weixin.qq)
Large Language Models for Software Engineering: Survey and Open Problems, arXiv, 2310.03533, arxiv, pdf, cication: 1

Angela Fan, Beliz Gokkaya, Mark Harman, Mitya Lyubarskiy, Shubho Sengupta, Shin Yoo, Jie M. Zhang
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion, arXiv, 2310.11248, arxiv, pdf, cication: -1

Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth
Ranking LLM-Generated Loop Invariants for Program Verification, arXiv, 2310.09342, arxiv, pdf, cication: -1

Saikat Chakraborty, Shuvendu K. Lahiri, Sarah Fakhoury, Madanlal Musuvathi, Akash Lal, Aseem Rastogi, Aditya Senthilnathan, Rahul Sharma, Nikhil Swamy
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules, arXiv, 2310.08992, arxiv, pdf, cication: -1

Hung Le, Hailin Chen, Amrita Saha, Akash Gokul, Doyen Sahoo, Shafiq Joty
Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation, arXiv, 2310.02304, arxiv, pdf, cication: -1

Eric Zelikman, Eliana Lorch, Lester Mackey, Adam Tauman Kalai · mp.weixin.qq
CodePlan: Repository-level Coding using LLMs and Planning, arXiv, 2309.12499, arxiv, pdf, cication: -1

Ramakrishna Bairi, Atharv Sonwane, Aditya Kanade, Vageesh D C, Arun Iyer, Suresh Parthasarathy, Sriram Rajamani, B. Ashok, Shashank Shet · mp.weixin.qq
Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Generation, arXiv, 2308.10335, arxiv, pdf, cication: 3

Li Zhong, Zilong Wang · jiqizhixin · mp.weixin.qq
Can Programming Languages Boost Each Other via Instruction Tuning?, arXiv, 2308.16824, arxiv, pdf, cication: -1

Daoguang Zan, Ailun Yu, Bo Shen, Jiaxin Zhang, Taihong Chen, Bing Geng, Bei Chen, Jichuan Ji, Yafen Yao, Yongji Wang
SoTaNa: The Open-Source Software Development Assistant, arXiv, 2308.13416, arxiv, pdf, cication: -1

Ensheng Shi, Fengji Zhang, Yanlin Wang, Bei Chen, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Hongbin Sun · github
OctoPack: Instruction Tuning Code Large Language Models, arXiv, 2308.07124, arxiv, pdf, cication: 6

Niklas Muennighoff, Qian Liu, Armel Zebaze, Qinkai Zheng, Binyuan Hui, Terry Yue Zhuo, Swayam Singh, Xiangru Tang, Leandro von Werra, Shayne Longpre
Enhancing Network Management Using Code Generated by Large Language Models, arXiv, 2308.06261, arxiv, pdf, cication: -1

Sathiya Kumaran Mani, Yajie Zhou, Kevin Hsieh, Santiago Segarra, Ranveer Chandra, Srikanth Kandula
PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback, arXiv, 2307.14936, arxiv, pdf, cication: 9

Bo Shen, Jiaxin Zhang, Taihong Chen, Daoguang Zan, Bing Geng, An Fu, Muhan Zeng, Ailun Yu, Jichuan Ji, Jingyang Zhao · jiqizhixin
Predicting Code Coverage without Execution, arXiv, 2307.13383, arxiv, pdf, cication: 1

Michele Tufano, Shubham Chandel, Anisha Agarwal, Neel Sundaresan, Colin Clement
Communicative Agents for Software Development, arXiv, 2307.07924, arxiv, pdf, cication: 23

Chen Qian, Xin Cong, Wei Liu, Cheng Yang, Weize Chen, Yusheng Su, Yufan Dang, Jiahao Li, Juyuan Xu, Dahai Li · jiqizhixin
Software Testing with Large Language Models: Survey, Landscape, and Vision, arXiv, 2307.07221, arxiv, pdf, cication: -1

Junjie Wang, Yuchao Huang, Chunyang Chen, Zhe Liu, Song Wang, Qing Wang · (LLM4SoftwareTesting - LLM-Testing) · (qbitai)
RLTF: Reinforcement Learning from Unit Test Feedback, arXiv, 2307.04349, arxiv, pdf, cication: -1

Jiate Liu, Yiqin Zhu, Kaiwen Xiao, Qiang Fu, Xiao Han, Wei Yang, Deheng Ye · github
CodeT5+: Open Code Large Language Models for Code Understanding and Generation, arXiv, 2305.07922, arxiv, pdf, cication: 43

Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D. Q. Bui, Junnan Li, Steven C. H. Hoi · jiqizhixin
Guiding Language Models of Code with Global Context using Monitors, arXiv, 2306.10763, arxiv, pdf, cication: 3

Lakshya A Agrawal, Aditya Kanade, Navin Goyal, Shuvendu K. Lahiri, Sriram K. Rajamani
RepoFusion: Training Code Models to Understand Your Repository, arXiv, 2306.10998, arxiv, pdf, cication: -1

Disha Shrivastava, Denis Kocetkov, Harm de Vries, Dzmitry Bahdanau, Torsten Scholak
Is Self-Repair a Silver Bullet for Code Generation?, arXiv, 2306.09896, arxiv, pdf, cication: 17

Theo X. Olausson, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao, Armando Solar-Lezama · mp.weixin.qq
WizardCoder: Empowering Code Large Language Models with Evol-Instruct, arXiv, 2306.08568, arxiv, pdf, cication: 44

Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, Qingwei Lin, Daxin Jiang · jiqizhixin
Learning Transformer Programs, arXiv, 2306.01128, arxiv, pdf, cication: 2

Dan Friedman, Alexander Wettig, Danqi Chen · github
Large Language Models of Code Fail at Completing Code with Potential Bugs, arXiv, 2306.03438, arxiv, pdf, cication: 2

Tuan Dinh, Jinman Zhao, Samson Tan, Renato Negrinho, Leonard Lausen, Sheng Zha, George Karypis
Teaching Large Language Models to Self-Debug, arXiv, 2304.05128, arxiv, pdf, cication: 78

Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback, arXiv, 2306.14898, arxiv, pdf, cication: 7

John Yang, Akshara Prabhakar, Karthik Narasimhan, Shunyu Yao · intercode-benchmark.github

Projects

Site Unreachable
GRADIO-CODER - matrixglitch 🤗
Yi-Coder-9B-Chat - 01-ai 🤗
Meet Yi-Coder: A Small but Mighty LLM for Code
DeepSeek-Coder-V2-Instruct-0724 - deepseek-ai 🤗
abacusai (Abacus.AI, Inc.)
Dracarys-Llama-3.1-70B-Instruct - abacusai 🤗
agilecoder - fsoft-ai4code

Incorporating Agile methodology into agents to create complex real-world softwares
Introducing SWE-bench Verified | OpenAI
Technical Report: Building Genie
Announcing BigCodeBench-Hard, and More
mbpp - google-research-datasets 🤗
mamba-codestral-7B-v0.1 - mistralai 🤗
codegeex4-all-9b - THUDM 🤗
Our Transformers Code Agent beats the GAIA benchmark 🏅
BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks

· (huggingface) · (bigcodebench - bigcode-project)
Codestral: Hello, World! | Mistral AI | Frontier AI in your hands

· (huggingface)
Devon - entropy-research

Devon: An open-source pair programmer
cover-agent - Codium-ai

CodiumAI Cover-Agent: An AI-Powered Tool for Automated Test Generation and Code Coverage Enhancement! 💻🤖🧪🐞 · (x)
code_bagel_hermes-2.5 - Replete-AI 🤗
granite-code-models - ibm-granite

Granite Code Models: A Family of Open Foundation Models for Code Intelligence · (granite-code-models - ibm-granite) · (huggingface)
codegemma-1.1-7b-it - google 🤗
repoqa - evalplus

RepoQA: Evaluating Long-Context Code Understanding · (repoqa - evalplus)
starcoder2-15b-instruct-v0.1 - bigcode 🤗

· (huggingface)

· (huggingface)
leaderboard - livecodebench 🤗

· (huggingface)
tabby - TabbyML

Self-hosted AI coding assistant
aiXcoder-7B - aixcoder-plugin

official repository of aiXcoder-7B Code Large Language Model · (aixcoder) · (qbitai)
plandex - plandex-ai

An AI coding engine for complex tasks
pip-library-etl - PipableAI

This Python package simplifies generating documentation for functions and methods in designated modules or libraries. It enables effortless function call generation from natural language input or existing signatures, and facilitates crafting new ones through the integrated model. Beyond documentation, it seamlessly generates sophisticated SQL too. · (huggingface)
SWE-agent - princeton-nlp

SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models
- the agent interacts with a specialized terminal and enables important processing of files and executable tests to achieve good performance; on SWE-bench, SWE-agent resolves 12.29% of issues, achieving the state-of-the-art performance on the full test set.
· (swe-agent)
stable-code-instruct-3b - stabilityai 🤗
devika - stitionai

Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.
OpenDevin - OpenDevin
StarChat2 15B - a HuggingFaceH4 Collection
oss-fuzz-gen - google

LLM powered fuzzing via OSS-Fuzz.
DeciCoder-6B-Demo - Deci 🤗
LLM4SoftwareTesting - LLM-Testing
stable-code-3b - stabilityai 🤗
Mastering-GitHub-Copilot-for-Paired-Programming - microsoft

A 6 Lesson course teaching everything you need to know about harnessing GitHub Copilot and an AI Paired Programing resource.
sweep - sweepai

Sweep: AI-powered Junior Developer for small features and bug fixes.
wizardlm - nlpxucan

Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder
codellama-13b-oasst-sft-v10 - OpenAssistant 🤗
DeepSeek-Coder - deepseek-ai

DeepSeek Coder: Let the Code Write Itself
CodeT - microsoft
SolidGPT - AI-Citizen

Chat everything with your code repository, ask repository level code questions, and discuss your requirements. AI Scan and learning your code repository, provide you code repository level answer🧱 🧱
codeshell - WisdomShell

A series of code large language models developed by PKU-KCL · mp.weixin.qq
replit-code-v1_5-3b - replit 🤗
gpt-pilot - Pythagora-io

PoC for a scalable dev tool that writes entire apps from scratch while the developer oversees the implementation
codellama - facebookresearch

Inference code for CodeLlama models · huggingface · huggingface · github · jiqizhixin · huggingface

· (promptingguide)
CodeLlama-70b-hf-4bit-MLX - mlx-community 🤗
sqlcoder - defog-ai

SoTA LLM for converting natural language questions to SQL queries · [jiqizhixin]
DeciCoder-1b - Deci 🤗
MiniChain - srush

A tiny library for coding with large language models.
stablecode-completion-alpha-3b - stabilityai 🤗

· qbitai
continue - continuedev

⏩ the open-source autopilot for software development—a VS Code extension that brings the power of ChatGPT to your IDE · jiqizhixin
CodeGeeX2 - THUDM

CodeGeeX2: A More Powerful Multilingual Code Generation Model
codeinterpreter-api - shroominic

Open source implementation of the ChatGPT Code Interpreter 👾
AmadeusGPT - AdaptiveMotorControlLab

We turn natural language descriptions of behaviors into machine-executable code
aider - paul-gauthier

aider is GPT powered coding in your terminal
gpt-migrate - 0xpayne

Easily migrate your codebase from one framework or language to another.
gpt-engineer - AntonOsika

Specify what you want it to build, the AI asks for clarification, and then builds it.

Products

Announcing Replit Agent in early access
Codegen | Reshape Your Codebase
Introducing Zed AI

Other

LLMs are bad at returning code in JSON | aider
Code Droid Technical Report
Introducing Devin, the first AI software engineer
phind.com/blog/introducing-phind-70b
Personal Copilot: Train Your Own Coding Assistant
Introducing SafeCoder
SafeCoder vs. Closed-source Code Assistants
解密代码模型StarCoder & CodeLlama
AGI降临派技术闭门会20230826 - YouTube

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

awesome_code_llm.md

awesome_code_llm.md

Awesome-code-llm

Survey

Paper

Projects

Products

Other

Files

awesome_code_llm.md

Latest commit

History

awesome_code_llm.md

File metadata and controls

Awesome-code-llm

Survey

Paper

Projects

Products

Other