awesome-llm-projects

awesome-llm-papers | awesome-llm-datasets

🤖 Our goal is to establish and cultivate a comprehensive collection of projects, demonstrating the remarkable versatility and potential of llm applications.

Projects index:

Projects

‼️Attention: If the project name starts with *, it means the project is neither open source nor has it released any applications yet.

🦄 LLMs

Command-R: Command-R is a scalable generative model targeting RAG and Tool Use to enable production-scale AI for enterprise.
Grok-1: Grok-1 is a 314 billion parameter Mixture-of-Experts model trained from scratch by xAI.
Mistral: Mistral AI releases Open Source LLMs, including Mistral 7B, Mistral 8x7B and Codestral.
DBRX: DBRX is an open, general-purpose LLM created by Databricks.
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding.
OpenChat: Advancing Open-source Language Models with Imperfect Data
WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions
CodeGemma-7b: An official Google release for code LLMs.
Awesome-Chinese-LLM: Includes many Open Source Chinese LLMs.
llama3: Meta newly released LLMs.
Snowflake Arctic: Arctic is a dense-MoE Hybrid transformer architecture pre-trained from scratch by the Snowflake AI Research Team. Taking an average of Coding (HumanEval+ and MBPP+), SQL Generation (Spider), and Instruction following (IFEval).
DeepSeek-V2-Chat: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Qwen 1.8B,7B,14B,72B: Chat & pretrained large language model proposed by Alibaba Cloud.
Granite Code Models 3b,8b,20b,34b: Granite Code Models, IBM's open-source code models: A Family of Open Foundation Models for Code Intelligence
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
MiniCPM-V 2.0: An Efficient End-side MLLM with Strong OCR and Understanding Capabilities
Stable Audio Open 1.0: Stable Audio Open 1.0 generates variable-length (up to 47s) stereo audio at 44.1kHz from text prompts.
Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B, and Qwen2-72B: Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
GLM-4-9B: GLM-4 series: Open Multilingual Multimodal Chat LMs
AutoCoder: A new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024) and GPT-4o.
Nemotron 4 340B: The Nvidia's Open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models.

🏆 Benchmarks Leaderboard

open_llm_leaderboard: This is the hub organisation(HuggingFace) maintaining the Open LLM Leaderboard.
LMSys Chatbot Arena Leaderboard: A crowdsourced, randomized battle platform. Use user votes to compute Elo ratings.
META Leaderboard: Massive Text Embedding Benchmark (MTEB) Leaderboard.
LLM-Perf Leaderboard: Aim to benchmark the performance (latency, throughput & memory) of LLMs with different hardwares, backends and optimizations using Optimum-Benchmark and Optimum flavors.
Big Code Models Leaderboard: Compare performance of base multilingual code generation models on HumanEval benchmark and MultiPL-E.
Open ASR Leaderboard: Rank and evaluate speech recognition models on the Hugging Face Hub.
Toolbench Leaderboard: An evaluation for LLM tool manipulation capabilities.
OpenCompass 2.0 LLM Leaderboard: Provides comprehensive, objective, and neutral scores and rankings for top-tier large language models and multimodal models.
Open Ko-LLM Leaderboard: Evaluates the performance of Korean Large Language Model (LLM).
Occiglot Euro LLM Leaderboard: The Occiglot euro LLM leaderboard evaluates a subset of the tasks from the Open LLM Leaderboard machine-translated into the four main languages from the Okapi benchmark and Belebele (French, Italian, German and Spanish).
BigCodeBench Leaderboard: BigCodeBench evaluates LLMs with practical and challenging programming tasks.()

💬 ChatBot

ChatGPT: ChatGPT is a free-to-use AI system. Use it for engaging conversations, gain insights, automate tasks, and witness the future of AI, all in one place.
Gemini: Bard is now Gemini. Get help with writing, planning, learning, and more from Google AI.
character.ai: Where intelligent agents live!
Claude: Talk with Claude, an AI assistant from Anthropic.
Mistral AI: Mistral makes frontier AI ubiquitous, and to provide tailor-made AI to all the builders.

🗣️ Voice

Including text to speech, speech to text, speech to speech, generate voice:

*Vall-E: A neural codec language model for speech synthesis.
ElevenLabs: AI Voice Generator & Text to Speech
Whisper: Robust Speech Recognition via Large-Scale Weak Supervision
Krisp: Krisp cancels background noise and reduces echo during your calls.
Voicemod: Voicemod is a free real-time voice changer and soundboard available on both Windows and macOS.
*NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models.
VoiceCraft: VoiceCraft is Zero-Shot Speech Editing and Text-to-Speech in the Wild.
Parler-TTS: Parler-TTS is a lightweight text-to-speech (TTS) model that can generate high-quality, natural sounding speech in the style of a given speaker (gender, pitch, speaking style, etc).
Sounds: Sounds for creators, game developers, artists, video makers. Experience the best AI Sound FX generator
VIVA: VIVA is the AI powerd creative visual design platform
ChatTTS: ChatTTS is a generative speech model for daily dialogue.
StreamSpeech: StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Dream Machine: Dream Machine is an AI model that makes high quality, realistic videos fast from text and images.

🎵 Music

Suno: Suno is an innovative tool designed for music creation, leveraging artificial intelligence to transform text input into original songs
Udio: Make your music. Discover, create, and share music with the world.

🌄 Image

Including text to image, image to image, and animate:

DALL-E: Creating images from text.
Stable Diffusion: Stable Diffusion is a deep learning, text-to-image model.
Midjourney: Midjourney is a generative artificial intelligence program and service that creates images from natural language descriptions, similar to other AI technologies like OpenAI's DALL-E and Stability AI's Stable Diffusion.
StickerBaker: StickerBaker is an open-source tool that allows users to create stickers using AI technology.
*PIXART-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation.
ResAdapter: ResAdapter is a plug-and-play resolution adapter for enabling diffusion models of arbitrary style domains to generate resolution-free images: no additional training, no additional inference and no style transfer.
FaceChain: FaceChain is a deep-learning toolchain for generating your Digital-Twin.
APISR: Anime Production Inspired Real-World Anime Super-Resolution (CVPR 2024)
OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models: OMG is a framework for multi-concept image generation
BasicPBC: Learning Inclusion Matching for Animation Paint Bucket Colorization.
DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing.
VAR: a new visual generation method elevates GPT-style models beyond diffusion & Scaling laws observed.
Ideogram: Ideogram is a free-to-use AI tool that generates realistic images, posters, logos and more.
MagicClothing: Focus on controllable garment-driven image synthesis.
*IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination.
HeyBeauty: Discover Beauty with AI, Make Fashion redefined.
IC-Light: IC-Light is a project to manipulate the illumination of images.
Logo Diffusion: Create Logos in Seconds With Generative A.I.
MistoLine: A Versatile and Robust SDXL-ControlNet Model for Adaptable Line Art Conditioning
InstaDrag: Lightning Fast and Accurate Drag-based Image Editing Emerging from Videos
Omost: Omost is a project to convert LLM's coding capability to image generation (or more accurately, image composing) capability.
ToonCrafter: ToonCrafter can interpolate two cartoon images by leveraging the pre-trained image-to-video diffusion priors.
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation.
Krea: Generate and enhance images and videos using powerful AI for free.
Leonardo AI: Leonardo AI is a generative AI tool that lets you craft top-tier visual assets for your.
MimicBrush: Zero-shot Image Editing with Reference Imitation
SketchDeco: Decorating B&W Sketches with Colour.

🧸 3D Model

Including text to 3D model:

TripoSR: TripoSR is a fast and feed-forward 3D generative model developed in collaboration between Stability AI and Tripo AI.
PantoMatrix: PantoMatrix: Talking Face and Body Animation Generation
Gaussian Head Avatar:Ultra High-fidelity Head Avatar via Dynamic Gaussians.
*Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text.
*CAT3D: CAT3D: Create Anything in 3D with Multi-View Diffusion Models
DiffTF: Large-Vocabulary 3D Diffusion Model with Transformer
DreamMat: High-quality PBR Material Generation with Geometry- and Light-aware Diffusion Models
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image.
Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention.

🎥 Video

Including text to video, image to video, video to video:

*Sora: Creating video from text. Sora is an AI model that can create realistic and imaginative scenes from text instructions.
*Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Runway: Runway is an applied AI research company shaping the next era of art, entertainment and human creativity.
HeyGen: HeyGen is an innovative video platform that harnesses the power of generative AI to streamline your video creation process.
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animations
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising.
CameraCtrl: Enabling Camera Control for Text-to-Video Generation.
Pika: Pika is the idea-to-video platform that sets your creativity in motion.
*VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.
OpenVoice: Instant voice cloning by MyShell.
Veo: Veo is Google most capable video generation model to date.
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding
Pandora: Towards General World Model with Natural Language Actions and Video States
EasyAnimate: An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion.
V-Express: V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
MusePose: A Pose-Driven Image-to-Video Framework for Virtual Human Generation
Hedra: Hedra is a video content generation platform and social media platform that allows individuals to edit, export and share AI-generated videos and video components.
MASA: Matching Anything by Segmenting Anything

🕸️ Search Engine

Including search engine, web browser:

Phind: web browser, to generate answers based on web search results and LLMs, also to provide customizable functionality for adjusting the weighting of search result sources
Devv: The next generation AI search engine for developers. Solve your programming problems in seconds.
Perplexity: Perplexity AI unlocks the power of knowledge with information discovery and sharing.
Arc: Effortlessly organize everything you do online — work, study, hobbies — all in one window with Spaces and Profiles.
Perplexica: Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
Reor: Private & offline AI personal knowledge management app.

👩🏽‍💻 Develop Assistant

GitHub Copilot: Get AI-based suggestions in real time.
Codeium: Codeium offers best in class AI code completion, search, and chat — all for free. It supports over 70+ languages and integrates with your favorite IDEs, with lightning fast speeds and state-of-the-art suggestion quality.
Amazon CodeWhisperer: Amazon CodeWhisperer is an AI-powered productivity tool for the IDE and command line that generates code suggestions based on comments and existing code.
Transformer Debugger: Transformer Debugger (TDB) is a tool developed by OpenAI's Superalignment team with the goal of supporting investigations into specific behaviors of small language models. The tool combines automated interpretability techniques with sparse autoencoders.
CopilotKit: A framework for building custom AI Copilots 🤖 in-app AI chatbots, in-app AI Agents, & AI-powered Textareas.
Codium: CodiumAI’s first tool is an IDE extension that interacts with the developer to generate meaningful tests and code explanations for busy devs.

🧠 AI Agent

AgentGPT: Assemble, configure, and deploy autonomous AI Agents in your browser.
*Devin: Introducing Devin, the first AI software engineer and setting a new state of the art on the SWE-bench coding benchmark.
OpenDevin: An autonomous AI software engineer who is capable of executing complex engineering tasks and collaborating actively with users on software development projects.
Plandex: An AI coding engine for complex tasks.
Devika: an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective.
Aider: Aider is AI pair programming in your terminal.
Agent Protocol: A single common interface for communicating with agents
Devon: An open-source pair programmer
PR-Agent: CodiumAI PR-Agent: An AI-Powered 🤖 Tool for Automated Pull Request Analysis, Feedback, Suggestions and More!
FinRobot: An Open-Source AI Agent Platform for Financial Applications using LLMs
AgentQL: Build AI agents using a query language for precise web and app automation
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Translation Agent: Agentic translation using reflection workflow
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement

🤼 Multi-Agent Collaboration

MetaGPT: MetaGPT takes a one line requirement as input and outputs user stories / competitive analysis / requirements / data structures / APIs / documents, etc.
ChatDev: The primary objective of ChatDev is to offer an easy-to-use, highly customizable and extendable framework, which is based on large language models (LLMs) and serves as an ideal scenario for studying collective intelligence.

💻 Terminal

Warp: Warp is a tool designed to enhance the terminal experience by providing AI-powered assistance for command lookups and allow users to input their objectives in plain English
Gorilla: Gorilla CLI powers your command-line interactions with a user-centric tool.
CodeWhisperer Cli: CodeWhisperer for command line adds IDE-style completions for hundreds of popular CLIs like as Git, npm, Docker, MongoDB Atlas, and the AWS CLI. Previously known as fig.
Open Interpreter: A natural language interface for computers.

🚀 Launcher

Raycast: Raycast is a blazingly fast, totally extendable launcher. It lets you complete tasks, calculate, share common links, and much more.

📊 PPT / Keynote

Gamma: A new medium for presenting ideas, powered by AI. Create beautiful, engaging content with none of the formatting and design work.

📰 Web Sites

Dora: Design and publish stunning 3D & animated websites effortlessly, without the need for coding.
Design2Code: How Far Are We From Automating Front-End Engineering
Tempo: Tempo generates and edits high-quality react code directly in your codebase so you can ship UIs in minutes.
OpenUI: OpenUI let's you describe UI using your imagination, then see it rendered live.
v0: Generate UI with shadcn/ui from simple text prompts and images.

🗜️ Hardware

Groq: Groq is on a mission to set the standard for GenAI inference speed, helping real-time AI applications come to life today.
*LOOI Root: Turn Your Smartphone into a Desktop Robot
Friend: Open-Source AI Wearable with 24h+ on single charge
insight: A raspberry pi lay around and built an AI wearable called insight.
Limitless: Personalized AI powered by what you’ve seen, said, and heard.
Frame AI glasses: Open-source eyewear.
Rabbit R1: Your pocket companion.
*Haptic Source-effector: Full-body Haptics via Non-invasive Brain Stimulation
OpenGlass: Turn any glasses into AI-powered smart glasses
Octo: Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
HumanPlus: Humanoid Shadowing and Imitation from Humans
LeRobot: LeRobot: End-to-end Learning for Real-World Robotics in Pytorch

⌨️ Prompt Engineering

Prompt-Engineering-Guide: Guides, papers, lecture, notebooks and resources for prompt engineering.
Prompt Library: The Dr. Ethan Mollick and Dr. Lilach Mollick of Wharton School of the University of Pennsylvania Prompt Library.

🤯 LLMs Inference and Serving

vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs.
Text Generation Inference: Large Language Model Text Generation Inference
Ollama: Get up and running with large language models locally.
LM Studio: Discover, download, and run local LLMs.

📋 Others

Cradle: The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.
LLMPerf: A Tool for evaulation the performance of LLM APIs. Also provide a Leaderboard for LLMs.
WebLINX: Real-world website navigation with multi-turn dialogue.
Latent Box: A collection of awesome-lists for AI, creativity and art.
LLM Transparency Tool: LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models.
LLM Visualization: A visualization and walkthrough of the LLM algorithm that backs OpenAI's ChatGPT. Explore the algorithm down to every add & multiply, seeing the whole process in action.
HippoRAG: HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents.
Vanna: Vanna is an MIT-licensed open-source Python RAG (Retrieval-Augmented Generation) framework for SQL generation and related functionality.
Rewind: Rewind is a personalized AI powered by everything you’ve seen, said, or heard. Your colleagues will wonder how you do it all.
Cursor: The AI Code Editor.
Wordware: A web-hosted IDE where non-technical domain experts work with AI Engineers to build task-specific AI agents. It approaches prompting as a new programming language rather than low/no-code blocks.

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
LICENSE		LICENSE
README.md		README.md
README.zh_CN.md		README.zh_CN.md
logo.png		logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

awesome-llm-projects

awesome-llm-papers | awesome-llm-datasets

Projects

🦄 LLMs

🏆 Benchmarks Leaderboard

💬 ChatBot

🗣️ Voice

🎵 Music

🌄 Image

🧸 3D Model

🎥 Video

🕸️ Search Engine

👩🏽‍💻 Develop Assistant

🧠 AI Agent

🤼 Multi-Agent Collaboration

💻 Terminal

🚀 Launcher

📊 PPT / Keynote

📰 Web Sites

🗜️ Hardware

⌨️ Prompt Engineering

🤯 LLMs Inference and Serving

📋 Others

About

Contributors 3

License

InfiniteAICreations/awesome-llm-projects

Folders and files

Latest commit

History

Repository files navigation

awesome-llm-projects

awesome-llm-papers | awesome-llm-datasets

Projects

🦄 LLMs

🏆 Benchmarks Leaderboard

💬 ChatBot

🗣️ Voice

🎵 Music

🌄 Image

🧸 3D Model

🎥 Video

🕸️ Search Engine

👩🏽‍💻 Develop Assistant

🧠 AI Agent

🤼 Multi-Agent Collaboration

💻 Terminal

🚀 Launcher

📊 PPT / Keynote

📰 Web Sites

🗜️ Hardware

⌨️ Prompt Engineering

🤯 LLMs Inference and Serving

📋 Others

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3