Awesome Personalized Large Multimodal Models

📝 A curated list about Personalized Multimodal Models, Personalized Representation Learning~ 📚


Problem Settings: Using 3-5 images of a novel concept/subject (e.g., a pet named `<bo>`), can we personalize Large Multimodal Models so that: (1) They retain their original capabilities (e.g., Describe a dog) while (2) Enabling tailored their capabilities for the novel concept? (e.g., Describe `<bo>`)

Over the years, we’ve witnessed the evolution of personalization across various tasks (e.g., object segmentation, image generation).
Now, with the rise of Large Multimodal Models (LMMs) -- We have opportunities to personalizing these generalist, large-scale AI systems.
It’s time to take the leap and bring personalization into the realm of Large Multimodal Models, making them not only powerful but also user-specific!

^ Above caption are actually generated by GPT-4o, I feed it the figure and asked it to generate a caption, haha!

(This figure is created by me. If there is anything incorrect, please feel free to correct me! Thank you!)

Papers

⚠️ Minor Note: The listed works below are specified for settings where users provide 3-5 images, and the system needs to learn about those concepts. There is research on other subtopics (e.g., role-playing, persona, etc.). For these topics, this repo might provide better coverage.

Personalized Large Multimodal Models

Title	Venue	Year	Input	Output	Link/ Code
[paper title]	xx	2024	image, text	image, text
─── Vision Language Model ───
Personalized Visual Instruction Tuning	arXiv	2024	image, text	text
Personalized Large Vision-Language Models	arXiv	2024	image, text	text
MC-LLaVA: Multi-Concept Personalized Vision-Language Model	arXiv	2024	image, text	text	Code
Retrieval-Augmented Personalization for Multimodal Large Language Models	arXiv	2024	image, text	text	Page, Code
Yo'LLaVA: Your Personalized Language and Vision Assistant	NeurIPS	2024	image, text	text	Page, Code
MyVLM: Personalizing VLMs for user-specific queries	ECCV	2024	image, text	text	Page, Code
─── Large Language Models ───
Personalized Large Language Models	ICDMw	2024	text	text
LaMP: When Large Language Models Meet Personalization	ACL	2024	text	text	Page, Code
Learning to Predict Persona Information forDialogue Personalization without Explicit Persona Description	ACL	2023	text	text
Call for Customized Conversation: Customized Conversation Grounding Persona and Knowledge	AAAI	2022	text	text	Code
A Personalized Dialogue Generator with Implicit User Persona Detection	COLING	2022	text	text
Personalizing Dialogue Agents: I have a dog, do you have pets too?	ACL	2018	text	text

Personalized Representation Learning

| Title | Venue | Year | Link/ Code | |:-------- |:--------:|:--------:|:--------:|:--------:|:--------:| | Personalized Representation from Personalized Generation | arXiv | 2024 | Code| | “This is my unicorn, Fluffy”: Personalizing frozen vision-language representations | ECCV | 2024 | Code |

Datasets

Name	Year	# Concepts	Link	Notes
ConCon-Chi	2024	20	GitHub	with ConCon-Chi
PODS	2024	100	GitHub	with personalized-rep
MC-LLaVA	2024	--	GitHub	with MC-LLaVA, multiple concepts
Yo'LLaVA	2024	40	GitHub	with Yo'LLaVA, single concept
MyVLM	2024	29	GitHub	with MyVLM, single concept

Applications

Memory and new controls for ChatGPT

⣶⣶⣶⣶⣶⣖⣒⡄⠀⣶⡖⠲⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢠⣤⠠⡄⠀⠀⠀⠀ ⠙⠛⣿⣿⣿⡟⠛⠃⢀⣿⣿⣆⣦⣴⠂⠤⠀⠀⠀⣠⣤⣴⣆⠠⢄⠀⠀⠀⣤⡤⢤⣤⣤⠤⢄⠀⠀⢻⣿⣦⡇⢀⣤⢤⠀ ⠀⢀⣿⣿⣿⡇⠀⠀⢸⣿⣿⣿⠛⣿⣷⣄⡇⠀⣼⣿⣿⡟⢿⣷⡄⣣⠀⢘⣿⣿⣿⠿⣿⣧⣈⡆⠀⢹⣿⣿⣷⣾⣧⣴⠀ ⠀⢰⣿⣿⣿⠀⠀⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⠙⠛⣻⣧⣾⣿⣿⡷⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⢸⣿⣿⣿⣿⣿⡇⠀ ⠀⢸⣿⣿⣿⠀⠀⠀⢸⣿⣿⡿⠀⣿⣿⣿⠃⠀⣰⣾⣿⡿⣿⣿⣿⣟⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⢸⣿⣿⣿⣿⡏⢇⠀ ⠀⣼⣿⣿⣿⠀⠀⠀⣸⣿⣿⣟⢠⣿⣿⣿⠀⠀⣿⣿⡟⣇⣾⣿⣿⣯⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⢼⣿⣿⣿⣿⣷⡈⡀ ⠀⠻⠿⠿⠟⠀⠀⠀⠻⠿⠿⠏⠸⣿⣿⣿⠀⠀⢿⣿⣿⣿⣿⣿⣿⡇⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⣿⣿⣿⡟⢻⣿⣧⣇ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠀⠀⠉⠉⠀⠀⠀⠉⠉⠁⠀⠉⠉⠉⠀⠀⠘⠙⠋⠁⠈⠋⠛⠉ ⠀⠀⠀⠀⠀⠀⢀⣠⣤⡀⠀⢀⣀⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣤⡤⠠⡄⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⢹⣿⣄⠱⣠⣿⣧⣴⠀⠀⣠⣤⣤⣀⣀⡀⠀⠀⢀⣤⠤⡀⢀⣠⡤⢄⠀⠈⣿⣿⣦⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠈⢿⣿⣷⣿⣿⣿⡏⠀⣾⣿⣿⣿⣶⣄⡉⡄⠀⣿⣿⣤⣝⢸⣿⣦⣼⠀⠀⣿⣿⣿⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢿⣿⣿⣿⠏⠀⠐⣿⣿⣿⠉⣿⣿⣷⡇⠀⣽⣿⣿⣯⢸⣿⣿⣿⠀⠀⢹⣿⣿⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⣿⣿⠀⠀⢠⣿⣿⣿⠀⣿⣿⣿⡇⠀⣻⣿⣿⡷⢸⣿⣿⣿⠀⠀⢸⣿⣿⠇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⣿⣿⠀⠀⠀⢿⣿⣿⣄⣿⣿⣿⠇⠀⢹⣿⣿⣿⣸⣿⣿⣿⠀⠀⢠⣽⣧⡄⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠛⠛⠋⠀⠀⠀⠈⠛⠛⠛⠛⠛⠉⠀⠀⠈⠛⠛⠛⠋⠛⠛⠋⠀⠀⠈⠛⠛⠁⠀⠀⠀⠀⠀⠀⠀

And good luck with your research! 🤗✨

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Awesome Personalized Large Multimodal Models

Table of Contents

🌱 Contributing

Papers

Personalized Large Multimodal Models

Personalized Representation Learning

Datasets

Applications

Files

README.md

Latest commit

History

README.md

File metadata and controls

Awesome Personalized Large Multimodal Models

Table of Contents

🌱 Contributing

Papers

Personalized Large Multimodal Models

Personalized Representation Learning

Datasets

Applications